跳转至

Chapter 2 Framework for analyzing algorithms

---
title: 算法分析框架图
---
flowchart LR
id1(Analysis)
id2(Description)
id3(Design)

id1 --> id3
id3 --> id2
id2 --> id1
  • Description: See how to describe algorithms in pseudocode(伪代码), C/C++ and argue the correctness of that.
  • Analysis: Begin using asymptotic(一致性) notation(符号) to express running-time analysis.
  • Design: for example, the technique of “divide and conquer”(分治) in the context of merge sort.

Start using framework for describing and analyzing algorithms.

2.1 Insertion sort: Framework for describing algorithms

  • Pseudocode: That is similar in many respects(方面) to C, Pascal, or Java, py, …

Differences between psuedocode and real code

  • Psuedocode is clear and concise(简明) to specify(指定) a given algorithm
  • Pseudocode is not typically concerned with issues of software engineering (data abstraction, modularity(模块化), error handling)

2.1.1 Pseudocode conventions(约定)

  • Indentation indecates block structure.
  • The loop constructs (while, for, repeat) and the conditional constructs (if, then, else) have interpretations(解释) similar to those in Pascal, C.
  • "//" or "\(\triangleright\)" indicates a comment.
  • A multiple assignment(多重赋值) \(i \leftarrow j \leftarrow e\) is equivalent to \(j \leftarrow e\) then \(i \leftarrow j\).
  • Variables are local(本地的) to the given procedure(程序).
  • Array elements access: \(A[i]; A[1 \cdots j] = <A[1], A[2], \cdots, A[i]>\)
  • Compound(复合) data are typically(通常) organazed into objects.
  • Parameters are passed to a procedure by value. 参数按值传递给程序。
  • The boolean operators "and" and "or" are short circuiting(短路): for example "x and (or) y": whether or not evaluate(v.计算) y rely on the evaluating(n.计算) of x.

Insertion sort --Description

INSERTION-SORT(A)
    for (j = 2; j <= length[A]; j++) // loop header
    {
        key = A[j]
        // Insert A[j] into the sorted sequence A[1 ... j-1]
        i <- j - 1
        while (i > 0 && A[i] > key)
        {
            A[i + 1] = A[i]
            i = i - 1
        }
        A[i + 1] = key
    }

2.1.2 Loop invariants(不变量) and the correctness of insertion sert

  • Loop invariants循环不变量: At the start of each iteration of for loop, the subarray A[1 .. j-1] consists of the elements originally in A[1 .. j-1] but in sorted order.

It can help us understand why an algorithm is correct.

Three properties(属性) about a loop invariant

  • Initialization: it is true(正确的) prior to(在...之前) the first iteration of the loop. (base case)
  • Maintenance: if it is true before an iteration of the loop, it remains true before the next iteration. (indective step检测步骤)
  • Termination: When the loop terminates, the invariant show that the algorithms is correct. (stopping the "indection" when the loop terminates; the indective step of mathematical indection is used infinitely)

Three properties hold for insertion sort

Loop invariant: original A[1 .. j-1] is permuted(已排列) to A'[1 .. j-1] but in sorted order

  • Initialization: j = 2, A[1 .. j-1] consists of just the single A[1], which is the original element in A[1], and is sorted. Obviously, the loop invariant holds prior to the first iteration of the loop.
  • Maintenance: A'[1 .. j-1] is in sorted order, sorting A[j] into A'[1 .. j-1], s.t. \(<a'_1, a'_2, \cdots, a'_k, a_j, a'_{k + 1} \cdots, a'_{j - 1}>\), obviously, it is sorted. The second property holds for the outer loop.
  • Termination: when j = n + 1, A'[1 .. j-1] = A'[1 .. n] consists of the elements originally in A[1 ..n], but in sorted order! Hence, the algorithm is correct.

2.2 Insertion sort: Analysing algorithms

Effeciency

Analysing an algorithm means predicting the resources

  • Resoutces:
  • computational time (in CPU, I/O, ...)
  • space (memory)
  • communication(通信) bandwidth(带宽)
  • primary(主要的) concern computer hardware

By analyzing several candidate(候选) algorithms for a problem, a most efficient one can be identified, but several inferior(劣) are discarded(丢弃)

  • Computing model
  • 原始递归函数: 一种能停机的递归函数,可以用图灵机计算。
  • λ演算:可以被称为最小的通用程序设计语言。它包括一条变换规则 (变量替换) 和一条函数定义方式,λ演算之通用在于,任何一个可计算函数都能用这种形式来表达和求值。
  • 图灵机(七元组的符合体系,抽象的计算机器):状态自动转移,就是机器指令的例行程序。但指令不能存储,不能形成程序结构。
  • RAM, Random-access machine 随机存取器(冯诺依曼):带着可随机访问的内存,主要是能用电子元器件做出来。

RAM model: Algorithms will be implemented as computer programs.

  • RAM: a generic(通用) one-processor(单处理器), instructions are executed one after another, with no concurrent(并发) operations.
  • RAM contains instructions commonly (basic instructions)
  • Arithmetic(算术) (add, subtract, multiply, divide, remainder(剩余), floor, ceiling)
  • Data movement (load, store, copy)
  • Control (conditional and unconditional branch, subroutine(子程序) call and return)
  • Each such instruction takes a constant amount of time.

Challenge

Analyzing a simple algorithm can be a challenge

  • Mathematics required: probability theory, combinatorics(组合数学), algebraic dexterity(代数灵巧性)
  • An ability to identify the most significant terms in a formula(公式)
  • Summarizing an algorithm in simple formulas

2.3 Merge sort: Designing algorithms

2.3.1 The divide-and-conquer approach

Many useful algorithms are recursive(递归的) in structure: to solve a given problem, they call themselves recursively one or more times to deal with closely related subproblems. These algorithms typically follow a divide-and-conquer approach: they break the problem into several subproblems that are similar to the original problem but smaller in size, solve the subproblems recursively, and then combine these solutions to create a solution to the original problem.

The divide-and-conquer paradigm(模式) involves three steps at each level of the recursion:

  1. Divede the problem into a number of subproblems that are smaller instances of the same problem.
  2. Conquer the subproblems by solving them recursicely. If the subproblem sizes are small enough, however, just solve the subproblems in a straightforward manner.
  3. Combine the solutions to the subproblems into the solution for the original problem.