Chapter 2 Framework for analyzing algorithms
---
title: 算法分析框架图
---
flowchart LR
id1(Analysis)
id2(Description)
id3(Design)
id1 --> id3
id3 --> id2
id2 --> id1
Description
: See how to describe algorithms in pseudocode(伪代码), C/C++ and argue the correctness of that.Analysis
: Begin using asymptotic(一致性) notation(符号) to express running-time analysis.Design
: for example, the technique of “divide and conquer”(分治) in the context of merge sort.
Start using framework for describing and analyzing algorithms.
2.1 Insertion sort: Framework for describing algorithms
Pseudocode
: That is similar in many respects(方面) to C, Pascal, or Java, py, …
Differences between psuedocode and real code
- Psuedocode is clear and concise(简明) to specify(指定) a given algorithm
- Pseudocode is not typically concerned with issues of software engineering (data abstraction, modularity(模块化), error handling)
2.1.1 Pseudocode conventions(约定)
- Indentation indecates block structure.
- The loop constructs (while, for, repeat) and the conditional constructs (if, then, else) have interpretations(解释) similar to those in Pascal, C.
- "//" or "\(\triangleright\)" indicates a comment.
- A multiple assignment(多重赋值) \(i \leftarrow j \leftarrow e\) is equivalent to \(j \leftarrow e\) then \(i \leftarrow j\).
- Variables are local(本地的) to the given procedure(程序).
- Array elements access: \(A[i]; A[1 \cdots j] = <A[1], A[2], \cdots, A[i]>\)
- Compound(复合) data are typically(通常) organazed into objects.
- Parameters are passed to a procedure by value. 参数按值传递给程序。
- The boolean operators "and" and "or" are short circuiting(短路): for example "x and (or) y": whether or not evaluate(v.计算) y rely on the evaluating(n.计算) of x.
Insertion sort --Description
INSERTION-SORT(A)
for (j = 2; j <= length[A]; j++) // loop header
{
key = A[j]
// Insert A[j] into the sorted sequence A[1 ... j-1]
i <- j - 1
while (i > 0 && A[i] > key)
{
A[i + 1] = A[i]
i = i - 1
}
A[i + 1] = key
}
2.1.2 Loop invariants(不变量) and the correctness of insertion sert
Loop invariants
循环不变量: At the start of each iteration of for loop, the subarray A[1 .. j-1] consists of the elements originally in A[1 .. j-1] but in sorted order.
It can help us understand why an algorithm is correct.
Three properties(属性) about a loop invariant
Initialization
: it is true(正确的) prior to(在...之前) the first iteration of the loop. (base case)Maintenance
: if it is true before an iteration of the loop, it remains true before the next iteration. (indective step检测步骤)Termination
: When the loop terminates, the invariant show that the algorithms is correct. (stopping the "indection" when the loop terminates; the indective step of mathematical indection is used infinitely)
Three properties hold for insertion sort
Loop invariant
: original A[1 .. j-1] is permuted(已排列) to A'[1 .. j-1] but in sorted order
Initialization
: j = 2, A[1 .. j-1] consists of just the single A[1], which is the original element in A[1], and is sorted. Obviously, the loop invariant holds prior to the first iteration of the loop.Maintenance
: A'[1 .. j-1] is in sorted order, sorting A[j] into A'[1 .. j-1], s.t. \(<a'_1, a'_2, \cdots, a'_k, a_j, a'_{k + 1} \cdots, a'_{j - 1}>\), obviously, it is sorted. The second property holds for the outer loop.Termination
: when j = n + 1, A'[1 .. j-1] = A'[1 .. n] consists of the elements originally in A[1 ..n], but in sorted order! Hence, the algorithm is correct.
2.2 Insertion sort: Analysing algorithms
Effeciency
Analysing an algorithm means predicting the resources
- Resoutces:
- computational time (in CPU, I/O, ...)
- space (memory)
- communication(通信) bandwidth(带宽)
- primary(主要的) concern computer hardware
By analyzing several candidate(候选) algorithms for a problem, a most efficient one can be identified, but several inferior(劣) are discarded(丢弃)
- Computing model
- 原始递归函数: 一种能停机的递归函数,可以用图灵机计算。
- λ演算:可以被称为最小的通用程序设计语言。它包括一条变换规则 (变量替换) 和一条函数定义方式,λ演算之通用在于,任何一个可计算函数都能用这种形式来表达和求值。
- 图灵机(七元组的符合体系,抽象的计算机器):状态自动转移,就是机器指令的例行程序。但指令不能存储,不能形成程序结构。
- RAM, Random-access machine 随机存取器(冯诺依曼):带着可随机访问的内存,主要是能用电子元器件做出来。
RAM model
: Algorithms will be implemented as computer programs.
RAM
: a generic(通用) one-processor(单处理器), instructions are executed one after another, with no concurrent(并发) operations.- RAM contains instructions commonly (basic instructions)
- Arithmetic(算术) (add, subtract, multiply, divide, remainder(剩余), floor, ceiling)
- Data movement (load, store, copy)
- Control (conditional and unconditional branch, subroutine(子程序) call and return)
- Each such instruction takes a constant amount of time.
Challenge
Analyzing a simple algorithm can be a challenge
- Mathematics required: probability theory, combinatorics(组合数学), algebraic dexterity(代数灵巧性)
- An ability to identify the most significant terms in a formula(公式)
- Summarizing an algorithm in simple formulas
2.3 Merge sort: Designing algorithms
2.3.1 The divide-and-conquer approach
Many useful algorithms are recursive(递归的) in structure: to solve a given problem, they call themselves recursively one or more times to deal with closely related subproblems. These algorithms typically follow a divide-and-conquer approach: they break the problem into several subproblems that are similar to the original problem but smaller in size, solve the subproblems recursively, and then combine these solutions to create a solution to the original problem.
The divide-and-conquer paradigm(模式) involves three steps at each level of the recursion:
Divede
the problem into a number of subproblems that are smaller instances of the same problem.Conquer
the subproblems by solving them recursicely. If the subproblem sizes are small enough, however, just solve the subproblems in a straightforward manner.Combine
the solutions to the subproblems into the solution for the original problem.