LECTURE 2 SORTING
Beakal Gizachew
Based on slides of Serdar Taşıran, Shafi Goldwasser, Erik Demaine, and Alptekin Küpçü
SORTING
• Sorting
• Input: a sequence of n numbers <a1,a2,…,an>
• Output: a re-ordering <a’1,a’2,…,a’n> of the sequence such that a’1 a’2
… a’n
• Example:
• Input 8 2 4 9 3 6
• Output 2 3 4 6 8 9
• Called an instance of the problem
• Check these demos:
• http://www.iti.fh-
flensburg.de/lang/algorithmen/sortieren/sortcontest/sortcontest.htm
• http://www.cs.oswego.edu/~mohammad/classes/csc241/samples/sort/Sor
t2-E.html
• http://www.cs.ubc.ca/~harrison/Java/sorting-demo.html
• http://www.cs.pitt.edu/~kirk/cs1501/animations/Sort3.html
2
INSERTION SORT
• Takes array A[1..n] containing a sequence of length n to be sorted
• Sorts the array in place
• Numbers re-arranged inside A with at most a constant number of
them stored outside.
• O(1) extra space
3
4
Insertion Sort
To insert 12, we need to make
room for it by moving first 36
and then 24.
5
Insertion Sort
6
Insertion Sort
7
Insertion Sort
INSERTION SORT
INSERTION-SORT (A, n) ⊳ A[1 . . n]
for j ← 2 to n
do key ← A[ j ]
i←j–1
“pseudocode”
while i > 0 and A[ i ] > key
do A[i+1] ← A[ i]
i←i–1
A[i+1] = key
1 i j n
A:
key
sorted
8
EXAMPLE OF INSERTION SORT
8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6
2 3 4 8 9 6
2 3 4 6 8 9 done
9
EXAMPLE OF INSERTION SORT
8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6
2 3 4 8 9 6
2 3 4 6 8 9 done
10
CORRECTNESS OF INSERTION
SORT
INSERTION-SORT (A, n) ⊳ A[1 . . n]
for j ← 2 to n
do key ← A[ j ]
i←j–1
while i > 0 and A[ i ] > key
do A[i+1] ← A[ i]
i←i–1
A[i+1] = key
• Loop invariant:
• At the start of each iteration, the subarray A[1..j-1] containsthe
elements originally in A[1..j-1] but in sorted (increasing) order
• Prove initialization, maintenance, termination
• Invariant must imply interesting property about algorithm
11
RUNNING TIME
• The running time of insertion sort depends on the input: e.g., an
already sorted sequence is easier to sort.
• Major Simplifying Convention: Parameterize the running time by
the size of the input, since short sequences are easier to sort than
long ones.
➢ TA(n) = time of A when run with inputs of length n
➢ n: Number of bits required to encode input
➢ Ignore machine-dependent constants
➢ Look at growth of T(n) as n → ∞
➢ For small inputs, the algorithm will run fast anyway. Important
thing is for how big inputs we can still run the algorithm given a
reasonable amount of time.
12
KINDS OF ANALYSES
• Worst-case: (mostly-used)
• T(n) = maximum time of algorithm on any input of size n.
• Generally, we seek upper bounds on the running time, to have a
guarantee of performance.
• Average-case: (sometimes used)
• T(n) = expected time of algorithm over all inputs of size n.
• Need assumption of statistical distribution of real inputs.
• Best-case: (almost never used)
• T(n) = best possible time of algorithm over any input of size n.
• Cheat with a slow algorithm that works fast on some input.
• May be useful when proving negative results
• e.g., Even the best case of algorithm X is slower than the worst
case of algorithm Y.
13
Analysis of Insertion Sort
INSERTION-SORT(A) cost times
for j ← 2 to n c1 n
do key ← A[ j ] c2 n-1
Insert A[ j ] into the sorted sequence A[1 . . j -1] 0 n-1
i←j-1 c4 n-1
n
while i > 0 and A[i] > key c5 j =2 j
t
c6
n
do A[i + 1] ← A[i] j =2
(t j − 1)
c7
n
i←i–1 j =2
(t j − 1)
A[i + 1] ← key c8 n-1
tj: # of times the while statement is executed at iteration j
T (n) = c1n + c2 (n − 1) + c4 (n − 1) + c5 t j + c6 (t j − 1) + c7 (t j − 1) + c8 (n − 1)
n n n
j =2 j =2 j =2
14
INSERTION SORT ANALYSIS
Worst case: Input reverse sorted.
T(n) = σ 𝒏 Θ(𝒋) = Θ(n2) [arithmetic series]
𝒋=𝟐
Average case: All permutations equally likely.
T(n) = σ 𝒏 Θ(𝒋ൗ 𝟐) = Θ(n2)
𝒋=𝟐
Best case: Input already sorted. O(n)
Is insertion sort a fast sorting algorithm?
• Moderately so, for small n.
• Not at all, for large n.
15
HOW TO DESIGN ALGORITHMS?
• We’ll see many example styles throughout the semester
• Insertion sort was an example of an “incremental algorithm”
• Another common paradigm: Divide and Conquer
• Divide into simpler/smaller sub-problems
• Solve (conquer) sub-problems recursively
• Combine results of sub-problems
16
MERGE SORT
MERGE-SORT A[1 . . n]
If n = 1, return
Recursively sort
A[ 1 . . n/2 ] and A[ n/2+1 . . n ]
Merge the two sorted lists
Key subroutine: MERGE
17
MERGE SORT EXAMPLE
18
MERGING TWO SORTED ARRAYS
20 12
13 11
7 9
2 1
19
MERGING TWO SORTED ARRAYS
20 12
13 11
7 9
2 1
20
MERGING TWO SORTED ARRAYS
20 12 20 12
13 11 13 11
7 9 7 9
2 1 2
21
MERGING TWO SORTED ARRAYS
20 12 20 12
13 11 13 11
7 9 7 9
2 1 2
1 2
22
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12
13 11 13 11 13 11
7 9 7 9 7 9
2 1 2
1 2
23
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12
13 11 13 11 13 11
7 9 7 9 7 9
2 1 2
1 2 7
24
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11
7 9 7 9 7 9 9
2 1 2
1 2 7
25
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11
7 9 7 9 7 9 9
2 1 2
1 2 7 9
26
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11 13 11
7 9 7 9 7 9 9
2 1 2
1 2 7 9
27
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11 13 11
7 9 7 9 7 9 9
2 1 2
1 2 7 9 11
28
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11 13 11 13
7 9 7 9 7 9 9
2 1 2
1 2 7 9 11
29
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11 13 11 13
7 9 7 9 7 9 9
2 1 2
1 2 7 9 11 12
30
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11 13 11 13
7 9 7 9 7 9 9
2 1 2
1 2 7 9 11 12
Time = O(n) to merge a total of n
elements (linear time).
31
ANALYZING MERGE SORT
T(n) MERGE-SORT A[1 . . n]
O(1) If n = 1, return
2T(n/2) Recursively sort
A[ 1 . . n/2 ] and A[ n/2+1 . . n ]
O(n) Merge the two sorted lists
Sloppiness: Should be T( n/2 ) + T( n/2 ) , but it
turns out this does not matter asymptotically.
32
RECURRENCE FOR MERGE SORT
O(1) if n = 1
T(n) =
2T(n/2) + O(n) if n > 1
Note: Usually the base case T(1) = O(1)
33
RECURSION TREE
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
34
RECURSION TREE
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
T(n)
35
RECURSION TREE
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
T(n/2) T(n/2)
36
RECURSION TREE
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/2 cn/2
T(n/4) T(n/4) T(n/4) T(n/4)
37
RECURSION TREE
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn cn
cn/2 cn/2 cn
h = log n cn/4
cn/4 cn/4 cn/4 cn
…
O(1) #leaves = n O(n)
Total = O(n log n)
38
CONCLUSIONS
• O(n log n) grows more slowly than O(n2).
• Therefore, merge sort asymptotically beats insertion sort in the
worst case.
• In practice, merge sort beats insertion sort for n > 30 or so.
39
MASTER THEOREM
• Let T(n) = a T(n/b) + f(n) where
• a 1, b > 1 are constants
• f(n) is an asymptotically positive function
• Then, T(n) can be bounded asymptotically as follows
• If f(n)=O(𝒏𝐥𝐨𝐠𝒃 𝒂−𝜺) for some > 0, then T(n)=Θ(𝒏𝐥𝐨𝐠𝒃 𝒂 )
• If f(n)=Θ(𝒏𝐥𝐨𝐠𝒃 𝒂 ) then T(n)=Θ(𝒏𝐥𝐨𝐠𝒃 𝒂 log n)
• If f(n)=Ω(𝒏𝐥𝐨𝐠𝒃 𝒂+ɛ) for some > 0, and if, for all sufficiently large n
and a constant c < 1 we have af(n/b) cf(n) , then T(n)=Θ(f(n))
40