KEMBAR78
Merge sort and quick sort | PPTX
Merge Sort and Quick Sort
CSC 391
2
Sorting
• Insertion sort
– Design approach:
– Sorts in place:
– Best case:
– Worst case:
• Bubble Sort
– Design approach:
– Sorts in place:
– Running time:
Yes
(n)
(n2)
incremental
Yes
(n2)
incremental
3
Sorting
• Selection sort
– Design approach:
– Sorts in place:
– Running time:
• Merge Sort
– Design approach:
– Sorts in place:
– Running time:
Yes
(n2)
incremental
No
Let’s see!!
divide and conquer
4
Divide-and-Conquer
• Divide the problem into a number of sub-problems
– Similar sub-problems of smaller size
• Conquer the sub-problems
– Solve the sub-problems recursively
– Sub-problem size small enough  solve the problems in
straightforward manner
• Combine the solutions of the sub-problems
– Obtain the solution for the original problem
5
Merge Sort Approach
• To sort an array A[p . . r]:
• Divide
– Divide the n-element sequence to be sorted into two
subsequences of n/2 elements each
• Conquer
– Sort the subsequences recursively using merge sort
– When the size of the sequences is 1 there is nothing
more to do
• Combine
– Merge the two sorted subsequences
6
Merge Sort
Alg.: MERGE-SORT(A, p, r)
if p < r Check for base case
then q ← (p + r)/2 Divide
MERGE-SORT(A, p, q) Conquer
MERGE-SORT(A, q + 1, r) Conquer
MERGE(A, p, q, r) Combine
• Initial call: MERGE-SORT(A, 1, n)
1 2 3 4 5 6 7 8
62317425
p rq
7
Example – n Power of 2
1 2 3 4 5 6 7 8
q = 462317425
1 2 3 4
7425
5 6 7 8
6231
1 2
25
3 4
74
5 6
31
7 8
62
1
5
2
2
3
4
4
7 1
6
3
7
2
8
6
5
Divide
8
Example – n Power of 2
1
5
2
2
3
4
4
7 1
6
3
7
2
8
6
5
1 2 3 4 5 6 7 8
76543221
1 2 3 4
7542
5 6 7 8
6321
1 2
52
3 4
74
5 6
31
7 8
62
Conquer
and
Merge
9
Example – n Not a Power of 2
62537416274
1 2 3 4 5 6 7 8 9 10 11
q = 6
416274
1 2 3 4 5 6
62537
7 8 9 10 11
q = 9q = 3
274
1 2 3
416
4 5 6
537
7 8 9
62
10 11
74
1 2
2
3
16
4 5
4
6
37
7 8
5
9
2
10
6
11
4
1
7
2
6
4
1
5
7
7
3
8
Divide
10
Example – n Not a Power of 2
77665443221
1 2 3 4 5 6 7 8 9 10 11
764421
1 2 3 4 5 6
76532
7 8 9 10 11
742
1 2 3
641
4 5 6
753
7 8 9
62
10 11
2
3
4
6
5
9
2
10
6
11
4
1
7
2
6
4
1
5
7
7
3
8
74
1 2
61
4 5
73
7 8
Conquer
and
Merge
11
Merge - Pseudocode
Alg.: MERGE(A, p, q, r)
1. Compute n1 and n2
2. Copy the first n1 elements into L[1
. . n1 + 1] and the next n2 elements into R[1 . . n2 + 1]
3. L[n1 + 1] ← ; R[n2 + 1] ← 
4. i ← 1; j ← 1
5. for k ← p to r
6. do if L[ i ] ≤ R[ j ]
7. then A[k] ← L[ i ]
8. i ←i + 1
9. else A[k] ← R[ j ]
10. j ← j + 1
p q
7542
6321
rq + 1
L
R


1 2 3 4 5 6 7 8
63217542
p rq
n1 n2
Algorithm:
mergesort( int [] a, int left, int right)
{
if (right > left)
{ middle = left + (right - left)/2;
mergesort(a, left, middle);
mergesort(a, middle+1, right);
merge(a, left, middle, right); }
}
Assumption: N is a power of two. For N = 1: time is a constant (denoted by 1)
Otherwise, time to mergesort N elements = time to mergesort N/2 elements + time to merge
two arrays each N/2 elements.
Time to merge two arrays each N/2 elements is linear, i.e. N
Thus we have:
(1) T(1) = 1
(2) T(N) = 2T(N/2) + N
Next we will solve this recurrence relation. First we divide (2) by N:
(3) T(N) / N = T(N/2) / (N/2) + 1
Complexity of Merge Sort
N is a power of two, so we can write
(4) T(N/2) / (N/2) = T(N/4) / (N/4) +1
(5) T(N/4) / (N/4) = T(N/8) / (N/8) +1
(6) T(N/8) / (N/8) = T(N/16) / (N/16) +1
(7) ……
(8) T(2) / 2 = T(1) / 1 + 1
Now we add equations (3) through (8) : the sum of their left-hand sides will be equal to
the sum of their right-hand sides:
T(N) / N + T(N/2) / (N/2) + T(N/4) / (N/4) + … + T(2)/2 =
T(N/2) / (N/2) + T(N/4) / (N/4) + ….+ T(2) / 2 + T(1) / 1 + LogN
(LogN is the sum of 1s in the right-hand sides)
After crossing the equal term, we get
(9) T(N)/N = T(1)/1 + LogN
T(1) is 1, hence we obtain
(10) T(N) = N + NlogN = O(NlogN)
Hence the complexity of the MergeSort algorithm is O(NlogN).
Complexity of Merge Sort
14
Quicksort
• Sort an array A[p…r]
• Divide
– Partition the array A into 2 subarrays A[p..q] and A[q+1..r], such that
each element of A[p..q] is smaller than or equal to each element in
A[q+1..r]
– Need to find index q to partition the array
≤A[p…q] A[q+1…r]
15
Quicksort
• Conquer
– Recursively sort A[p..q] and A[q+1..r] using Quicksort
• Combine
– Trivial: the arrays are sorted in place
– No additional work is required to combine them
– The entire array is now sorted
A[p…q] A[q+1…r]≤
16
Recurrence
Alg.: QUICKSORT(A, p, r)
if p < r
then q  PARTITION(A, p, r)
QUICKSORT (A, p, q)
QUICKSORT (A, q+1, r)
Recurrence:
Initially: p=1, r=n
T(n) = T(q) + T(n – q) + n
17
Partitioning the Array
Alg. PARTITION (A, p, r)
1. x  A[p]
2. i  p – 1
3. j  r + 1
4. while TRUE
5. do repeat j  j – 1
6. until A[j] ≤ x
7. do repeat i  i + 1
8. until A[i] ≥ x
9. if i < j
10. then exchange A[i]  A[j]
11. else return j
Running time: (n)
n = r – p + 1
73146235
i j
A:
arap
ij=q
A:
A[p…q] A[q+1…r]≤
p r
Each element is
visited once!
Partition can be done in O(n) time, where n is the size of
the array. Let T(n) be the number of comparisons
required by Quicksort.
If the pivot ends up at position k, then we have
To determine best-, worst-, and average-case complexity
we need to determine the values of k that correspond
to these cases.
Analysis of quicksort
T(n) T(nk)  T(k 1)  n
Best-Case Complexity
The best case is clearly when the pivot always
partitions the array equally.
Intuitively, this would lead to a recursive depth of at
most lg n calls
We can actually prove this. In this case
– T(n) T(n/2)  T(n/2)  n  (n lg n)
Best Case Partitioning
• Best-case partitioning
– Partitioning produces two regions of size n/2
• Recurrence: q=n/2
T(n) = 2T(n/2) + (n)
T(n) = (nlgn) (Master theorem)
Average-Case Complexity
Average case is rather complex, but is where the
algorithm earns its name. The bottom line is: T(n) =
(nlgn)
Worst Case Partitioning
The worst-case behavior for quicksort occurs when the
partitioning routine produces one region with n - 1 elements and
one with only l element.
Let us assume that this
Unbalanced partitioning arises
at every step of the algorithm.
Since partitioning costs (n) time
and T(1) = (1), the recurrence for
the running time is
T(n) = T(n - 1) + (n).
To evaluate this recurrence, we observe that T(1) = (1) and then
iterate:
n
n - 1
n - 2
n - 3
2
1
1
1
1
1
1
n
n
n
n - 1
n - 2
3
2
(n2)
Best case: split in the middle — Θ( n log n)
Worst case: sorted array! — Θ( n2)
Average case: random arrays — Θ( n log n)
Worst Case Partitioning

Merge sort and quick sort

  • 1.
    Merge Sort andQuick Sort CSC 391
  • 2.
    2 Sorting • Insertion sort –Design approach: – Sorts in place: – Best case: – Worst case: • Bubble Sort – Design approach: – Sorts in place: – Running time: Yes (n) (n2) incremental Yes (n2) incremental
  • 3.
    3 Sorting • Selection sort –Design approach: – Sorts in place: – Running time: • Merge Sort – Design approach: – Sorts in place: – Running time: Yes (n2) incremental No Let’s see!! divide and conquer
  • 4.
    4 Divide-and-Conquer • Divide theproblem into a number of sub-problems – Similar sub-problems of smaller size • Conquer the sub-problems – Solve the sub-problems recursively – Sub-problem size small enough  solve the problems in straightforward manner • Combine the solutions of the sub-problems – Obtain the solution for the original problem
  • 5.
    5 Merge Sort Approach •To sort an array A[p . . r]: • Divide – Divide the n-element sequence to be sorted into two subsequences of n/2 elements each • Conquer – Sort the subsequences recursively using merge sort – When the size of the sequences is 1 there is nothing more to do • Combine – Merge the two sorted subsequences
  • 6.
    6 Merge Sort Alg.: MERGE-SORT(A,p, r) if p < r Check for base case then q ← (p + r)/2 Divide MERGE-SORT(A, p, q) Conquer MERGE-SORT(A, q + 1, r) Conquer MERGE(A, p, q, r) Combine • Initial call: MERGE-SORT(A, 1, n) 1 2 3 4 5 6 7 8 62317425 p rq
  • 7.
    7 Example – nPower of 2 1 2 3 4 5 6 7 8 q = 462317425 1 2 3 4 7425 5 6 7 8 6231 1 2 25 3 4 74 5 6 31 7 8 62 1 5 2 2 3 4 4 7 1 6 3 7 2 8 6 5 Divide
  • 8.
    8 Example – nPower of 2 1 5 2 2 3 4 4 7 1 6 3 7 2 8 6 5 1 2 3 4 5 6 7 8 76543221 1 2 3 4 7542 5 6 7 8 6321 1 2 52 3 4 74 5 6 31 7 8 62 Conquer and Merge
  • 9.
    9 Example – nNot a Power of 2 62537416274 1 2 3 4 5 6 7 8 9 10 11 q = 6 416274 1 2 3 4 5 6 62537 7 8 9 10 11 q = 9q = 3 274 1 2 3 416 4 5 6 537 7 8 9 62 10 11 74 1 2 2 3 16 4 5 4 6 37 7 8 5 9 2 10 6 11 4 1 7 2 6 4 1 5 7 7 3 8 Divide
  • 10.
    10 Example – nNot a Power of 2 77665443221 1 2 3 4 5 6 7 8 9 10 11 764421 1 2 3 4 5 6 76532 7 8 9 10 11 742 1 2 3 641 4 5 6 753 7 8 9 62 10 11 2 3 4 6 5 9 2 10 6 11 4 1 7 2 6 4 1 5 7 7 3 8 74 1 2 61 4 5 73 7 8 Conquer and Merge
  • 11.
    11 Merge - Pseudocode Alg.:MERGE(A, p, q, r) 1. Compute n1 and n2 2. Copy the first n1 elements into L[1 . . n1 + 1] and the next n2 elements into R[1 . . n2 + 1] 3. L[n1 + 1] ← ; R[n2 + 1] ←  4. i ← 1; j ← 1 5. for k ← p to r 6. do if L[ i ] ≤ R[ j ] 7. then A[k] ← L[ i ] 8. i ←i + 1 9. else A[k] ← R[ j ] 10. j ← j + 1 p q 7542 6321 rq + 1 L R   1 2 3 4 5 6 7 8 63217542 p rq n1 n2
  • 12.
    Algorithm: mergesort( int []a, int left, int right) { if (right > left) { middle = left + (right - left)/2; mergesort(a, left, middle); mergesort(a, middle+1, right); merge(a, left, middle, right); } } Assumption: N is a power of two. For N = 1: time is a constant (denoted by 1) Otherwise, time to mergesort N elements = time to mergesort N/2 elements + time to merge two arrays each N/2 elements. Time to merge two arrays each N/2 elements is linear, i.e. N Thus we have: (1) T(1) = 1 (2) T(N) = 2T(N/2) + N Next we will solve this recurrence relation. First we divide (2) by N: (3) T(N) / N = T(N/2) / (N/2) + 1 Complexity of Merge Sort
  • 13.
    N is apower of two, so we can write (4) T(N/2) / (N/2) = T(N/4) / (N/4) +1 (5) T(N/4) / (N/4) = T(N/8) / (N/8) +1 (6) T(N/8) / (N/8) = T(N/16) / (N/16) +1 (7) …… (8) T(2) / 2 = T(1) / 1 + 1 Now we add equations (3) through (8) : the sum of their left-hand sides will be equal to the sum of their right-hand sides: T(N) / N + T(N/2) / (N/2) + T(N/4) / (N/4) + … + T(2)/2 = T(N/2) / (N/2) + T(N/4) / (N/4) + ….+ T(2) / 2 + T(1) / 1 + LogN (LogN is the sum of 1s in the right-hand sides) After crossing the equal term, we get (9) T(N)/N = T(1)/1 + LogN T(1) is 1, hence we obtain (10) T(N) = N + NlogN = O(NlogN) Hence the complexity of the MergeSort algorithm is O(NlogN). Complexity of Merge Sort
  • 14.
    14 Quicksort • Sort anarray A[p…r] • Divide – Partition the array A into 2 subarrays A[p..q] and A[q+1..r], such that each element of A[p..q] is smaller than or equal to each element in A[q+1..r] – Need to find index q to partition the array ≤A[p…q] A[q+1…r]
  • 15.
    15 Quicksort • Conquer – Recursivelysort A[p..q] and A[q+1..r] using Quicksort • Combine – Trivial: the arrays are sorted in place – No additional work is required to combine them – The entire array is now sorted A[p…q] A[q+1…r]≤
  • 16.
    16 Recurrence Alg.: QUICKSORT(A, p,r) if p < r then q  PARTITION(A, p, r) QUICKSORT (A, p, q) QUICKSORT (A, q+1, r) Recurrence: Initially: p=1, r=n T(n) = T(q) + T(n – q) + n
  • 17.
    17 Partitioning the Array Alg.PARTITION (A, p, r) 1. x  A[p] 2. i  p – 1 3. j  r + 1 4. while TRUE 5. do repeat j  j – 1 6. until A[j] ≤ x 7. do repeat i  i + 1 8. until A[i] ≥ x 9. if i < j 10. then exchange A[i]  A[j] 11. else return j Running time: (n) n = r – p + 1 73146235 i j A: arap ij=q A: A[p…q] A[q+1…r]≤ p r Each element is visited once!
  • 18.
    Partition can bedone in O(n) time, where n is the size of the array. Let T(n) be the number of comparisons required by Quicksort. If the pivot ends up at position k, then we have To determine best-, worst-, and average-case complexity we need to determine the values of k that correspond to these cases. Analysis of quicksort T(n) T(nk)  T(k 1)  n
  • 19.
    Best-Case Complexity The bestcase is clearly when the pivot always partitions the array equally. Intuitively, this would lead to a recursive depth of at most lg n calls We can actually prove this. In this case – T(n) T(n/2)  T(n/2)  n  (n lg n)
  • 20.
    Best Case Partitioning •Best-case partitioning – Partitioning produces two regions of size n/2 • Recurrence: q=n/2 T(n) = 2T(n/2) + (n) T(n) = (nlgn) (Master theorem)
  • 21.
    Average-Case Complexity Average caseis rather complex, but is where the algorithm earns its name. The bottom line is: T(n) = (nlgn)
  • 22.
    Worst Case Partitioning Theworst-case behavior for quicksort occurs when the partitioning routine produces one region with n - 1 elements and one with only l element. Let us assume that this Unbalanced partitioning arises at every step of the algorithm. Since partitioning costs (n) time and T(1) = (1), the recurrence for the running time is T(n) = T(n - 1) + (n). To evaluate this recurrence, we observe that T(1) = (1) and then iterate: n n - 1 n - 2 n - 3 2 1 1 1 1 1 1 n n n n - 1 n - 2 3 2 (n2)
  • 23.
    Best case: splitin the middle — Θ( n log n) Worst case: sorted array! — Θ( n2) Average case: random arrays — Θ( n log n) Worst Case Partitioning