KEMBAR78
Linear Time Sorting Algorithm Functionality | PPT
David Luebke 1
10/08/25
CS 332: Algorithms
Linear-Time Sorting Algorithms
David Luebke 2
10/08/25
Sorting So Far
īŦ Insertion sort:
īŽ Easy to code
īŽ Fast on small inputs (less than ~50 elements)
īŽ Fast on nearly-sorted inputs
īŽ O(n2
) worst case
īŽ O(n2
) average (equally-likely inputs) case
īŽ O(n2
) reverse-sorted case
David Luebke 3
10/08/25
Sorting So Far
īŦ Merge sort:
īŽ Divide-and-conquer:
īĩ Split array in half
īĩ Recursively sort subarrays
īĩ Linear-time merge step
īŽ O(n lg n) worst case
īŽ Doesn’t sort in place
David Luebke 4
10/08/25
Sorting So Far
īŦ Heap sort:
īŽ Uses the very useful heap data structure
īĩ Complete binary tree
īĩ Heap property: parent key > children’s keys
īŽ O(n lg n) worst case
īŽ Sorts in place
īŽ Fair amount of shuffling memory around
David Luebke 5
10/08/25
Sorting So Far
īŦ Quick sort:
īŽ Divide-and-conquer:
īĩ Partition array into two subarrays, recursively sort
īĩ All of first subarray < all of second subarray
īĩ No merge step needed!
īŽ O(n lg n) average case
īŽ Fast in practice
īŽ O(n2
) worst case
īĩ Naïve implementation: worst case on sorted input
īĩ Address this with randomized quicksort
David Luebke 6
10/08/25
How Fast Can We Sort?
īŦ We will provide a lower bound, then beat it
īŽ How do you suppose we’ll beat it?
īŦ First, an observation: all of the sorting algorithms so
far are comparison sorts
īŽ The only operation used to gain ordering information about
a sequence is the pairwise comparison of two elements
īŽ Theorem: all comparison sorts are (n lg n)
īĩ A comparison sort must do O(n) comparisons (why?)
īĩ What about the gap between O(n) and O(n lg n)
David Luebke 7
10/08/25
Decision Trees
īŦ Decision trees provide an abstraction of
comparison sorts
īŽ A decision tree represents the comparisons made
by a comparison sort. Every thing else ignored
īŽ (Draw examples on board)
īŦ What do the leaves represent?
īŦ How many leaves must there be?
David Luebke 8
10/08/25
Decision Trees
īŦ Decision trees can model comparison sorts. For a
given algorithm:
īŽ One tree for each n
īŽ Tree paths are all possible execution traces
īŽ What’s the longest path in a decision tree for insertion
sort? For merge sort?
īŦ What is the asymptotic height of any decision tree
for sorting n elements?
īŦ Answer: (n lg n) (now let’s prove itâ€Ļ)
David Luebke 9
10/08/25
Lower Bound For
Comparison Sorting
īŦ Thm: Any decision tree that sorts n elements
has height (n lg n)
īŦ What’s the minimum # of leaves?
īŦ What’s the maximum # of leaves of a binary
tree of height h?
īŦ Clearly the minimum # of leaves is less than or
equal to the maximum # of leaves
David Luebke 10
10/08/25
Lower Bound For
Comparison Sorting
īŦ So we haveâ€Ļ
n!  2h
īŦ Taking logarithms:
lg (n!)  h
īŦ Stirling’s approximation tells us:
īŦ Thus:
n
e
n
n 

īƒļ


īƒĻ
ī€ž
!
n
e
n
h 

īƒļ


īƒĻ
lg
David Luebke 11
10/08/25
Lower Bound For
Comparison Sorting
īŦ So we have
īŦ Thus the minimum height of a decision tree is (n lg n)
 ī€Š
n
n
e
n
n
n
e
n
h
n
lg
lg
lg
lg

ī€Ŋ

ī€Ŋ


īƒļ


īƒĻ

David Luebke 12
10/08/25
Lower Bound For
Comparison Sorts
īŦ Thus the time to comparison sort n elements is
(n lg n)
īŦ Corollary: Heapsort and Mergesort are
asymptotically optimal comparison sorts
īŦ But the name of this lecture is “Sorting in
linear time”!
īŽ How can we do better than (n lg n)?
David Luebke 13
10/08/25
Sorting In Linear Time
īŦ Counting sort
īŽ No comparisons between elements!
īŽ Butâ€Ļdepends on assumption about the numbers
being sorted
īĩ We assume numbers are in the range 1.. k
īŽ The algorithm:
īĩ Input: A[1..n], where A[j] īƒŽ {1, 2, 3, â€Ļ, k}
īĩ Output: B[1..n], sorted (notice: not sorting in place)
īĩ Also: Array C[1..k] for auxiliary storage
David Luebke 14
10/08/25
Counting Sort
1 CountingSort(A, B, k)
2 for i=1 to k
3 C[i]= 0;
4 for j=1 to n
5 C[A[j]] += 1;
6 for i=2 to k
7 C[i] = C[i] + C[i-1];
8 for j=n downto 1
9 B[C[A[j]]] = A[j];
10 C[A[j]] -= 1;
Work through example: A={4 1 3 4 3}, k = 4
David Luebke 15
10/08/25
Counting Sort
1 CountingSort(A, B, k)
2 for i=1 to k
3 C[i]= 0;
4 for j=1 to n
5 C[A[j]] += 1;
6 for i=2 to k
7 C[i] = C[i] + C[i-1];
8 for j=n downto 1
9 B[C[A[j]]] = A[j];
10 C[A[j]] -= 1;
What will be the running time?
Takes time O(k)
Takes time O(n)
David Luebke 16
10/08/25
Counting Sort
īŦ Total time: O(n + k)
īŽ Usually, k = O(n)
īŽ Thus counting sort runs in O(n) time
īŦ But sorting is (n lg n)!
īŽ No contradiction--this is not a comparison sort (in
fact, there are no comparisons at all!)
īŽ Notice that this algorithm is stable
David Luebke 17
10/08/25
Counting Sort
īŦ Cool! Why don’t we always use counting
sort?
īŦ Because it depends on range k of elements
īŦ Could we use counting sort to sort 32 bit
integers? Why or why not?
īŦ Answer: no, k too large (232
= 4,294,967,296)
David Luebke 18
10/08/25
Counting Sort
īŦ How did IBM get rich originally?
īŦ Answer: punched card readers for census
tabulation in early 1900’s.
īŽ In particular, a card sorter that could sort cards into
different bins
īĩ Each column can be punched in 12 places
īĩ Decimal digits use 10 places
īŽ Problem: only one column can be sorted on at a
time
David Luebke 19
10/08/25
Radix Sort
īŦ Intuitively, you might sort on the most significant
digit, then the second msd, etc.
īŦ Problem: lots of intermediate piles of cards (read:
scratch arrays) to keep track of
īŦ Key idea: sort the least significant digit first
RadixSort(A, d)
for i=1 to d
StableSort(A) on digit i
īŽ Example: Fig 9.3
David Luebke 20
10/08/25
Radix Sort
īŦ Can we prove it will work?
īŦ Sketch of an inductive argument (induction on the
number of passes):
īŽ Assume lower-order digits {j: j<i}are sorted
īŽ Show that sorting next digit i leaves array correctly sorted
īĩ If two digits at position i are different, ordering numbers by that
digit is correct (lower-order digits irrelevant)
īĩ If they are the same, numbers are already sorted on the lower-
order digits. Since we use a stable sort, the numbers stay in the
right order
David Luebke 21
10/08/25
Radix Sort
īŦ What sort will we use to sort on digits?
īŦ Counting sort is obvious choice:
īŽ Sort n numbers on digits that range from 1..k
īŽ Time: O(n + k)
īŦ Each pass over n numbers with d digits takes
time O(n+k), so total time O(dn+dk)
īŽ When d is constant and k=O(n), takes O(n) time
īŦ How many bits in a computer word?
David Luebke 22
10/08/25
Radix Sort
īŦ Problem: sort 1 million 64-bit numbers
īŽ Treat as four-digit radix 216
numbers
īŽ Can sort in just four passes with radix sort!
īŦ Compares well with typical O(n lg n) comparison
sort
īŽ Requires approx lg n = 20 operations per number being
sorted
īŦ So why would we ever use anything but radix sort?
David Luebke 23
10/08/25
Radix Sort
īŦ In general, radix sort based on counting sort is
īŽ Fast
īŽ Asymptotically fast (i.e., O(n))
īŽ Simple to code
īŽ A good choice
īŦ To think about: Can radix sort be used on
floating-point numbers?
David Luebke 24
10/08/25
The End

Linear Time Sorting Algorithm Functionality

  • 1.
    David Luebke 1 10/08/25 CS332: Algorithms Linear-Time Sorting Algorithms
  • 2.
    David Luebke 2 10/08/25 SortingSo Far īŦ Insertion sort: īŽ Easy to code īŽ Fast on small inputs (less than ~50 elements) īŽ Fast on nearly-sorted inputs īŽ O(n2 ) worst case īŽ O(n2 ) average (equally-likely inputs) case īŽ O(n2 ) reverse-sorted case
  • 3.
    David Luebke 3 10/08/25 SortingSo Far īŦ Merge sort: īŽ Divide-and-conquer: īĩ Split array in half īĩ Recursively sort subarrays īĩ Linear-time merge step īŽ O(n lg n) worst case īŽ Doesn’t sort in place
  • 4.
    David Luebke 4 10/08/25 SortingSo Far īŦ Heap sort: īŽ Uses the very useful heap data structure īĩ Complete binary tree īĩ Heap property: parent key > children’s keys īŽ O(n lg n) worst case īŽ Sorts in place īŽ Fair amount of shuffling memory around
  • 5.
    David Luebke 5 10/08/25 SortingSo Far īŦ Quick sort: īŽ Divide-and-conquer: īĩ Partition array into two subarrays, recursively sort īĩ All of first subarray < all of second subarray īĩ No merge step needed! īŽ O(n lg n) average case īŽ Fast in practice īŽ O(n2 ) worst case īĩ Naïve implementation: worst case on sorted input īĩ Address this with randomized quicksort
  • 6.
    David Luebke 6 10/08/25 HowFast Can We Sort? īŦ We will provide a lower bound, then beat it īŽ How do you suppose we’ll beat it? īŦ First, an observation: all of the sorting algorithms so far are comparison sorts īŽ The only operation used to gain ordering information about a sequence is the pairwise comparison of two elements īŽ Theorem: all comparison sorts are (n lg n) īĩ A comparison sort must do O(n) comparisons (why?) īĩ What about the gap between O(n) and O(n lg n)
  • 7.
    David Luebke 7 10/08/25 DecisionTrees īŦ Decision trees provide an abstraction of comparison sorts īŽ A decision tree represents the comparisons made by a comparison sort. Every thing else ignored īŽ (Draw examples on board) īŦ What do the leaves represent? īŦ How many leaves must there be?
  • 8.
    David Luebke 8 10/08/25 DecisionTrees īŦ Decision trees can model comparison sorts. For a given algorithm: īŽ One tree for each n īŽ Tree paths are all possible execution traces īŽ What’s the longest path in a decision tree for insertion sort? For merge sort? īŦ What is the asymptotic height of any decision tree for sorting n elements? īŦ Answer: (n lg n) (now let’s prove itâ€Ļ)
  • 9.
    David Luebke 9 10/08/25 LowerBound For Comparison Sorting īŦ Thm: Any decision tree that sorts n elements has height (n lg n) īŦ What’s the minimum # of leaves? īŦ What’s the maximum # of leaves of a binary tree of height h? īŦ Clearly the minimum # of leaves is less than or equal to the maximum # of leaves
  • 10.
    David Luebke 10 10/08/25 LowerBound For Comparison Sorting īŦ So we haveâ€Ļ n!  2h īŦ Taking logarithms: lg (n!)  h īŦ Stirling’s approximation tells us: īŦ Thus: n e n n   īƒļ   īƒĻ ī€ž ! n e n h   īƒļ   īƒĻ lg
  • 11.
    David Luebke 11 10/08/25 LowerBound For Comparison Sorting īŦ So we have īŦ Thus the minimum height of a decision tree is (n lg n)  ī€Š n n e n n n e n h n lg lg lg lg  ī€Ŋ  ī€Ŋ   īƒļ   īƒĻ 
  • 12.
    David Luebke 12 10/08/25 LowerBound For Comparison Sorts īŦ Thus the time to comparison sort n elements is (n lg n) īŦ Corollary: Heapsort and Mergesort are asymptotically optimal comparison sorts īŦ But the name of this lecture is “Sorting in linear time”! īŽ How can we do better than (n lg n)?
  • 13.
    David Luebke 13 10/08/25 SortingIn Linear Time īŦ Counting sort īŽ No comparisons between elements! īŽ Butâ€Ļdepends on assumption about the numbers being sorted īĩ We assume numbers are in the range 1.. k īŽ The algorithm: īĩ Input: A[1..n], where A[j] īƒŽ {1, 2, 3, â€Ļ, k} īĩ Output: B[1..n], sorted (notice: not sorting in place) īĩ Also: Array C[1..k] for auxiliary storage
  • 14.
    David Luebke 14 10/08/25 CountingSort 1 CountingSort(A, B, k) 2 for i=1 to k 3 C[i]= 0; 4 for j=1 to n 5 C[A[j]] += 1; 6 for i=2 to k 7 C[i] = C[i] + C[i-1]; 8 for j=n downto 1 9 B[C[A[j]]] = A[j]; 10 C[A[j]] -= 1; Work through example: A={4 1 3 4 3}, k = 4
  • 15.
    David Luebke 15 10/08/25 CountingSort 1 CountingSort(A, B, k) 2 for i=1 to k 3 C[i]= 0; 4 for j=1 to n 5 C[A[j]] += 1; 6 for i=2 to k 7 C[i] = C[i] + C[i-1]; 8 for j=n downto 1 9 B[C[A[j]]] = A[j]; 10 C[A[j]] -= 1; What will be the running time? Takes time O(k) Takes time O(n)
  • 16.
    David Luebke 16 10/08/25 CountingSort īŦ Total time: O(n + k) īŽ Usually, k = O(n) īŽ Thus counting sort runs in O(n) time īŦ But sorting is (n lg n)! īŽ No contradiction--this is not a comparison sort (in fact, there are no comparisons at all!) īŽ Notice that this algorithm is stable
  • 17.
    David Luebke 17 10/08/25 CountingSort īŦ Cool! Why don’t we always use counting sort? īŦ Because it depends on range k of elements īŦ Could we use counting sort to sort 32 bit integers? Why or why not? īŦ Answer: no, k too large (232 = 4,294,967,296)
  • 18.
    David Luebke 18 10/08/25 CountingSort īŦ How did IBM get rich originally? īŦ Answer: punched card readers for census tabulation in early 1900’s. īŽ In particular, a card sorter that could sort cards into different bins īĩ Each column can be punched in 12 places īĩ Decimal digits use 10 places īŽ Problem: only one column can be sorted on at a time
  • 19.
    David Luebke 19 10/08/25 RadixSort īŦ Intuitively, you might sort on the most significant digit, then the second msd, etc. īŦ Problem: lots of intermediate piles of cards (read: scratch arrays) to keep track of īŦ Key idea: sort the least significant digit first RadixSort(A, d) for i=1 to d StableSort(A) on digit i īŽ Example: Fig 9.3
  • 20.
    David Luebke 20 10/08/25 RadixSort īŦ Can we prove it will work? īŦ Sketch of an inductive argument (induction on the number of passes): īŽ Assume lower-order digits {j: j<i}are sorted īŽ Show that sorting next digit i leaves array correctly sorted īĩ If two digits at position i are different, ordering numbers by that digit is correct (lower-order digits irrelevant) īĩ If they are the same, numbers are already sorted on the lower- order digits. Since we use a stable sort, the numbers stay in the right order
  • 21.
    David Luebke 21 10/08/25 RadixSort īŦ What sort will we use to sort on digits? īŦ Counting sort is obvious choice: īŽ Sort n numbers on digits that range from 1..k īŽ Time: O(n + k) īŦ Each pass over n numbers with d digits takes time O(n+k), so total time O(dn+dk) īŽ When d is constant and k=O(n), takes O(n) time īŦ How many bits in a computer word?
  • 22.
    David Luebke 22 10/08/25 RadixSort īŦ Problem: sort 1 million 64-bit numbers īŽ Treat as four-digit radix 216 numbers īŽ Can sort in just four passes with radix sort! īŦ Compares well with typical O(n lg n) comparison sort īŽ Requires approx lg n = 20 operations per number being sorted īŦ So why would we ever use anything but radix sort?
  • 23.
    David Luebke 23 10/08/25 RadixSort īŦ In general, radix sort based on counting sort is īŽ Fast īŽ Asymptotically fast (i.e., O(n)) īŽ Simple to code īŽ A good choice īŦ To think about: Can radix sort be used on floating-point numbers?
  • 24.