Chapter 8: Sorting
• One of the most important concepts and
  common applications in computing.
              23 78 45 8 32 56
               8 23 32 45 78 56
                                           1
     General Sort Concepts
• Internal sort: all data are held in primary
  memory during the sorting process.
• External sort: primary memory for data
  currently being sorted and secondary
  storage for data that do not fit in primary
  memory.
                                                2
    General Sort Concepts
                            Sorts
               Internal                External
                                       • Natural
                                       • Balanced
Insertion     Selection     Exchange   • Polyphas
                                         e
• Insertion   • Selection   • Bubble
• Shell       • Heap        • Quick
                                                    3
     General Sort Concepts
• Sort stability: data with equal keys maintain
  their relative input order in the output.
                78 8 45 8 32 56
                8   8 32 45 56 78
                                              4
     General Sort Concepts
• Sort efficiency: a measure of the relative
  efficiency of a sort = number of comparisons +
  number of moves
                                               5
     General Sort Concepts
• Sort pass: each traversal of the data being
  sorted.
                                                6
            Insertion Sorts
• In each pass, one or more pieces of data
  are inserted into their correct location in an
  ordered list.
                                                   7
     Straight Insertion Sort
• The list is divided into two parts: sorted and
  unsorted.
• In each pass, the first element of the unsorted
  sublist is inserted into the sorted sublist.
                        unsorte
                        d
                                               8
Straight Insertion Sort
      23 78 45 8 32 56
                          9
Straight Insertion Sort
      23 78 45 8 32 56
      23 78 45 8 32 56
                          10
Straight Insertion Sort
      23 78 45 8 32 56
      23 78 45 8 32 56
      23 45 78 8 32 56
                          11
Straight Insertion Sort
      23 78 45 8 32 56
      23 78 45 8 32 56
      23 45 78 8 32 56
      8 23 45 78 32 56
                          12
Straight Insertion Sort
      23 78 45 8 32 56
      23 78 45 8 32 56
      23 45 78 8 32 56
      8 23 45 78 32 56
      8 23 32 45 78 56
                          13
23 78 45 8 32 56
23 78 45 8 32 56
23 45 78 8 32 56
8 23 45 78 32 56
8 23 32 45 78 56
8 23 32 45 56 78
                   14
         Straight Insertion Sort
Algorithm      insertionSort (ref list <array>, val last <index>)
Sorts list[1..last] using straight insertion sort
         Pre  list contains at least one element
              last is index to last element in the list
         Post list has been rearranged
1    current = 2
2   loop (current <= last)
    1 hold = list[current]
    2 walker = current - 1
    3 loop (walker >= 1 AND hold.key < list[walker].key)
        1 list[walker + 1] = list[walker]
        2 walker = walker - 1
    4 list[walker + 1] = hold
    5 current = current + 1
3   return
End    insertionSort                                                15
                Shell Sort
• Named after its creator Donald L. Shell
  (1959).
• Given a list of N elements, the list is divided
  into K segments (K is called the increment).
• Each segment contains N/K or more
  elements.
• Segments are dispersed throughout the
  list.
                                                16
                    Shell Sort
          [1   [2   [3     [4     [5     [6   [7   [8     [9 [10]
          ]    ]    ]      ]      ]      ]    ]    ]      ]
K=3
          [1             [1 + K]          [1 +                 [1 +
          ]                               2*K]                 3*K]
Segment
1
               [2               [2 + K]          [2 +
               ]                                 2*K]
Segment
2
                    [3                 [3 + K]          [3 +
                    ]                                   2*K]
Segment
3
                                                                      17
Shell Sort
23 78 45 8 32 56
                   18
                Shell Sort
• For the value of K in each iteration, sort the
  K segments.
• After each iteration, K is reduced until it is 1
  in the final iteration.
                                                 19
                           Shell Sort
Algorithm       shellSort (ref list <array>, val last <index>)
Sorts list[1..last] using shell sort
         Pre      list must contain at least one element
                  last is index to last element in the list
         Post     list has been rearranged
1    K = last/2
2   loop (K not 0)
    1 seg = 1
    2 loop (seg <= K)
        1 sortSegment (seg)
        2 seg = seg + 1
    3 K = K/2
3   return
End     shellSort
                                                                 20
                              Shell Sort
Algorithm       shellSort (ref list <array>, val last <index>)
Sorts list[1..last] using shell sort
          Pre       list must contain at least one element
                    last is index to last element in the list
        Post        list has been rearranged
1 K = last/2
2 loop (K not 0)
   1 seg = 1
   2 loop (seg <= K)
      1 current = seg + K
      2 loop (current <= last)
          1 hold = list[current]
          2 walker = current - K
          3 loop (walker >= 1 AND hold.key < list[walker].key)
              1 list[walker + K] = list [walker]
              2 walker = walker - K
          4 list[walker + K] = hold
          5 current = current + K
      3 seg = seg + 1
   3 K = K/2
3 return
End   shellSort                                                  21
                               Shell Sort
Algorithm        shellSort (ref list <array>, val last <index>)
Sorts list[1..last] using shell sort
          Pre        list must contain at least one element
                     last is index to last element in the list
          Post       list has been rearranged
1       incre = last/2
2   loop (incre not 0)
    1 current = 1 + incre
    2 loop (current <= last)
        1 hold = list[current]
        2 walker = current - incre
        3 loop (walker >= 1 AND hold.key < list[walker].key)
            1 list[walker + incre] = list [walker]
            2 walker = walker - incre
        4 list[walker + incre] = hold
        5 current = current + 1
    3 incre = incre/2
3   return
End      shellSort
                                                                  22
    Insertion Sort Efficiency
• Straight insertion sort:
         f(n) = n(n + 1)/2   = O(n2)
                                       23
    Insertion Sort Efficiency
• Shell sort:
            O(n1.25)   Empirical study
                                         24
    General Sort Concepts
                            Sorts
               Internal                External
                                       • Natural
                                       • Balanced
Insertion     Selection     Exchange   • Polyphas
                                         e
• Insertion   • Selection   • Bubble
• Shell       • Heap        • Quick
                                                    25
           Selection Sorts
• In each pass, the smallest/largest item is
  selected and placed in a sorted list.
                                               26
     Straight Selection Sort
• The list is divided into two parts: sorted and
  unsorted.
• In each pass, in the unsorted sublist, the
  smallest element is selected and
  exchanged with the first element.
                        unsorte
                        d
                                               27
Straight Selection Sort
      23 78 45 8 32 56
                          28
Straight Selection Sort
     23 78 45 8 32 56
      8 78 45 23 32 56
                          29
Straight Selection Sort
     23 78 45 8 32 56
      8 78 45 23 32 56
      8 23 45 78 32 56
                          30
Straight Selection Sort
     23 78 45 8 32 56
      8 78 45 23 32 56
      8 23 45 78 32 56
      8 23 32 78 45 56
                          31
Straight Selection Sort
     23 78 45 8 32 56
      8 78 45 23 32 56
      8 23 45 78 32 56
      8 23 32 78 45 56
      8 23 32 45 78 56
                          32
23 78 45 8 32 56
8 78 45 23 32 56
8 23 45 78 32 56
8 23 32 78 45 56
8 23 32 45 78 56
8 23 32 45 56 78
                   33
         Straight Selection Sort
Algorithm       selectionSort (ref list <array>, val last <index>)
Sorts list[1..last] using straight selection sort
          Pre  list contains at least one element
               last is index to last element in the list
          Post list has been rearranged
1       current = 1
2   loop (current < last)
    1 smallest = current
    2 walker = current + 1
    3 loop (walker <= last)
        1 if (list[walker] < list[smallest])
            1 smallest = walker
        2 walker = walker + 1
    4 exchange (list, current, smallest)
    5 current = current + 1
3   return
End     selectionSort
                                                                     34
               Heap Sort
• The unsorted sublist is organized into a
  heap.
• In each pass, in the unsorted sublist, the
  largest element is selected and exchanged
  with the last element.
  Then the heap is reheaped.
                   heap
                                               35
Heap Sort
23 78 45 8 32 56
               23
     78
               [0        45
               ]
     [1                  [2
8         32        56
     ]                   ]
[3        [4        [5
]         ]         ]
                              36
                         Heap Sort
23 78 45 8 32 56                      78 32 56 8 23 45
                              build
               23             heap                   78
     78
               [0        45                32
                                                     [0        56
               ]                                     ]
     [1                  [2                [1                  [2
8         32        56                8         23        45
     ]                   ]                 ]                   ]
[3        [4        [5                [3        [4        [5
]         ]         ]                 ]         ]         ]
                                                                    37
                         Heap Sort
78 32 56 8 23 45                 45 32 56 8 23 78
               78                               45
               [0                     32
                                                [0        56
     32                  56
               ]                                ]
     [1                  [2           [1                  [2
8         23        45           8         23        78
     ]                   ]            ]                   ]
[3        [4        [5           [3        [4        [5
]         ]         ]            ]         ]         ]
                                                               38
                            Heap Sort
Algorithm       heapSort (ref heap <array>, val last <index>)
Sorts list[0..last] using heap sort
          Pre  array is filled
               last is index to last element in the list
          Post array has been sorted
   Creat a heap
1     walker = 1
2 loop (walker <= last)
   1 reheapUp (heap, walker)
   2 walker = walker + 1
   Sort the list
3 sorted = last
4 loop (sorted > 0)
   1 exchange (heap, 0, sorted)
   2 sorted = sorted - 1
   3 reheapDown (heap, 0, sorted)
5 return
End    heapSort
                                                                39
    Selection Sort Efficiency
• Straight selection sort:
                     O(n2)
                                40
   Selection Sort Efficiency
• Heap sort:
         O(n log2n)
                               41
       General Sort Concepts
                            Sorts
               Internal                External
                                       • Natural
                                       • Balanced
Insertion     Selection     Exchange   • Polyphas
                                         e
• Insertion   • Selection   • Bubble
• Shell       • Heap        • Quick
                                                    42
          Exchange Sorts
• In each pass, elements that are out of order
  are exchanged, until the entire list is
  sorted.
• Exchange is extensively used.
                                             43
              Bubble Sort
• The list is divided into two parts: sorted and
  unsorted.
• In each pass, the smallest element is
  bubbled from the unsorted sublist and
  moved to the sorted sublist.
                        unsorte
                        d
                                               44
Bubble Sort
23 78 45 8 56 32
                   45
Bubble Sort
23 78 45 8 56 32
23 78 45 8 32 56
23 78 45 8 32 56
23 78 8 45 32 56
23 8 78 45 32 56
8 23 78 45 32 56
                   46
                          Bubble Sort
Algorithm       bubbleSort (ref list <array>, val last <index>)
Sorts list[1..last] using bubble sort
          Pre  list must contain at least one element
               last is index to last element in the list
          Post list has been rearranged
1       current = 1
2   sorted = false
3   loop (current <= last AND sorted false)
    1 walker = last
    2 sorted = true
    3 loop (walker > current)
        1 if (list[walker] < list[walker - 1])
            1 sorted = false
            2 exchange (list, walker, walker - 1)
        2 walker = walker -1
    4    current = current + 1
4   return
End     bubbleSort
                                                                  47
              Quick Sort
• Developed by C. A. Hoare (1962).
• In each pass, a pivot element is selected and
  the list is divided into three groups:
              < pivot, = pivot, > pivot
 Quick sort is continued for the first and third
 groups.            pivo
                    t
                        k
              <k             >k
                                              48
                 Quick Sort
• Pivot selection:
  – C. A. Hoare (1962): the first element in the list.
  – R. C. Singleton (1969): the median of the left,
    right and middle elements of the list.
• Pivot location:
  – Use left and right walls.
  – Exchange the two elements at the left and right
    wall positions if they are out of order with
    respect to the pivot.
                                                         49
       Quick Sort
pivo
t
        pivo
        t
                    50
       Quick Sort
78 21 14 97 87 62 74 85 76 45 84 22
                                      51
78 21 14 97 87 62 74 85 76 45 84 22
78 21 14 97 87 62 74 85 76 45 84 22
78 21 14 22 87 62 74 85 76 45 84 97
78 21 14 22 87 62 74 85 76 45 84 97
78 21 14 22 45 62 74 85 76 87 84 97
78 21 14 22 45 62 74 85 76 87 84 97
78 21 14 22 45 62 74 76 85 87 84 97   52
              Quick Sort
• D.E. Knuth suggested that when the sort
  partitions becomes small, straight insertion
  sort should be used to complete the sorting.
                   pivo
                   t
                     k
             <k            >k
                                            53
                             Quick Sort
Algorithm       quickSort (ref list <array>, val left <index>, val right <index>)
Sorts list[left..right] using quick sort
          Pre  list must contain at least one element
               left and right are indices to first and last elements in the list
          Post list has been rearranged
1       if (right - left > minsize)
        quick sort
    1   medianLeft (list, left, right)
    2   pivot = list[left]
    3   sortLeft = left + 1
    4   sortRight = right
    5   loop (sortLeft <= sortRight)
        Find key on left that belongs on right
        1 loop (list[sortLeft].key < pivot.key)
              1 sortLeft = sortLeft + 1
        Find key on right that belongs on left
        1 loop (list[sortRight].key >= pivot.key)
              1 sortRight = sortRight - 1
                                                                                    54
                          Quick Sort
        Find key on left that belongs on right
        1 loop (list[sortLeft].key < pivot.key)
            1 sortLeft = sortLeft + 1
        Find key on right that belongs on left
        2 loop (list[sortRight].key >= pivot.key)
            1 sortRight = sortRight - 1
        3 if (sortLeft <= sortRight)
            1 exchange (list, sortLeft, sortRight)
            2 sortLeft = sortLeft + 1
            3 sortRight = sortRight - 1
        Prepare for next phase
    6 list[left] = list[sortLeft - 1]
    7 list[sortLeft - 1] = pivot
    8 if (left < sortRight)
        1 quickSort (list, left, sortRight - 1)
    9 if (sortLeft < right)
        1 quickSort (list, sortLeft, right )
2   else
        1 insertionSort (list, left, right)
End    quickSort                                     55
                           Quick Sort
Algorithm      medianLeft (ref sortData <array>, val left <index>, val right
<index>)
Finds the median of an array sortData[left..right], and places it in the location
sortData[left]
         Pre  sortData is an array of at least three elements
              left and right are the boundaries of the array
         Post median value located and placed at sortData[left]
1   mid = (left + right)/2
2   if (sortData[left].key > sortData[mid].key)
    1 exchange (sortData, left, mid)
3   if (sortData[left].key > sortData[right].key)
    1 exchange (sortData, left, right)
4   if (sortData[mid].key > sortData[right].key)
    1 exchange (sortData, mid, right)
5   exchange (sortData, left, mid)
6   return
End      medianLeft
                                                                                    56
   Exchange Sort Efficiency
• Bubble sort:
        f(n) = n(n + 1)/2   = O(n2)
                                      57
   Exchange Sort Efficiency
• Quick sort:
           O(n log2n)
                              58
    General Sort Concepts
                            Sorts
               Internal                External
                                       • Natural
                                       • Balanced
Insertion     Selection     Exchange   • Polyphas
                                         e
• Insertion   • Selection   • Bubble
• Shell       • Heap        • Quick
                                                    59
            External Sorts
• Sorts that allow portions of the data to be
  stored in secondary memory during the
  sorting process.
• Most of the work spent ordering files is not
  sorting but merging.
                                                 60
Merging Ordered Files
          1
    1     2    2
    3     3    4
    5     4    6
          5    8
          6    10
          8
         10
                        61
          Merging Ordered Files
Algorithm      mergeFiles
Merges two files into one file
         Pre    Input files are ordered
         Post Input files sequentially combined in output file
1       open files
2   read (file1 into record1)
3   read (file2 into record2)
4   loop (not end file1 OR not end file2)
    1 if (record1.key <= record2.key)
        1 write (file3 from record1)
        2 read (file1 into record1)
        3 if (end of file1)
             1 record1.key = +∝
    2 else
        1 write (file3 from record2)
        2 read (file2 into record2)
        3 if (end of file2)
             1 record2.key = +∝
5   close files
End      mergeFiles                                              62
       File Sorting Process
• Sort phase.
• Merge phase.
                              63
                 Sort Phase
                     2,300 records
Input
file
                         Sort
records 1-500   1,001-1,500     2,001-2,300
Merge 1
                               501-1,000      1,501-2,000
                              Merge 2                       64
           Merge Phase
• Natural merge
• Balanced merge
• Polyphase merge
                         65
natural two-way                input
merge
    records 1-500           sort phase               records 501-1000
                      mrg                    mrg
    records 1,001-                            2      records 1,501-
                       1
    1,500                     merge                  2000
    records 2.001-
    2,300                                  records 1-1000
                               mrg
                                3          records 1,001-
                                           2000
                                           records 2.001-
                            distribution
    records 1-1000                         2,300
                                              mrg    records 1,001-
                      mrg
    records 2,001-     1                       2     2,000
                              merge
    2,300
                               mrg         records 1-2000
                                3          records 2,001-
                                           2,300
                            distribution
                      mrg                    mrg
    records 1-2,000                           2      records 2,001-
                       1                             2,300
                              merge
                               mrg
                                3          records 1-2,300              66
balanced two-way
merge
                              input
   records   1-500          sort phase            records 501-1000
                      mrg                  mrg
   records   1,001-                         2     records 1,501-
                       1
   1,500                     merge                2000
   records   2.001-
   records   1-1000   mrg                  mrg    records 1,001-
   2,300                                    2
   records   2,001-    1                          2,000
                             merge
   2,300
                      mrg                  mrg
   records 1-2,000                          2     records 2,001-
                       1                          2,300
                             merge
                               mrg
                                3        records 1-2,300
                                                                     67
polyphase merge              input
   records 1-500           sort phase             records 501-1000
                     mrg                  mrg
   records 1,001-                          2      records 1,501-
                      1
   1,500                    merge                 2000
   records 2.001-
   2,300                                records 1-1000
                              mrg
                               3        records 1,001-
                                        2000
                     mrg                  mrg     records 1-1,000
   records 2,001-     1                    2             2,001-
   2,300                    merge
                                                  2,300
                              mrg       records 1-1000
                               3        records 1,001-
                                        2,000
                     mrg                  mrg     records 1-1,000
   records 1-2,300    1                    2             2,001-
                            merge
                                                  2,300
                              mrg
                               3        records 1,001-               68
                                        2,000