Popular sorting algorithms
Martin Talamanoski Kris Petrevski Ustijana R. Shikoska
CSE - 948 ICSBM – 895 Professor and Assistant
Abstract: Here we want to introduce some of sorting algorithms. So, we begin with Bubble sort
and will continue with Selection Sort, Insertion Sort, Merge Sort and Quick Sort. These are the
most popular sorting algorithms. Then we introduce stacks and basic operations such as Pop,
Push and Peek. All of these algorithms are perfectly described during the paper.
Key word: Bubble sort Selection sort Insertion sort Shell sort Merge sort Heapsort,
Quicksort Bucket sort . Pop operation . Push operation . Peek operation
I. INTRODUCTION
In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order.
The most-used orders are numerical order and lexicographical order. Efficient sorting is important for
optimizing the use of other algorithms (such as search and merge algorithms) that require sorted lists to
work correctly; it is also often useful for canonicalizing data and for producing human-readable output.
More formally, the output must satisfy two conditions:
1. The output is in nondecreasing order (each element is no smaller than the previous element
according to the desired total order);
2. The output is a permutation, or reordering, of the input.
Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due
to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort
was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting
algorithms are still being invented (for example, library sort was first published in 2004). Sorting
algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for
the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation,
divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case
analysis, time-space tradeoffs, and lower bounds.
In computer science, a stack is an abstract data type that serves as a collection of elements, with two
principal operations:
1. push, which adds an element to the collection, and
2. pop, which removes the most recently added element that was not yet removed.
The order in which elements come off a stack gives rise to its alternative name, LIFO (last in, first out).
Additionally, a peek operation may give access to the top without modifying the stack.[1] The name "stack"
for this type of structure comes from the analogy to a set of physical items stacked on top of each other,
which makes it easy to take an item off the top of the stack, while getting to an item deeper in the stack may
require taking off multiple other items first.[2]
Considered as a linear data structure, or more abstractly a sequential collection, the push and pop operations
occur only at one end of the structure, referred to as the top of the stack. This makes it possible to implement
a stack as a singly linked list and a pointer to the top element. A stack may be implemented to have a
bounded capacity. If the stack is full and does not contain enough space to accept an entity to be pushed, the
stack is then considered to be in an overflow state. The pop operation removes an item from the top of the
stack.
1
II. POPULAR SORTING ALGORITHMS
In this section we are going to list some of the most popular sorting algorithms and describe them.
They are: Bubble Sort, Selection Sort, Insertion Sort, Merge Sort and Quick Sort
Bubble sort
Bubble sort, also known as sinking sort, is a simple sorting algorithm that works by repeatedly
stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they
are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates
that the list is sorted. The algorithm gets its name from the way smaller elements "bubble" to the top of
the list. Because it only uses comparisons to operate on elements, it is a comparison sort. The equally
simple insertion sort has better performance than bubble sort, so some have suggested no longer teaching
the bubble sort [2] [3].
Bubble sort has worst-case and average complexity both О(n2), where n is the number of items being
sorted. There exist many sorting algorithms with substantially better worst-case or average complexity of
O(n log n). Even other О(n2) sorting algorithms, such as insertion sort, tend to have better performance
than bubble sort. Therefore, bubble sort is not a practical sorting algorithm when n is large.
The only significant advantage that bubble sort has over most other implementations, even quicksort,
but not insertion sort, is that the ability to detect that the list is sorted is efficiently built into the algorithm.
Performance of bubble sort over an already-sorted list (best-case) is O(n). By contrast, most other
algorithms, even those with better average-case complexity, perform their entire sorting process on the set
and thus are more complex. However, not only does insertion sort have this mechanism too, but it also
performs better on a list that is substantially sorted (having a small number of inversions).
The positions of the elements in bubble sort will play a large part in determining its performance.
Large elements at the beginning of the list do not pose a problem, as they are quickly swapped. Small
elements towards the end, however, move to the beginning extremely slowly. This has led to these types
of elements being named rabbits and turtles, respectively.
Various efforts have been made to eliminate turtles to improve upon the speed of bubble sort.
Cocktail sort achieves this goal fairly well, but it retains O(n2) worst-case complexity. Comb sort
compares elements separated by large gaps, and can move turtles extremely quickly before proceeding to
smaller and smaller gaps to smooth out the list. Its average speed is comparable to faster algorithms like
quicksort. The algorithm can be expressed as:
procedure bubbleSort( A : list of sortable items )
do
swapped = false
for each i in 1 to length(A) - 1 inclusive do:
if A[i-1] > A[i] then
swap( A[i-1], A[i] )
swapped = true
end if
end for
while swapped
end procedure
2
Selection sort
Selection sort is a sorting algorithm, specifically an in-place comparison sort. It has O(n2)
complexity, making it inefficient on large lists, and generally performs worse than the similar insertion
sort. Selection sort is noted for its simplicity, and also has performance advantages over more
complicated algorithms in certain situations.
Selection sort is not difficult to analyze compared to other sorting algorithms since none of the loops
depend on the data in the array. Selecting the lowest element requires scanning all n elements (this takes n
− 1 comparisons) and then swapping it into the first position. Finding the next lowest element requires
scanning the remaining n − 1 elements and so on, for (n − 1) + (n − 2) + ... + 2 + 1 = n(n − 1) / 2 ∈θ(n2)
comparisons. Each of these scans requires one swap for n − 1 elements (the final element is already in
place).
Selection Sort's philosophy most closely matches human intuition: It finds the largest element and
puts it in its place. Then it finds the next largest and places it and so on until the array is sorted. To put an
element in its place, it trades positions with the element in that location (this is called a swap). As a result,
the array will have a section that is sorted growing from the end of the array and the rest of the array will
remain unsorted [4].
Algorithm is as follow [5]:
for i ← 1 to n-1 do
min j ← i;
min x ← A[i]
for j ← i + 1 to n do
If A[j] < min x then
min j ← j
min x ← A[j]
A[min j] ← A [i]
A[i] ← min x
Insertion sort
Insertion sort is a simple sorting algorithm: a comparison sort in which the sorted array (or list) is built
one entry at a time. It is much less efficient on large lists than more advanced algorithms such as
quicksort, heapsort, or merge sort. However, insertion sort provides several advantages:
Simple implementation
Efficient for (quite) small data sets
Adaptive, i.e. efficient for data sets that are already substantially sorted: the time complexity is O(n +
d), where d is the number of inversions
More efficient in practice than most other simple quadratic, i.e. O(n2) algorithms such as selection
sort or bubble sort; the best case (nearly sorted input) is O(n)
Stable, i.e. does not change the relative order of elements with equal keys
In-place, i.e. only requires a constant amount O(1) of additional memory space
Online, i.e. can sort a list as it receives it
Most humans when sorting—ordering a deck of cards, for example—use a method that is similar to
insertion sort.[6]
Every repetition of insertion sort removes an element from the input data, inserting it into the correct
position in the already-sorted list, until no input elements remain. The choice of which element to remove
from the input is arbitrary, and can be made using almost any choice algorithm.
Sorting is typically done in-place. The resulting array after k iterations has the property where the
first k + 1 entries are sorted. In each iteration the first remaining entry of the input is removed, inserted
into the result at the correct position, thus extending the result:
Becomes
3
with each element greater than x copied to the right as it is compared against x.
The most common variant of insertion sort, which operates on arrays, can be described as follow
1. Suppose that there exists a function called Insert designed to insert a value into a sorted sequence
at the beginning of an array. It operates by beginning at the end of the sequence and shifting each
element one place to the right until a suitable position is found for the new element. The function
has the side effect of overwriting the value stored immediately after the sorted sequence in the
array.
2. To perform an insertion sort, begin at the left-most element of the array and invoke Insert to
insert each element encountered into its correct position. The ordered sequence into which the
element is inserted is stored at the beginning of the array in the set of indices already examined.
Each insertion overwrites a single value: the value being inserted.
Below is the pseudocode for insertion sort for a zero-based array:
for j ←1 to length(A)-1
key ← A[ j ]
> A[ j ] is added in the sorted sequence A[1, .. j-1]
i ← j - 1
while i >= 0 and A [ i ] > key
A[ i +1 ] ← A[ i ]
i ← i -1
A [i +1] ← key
Merge sort
Merge sort is an O(n log n) comparison-based sorting algorithm. Most implementations produce a
stable sort, meaning that the implementation preserves the input order of equal elements in the sorted
output. It is a divide and conquer algorithm. Merge sort was invented by John von Neumann in 1945 [10].
In sorting n objects, merge sort has an average and worst-case performance of O(n log n). If the
running time of merge sort for a list of length n is T(n), then the recurrence T(n) = 2T(n/2) + n follows
from the definition of the algorithm (apply the algorithm to two lists of half the size of the original list,
and add the n steps taken to merge the resulting two lists). The closed form follows from the master
theorem.
In the worst case, merge sort does an amount of comparisons equal to or slightly smaller than
(n ⌈lg n⌉ - 2⌈lg n⌉ + 1), which is between (n lg n - n + 1) and (n lg n + n + O(lg n)) [11].
For large n and a randomly ordered input list, merge sort's expected (average) number of
comparisons approaches α·n fewer than the worst case where
The worst case, merge sort does about 39% fewer comparisons than quicksort does in the average
case; merge sort always makes fewer comparisons than quicksort, except in extremely rare cases, when
they tie, where merge sort's worst case is found simultaneously with quicksort's best case. In terms of
moves, merge sort's worst case complexity is O(n log n)—the same complexity as quicksort's best case,
and merge sort's best case takes about half as many iterations as the worst case.
Recursive implementations of merge sort make 2n − 1 method calls in the worst case, compared to
quicksort's n, thus merge sort has roughly twice as much recursive overhead as quicksort. However,
iterative, non-recursive implementations of merge sort, avoiding method call overhead, are not difficult to
code. Merge sort's most common implementation does not sort in place; therefore, the memory size of the
input must be allocated for the sorted output to be stored in.
Merge sort as described here also has an often overlooked, but practically important, best-case
property. If the input is already sorted, its complexity falls to O(n). Specifically, n-1 comparisons and
zero moves are performed, which is the same as for simply running through the input, checking if it is pre-
sorted.
4
Sorting in-place is possible (e.g., using lists rather than arrays) but is very complicated, and will offer
little performance gains in practice, even if the algorithm runs in O(n log n) time. (Katajainen, Pasanen &
Teuhola 1996) In these cases, algorithms like heapsort usually offer comparable speed, and are far less
complex. Additionally, unlike the standard merge sort, in-place merge sort is not a stable sort. In the case
of linked lists the algorithm does not use more space than that the already used by the list representation,
but the O(log(k)) used for the recursion trace.
Merge sort is more efficient than quick sort for some types of lists if the data to be sorted can only be
efficiently accessed sequentially, and is thus popular in languages such as Lisp, where sequentially
accessed data structures are very common. Unlike some (efficient) implementations of quicksort, merge
sort is a stable sort as long as the merge operation is implemented properly.
As can be seen from the procedure merge sort, there are some demerits. One complaint we might
raise is its use of 2n locations; the additional n locations were needed because one couldn't reasonably
merge two sorted sets in place. But despite the use of this space the algorithm must still work hard: The
contents of m are first copied into left and right and later into the list result on each invocation of
merge_sort (variable names according to the pseudocode above). An alternative to this copying is to
associate a new field of information with each key (the elements in m are called keys). This field will be
used to link the keys and any associated information together in a sorted list (a key and its related
information is called a record). Then the merging of the sorted lists proceeds by changing the link values;
no records need to be moved at all. A field which contains only a link will generally be smaller than an
entire record so less space will also be used.
Another alternative for reducing the space overhead to n/2 is to maintain left and right as a combined
structure, copy only the left part of m into temporary space, and to direct the merge routine to place the
merged output into m. With this version it is better to allocate the temporary space outside the merge
routine, so that only one allocation is needed. The excessive copying mentioned in the previous paragraph
is also mitigated, since the last pair of lines before the return result statement (function merge in the
pseudo code above) become superfluous.
function merge_sort(m)
if length(m) ≤ 1
return m
var list left, right, result
var integer middle = length(m) / 2
for each x in m up to middle
add x to left
for each x in m after middle
add x to right
left = merge_sort(left)
right = merge_sort(right)
result = merge(left, right)
return result
Quicksort
Quicksort is a sorting algorithm developed by C. A. R. Hoare that, on average, makes O(nlogn) (big
O notation) comparisons to sort n items. In the worst case, it makes O(n2) comparisons, though if
implemented correctly this behavior is rare. Typically, quicksort is significantly faster in practice than
other O(nlogn) algorithms, because its inner loop can be efficiently implemented on most architectures,
and in most real-world data it is possible to make design choices that minimize the probability of
requiring quadratic time. Additionally, quicksort tends to make excellent usage of the memory hierarchy,
taking perfect advantage of virtual memory and available caches. Although quicksort is usually not
implemented as an in-place sort, it is possible to create such an implementation.[12]
Quicksort (also known as "partition-exchange sort") is a comparison sort and, in efficient
implementations, is not a stable sort.
Quicksort sorts by employing a divide and conquer strategy to divide a list into two sub-lists.The
steps are:
1. Pick an element, called a pivot, from the list.
2. Reorder the list so that all elements with values less than the pivot come before the pivot, while
all elements with values greater than the pivot come after it (equal values can go either way).
5
After this partitioning, the pivot is in its final position. This is called the partition operation
3. Recursively sort the sub-list of lesser elements and the sub-list of greater elements.
The bases cases of the recursion are lists of size zero or one, which never need to be sorted.
Algorithm is listed bellow:
function quicksort(array)
var list less, greater
if length(array) ≤ 1
return array // an array of zero or one elements is already sorted
select and remove a pivot value pivot from array
for each x in array
if x ≤ pivot then append x to less
else append x to greater
return concatenate(quicksort(less), pivot, quicksort(greater))
III. LUSION
IV. CONCLUSION
In this paper, we got into sorting problem and investigated different solutions. We talked about the
most popular algorithms that are useful for sorting lists. They are: Bubble sort, Selection sort, Insertion
sort, Shell sort, Merge sort, Heapsort, Quicksort and Bucket sort. Algorithms were represented with perfect
descriptions. Also, it was tried to indicate the computational complexity of them in the worst, middle and
best cases. At the end, implementation code was placed.
REFERENCES
[1] Wikipedia. Address: http://www.wikipedia.com
[2] Owen Astrachan. Bubble Sort: An Archaeological Algorithmic Analysis. SIGCSE 2003 Hannan Akhtar. Available at:
http://www.cs.duke.edu/~ola/papers/bubble.pdf.
[3] Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Second Edition. Addison-Wesley, 1998.
ISBN 0-201-89685-0. Pages 106–110 of section 5.2.2: Sorting by Exchanging.
[4] Available at: http://webspace.ship.edu/cawell/Sorting/selintro.htm
[5] Available at: http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/Sorting/selectionSort.htm
6
[6] Robert Sedgewick, Algorithms, Addison-Wesley 1983 (chapter 8 p. 95)
[7] Sedgewick, Robert (1998). Algorithms in C. Addison Wesley. pp. 273–279.
[8] Weiss, Mark Allen (1997). Data Structures and Algorithm Analysis in C. Addison Wesley Longman. pp. 222–226.
[9] Pratt, V (1979). Shellsort and sorting networks (Outstanding dissertations in the computer sciences). Garland. ISBN 0-824-
04406-1. (This was originally presented as the author's Ph.D. thesis, Stanford University, 1971)
[10] Merge Sort - Wolfram MathWorld, available at: http://mathworld.wolfram.com/MergeSort.html
[11] The worst case number given here does not agree with that given in Knuth's Art of Computer Programming, Vol 3. The
discrepancy is due to Knuth analyzing a variant implementation of merge sort that is slightly sub-optimal.
[12] R. Sedgewick, Implementing quicksort programs, Comm. ACM, 21(10):847-857, 1978. Implementing Quicksort Programs,
available:
http://delivery.acm.org/10.1145/360000/359631/p847-
sedgewick.pdf?key1=359631&key2=9191985921&coll=DL&dl=ACM&CFID=6618157&CFTOKEN=73435998