Unit III
Searching and Sorting
PROF . SHITAL R. BEDSE
MET,IOE,NASHIK
Contents
⚫Searching: Search Techniques-Sequential Search/Linear
Search, Variant of Sequential Search- Sentinel Search, Binary
Search, Fibonacci Search, and Indexed Sequential Search.
⚫Sorting: Types of Sorting-Internal and External Sorting,
General Sort Concepts-Sort Order, Stability, efficiency, and
Number of Passes,
⚫Comparison Based Sorting Methods-Bubble Sort, Insertion
Sort, Selection Sort, Quick Sort, Shell Sort,
⚫Non-comparison Based Sorting Methods-Radix Sort,
Counting Sort, and Bucket Sort,
⚫Comparison of All Sorting Methods and their complexities.
Searching
⚫Searching is the process of finding the location of the
target among a list of objects. The two basic
⚫search techniques are the following:
⚫1. Sequential search
⚫2. Binary search
⚫There are certain ways of organizing data, which make
the search process more efficient. If the data is kept in a
proper order, it is much easier to search.
Searching
⮚ External searching : secondary storage (hard disk)
⮚ Internal searching : primary storage (main memory)
.This is faster than external searching.
⮚ It accepts two arguments as parameters—
• a target value to be searched and
• the list to be searched.
Searching Techniques
1. Sequential search
2. Binary search
3. Fibonacci search
4. Index sequential search
5. Hashed search
The performance of a searching algorithm can be
computed by counting the number of comparisons to
find a given value.
Linear Search
⚫Linear Search is defined as a sequential search
algorithm that starts at one end and goes through each
element of a list until the desired element is found,
otherwise the search continues till the end of the data set.
How does Linear Search Algorithm Work?
In Linear Search Algorithm,
⚫Every element is considered as a potential match for the
key and checked for the same.
⚫If any element is found equal to the key, the search is
successful and the index of that element is returned.
⚫If no element is found equal to the key, the search yields
“No match found”.
Linear Search Algorithm
Step 1 - Read the search element from the user.
Step 2 - Compare the search element with the first element in the
list.
Step 3 - If both are matched, then display "Given element is
found!!!" and terminate the function
Step 4 - If both are not matched, then compare search element
with the next element in the list.
Step 5 - Repeat steps 3 and 4 until search element is compared
with last element in the list.
Step 6 - If last element in the list also doesn't match, then display
"Element is not found!!!" and terminate the function.
Pseudocode
int SeqSearch (int A[max], int key, int n)
{
int i, flag = 0, position;
for(i = 0; i < n; i++)
{
if(key == A[i])
{
position = i;
flag = 1;
break;
}
}
if(flag == 1) // if found return position
return(position);
else // return −1 if not found
return(−1);
}
Properties of Linear Search:
⚫Time Complexity: The worst case time complexity of
linear search is O(n), where n is the size of the array.
⚫Space Complexity: Linear search requires a constant
amount of memory to run.
⚫Efficiency: Linear search is efficient for small datasets
but becomes inefficient for larger datasets. In practice,
linear search is often used as a subroutine in more
complex algorithms.
⚫Implementation: Linear search can be easily
implemented using a loop, with each iteration comparing
the target value to the current element of the array.
Application of Linear Search:
⚫Phonebook Search: Linear search can be used to search
through a phonebook to find a person’s name, given their
phone number.
⚫Spell Checkers: The algorithm compares each word in
the document to a dictionary of correctly spelled words
until a match is found.
⚫Finding Minimum and Maximum Values: Linear
search can be used to find the minimum and maximum
values in an array or list.
⚫Searching through unsorted data: Linear search is
useful for searching through unsorted data.
Advantages of Linear Search:
⚫Simplicity: Linear search is a very simple algorithm to
understand and implement.
⚫Works with unsorted data: Linear search works well
with unsorted data. It does not require any pre-
processing or sorting of the data before performing the
search.
⚫Low memory usage: Linear search only requires a
small amount of memory to store the index of the current
element being searched.
⚫Easy to debug: Because linear search is a simple
algorithm, it is easy to debug and troubleshoot any issues
that may arise.
Disadvantages of Linear Search:
⚫Inefficient for large datasets: As the size of the dataset
grows, the time taken by linear search also increases
proportionally.
⚫Limited applicability: Linear search is only suitable for
datasets that are not too large or not too complex.
⚫No early termination: Linear search does not have a
mechanism to terminate early once the target element is found.
⚫Inefficient for sorted data: When the data is already sorted,
linear search is not efficient because it needs to check each
element one by one, even if it has already passed the target
element.
Time Complexity
Best Case Complexity
⚫ The element being searched could be found in the first position.
⚫ In this case, the search ends with a single successful comparison.
⚫ Thus, in the best-case scenario, the linear search algorithm performs O(1)
operations.
Worst Case Complexity
⚫ The element being searched may be at the last position in the array or not at
all.
⚫ In the first case, the search succeeds in ‘n’ comparisons.
⚫ In the next case, the search fails after ‘n’ comparisons.
⚫ Thus, in the worst-case scenario, the linear search algorithm performs O(n)
operations.
Average Case Complexity
⚫ When the element to be searched is in the middle of the array, the average case
of the Linear Search Algorithm is O(n).
Space Complexity
⚫The linear search algorithm takes up no extra space;
its space complexity is O(n) for an array of n elements.
Basic Questions
Drawbacks of Linear Search:
● Linear search has a time complexity of O(N), which in turn
makes it slow for large datasets.
● Not suitable for large arrays.
When to use Linear Search?
● When we are dealing with a small dataset.
● When you are searching for a dataset stored in contiguous
memory.
Variations on Sequential Search
⚫Three useful variations on the sequential search
algorithm are:
(I) Sentinel search,
(II) Probability search, and
(III) Ordered list search.
Sentinel Search
Sentinel Linear Search as the name suggests is a type of Linear Search where the number
of comparisons is reduced as compared to a traditional linear search. In a traditional linear
search, only N comparisons are made, and in a Sentinel Linear Search, the sentinel value is
used to avoid any out-of-bounds comparisons, but there is no additional comparison made
specifically for the index of the element being searched.
IIn this search, the last element of the array is replaced with the element to be searched
and then the linear search is performed on the array without checking whether the current
index is inside the index range of the array or not because the element to be searched will
definitely be found inside the array even if it was not present in the original array since the
last element got replaced with it. So, the index to be checked will never be out of the
bounds of the array. The number of comparisons in the worst case there will be (N + 2).
Sentinel Search
The basic idea of Sentinel Linear Search is to add an extra element at the end of
the array (i.e., the sentinel value) that matches the search key. By doing so, we
can avoid the conditional check for the end of the array in the loop and terminate
the search early, as soon as we find the sentinel element. This eliminates the
need for a separate check for the end of the array, resulting in a slight
improvement in the average case performance of the algorithm.
Sentinel Search Algorithm
● Initialize the search index variable i to 0.
● Set the last element of the array to the search key.
● While the search key is not equal to the current element of the array
(i.e., arr[i]), increment the search index i.
● If i is less than the size of the array or arr[i] is equal to the search key,
return the value of i (i.e., the index of the search key in the array).
● Otherwise, the search key is not present in the array, so return -1 (or
any other appropriate value to indicate that the key is not found).
.
def sentinelLinearSearch(array, key):
last = array[-1] ------------------------------------------------------------1
array[-1] = key ------------------------------------------------------------ 1
i=0 ------------------------------------------------------
------1
while array[i] != key:------------------------------------------------------------n
i += 1 ------------------------------------------------------
------n-1
array[len(array) - 1] = last--------------------------------------------------1
if i < len(array) - 1 or last == key:----------------------------------------1
return i ------------------------------------------------------
------1
else:
return -1 ------------------------------------------------------------1
array = [1, 2, 3, 4, 5, 6, 7, 8, 9]
key = 8
index = sentinelLinearSearch(array, key)
if index == -1:
print(f"{key} is not found in the array: {array}")
else:
print(f"{key} is found at index {index+1} in the array: {array}")
Time Complexity
The time complexity of the Sentinel Linear Search algorithm is
O(n) in the worst case.
In the best case, when the key is found in the first iteration, the
time complexity will be O(1).
However, the average time complexity is still O(n), because on
average, the key will be found after
Is Sentinel Linear Search better than normal
Linear Search?
The number of comparisons is reduced in this search as compared to a
traditional linear search. When a linear search is performed on an array of
size N then in the worst case a total of N comparisons are made when the
element to be searched is compared to all the elements of the array and (N +
1) comparisons are made for the index of the element to be compared so
that the index is not out of bounds of the array which can be reduced in a
Sentinel Linear Search. So total comparisons in the worst case can be 2*N +
1.
But in this search, the last element of the array is replaced with the element
to be searched and then the linear search is performed on the array without
checking whether the current index is inside the index range of the array or
not because the element to be searched will definitely be found inside the
array even if it was not present in the original array. So, the index to be
checked will never be out of the bounds of the array. The number of
comparisons in the worst case there will be (N + 2).
Probability search
In probability search, the elements that are more
probable are placed at the beginning of the array and
those that are less probable are placed at the end of the
array.
Ordered list search
When elements are ordered, binary search is preferred.
However, when data is ordered and is of smaller size,
sequential search with a small change is preferred to binary
search.
When the data is ordered but stored in a data structure such
as a linked list, modified sequential search is preferred.
While searching an ordered list, we need not continue the
search till the end of list to know that the target element is not
in the list.
While searching in an ascending ordered list, whenever an
element that is greater than or equal to the target is
encountered, the search stops. We can also add a sentinel to
avoid the end of list test.
Binary Search
Binary Search is defined as a searching algorithm used
in a sorted array by repeatedly dividing the search
interval in half. The idea of binary search is to use the
information that the array is sorted and reduce the
time complexity to O(log N).
Conditions for when to apply Binary Search
in a Data Structure:
To apply Binary Search algorithm:
● The data structure must be sorted.
● Access to any element of the data structure takes
constant time.
Binary Search Algorithm:
● Divide the search space into two halves by finding the middle index “mid”.
● Compare the middle element of the search space with the key.
● If the key is found at middle element, the process is terminated.
● If the key is not found at middle element, choose which half will be used as the next search
space.
● If the key is smaller than the middle element, then the left side is used for next search.
● If the key is larger than the middle element, then the right side is used for next search.
● This process is continued until the key is found or the total search space is exhausted.
Example
Consider an array arr[] = {2, 5, 8, 12, 16, 23, 38, 56, 72, 91}, and the target = 23.
First Step: Calculate the mid and compare the mid element with the key. If the key is less
than mid element, move to left and if it is greater than the mid then move search space to
the right.
● Key (i.e., 23) is greater than current mid element (i.e., 16). The search space moves
to the right.
Step 2
● Key is less than the current mid 56. The search space moves
to the left.
Step 3
Second Step: If the key matches the value of the mid element,
the element is found and stop search.
Time Complexity
Time complexity of binary search is O(log(n)) as it
halves the list size in each step. It is a large
improvement over linear search; for a list with 10
million entries, linear search will need 10 million key
comparisons in the worst case, whereas binary
search will need just about 24 comparisons
The time complexity can be written as a recurrence
relation as
T(1), n=1
T(n) = { T(n/2) + c, n>1
Pros and Cons of Binary Search
Pros
1. Suitable for sorted data
2. Efficient for large lists
3. Suitable for storage structures that support direct
access to data
4. Time complexity is O(log2 (n))
Cons
1. Not applicable for unsorted data
2. Not suitable for storage structures that do not support
direct access to data, for example,
magnetic tape and linked list
3. Inefficient for small list
Fibonacci Search
It has many diverse applications from estimation of
the number of cells in successive reproductions to
the number of leaves on branches.
The Fibonacci series has 0 and 1 as the first two
terms, and each successive term is the sum of the
previous two terms. Fibonacci numbers are 0, 1, 1, 2,
3, 5, 8, 13, 21, 34, ... with Fn = Fn−1+ Fn−2 for n ≥ 2
where, F0 = 0 and F1 = 1.
Fibonacci Search vs Binary Search
Fibonacci search modifies the binary search algorithm
slightly. Instead of halving the index for a search, a
Fibonacci number is subtracted from it. The Fibonacci
number to be subtracted decreases as the size of the list
decreases
Similarities with Binary Search:
1. Works for sorted arrays
2. A Divide and Conquer Algorithm.
3. Has Log n time complexity.
Fibonacci Search
Fibonacci search starts searching for the target by
⚫
comparing it with the element at the Fkth location.
Here, Fk ≥ n and Fk−1 < n.
The Fibonacci search works like the binary search
⚫
but with a few modifications. In binary search, we
have low, high, and mid positions for the sub-list.
Here, we have mid = n - Fk−1 + 1, F1 = Fk−2, and F2
= Fk−3. The target to be searched is compared with
A[mid]. mid is computed as follows
Find nth Fibonacci number
=
Example
⚫ Suppose we have a sorted array of elements {12, 14,
16, 17, 20, 24, 31, 43, 50, 62} and need to identify
the location of element 24 in it using Fibonacci
Search.
Step 1
⚫ The size of the input array is 10. The smallest
Fibonacci number greater than 10 is 13.
⚫ Therefore, Fm = 13, Fm-1 = 8, Fm-2 = 5.
⚫ We initialize offset = -1
Step 2
⚫ In the first iteration, compare it with the element at
index = minimum (offset + Fm-2, n – 1) = minimum
(-1 + 5, 9) = minimum (4, 9) = 4.
⚫ The fourth element in the array is 20, which is not a
match and is less than the key element.
⚫ Step 3
⚫ In the second iteration, update the offset value and the
Fibonacci numbers.
⚫ Since the key is greater, the offset value will become the
index of the element, i.e. 4. Fibonacci numbers are
updated as Fm = Fm-1 = 8.
⚫ Fm-1 = 5, Fm-2 = 3.
⚫ Now, compare it with the element at index = minimum
(offset + Fm-2, n – 1) = minimum (4 + 3, 9) = minimum
(7, 9) = 7.
⚫ Element at the 7th index of the array is 43, which is not a
match and is also lesser than the key.
Step 4
⚫ We discard the elements after the 7th index, so n = 7
and offset value remains 4.
⚫ Fibonacci numbers are pushed two steps backward, i.e.
Fm = Fm-2 = 3.
⚫ Fm-1 = 2, Fm-2 = 1.
⚫ Now, compare it with the element at index = minimum
(offset + Fm-2, n – 1) = minimum (4 + 1, 6) = minimum
(5, 7) = 5.
⚫ The element at index 5 in the array is 24, which is our
key element. 5th index is returned as the output for this
example array.
Analysis
The Fibonacci Search algorithm takes logarithmic
time complexity to search for an element. Since it is
based on a divide on a conquer approach and is
similar to idea of binary search, the time taken by
this algorithm to be executed under the worst case
consequences is O(log n).
Indexed Sequential Search
⚫ In this searching method, first of all, an index file is
created, that contains some specific group or
division of required record when the index is
obtained, then the partial indexing takes less time
cause it is located in a specified group.
⚫ Note: When the user makes a request for specific
records it will find that index group first where that
specific record is recorded.
Indexed Sequential Search
Characteristics of Indexed Sequential Search:
⚫ In Indexed Sequential Search a sorted index is set
aside in addition to the array.
⚫ Each element in the index points to a block of
elements in the array or another expanded index.
⚫ The index is searched 1st then the array and guides
the search in the array.
Advantage of Indexed Sequential Search
⚫ The advantage of indexed sequential over
sequential is that here we can achieve both
sequential as well as Random access (i.e. jump
directly to a record from the index value).
⚫ Moreover, it is faster than sequential method.
Indexed Sequential Search Algorithm
1. Read the search element from user.
2. Divide the array into groups according to group
size.
3. Create index array that contains starting index of
groups.
4. If group is present and 1st element of that group is
less than to key element go to next group.
Else apply linear search in previous group
5. Repeat step 4 for all groups.
⚫Sorting: Types of Sorting-Internal and External Sorting,
General Sort Concepts-Sort Order, Stability, efficiency,
and Number of Passes,
Sorting
⚫ Sorting techniques in data structures is a process of
rearranging data elements in an array or list in
order to make it easier to search and retrieve.
⚫ By sorting in data structure, the complexity of
searching for a particular item is reduced.
Types of Sorting-Internal and External Sorting,
⚫ If all the records to be sorted are kept internally in
the main memory, they can be sorted using an
internal sort.
⚫ However, if there are a large number of records to
be sorted, they must be kept in external files on
auxiliary storage. They have to be sorted using
external sort.
Internal Sorting
⚫ Any sort algorithm that uses main memory exclusively
during the sorting is called as an internal sort algorithm.
⚫ This assumes high-speed and random access to all data
members.
⚫ All the methods described in this chapter assume that all
the data is stored in high-speed main memory of the
computer and are therefore internal sorting techniques,
except for merge sort.
Internal sorting
Internal sorting is faster than external sorting. The
various internal sorting techniques are the following:
1. Bubble sort
2. Insertion sort
3. Selection sort
4. Quick sort
5. Heap sort
6. Shell sort
7. Bucket sort
8. Radix sort
9. File sort
External Sorting
⚫ Any sort algorithm that uses external memory, such
as tape or disk, during the sorting is called as an
external sort algorithm. Merge sort uses external
memory.
⚫ Do note that the other algorithms may read the
initial values from a magnetic tape or write sorted
values to a disk, but this is not using external
memory during the sort.
Sort Order
⚫ Data can be ordered either in ascending or in
descending order. The order in which the data is
organized, either ascending or descending, is called
sort order.
⚫ For example, the percentages of marks obtained by
students in the examination are organized in
descending order to decide ranks, whereas the
names in the telephone directory are organized
alphabetically in ascending order.
Sort Stability
⚫ A sorting method is said to be stable if at the end of the
method, identical elements occur in the same relative
order as in the original unsorted set.
⚫ While sorting, we must take care of the special case—
when two or more of the records have the same key, it is
important to preserve the order of records in this case
of duplicate keys.
⚫ A sorting algorithm is said to be stable if it preserves the
order for all records with duplicate keys; that means, if
for all records i and j is such that k[i] is equal to k[j] and
if r[i] precedes to r[j] in the unsorted table, then r[i]
precedes to r[j] in the sorted table too.
⚫ Bubble sort, merge sort and insertion sort are the stable
sort methods
Sort Efficiency
⚫ Each sorting method may be analysed depending
on the amount of time necessary for running the
program and the amount of space required for the
program. The amount of time for running a
program is proportional to the number of key
comparisons and the movement of records or the
movement of pointers to records.
⚫ Sort efficiency is a measure of the relative
efficiency of a sort. It is usually an estimate of the
number of comparisons and data movement
required to sort the data.
Passes
⚫ During the sorted process, the data is traversed
many times. Each traversal of the data is referred to
as a sort pass.
⚫ Depending on the algorithm, the sort pass may
traverse the whole list or just a section of the list. In
addition, the characteristic of a sort pass is the
placement of one or more elements in a sorted list
Comparison Based Sorting Methods -Bubble Sort,
Insertion Sort, Selection Sort, Quick Sort, Shell Sort,
Comparison Based Sorting Methods
⚫ Comparison Based Soring techniques are bubble
sort, selection sort, insertion sort, Merge sort,
quicksort, heap sort etc. These techniques are
considered as comparison based sort because in
these techniques the values are compared, and
placed into sorted position in ifferent phases. Here
we will see time complexity of these techniques.
Comparison Based Sorting Methods
Bubble Sort
⚫ The bubble sort is the oldest and the simplest sort in
use. Unfortunately, it is also the slowest.
⚫ The bubble sort works by comparing each item in the
list with the item next to it and swapping them if
required. The algorithm repeats this process until it
makes a pass all the way through the list without
swapping any items.
⚫ This causes larger values to ‘bubble’ to the end of the list
while smaller values ‘sink’ towards the beginning of the
list. In brief, the bubble sort derives its name from the
fact that the smallest data item bubbles up to the top of
the sorted array.
Insertion Sort
⚫ Insertion sort is a simple sorting algorithm that
works similar to the way you sort playing cards in
your hands. The array is virtually split into a sorted
and an unsorted part. Values from the unsorted
part are picked and placed at the correct position in
the sorted part.
Characteristics of Insertion Sort
⚫ Insertion Sort is one of the simplest algorithms
with a simple implementation
⚫ Basically, Insertion sort is efficient for small data
values
⚫ Insertion sort is adaptive in nature, i.e. it is
appropriate for data sets that are already partially
sorted.
Example
⚫ Let us consider a list L = {3, 6, 9, 14}. Given this
sorted list, we need to insert a new element 5 in it.
The commonly used process would involve the
following steps:
1. Compare the new element 5 and the last element 14
2. Shift 14 right to get 3, 6, 9, ,14
3. Shift 9 right to get 3, 6, ,9, 14
4. Shift 6 right to get 3, ,6, 9, 14
5. Insert 5 to get 3, 5, 6, 9, 14
Insertion Sort Algoritham
1. Set J = 1, where J is an integer
2. Check if list (J) < list (J − 1): if so interchange
them; set J = J −1 and repeat step (2) until J = 1
3. Set J = 3, 4, 5,. . ., N and keep on executing step (2)
Complexity Analysis
⚫ Time Complexity of Insertion Sort
⚫ The worst-case time complexity of the Insertion
sort is O(N^2)
⚫ The average case time complexity of the Insertion
sort is O(N^2)
⚫ The time complexity of the best case is O(N).
⚫ Space Complexity of Insertion Sort
⚫ The auxiliary space complexity of Insertion Sort
is O(1)
Frequently Asked Questions on Insertion Sort
Q1. What are the Boundary Cases of the Insertion Sort algorithm?
Insertion sort takes the maximum time to sort if elements are sorted in reverse
order and it takes minimum time (Order of n) when elements are already sorted.
Q2. What is the Algorithmic Paradigm of the Insertion Sort
algorithm?:
The Insertion Sort algorithm follows an incremental approach.
Q3. Is Insertion Sort an in-place sorting algorithm?
Yes, insertion sort is an in-place sorting algorithm.
Q4. Is Insertion Sort a stable algorithm?
Yes, insertion sort is a stable sorting algorithm.
Q5. When is the Insertion Sort algorithm used?
Insertion sort is used when number of elements is small. It can also be useful
When the input array is almost sorted, and only a few elements are misplaced in
complete big array.
Selection Sort
⚫ Selection sort is a simple sorting algorithm. This sorting
algorithm is an in-place comparison-based algorithm in
which the list is divided into two parts, the sorted part
at the left end and the unsorted part at the right end.
Initially, the sorted part is empty and the unsorted part
is the entire list.
⚫ The smallest element is selected from the unsorted
array and swapped with the leftmost element, and that
element becomes a part of the sorted array. This
process continues moving unsorted array boundary by
one element to the right.
⚫ This algorithm is not suitable for large data sets as its
average and worst case complexities are of Ο(n2),
where n is the number of items.
Algorithm
⚫ Step 1 − Set MIN to location 0
⚫ Step 2 − Search the minimum element in the list
⚫ Step 3 − Swap with value at location MIN
⚫ Step 4 − Increment MIN to point to next element
⚫ Step 5 − Repeat until list is sorted
Pass 1: Pass 4:
Pass 2:
Pass 3:
Pass 5 :
Pseudocode
void selectionSort(int arr[], int n)
{
int i, j, min_idx;
// One by one move boundary of unsorted subarray
for (i = 0; i < n - 1; i++)
{ // Find the minimum element in unsorted array
min_idx = i;
for (j = i + 1; j < n; j++)
{
if (arr[j] < arr[min_idx])
min_idx = j;
}// Swap the found minimum element with the first element
if (min_idx != i)
swap(arr[min_idx], arr[i]);
}
}
Time Complexity
Total number of comparisons
= 1+ 2 + 3 + ……….+ (n - 3) + (n - 2)+ (n - 1)
= n*(n-1)/2
= O(n^2)
⚫ Worst-case complexity: O(n^2)
When the input array is sorted in descending order
In this case, n-1 swaps are performed
⚫ Best-case complexity: O(n^2)
When the array is already sorted (ascending order)
In this case, no swap operation is performed
⚫ Average-case complexity: O(n^2)
When the array is neither sorted in ascending nor in descending order
Space Complexity
⚫ Selection sort’s space complexity is O(1), as it doesn’t take any extra space.
Advantages of Selection Sort
⚫ Is easy to implement
⚫ Works well on small lists
⚫ Does not require additional temporary storage
Disadvantages of Selection Sort
⚫ Does not work well for huge lists of items; has a
poor run time — n^2 for n elements
⚫ Is not adaptive: For example, an optimized bubble
sort algorithm will take O(n) time to run on a sorted
array, as it is adaptive. But selection sort will take
O(n^2) in every case
Selection Sort FAQs
Question 1: Is the selection sort algorithm an in-place, comparison-based
sorting algorithm?
Answer: The selection sort algorithm is an in-place, comparison-based sorting
algorithm. This basically means that this algorithm transforms a given input (in
this case, an array) without using any other data structure. In such cases, the input
is usually overwritten by the output.
Question 2: How many swaps does the selection sort algorithm make?
Answer: The selection sort algorithm makes n-1 swaps in the worst case and zero
swaps in the best case. Therefore, it never makes more than O(n) swaps. So, it is
handy in situations where “memory write” is a costly operation.
Question 3: Is the selection sort algorithm faster than bubble sort?
Answer: Both selection sort and bubble sort have a worst-case complexity of O(n^2).
However, the selection sort algorithm is still faster than bubble sort in the worst-
case scenario when “memory write” is a costly operation. This is because selection
sort makes fewer swaps compared to bubble sort.
Quick Sort
⚫ Quick sort is based on the divide-and-conquer strategy.
This sort technique initially selects an element called as
pivot that is near the middle of the list to be sorted.
⚫ Then the items on either side are moved so that the
elements on one side of pivot are smaller and on the
other side are larger. Now, the pivot is at the right
position with respect to the sorted sequence.
⚫ These two steps, selecting the pivot and arranging the
elements on either side of pivot, are now applied
recursively to both the halves of the list till the list size
reduces to one.
Recursive Algorithm
Recursive algorithm consists of four steps:
1. If the array size is 1, return immediately.
2. Pick an element in the array to serve as a ‘pivot’
(usually the left-most element in the list).
3. Partition the array into two parts—one with
elements smaller than the pivot and the other with
elements larger than the pivot by traversing from
both the ends and performing swaps if needed.
4. Recursively repeat the algorithm for both
partitions.
Example
⚫ Let us consider an example. Let the list of numbers
to be sorted be {13, 11, 14, 11, 15, 19, 12, 16, 15, 13,
15, 18, 19}.
⚫ Now, the first element 13 becomes pivot. We need
to place 13 at a proper location so that all elements
to its left are smaller and the right are greater
Shell Sort
⚫ It is a sorting algorithm that is an extended version of
insertion sort. Shell sort has improved the average time
complexity of insertion sort. As similar to insertion sort, it is
a comparison-based and in-place sorting algorithm. Shell
sort is efficient for medium-sized data sets.
⚫ In insertion sort, at a time, elements can be moved ahead by
one position only.
⚫ To move an element to a far-away position, many
movements are required that increase the algorithm's
execution time. But shell sort overcomes this drawback
of insertion sort. It allows the movement and swapping
of far-away elements as well.
⚫ This algorithm first sorts the elements that are far away from
each other, then it subsequently reduces the gap between
them. This gap is called as interval.
Algorithm
Step 1 − Start
Step 2 − Initialize the value of gap size. Example: h
Step 3 − Divide the list into smaller sub-part. Each
must have equal intervals to h
Step 4 − Sort these sub-lists using insertion sort
Step 5 – Repeat this step 2 until the list is sorted.
Step 6 – Print a sorted list.
Step 7 – Stop
Working of Shell Sort
Let the elements of array are –
⚫ We will use the original sequence of shell sort, i.e., N/2,
N/4,....,1 as the intervals.
⚫ In the first loop, n is equal to 8 (size of the array), so the
elements are lying at the interval of 4 (n/2 = 4). Elements
will be compared and swapped if they are not in order.
⚫ Here, in the first loop, the element at the 0th position will be
compared with the element at 4th position. If the 0th element
is greater, it will be swapped with the element at 4th position.
Otherwise, it remains the same. This process will continue
for the remaining elements.
Pseudocode
Void shell(int a[], int n)
{
//Rearrange the array elements at n/2, n/4, ..., 1 intervals
for (int interval = n/2; interval > 0; interval /= 2)
{
for (int i = interval; i < n; i += 1)
{
// store a[i] to the variable temp and make the ith position empty
int temp = a[i];
int j;
for (j = i; j >= interval && a[j - interval] > temp; j -= interval)
a[j] = a[j - interval];
// put temp (the original a[i]) in its correct position
a[j] = temp;
}
}
}
Shell Sort Complexity
Space Complexity
Non-comparison Based Sorting Methods- Radix Sort,
Counting Sort, and Bucket Sort,
Non-comparison Based Sorting Methods
⚫ The sorting algorithms that sort the data without
comparing the elements are called non-comparison
sorting algorithms. The examples for non-
comparison sorting algorithms are counting sort,
bucket sort, radix sort, and others.
Radix Sort
⚫ Radix Sort is a linear sorting algorithm that sorts
elements by processing them digit by digit. It is an
efficient sorting algorithm for integers or strings
with fixed-size keys.
⚫ Rather than comparing elements directly, Radix
Sort distributes the elements into buckets based on
each digit’s value. By repeatedly sorting the
elements by their significant digits, from the least
significant to the most significant, Radix Sort
achieves the final sorted order.
Radix Sort
⚫ The key idea behind Radix Sort is to exploit the
concept of place value. It assumes that sorting
numbers digit by digit will eventually result in a
fully sorted list. Radix Sort can be performed using
different variations, such as Least Significant Digit
(LSD) Radix Sort or Most Significant Digit (MSD)
Radix Sort.
⚫ How does Radix Sort Algorithm work?
⚫ To perform radix sort on the array [170, 45, 75, 90,
802, 24, 2, 66], we follow these steps:
Radix Sort algoritham
⚫ Step 1: Find the largest element in the array,
which is 802. It has three digits, so we will iterate
three times, once for each significant place.
⚫ Step 2: Sort the elements based on the unit place
digits (X=0). We use a stable sorting technique,
such as counting sort, to sort the digits at each
significant place.
Time Complexity:
⚫ Radix sort is a non-comparative integer sorting
algorithm that sorts data with integer keys by grouping
the keys by the individual digits which share the same
significant position and value. It has a time complexity
of O(d * (n + b)), where d is the number of digits, n is
the number of elements, and b is the base of the
number system being used.
⚫ In practical implementations, radix sort is often faster
than other comparison-based sorting algorithms, such
as quicksort or merge sort, for large datasets, especially
when the keys have many digits. However, its time
complexity grows linearly with the number of digits,
and so it is not as efficient for small datasets.
Complexity
Auxiliary Space:
⚫ Radix sort also has a space complexity of O(n +
b), where n is the number of elements and b is the
base of the number system. This space complexity
comes from the need to create buckets for each digit
value and to copy the elements back to the original
array after each digit has been sorted.
Counting Sort
⚫ Counting Sort is a non-comparison-based sorting
algorithm that works well when there is limited range of
input values.
⚫ It is particularly efficient when the range of input values
is small compared to the number of elements to be
sorted.
⚫ The basic idea behind Counting Sort is to count
the frequency of each distinct element in the input
array and use that information to place the elements in
their correct sorted positions.
Counting Sort Algorithm
⚫ Declare an auxiliary array countArray[] of
size max(inputArray[])+1 and initialize it with 0.
⚫ Traverse array inputArray[] and map each element
of inputArray[] as an index of countArray[] array, i.e.,
execute countArray[inputArray[i]]++ for 0 <= i < N.
⚫ Calculate the prefix sum at every index of array inputArray[].
⚫ Create an array outputArray[] of size N.
⚫ Traverse array inputArray[] from end and
update outputArray[ countArray[ inputArray[i] ] – 1] =
inputArray[i]. Also, update countArray[ inputArray[i] ] =
countArray[ inputArray[i] ]–1.
Counting Sort
1. Find out the maximum element from the given array.
2. Initialize a countArray[] of length max+1 with all
elements as 0. This array will be used for storing the
occurrences of the elements of the input array.
⚫ Step 3:
In the countArray[], store the count of each unique
element of the input array at their respective indices.
For Example: The count of element 2 in the input array
is 2. So, store 2 at index 2 in the countArray[].
Similarly, the count of element 5 in the input array is 1,
hence store 1 at index 5 in the countArray[].
Step 4:
Store the cumulative sum or prefix sum of the elements of
the countArray[] by doing
countArray[i] = countArray[i – 1] + countArray[i].
This will help in placing the elements of the input array at the
correct index in the output array.
Step 5:
Iterate from end of the input array and because traversing input
array from end preserves the order of equal elements, which
eventually makes this sorting algorithm stable.
Update outputArray[ countArray[ inputArray[i] ] – 1] =
inputArray[i]. Also, update countArray[ inputArray[i] ] =
countArray[ inputArray[i] ]-1.
Step 6:
For i = 6, Update outputArray[ countArray[ inputArray[6] ] –
1] = inputArray[6].
Also, update
countArray[ inputArray[6] ] = countArray[ inputArray[6] ]–1.
Advantages of counting sort
⚫ It is faster than merge sort. The time complexity of
merge sort is O(n*log(n)) whereas the time
complexity of counting sort is O(n+k) where k is the
maximum value in the array.
⚫ It is a stable sort.
Disadvantages of counting sort
⚫ It is not suitable for large data values.
⚫ It uses more space.
⚫ It is not suitable for sorting strings.
Bucket Sort
⚫ Bucket sort is a sorting technique that involves
dividing elements into various groups, or buckets.
These buckets are formed by uniformly
distributing the elements. Once the elements are
divided into buckets, they can be sorted using any
other sorting algorithm. Finally, the sorted
elements are gathered together in an ordered
fashion.
Bucket Sort Algorithm
⚫ Create n empty buckets (Or lists) and do the
following for every array element arr[i].
⚫ Insert arr[i] into bucket[n*array[i]]
⚫ Sort individual buckets using insertion sort.
⚫ Concatenate all sorted buckets.
How does Bucket Sort work?
⚫ To apply bucket sort on the input array [0.78,
0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23,
0.68], we follow these steps:
⚫ Step 1: Create an array of size 10, where each slot
represents a bucket.
Step 2:
⚫ Insert elements into the buckets from the input array
based on their range.
⚫ Inserting elements into the buckets:
⚫ Take each element from the input array.
⚫ Multiply the element by the size of the bucket array (10
in this case). For example, for element 0.23, we get 0.23
* 10 = 2.3.
⚫ Convert the result to an integer, which gives us the
bucket index. In this case, 2.3 is converted to the integer
2.
⚫ Insert the element into the bucket corresponding to the
calculated index.
⚫ Repeat these steps for all elements in the input array.
Step 3:
⚫ Sort the elements within each bucket. In this
example, we use quicksort (or any stable sorting
algorithm) to sort the elements within each bucket.
⚫ Sorting the elements within each bucket:
⚫ Apply a stable sorting algorithm (e.g., quicksort) to
sort the elements within each bucket.
⚫ The elements within each bucket are now sorted.
Step 4:
⚫ Gather the elements from each bucket and put
them back into the original array.
⚫ Gathering elements from each bucket:
⚫ Iterate through each bucket in order.
⚫ Insert each individual element from the bucket into
the original array.
⚫ Once an element is copied, it is removed from the
bucket.
⚫ Repeat this process for all buckets until all elements
have been gathered.
Step 5:
⚫ The original array now contains the sorted
elements.
⚫ The final sorted array using bucket sort for the
given input is [0.12, 0.17, 0.21, 0.23, 0.26, 0.39,
0.68, 0.72, 0.78, 0.94].
Complexity Analysis of Bucket Sort Algorithm
⚫ Time Complexity: O(n2),
⚫ If we assume that insertion in a bucket takes O(1)
time then steps 1 and 2 of the above algorithm
clearly take O(n) time.
⚫ The O(1) is easily possible if we use a linked list to
represent a bucket.
⚫ Step 4 also takes O(n) time as there will be n items
in all buckets.
⚫ The main step to analyze is step 3. This step also
takes O(n) time on average if all numbers are
uniformly distributed.
Comparison of All Sorting Methods and their
complexities
In-place and Stable sorting Algorithms
⚫ A sorting algorithm is In-place if the algorithm
does not use extra space for manipulating the input
but may require a small though non constant extra
space for its operation. Or we can say, a sorting
algorithm sorts in-place if only a constant number
of elements of the input array are ever stored
outside the array.
⚫ A sorting algorithm is stable if it does not change
the order of elements with the same value.
Key Points
⚫ Understanding the sorting algorithms are the best way to learn
problem solving and complexity analysis in the algorithms. In
some cases, We often use sorting as a key routine to solve several
coding problems.
⚫ Knowing which algorithm is best possible depends heavily on
details of the application and implementation.
⚫ In most practical situations, quicksort is a popular algorithm for
sorting large input arrays because its expected running time is
O(nlogn).
⚫ If stability is important and space is available, the merge sort
might be the best choice for the implementation.
⚫ Like insertion sort, quick sort has tight code and hidden constant
factor in its running time is small.
⚫ When the input array is almost sorted or input size is small, then
the insertion sort can be preferred.
⚫ We can prefer merge sort for sorting a linked list.
References
⚫ https://www.geeksforgeeks.org/sorting-algorithms/
⚫ https://www.tutorialspoint.com/data_structures_alg
orithms/sorting_algorithms.htm
⚫ https://www.geeksforgeeks.org/searching-
algorithms/