SORTING
&
SEARCHING
SEARCHING
Searching: It is the process of finding a given value
position in a list of values. It decides whether a search
key is present in the data or not.
It can be done on internal data structure or on
external data structure.
Internal Search
The search in which the whole list resides in the main
memory is called internal search.
External Search
The search in which the whole list resides in the
Secondary memory is called external search.
TYPES OF SEARCHING ALGORITHM
Linear or Sequential searching: In linear search, each
element of an array is read one-by-one sequentially and it
is compared with the desired element.
Binary searching: If we place our items in an array sort
them in either ascending or descending order on the key
first, then we can obtain much better performance with
an algorithm called binary search.
Binary search looks for a particular item by comparing
the middle most item of the collection. If a match occurs,
then the index of item is returned. If the middle item is
greater than the item, then the item is searched in the
sub-array to the left of the middle item.
ALGORITHM FOR LINEAR
SEARCH
LINEAR(DATA,N,ITEM,LOC)
1.Set K:=1 and LOC := 0.
2.Repeat Steps 3 and 4 while LOC = 0 and
K<=N.
3.If ITEM = DATA[K] ,then Set LOC:= K.
4.Set K:=K+1.
5.If LOC = 0,then:
Write: ITEM is not in the array
DATA.
Else:
Write: LOC is the location of ITEM.
6.Exit.
Advantages
It is the simplest way for finding an
element in a list.
Not Expensive.
List/Array may or may not be sorted.
Disadvantages
Less Efficient.
Time consuming.
It is used only for small amount of data.
BEST CASE: O(1)
WORST CASE: O(N)
AVERAGE CASE:TARGET IS MIDDLE ITEM
O(N/2)
1+2+3+…N=N(N+1)/2*(1/N)
O(N+1/2)
ALGORITHM FOR BINARY
SEARCH
BINARY(DATA , LB , UB, ITEM , LOC)
1. Set BEG:= LB, END:=UB and
MID = INT((BEG+END)/2).
2. Repeat Steps 3 and 4 while BEG<=END
and DATA[MID]!=ITEM.
3. If ITEM< DATA[MID] ,then:
Set END := MID-1.
Else:
Set BEG := MID+1.
4. Set MID := INT((BEG+END)/2).
5. If DATA[MID]=ITEM, then:
Set LOC:= MID.
Else:
Set LOC:= NULL.
6. Exit.
Advantages
Extremely Efficient
Less Time Consuming
Disadvantages
Expensive
List/Array may be sorted.
INTRODUCTION TO
SORTING
Sorting: An operation that arranges items into
groups according to specified criterion.
A={3 1 6 2 1 34 5 9 0 }
A={0 1 1 2 3 34 5 6 9 }
WHY SORT AND EXAMPLES
Consider:
Sorting Books in Library (Dewey system)
Sorting Individuals by Height (Feet and
Inches)
Sorting Movies in Blockbuster
(Alphabetical)
Sorting Numbers (Sequential)
TYPES OF SORTING
ALGORITHMS
There are many different types of
sorting algorithms, but the
primary ones are:
Bubble Sort
Selection Sort
Insertion Sort
Merge Sort
Quick Sort
Radix Sort
Heap Sort
COMPLEXITY
Most of the primary sorting algorithms run
on different space and time complexity.
Time Complexity is defined to be the time
the computer takes to run a program (or
algorithm in our case).
Space complexity is defined to be the
amount of memory the computer needs to
run a program.
COMPLEXITY (CONT.)
Complexity in general, measures the
algorithms efficiency in internal factors such
as the time needed to run an algorithm.
External Factors (not related to complexity):
Size of the input of the algorithm
Speed of the Computer
Quality of the Compiler
O(N), Ω(N), & Θ(N)
An algorithm or function T(n) is O(f(n))
whenever T(n)'s rate of growth is less than or
equal to f(n)'s rate.
An algorithm or function T(n) is Ω(f(n))
whenever T(n)'s rate of growth is greater
than or equal to f(n)'s rate.
An algorithm or function T(n) is Θ(f(n)) if
and only if the rate of growth of T(n) is equal
to f(n).
COMMON BIG-OH’S
Time complexity Example
O(1) constant Adding to the front of a linked list
O(log N) log Finding an entry in a sorted array
O( N ) linear Finding an entry in an unsorted array
O( N log N) n-log-n Sorting n items by ‘divide-and-conquer’
O( N 2) quadratic Shortest path between two nodes in a graph
O( N 3) cubic Simultaneous linear equations
BIG-OH TO PRIMARY SORTS
●
Bubble Sort = n²
●
Selection Sort = n²
●
Insertion Sort = n²
●
Merge Sort = n log(n)
●
Quick Sort = n log(n)
TIME EFFICIENCY
How do we improve the time efficiency of a
program?
The 90/10 Rule
90% of the execution time of a program is spent
in
executing 10% of the code
So, how do we locate the critical 10%?
software metrics tools
global counters to locate bottlenecks (loop
executions, function calls)
TIME EFFICIENCY
IMPROVEMENTS
Possibilities (some better than others!)
Move code out of loops that does not belong
there (just good programming!)
Remove any unnecessary I/O operations (I/O
operations are expensive time-wise)
Code so that the compiled code is more
efficient
Moral - Choose the most appropriate
algorithm(s) BEFORE program implementation
BUBBLE SORT
ALGORITHM FOR BUBBLE
SORT
BUBBLE(DATA , N)
1.Repeat Steps 2 and 3 for K=1 to N-1.
2.Set PTR := 1. // Initialize pass pointer
PTR
3.Repeat while PTR<=N-K.
a) If DATA[PTR]>DATA[PTR+1],then
Interchange DATA[PTR] and
DATA[PTR+1]
b) Set PTR:= PTR+1
4. Exit.
SELECTION SORT
SELECTION SORT
MIN(A,K,N,LOC)
1. Set MIN = A[K] and LOC = K
2. Repeat for J=K+1, K+2 ……… N
If MIN>A[J], then Set MIN = A[J]
and LOC = J
End of Loop
3. Return
SELECTION SORT
SELECTION (A,N)
1. Repeat steps 2 and 3 for K=1, 2 ………
N-1
2. Call MIN(A,K,N,LOC)
3. Set TEMP = A[K]
A[K] = A[LOC]
A[LOC] = TEMP
End of Step 1 Loop
4. Exit
COMPLEXITY OF THE
SELECTION SORT
Worst Case
O(n2)
Average Case
O(n2)
INSERTION SORT
INSERTION SORT
INSERTION SORT
INSERTION (A,N)
1. Set A[0] = - ∞
2. Repeat Steps 3 to 5 for K = 2,3,….N
3. Set TEMP = A[K] and PTR = K-1
4. Repeat while TEMP<A[PTR]
1. Set A[PTR+1] = A[PTR]
2. Set PTR = PTR – 1
End of Loop
5. Set A[PTR+1] = TEMP
End of Step 2 Loop
6. Return
COMPLEXITY OF THE
INSERTION SORT
Worst Case
O(n2)
Average Case
O(n2)
MERGE SORT
Working of merge sort:
1. Divide the unsorted array into two sub-
arrays, half the size of the original.
2. Continue to divide the sub-arrays as long as
the current piece of the array has more than
one element.
3. Merge two sub-arrays together by always
putting the lowest value first.
4. Keep merging until there are no sub-arrays
left.
MERGE-SORT
MERGE-SORT(CONT.)
Merge-Sort (cont’d)
MERGE SORT
MERGE_SORT (A, BEG, END)
1. IF BEG<ENG
SET MID=(BEG+END)/2
CALL MERGE_SORT (A, BEG, MID)
CALL MERGE_SORT (A, MID+1, END)
MERGE(A, BEG, MID, END)
2. END
MERGE(A, BEG, MID, END)
1. SET I = BEG, J=MID+1, INDEX=0
2. REPEAT WHILE (I<=MID) AND (J<=END)
IF A[I]<A[J], THEN
SET TEMP [INDEX] = A[I]
SET I=I+1
ELSE
SET TEMP [INDEX]=A[J]
SET J=J+1
SET INDEX = INDEX +1
3. IF I>MID THEN
REPEAT WHILE J<=END
SET TEMP [INDEX]=A[J]
SET INDEX=INDEX+1
SET J=J+1
ELSE
REPEAT WHILE I<=MID
SET TEMP[INDEX]=A[I]
SET INDEX = INDEX +1
SET I=I+1
4. SET K=0
5. REPEAT WHILE K<INDEX
SET A[K]=TEMP[K]
SET K=K+1
6. END
EXAMPLE (MERGE SORT)
QUICKSORT
Quicksort is one of the fastest sorting algorithms.
The Quicksort algorithm takes an array of values, chooses one of the values as
the 'pivot' element, and moves the other values so that lower values are on the
left of the pivot element, and higher values are on the right of it.
Starting with first element of list do the operations
1. Compare from right to find element less than selected and interchange
2. Compare from left to find element greater than selected and interchange
44,33,11,55,77,90,40,60,33,22,88,66
22,33,11,55,77,90,40,60,99,44,88,66
22,33,11,44,77,90,40,60,99,55,88,66
22,33,11,40,77,90,44,60,99,55,88,66
22,33,11,40,44,90,77,60,99,55,88,66
22,33,11,40,44,90,77,60,99,55,88,66
Two Sublists are created:
Before 44 and After 44
QUICK SORT ALGORITHM
QUICK (A, N, BEG, END, LOC)
1. [Initialize] Set LEFT= BEG, RIGHT=END and LOC=BEG
2. [Scan from right to left]
a) Repeat while A[LOC]<=A[RIGHT] and LOC != RIGHT
RIGHT=RIGHT-1
[End of loop]
b) If LOC=RIGHT then : Return
c) If A[LOC]>A[RIGHT] then
i) Interchange A[LOC] and A[RIGHT]
ii) Set LOC=RIGHT
iii) Go to step 3
[End of If structure]
39
3. [Scan from left to right]
a) Repeat while A[LEFT]<=A[LOC] and LEFT!
=LOC
LEFT=LEFT+1
[End of loop]
b) If LOC=LEFT, then Return
c) If A[LEFT]>A[LOC], then
i) Interchange A[LEFT] and A[LOC]
ii) Set LOC=LEFT
iii) Goto step 2
[End of If structure]
40
(QuickSort) This algorithm sorts an array A with N
elements.
1. TOP=NULL
2. If N>1 then TOP=TOP+1, LOWER[1]=1, UPPER[1]=N
3. Repeat Steps 4 to 7 while TOP!=NULL
4. Set BEG=LOWER[TOP], END=UPPER[TOP]
TOP=TOP-1
5. Call QUICK(A, N, BEG, END, LOC)
6. If BEG<LOC-1 then
TOP=TOP+1, LOWER[TOP]=BEG, UPPER[TOP]=LOC-1
[End of If structure]
7. If LOC+1<END then
TOP=TOP+1, LOWER[TOP]=LOC+1, UPPER[TOP]=END
[End of If Structure]
[End of Step 3 loop]
8. Exit 41
Radix Sort
RADIX SORT
RADIX SORT (OR BUCKET
SORT)
It is a non comparative integer sorting
algorithm
It sorts data with integer keys by
grouping keys by the individual digits
which share the same significant
position and value.
A positional notation is required, but
because integers can represent strings
of characters (e.g., names or dates) and
specially formatted floating point
numbers, Radix sort is not limited to
integers.
RADIX SORT ALGORITHM
Let A be an array with N elements in the
memory
Find the largest element of the array
Find the total no. of digits ‘NUM’ in the
largest element
Repeat Steps 4, 5 for PASS = 1 to NUM
Initialize buckets (0 to 9 for numbers, A to Z
for Characters)
For i = 0 to (N-1)
Set DIGIT = Obtain digit number PASS of A
[i]
Put A [i] in bucket number DIGIT
End of for loop
5. Collect all numbers from the bucket in order
6. Exit
HEAP SORT
HEAP_SORT (A)
1. BUILD_MAX_HEAP(A)
2. For i=length[A] to 2
do exchange A[1] <-> A[i]
Heap-Size [A]=Heap-Size[A]-1
MAX_HEAPIFY (A, 1)
BUILD_MAX_HEAP(A)
1. Heap-Size[A]=Length[A]
2. for i=|length[A]/2| to 1
3. MAX_HEAPIFY (A, i)
MAX_HEAPIFY(A, i)
1. l= Left[i]
2. r=Right[i]
3. If l<=Heap-Size [A] && A[l] > A[i]
4. Largest = l
5. Else largest =i
6. If r<=Heap-Size[A] && A[r]>A[largest]
7. Largest = r
8. If largest != i
9. then interchange A[i] <-> A[largest]
and MAX_HEAPIFY (A, largest)
(HEAP SORT) EXAMPLE
A=(5, 13, 2, 25, 7, 17, 20, 8, 4)
5
13 2
25 7 17 20
8 4
EXAMPLE
First we call BUILD_MAX_HEAP
5
13 2
i 25 7 17 20
8 4
l r
EXAMPLE
5
13 2i
25 7 17
l
20 r
8 4
EXAMPLE
5
i
13 20
l25 7 r 17 2
8 4
EXAMPLE
i
5
25
l 20 r
13 7 17 2
8 4
EXAMPLE
25
5i 20
13
l
7 r 17 2
8 4
EXAMPLE
25
13 20
5 i 7 17 2
8 4
l r
EXAMPLE
25
13 20
8 7 17 2
5 4
This is the final tree after BUILD-MAX-HEAP
EXAMPLE
25 4
13 20
13 20
8 7 17
2 8 7 17
5 4 2
5 25
4 13 20 8 7 17 2 5 25
EXAMPLE
BUILD_MAX_HEAPIFY
20
13 17
8 7 42
5
EXAMPLE
20 5
13 17
13 17
8 7 4 2
5
8 7 42
20
5 13 17 8 7 4 2 20 25
EXAMPLE
BUILD_MAX_HEAPIFY
17
13 5
8 7 42
EXAMPLE
17 2
13 5
13 5
8 7 4
2 8 7 4 17
2 13 5 8 7 4 17 20 25
EXAMPLE
BUILD_MAX_HEAPIFY
13
8 5
2 7 4
EXAMPLE
13 4
8 5
8 5
2 7 4
2 7 13
4 8 5 2 7 13 17 20 25
EXAMPLE
BUILD_MAX_HEAPIFY
8
7 5
2 4
EXAMPLE
8 4
7 5
7 5
2 4
2 8
4 7 5 2 8 13 17 20 25
EXAMPLE
BUILD_MAX_HEAPIFY
7
4 5
2
EXAMPLE
7 2
4 5
4 5
2
7
2 4 5 7 8 13 17 20 25
EXAMPLE
BUILD_MAX_HEAPIFY
5
4 2
EXAMPLE
5
2
4 2
4 5
2 4 5 7 8 13 17 20 25
EXAMPLE
BUILD_MAX_HEAPIFY
4
2
EXAMPLE
4 2
2
4
2 4 5 7 8 13 17 20 25
EXAMPLE
2 4 5 7 8 13 17 20 25