KEMBAR78
CS3401 Algorithm | PDF | Time Complexity | Computational Complexity Theory
0% found this document useful (0 votes)
81 views137 pages

CS3401 Algorithm

The document outlines the curriculum for the CS3401 Algorithms course, detailing course objectives and five main units covering algorithm analysis, graph algorithms, algorithm design techniques, state space search algorithms, and NP-completeness. It includes discussions on time and space complexity, asymptotic notations, and various algorithmic strategies such as divide and conquer, dynamic programming, and greedy techniques. Additionally, it addresses searching and sorting algorithms, as well as approximation and randomized algorithms.

Uploaded by

priyadharsini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views137 pages

CS3401 Algorithm

The document outlines the curriculum for the CS3401 Algorithms course, detailing course objectives and five main units covering algorithm analysis, graph algorithms, algorithm design techniques, state space search algorithms, and NP-completeness. It includes discussions on time and space complexity, asymptotic notations, and various algorithmic strategies such as divide and conquer, dynamic programming, and greedy techniques. Additionally, it addresses searching and sorting algorithms, as well as approximation and randomized algorithms.

Uploaded by

priyadharsini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 137

Department of Computer Science and Engineering

Regulation 2021
II Year – IV Semester
CS3401- Algorithm
CS3401 ALGORITHMS L T P C 3 0 2 4

COURSE OBJECTIVES:

To understand and apply the algorithm analysis techniques on searching and sorting algorithms

To critically analyze the efficiency of graph algorithms To understand different algorithm design
techniques

To solve programming problems using state space tree

To understand the concepts behind NP Completeness, Approximation algorithms and randomized


algorithms.

UNIT I INTRODUCTION 9

Algorithm analysis: Time and space complexity - Asymptotic Notations and its properties Best case,
Worst case and average case analysis – Recurrence relation: substitution method - Lower bounds –
searching: linear search, binary search and Interpolation Search, Pattern search: The naïve string
matching algorithm - Rabin-Karp algorithm - Knuth-Morris-Pratt algorithm. Sorting: Insertion sort –
heap sort

UNIT II GRAPH ALGORITHMS 9

Graph algorithms: Representations of graphs - Graph traversal: DFS – BFS - applications -


Connectivity, strong connectivity, bi-connectivity - Minimum spanning tree: Kruskal’s and Prim’s
algorithm- Shortest path: Bellman-Ford algorithm - Dijkstra’s algorithm - Floyd-Warshall algorithm
Network flow: Flow networks - Ford-Fulkerson method – Matching: Maximum bipartite matching

UNIT III ALGORITHM DESIGN TECHNIQUES 9

Divide and Conquer methodology: Finding maximum and minimum - Merge sort - Quick sort
Dynamic programming: Elements of dynamic programming — Matrix-chain multiplication - Multi
stage graph — Optimal Binary Search Trees. Greedy Technique: Elements of the greedy strategy -
Activity-selection problem –- Optimal Merge pattern — Huffman Trees.

UNIT IV STATE SPACE SEARCH ALGORITHMS 9

Backtracking: n-Queens problem - Hamiltonian Circuit Problem - Subset Sum Problem – Graph
colouring problem Branch and Bound: Solving 15-Puzzle problem - Assignment problem - Knapsack
Problem - Travelling Salesman Problem

UNIT V NP-COMPLETE AND APPROXIMATION ALGORITHM 9

Tractable and intractable problems: Polynomial time algorithms – Venn diagram representation -
NPalgorithms - NP-hardness and NP-completeness – Bin Packing problem - Problem reduction: TSP –
3CNF problem. Approximation Algorithms: TSP - Randomized Algorithms: concept and application -
primality testing - randomized quick sort - Finding kth smallest number
UNIT I INTRODUCTION

Algorithm analysis: Time and space complexity - Asymptotic Notations and its properties Best case,
Worst case and average case analysis – Recurrence relation: substitution method - Lower bounds –
searching: linear search, binary search and Interpolation Search, Pattern search: The naïve string
matching algorithm - Rabin-Karp algorithm - Knuth-Morris-Pratt algorithm. Sorting: Insertion sort –
heap sort

Introduction

Definition of Algorithm

The word Algorithm means ” A set of finite rules or instructions to be followed in calculations or
other problem-solving operations ”
Or
” A procedure for solving a mathematical problem in a finite number of steps that frequently
involves recursive operations”.

What is Algorithm?

Set of rules
obtain expected
Input Output
output from the
given input

Algorithm Analysis
Algorithm analysis is the process of determining the time and space complexity of an
algorithm, which are measures of the algorithm's efficiency.

Time complexity refers to the amount of time it takes for an algorithm to run as a function of the size of
the input, and is typically expressed using big O notation. Space complexity refers to the amount of
memory required by an algorithm as a function of the size of the input, and is also typically expressed
using big O notation.

Time and Space Complexity

Time complexity is a measure of how long an algorithm takes to run as a function the size of the
input. It is typically expressed using big O notation, which describes the upper bound on the growth
of the time required by the algorithm. For example, an algorithm with a time complexity of O(n)
takes longer to run as the input size (n) increases.
There are different types of time complexities:

• O(1) or constant time: the algorithm takes the same amount of time to run regardless of the
size of the input.

• O(log n) or logarithmic time: the algorithm's running time increases logarithmically with the
size of the input.
• O(n) or linear time: the algorithm's running time increases linearly with the size of the input.

• O(n log n) or linear logarithmic time: the algorithm's running time increases linearly with the
size of the input and logarithmically with the size of the input.

• O(n^2) or quadratic time: the algorithm's running time increases quadratically with the size
of the input.

Space complexity, on the other hand, is a measure of how much memory an algorithm uses as a
function of the size of the input. Like time complexity, it is typically expressed using big O notation.
For example, an algorithm with a space complexity of O(n) uses more memory as the input size (n)
increases. Space complexities are generally categorized as:
• O(1) or constant space: the algorithm uses the same amount of memory regardless of the size
of the input.

• O(n) or linear space: the algorithm's memory usage increases linearly with the size of the
input.
• O(n^2) or quadratic space: the algorithm's memory usage increases quadratically with the
size of the input.
• O(2^n) or exponential space: the algorithm's memory usage increases exponentially with the
size of the input.

Asymptotic notation and its properties


Asymptotic notation is a mathematical notation used to describe the behavior of an algorithm as the
size of the input (usually denoted by n) becomes arbitrarily large.

There are mainly three asymptotic notations:

1. Big-O Notation (O-notation)


2. Omega Notation (Ω-notation)
3. Theta Notation (Θ-notation)

1. Big-O Notation

Big O notation is a mathematical concept used in computer science to describe the upper bound of an
algorithm's time or space complexity. It provides a way to express the worst-case scenario of how an
algorithm performs as the size of the input increases.
Mathematical Representation of Big O Notation

A function f(n) is said to be O(g(n)) if there exist positive constants c0 and c1 such that 0 ≤ f(n) ≤
c0*g(n) for all n ≥ c1. This means that for sufficiently large values of n, the function f(n) does not grow
faster than g(n) up to a constant factor.

O(g(n)) = {f(n): there exist positive constants c0 and c1 such that 0 ≤ f(n) ≤ c0g(n) for all n ≥ c1}.

For Example:

Let, f(n) = n2 + n + 1
g(n) = n2
n2 + n + 1 <= c (n2)
The time complexity of the above function is O(n2), because the above function has to run for n2 time at
least.

Omega Notation(Ω)
Omega notation is used to denote the lower bound of the algorithm; it represents the minimum running
time of an algorithm. Therefore, it provides the best-case complexity of any algorithm.
Ω(g(n)) = {f(n): there exist positive constants c0 and c1, such that 0 ≤ c0g(n) ≤ f(n) for all n ≥ c1}.

For Example:
Let,
1. f(n) = n2 + n
Then, the best-case time complexity will be Ω(n2)
2. f(n) = 100n + log(n)
Then, the best-case time complexity will be Ω(n).

Theta Notation(θ)
Theta notation is used to denote the average bound of the algorithm; it bounds a function from above
and below, that’s why it is used to represent exact asymptotic behaviour.

Θ(g(n)) = {f(n): there exist positive constants c0, c1 and c2, such that 0 ≤ c0g(n) ≤ f(n) ≤ c1g(n) for
all n ≥ c2}
Difference between Big O Notation, Omega Notation, and Theta
Notation
Parameter Big O Notation (O) Omega Notation (Ω) Theta Notation (Θ)
Describes an upper bound Describes a lower bound
Describes both an upper
on the time or space on the time or space
Definition and a lower bound on the
complexity of an complexity of an
time or space complexity.
algorithm. algorithm.
Used to characterize an
Used to characterize the Used to characterize the
algorithm's precise bound
Purpose worst-case scenario of an best-case scenario of an
(both worst and best
algorithm. algorithm.
cases).
Indicates the maximum Indicates the minimum Indicates the exact rate of
Interpretation rate of growth of the rate of growth of the growth of the algorithm's
algorithm's complexity. algorithm's complexity. complexity.
f(n) = O(g(n)) if ∃ f(n) = Ω(g(n)) if ∃ f(n) = Θ(g(n)) if ∃
Mathematical constants c > 0, n₀ such constants c > 0, n₀ such constants c₁, c₂ > 0, n₀
Expression that 0 ≤ f(n) ≤ c*g(n) for that 0 ≤ c*g(n) ≤ f(n) for such that 0 ≤ c₁g(n) ≤ f(n)
all n ≥ n₀. all n ≥ n₀. ≤ c₂g(n) for all n ≥ n₀.
Focuses on both the upper
Focuses on the upper limit Focuses on the lower
and lower limits,
Focus of performance (less limit of performance
providing a balanced view
efficient aspects). (more efficient aspects).
of performance.
It is commonly used to Used to provide a precise
Usage in Used to demonstrate the
analyze efficiency, analysis of algorithm
Algorithm effectiveness under
especially concerning efficiency in typical
Analysis optimal conditions.
worst-case performance. scenarios.
It is less common than Used when an algorithm
Predominant in theoretical
Big O but important for exhibits a consistent
Common Usage and practical applications
understanding best-case performance across
for worst-case analysis.
efficiency. different inputs.
Linear search in a sorted
Searching in an unsorted Inserting an element in a array, where the element
Examples
list: O(n). sorted array: Ω(1). is always in the middle:
Θ(n).

Recurrence Relation

A recurrence is an equation or inequality that describes a function in terms of its values on smaller
inputs. To solve a Recurrence Relation means to obtain a function defined on the natural numbers that
satisfy the recurrence.
For Example, the Worst Case Running Time T(n) of the MERGE SORT Procedures is described by
the recurrence.

T (n) = θ (1) if n=1

2T + θ (n) if n>1

There are three methods for solving Recurrence:

1. Substitution Method
2. Recursion Tree Method
3. Master Method

1. Substitution Method:

The Substitution Method consists of two types

1. Forward substitution
2. Backward substitution

1. Forward substitution

This method makes use of an initial condition in the initial term and value for the next term is
generated. This process is continued until some formula is guessed.

For example:

Consider a recurrence relation

T(n)=T(n-1)+n

With initial condition T(0)=0

Let T(n)=T(n-1)+n

If n=1 then T(1)=1

If n=2 then T(2)=3

If n=3 then T(3)=6

T(n) =n(n+1)/2 =n2/2 +n/2

We can denote T(n) in terms of Big oh notation as T(n)=O(n2)


2. Backward substitution

In this method backward values are substituted recursively in order to derive some formula.

For example,

Consider a recurrence relation

T(n)=T(n-1)+n

With initial condition T(0)=0

T(n-1)=T(n-1-1)+(n-1)

T(n)=T(n-2)+(n-1)+n

Let T(n-2)=T(n-2-1)+(n-2)

T(n)=T(n-3)+(n-2)+(n-1)+n

::

T(n)=T(n-k)+(n-k-1)+T(n-k-2)+….+n

If k=n then

T(n)=T(0)+1+2+….n

T(n)=0+1+2+….+n

T(n) =n(n+1)/2 =n2/2 +n/2

We can denote T(n) in terms of Big oh notation as T(n)=O(n2)

2. Tree Method

Steps to solve recurrence relation using recursion tree method:

1. Draw a recursive tree for given recurrence relation


2. Calculate the cost at each level and count the total no of levels in the recursion tree.
3. Count the total number of nodes in the last level and calculate the cost of the last level
4. Sum up the cost of all the levels in the recursive tree

For example

T(n) = 2T(n/2) + c
Step 1: Draw a recursive tree

Recursion Tree
Step 2: Calculate the work done or cost at each level and count total no of levels in recursion tree

Recursive Tree with each level cost


Count the total number of levels –
Choose the longest path from root node to leaf node
n/20 -→ n/21 -→ n/22 -→ ……… -→ n/2k

Size of problem at last level = n/2k


At last level size of problem becomes 1
n/2k = 1
2k = n
k = log2(n)

Total no of levels in recursive tree = k +1 = log 2(n) + 1

Step 3: Count total number of nodes in the last level and calculate cost of last level
No. of nodes at level 0 = 20 = 1
No. of nodes at level 1 = 21 = 2
………………………………………………………
No. of nodes at level log2(n) = 2log2(n) = nlog2(2) = n
Cost of sub problems at level log2(n) (last level) = nxT(1) = nx1 = n

Step 4: Sum up the cost all the levels in recursive tree


T(n) = c + 2c + 4c + —- + (no. of levels-1) times + last level cost
= c + 2c + 4c + —- + log2(n) times + Θ(n)
= c(1 + 2 + 4 + —- + log2(n) times) + Θ(n)
1 + 2 + 4 + —– + log2(n) times –> 20 + 21 + 22 + —– + log2(n) times –> Geometric
Progression(G.P.)
= c(n) + Θ(n)

3. Master Method

The Master Method is used for solving the following types of recurrence

T (n) = a T + f (n) with a≥1 and b≥1 be constant & f(n) be a function and can be interpreted as

Let T (n) is defined on non-negative integers by the recurrence.

T (n) = a T + f (n)

Master Theorem:

It is possible to complete an asymptotic tight bound in these three cases:

Lower Bound
Steps to find the recurrence using lower bound:

1. Guess the solutions


2. Use the mathematic induction to find the boundary condition and shows that the guess is
correct.
Searching
It is a technique in which the location of desired element is obtained. The searching technique
is based on search key. This key is compared with array elements. If the key and the current element
match then position of that element is returned.

Commonly used searching algorithms are,

1. Linear search
2. Binary search
3. Interpolation search
1. Linear Search

Linear search is a type of sequential searching algorithm. In this method, every element within
the input array is traversed and compared with the key element to be found. If a match is found in the
array the search is said to be successful; if there is no match found the search is said to be unsuccessful
and gives the worst-case time complexity.
Linear Search Algorithm

The algorithm for linear search is relatively simple. The procedure starts at the very first index of the
input array to be searched.

Step 1 − Start from the 0th index of the input array, compare the key value with the value present in the
0th index.

Step 2 − If the value matches with the key, return the position at which the value was found.

Step 3 − If the value does not match with the key, compare the next element in the array.

Step 4 − Repeat Step 3 until there is a match found. Return the position at which the match was found.

Step 5 − If it is an unsuccessful search, print that the element is not present in the array and exit the
program.

Pseudo code
procedure linear_search (list, value)
for each item in the list
if match item == value
return the item's location
end if
end for
end procedure

Analysis

Linear search traverses through every element sequentially therefore, the best case is when the element
is found in the very first iteration. The best-case time complexity would be O(1).

However, the worst case of the linear search method would be an unsuccessful search that does not find
the key value in the array, it performs n iterations. Therefore, the worst-case time complexity of the
linear search algorithm would be O(n).

Example
Let us look at the step-by-step searching of the key element (say 47) in an array using the linear search
method.
Step 1

The linear search starts from the 0th index. Compare the key element with the value in the 0th index, 34.

However, 47 ≠ 34. So it moves to the next element.

Step 2

Now, the key is compared with value in the 1st index of the array.

Still, 47 ≠ 10, making the algorithm move for another iteration.

Step 3

The next element 66 is compared with 47. They are both not a match so the algorithm compares the
further elements.
Step 4

Now the element in 3rd index, 27, is compared with the key value, 47. They are not equal so the
algorithm is pushed forward to check the next element.

Step 5

Comparing the element in the 4th index of the array, 47, to the key 47. It is figured that both the
elements match. Now, the position in which 47 is present, i.e., 4 is returned.

The output achieved is “Element found at 4th index”.

2. Binary Search

Binary search is a fast search algorithm with run-time complexity of Ο(log n). This search algorithm
works on the principle of divide and conquer, since it divides the array into half before searching. For
this algorithm to work properly, the data collection should be in the sorted form.

Binary search looks for a particular key value by comparing the middle most item of the collection. If a
match occurs, then the index of item is returned. But if the middle item has a value greater than the key
value, the right sub-array of the middle item is searched. Otherwise, the left sub-array is searched. This
process continues recursively until the size of a sub array reduces to zero.
Binary Search Algorithm

Step 1 − Select the middle item in the array and compare it with the key value to be searched. If it is
matched, return the position of the median.

Step 2 − If it does not match the key value, check if the key value is either greater than or less than the
median value.

Step 3 − If the key is greater, perform the search in the right sub-array; but if the key is lower than the
median value, perform the search in the left sub-array.

Step 4 − Repeat Steps 1, 2 and 3 iteratively, until the size of sub-array becomes 1.

Step 5 − If the key value does not exist in the array, then the algorithm returns an unsuccessful search.

Pseudocode
Procedure binary_search
A ← sorted array
n ← size of array
x ← value to be searched

Set lowerBound = 1
Set upperBound = n

while x not found


if upperBound < lowerBound
EXIT: x does not exists.
set midPoint = lowerBound + ( upperBound - lowerBound ) / 2
if A[midPoint] < x
set lowerBound = midPoint + 1
if A[midPoint] > x
set upperBound = midPoint - 1
if A[midPoint] = x
EXIT: x found at location midPoint
end while
end procedure

Analysis

To achieve a successful search, after the last iteration the length of array must be 1. Hence,

n/2i = 1

That gives us − n = 2i

Applying log on both sides,


log n = log 2i
log n = i. log 2
i = log n

The time complexity of the binary search algorithm is O(log n)

Example

For a binary search to work, it is mandatory for the target array to be sorted. We shall learn the process
of binary search with a pictorial example. The following is our sorted array and let us assume that we
need to search the location of value 31 using binary search.

First, we shall determine half of the array by using this formula −

mid = low + (high - low) / 2

Here it is, 0 + (9 - 0) / 2 = 4 (integer value of 4.5). So, 4 is the mid of the array.

Now we compare the value stored at location 4, with the value being searched, i.e. 31. We find that the
value at location 4 is 27, which is not a match. As the value is greater than 27 and we have a sorted
array, so we also know that the target value must be in the upper portion of the array.

We change our low to mid + 1 and find the new mid value again.

low = mid + 1
mid = low + (high - low) / 2

Our new mid is 7 now. We compare the value stored at location 7 with our target value 31.
The value stored at location 7 is not a match, rather it is less than what we are looking for. So, the value
must be in the lower part from this location.

Hence, we calculate the mid again. This time it is 5.

We compare the value stored at location 5 with our target value. We find that it is a match.

We conclude that the target value 31 is stored at location 5.

3. Interpolation Search

Interpolation search is an improved variant of binary search. This search algorithm works on the
probing position of the required value. For this algorithm to work properly, the data collection should
be in a sorted form and equally distributed.

Position Probing in Interpolation Search

Interpolation search finds a particular item by computing the probe position. Initially, the probe
position is the position of the middle most item of the collection.
If a match occurs, then the index of the item is returned. To split the list into two parts, we use the
following method −

Interpolation Search Algorithm

1. Start searching data from middle of the list.

2. If it is a match, return the index of the item, and exit.

3. If it is not a match, probe position.

4. Divide the list using probing formula and find the new middle.

5. If data is greater than middle, search in higher sub-list.

6. If data is smaller than middle, search in lower sub-list.

7. Repeat until match.

Example

To understand the step-by-step process involved in the interpolation search, let us look at an example
and work around it.

Consider an array of sorted elements given below −

Let us search for the element 19.

Solution

Unlike binary search, the middle point in this approach is chosen using the formula

Mid=low+ (high-low)*(key-A[low])

A[high]-A[Low]

So in this given array input,


Lo = 0, A[Lo] = 10

Hi = 9, A[Hi] = 44

X = 19

Applying the formula to find the middle point in the list,

Since, mid is an index value, we only consider the integer part of the decimal. That is, mid = 2.

Comparing the key element given, that is 19, to the element present in the mid index, it is found that
both the elements match.

Therefore, the element is found at index 2.

String Matching Algorithm

String Matching Algorithm is also called "String Searching Algorithm." This is a vital class of string
algorithm is declared as "this is the method to find a place where one is several strings are found within
the larger string."

Given a text array, T [1.....n], of n character and a pattern array, P [1......m], of m characters. The
problems are to find an integer s, called valid shift where 0 ≤ s < n-m and T [s+1......s+m] = P [1......m].
In other words, to find even if P in T, i.e., where P is a substring of T. The item of P and T are character
drawn from some finite alphabet such as {0, 1} or {A, B .....Z, a, b..... z}.

There are different types of method is used to finding the string

1. The Naive String Matching Algorithm


2. The Rabin-Karp-Algorithm
3. Knuth-Morris-Pratt Algorithm

1. Naive String Matching Algorithm

The naïve approach tests all the possible placement of Pattern P [1.......m] relative to text T [1......n].
We try shift s = 0, 1.......n-m, successively and for each shift s. Compare T [s+1.......s+m] to P [1......m].
The naïve algorithm finds all valid shifts using a loop that checks the condition P [1.......m] = T
[s+1.......s+m] for each of the n - m +1 possible value of s.

NAIVE-STRING-MATCHER (T, P)
1. n ← length [T]
2. m ← length [P]
3. for s ← 0 to n -m
4. do if P [1.....m] = T [s + 1....s + m]
5. then print "Pattern occurs with shift" s

Analysis: This for loop from 3 to 5 executes for n-m + 1(we need at least m characters at the end) times
and in iteration we are doing m comparisons. So the total complexity is O (n-m+1).

Example:
Suppose T = 1011101110 P = 111 Find all the Valid Shift
2. Rabin-Karp-Algorithm

The Rabin-Karp string matching algorithm calculates a hash value for the pattern, as well as for each
M-character subsequences of text to be compared. If the hash values are unequal, the algorithm will
determine the hash value for next M-character sequence. If the hash values are equal, the algorithm will
analyze the pattern and the M-character sequence. In this way, there is only one comparison per text
subsequence, and character matching is only required when the hash values match.

RABIN-KARP-MATCHER (T, P, d, q)
1. n ← length [T]
2. m ← length [P]
3. h ← dm-1 mod q
4. p ← 0
5. t0 ← 0
6. for i ← 1 to m
7. do p ← (dp + P[i]) mod q
8. t0 ← (dt0+T [i]) mod q
9. for s ← 0 to n-m
10. do if p = ts
11. then if P [1.....m] = T [s+1.....s + m]
12. then "Pattern occurs with shift" s
13. If s < n-m
14. then ts+1 ← (d (ts-T [s+1]h)+T [s+m+1])mod q

Example: For string matching, working module q = 11, how many spurious hits does the Rabin-Karp
matcher encounters in Text T = 31415926535.......

1. T = 31415926535.......
2. P = 26
3. Here T.Length =11 so Q = 11
4. And P mod Q = 26 mod 11 = 4
5. Now find the exact match of P mod Q...

Solution:
Complexity:

The running time of RABIN-KARP-MATCHER in the worst case scenario O ((n-m+1) m but it has a
good average case running time. If the expected number of strong shifts is small O (1) and prime q is
chosen to be quite large, then the Rabin-Karp algorithm can be expected to run in time O (n+m) plus
the time to require to process spurious hits.

Knuth-Morris-Pratt (KMP) Algorithm

Knuth-Morris and Pratt introduce a linear time algorithm for the string matching problem. A matching
time of O (n) is achieved by avoiding comparison with an element of 'S' that have previously been
involved in comparison with some element of the pattern 'p' to be matched. i.e., backtracking on the
string 'S' never occurs

Components of KMP Algorithm:

1. The Prefix Function (Π): The Prefix Function, Π for a pattern encapsulates knowledge about how
the pattern matches against the shift of itself. This information can be used to avoid a useless shift of
the pattern 'p.' In other words, this enables avoiding backtracking of the string 'S.'

2. The KMP Matcher: With string 'S,' pattern 'p' and prefix function 'Π' as inputs, find the occurrence
of 'p' in 'S' and returns the number of shifts of 'p' after which occurrences are found.

Following pseudo code compute the prefix function, Π:

COMPUTE- PREFIX- FUNCTION (P)


1. m ←length [P] //'p' pattern to be matched
2. Π [1] ← 0
3. k ← 0
4. for q ← 2 to m
5. do while k > 0 and P [k + 1] ≠ P [q]
6. do k ← Π [k]
7. If P [k + 1] = P [q]
8. then k← k + 1
9. Π [q] ← k
10. Return Π

Running Time Analysis:

In the above pseudo code for calculating the prefix function, the for loop from step 4 to step 10 runs 'm'
times. Step1 to Step3 take constant time. Hence the running time of computing prefix function is O (m).

Example: Compute Π for the pattern 'p' below:


Solution:

Initially: m = length [p] = 7


Π [1] = 0
k = 0
After iteration 6 times, the prefix function computation is complete:

The KMP Matcher:

The KMP Matcher with the pattern 'p,' the string 'S' and prefix function 'Π' as input, finds a match of p
in S. Following pseudo code compute the matching component of KMP algorithm:

KMP-MATCHER (T, P)
1. n ← length [T]
2. m ← length [P]
3. Π← COMPUTE-PREFIX-FUNCTION (P)
4. q ← 0 // numbers of characters matched
5. for i ← 1 to n // scan S from left to right
6. do while q > 0 and P [q + 1] ≠ T [i]
7. do q ← Π [q] // next character does not match
8. If P [q + 1] = T [i]
9. then q ← q + 1 // next character matches
10. If q = m // is all of p matched?
11. then print "Pattern occurs with shift" i - m
12. q ← Π [q] // look for the next match
ADVERTISEMENT

Running Time Analysis:

The for loop beginning in step 5 runs 'n' times, i.e., as long as the length of the string 'S.' Since step 1 to
step 4 take constant times, the running time is dominated by this for the loop. Thus running time of the
matching function is O (n).

Insertion Sort

Insertion sort is a very simple method to sort numbers in an ascending or descending order. This
method follows the incremental method. It can be compared with the technique how cards are sorted at
the time of playing a game.

This is an in-place comparison-based sorting algorithm. Here, a sub-list is maintained which is always
sorted. For example, the lower part of an array is maintained to be sorted. An element which is to be
'inserted' in this sorted sub-list, has to find its appropriate place and then it has to be inserted there.
Hence the name, insertion sort.

The array is searched sequentially and unsorted items are moved and inserted into the sorted sub-list (in
the same array). This algorithm is not suitable for large data sets as its average and worst case
complexity are of Ο(n2), where n is the number of items.

Insertion Sort Algorithm

Now we have a bigger picture of how this sorting technique works, so we can derive simple steps by
which we can achieve insertion sort.

Step 1 − If it is the first element, it is already sorted. return 1;

Step 2 − Pick next element

Step 3 − Compare with all elements in the sorted sub-list

Step 4 − Shift all the elements in the sorted sub-list that is greater than the value to be sorted

Step 5 − Insert the value

Step 6 − Repeat until list is sorted


Algorithm: Insertion-Sort(A)

for j = 2 to A.length

key = A[j]

i=j–1

while i > 0 and A[i] > key

A[i + 1] = A[i]

i = i -1

A[i + 1] = key

Analysis

Run time of this algorithm is very much dependent on the given input.

If the given numbers are sorted, this algorithm runs in O(n) time. If the given numbers are in reverse
order, the algorithm runs in O(n2) time.

Example

We take an unsorted array for our example.

Insertion sort compares the first two elements.

It finds that both 14 and 33 are already in ascending order. For now, 14 is in sorted sub-list.
Insertion sort moves ahead and compares 33 with 27.

And finds that 33 is not in the correct position. It swaps 33 with 27. It also checks with all the elements
of sorted sub-list. Here we see that the sorted sub-list has only one element 14, and 27 is greater than
14. Hence, the sorted sub-list remains sorted after swapping.

By now we have 14 and 27 in the sorted sub-list. Next, it compares 33 with 10. These values are not in
a sorted order.

So they are swapped.

However, swapping makes 27 and 10 unsorted.

Hence, we swap them too.


Again we find 14 and 10 in an unsorted order.

We swap them again.

By the end of third iteration, we have a sorted sub-list of 4 items.

Heap Sort

Heap Sort is an efficient sorting technique based on the heap data structure.

The heap is a nearly-complete binary tree where the parent node could either be minimum or
maximum. The heap with minimum root node is called min-heap and the root node with maximum
root node is called max-heap. The elements in the input data of the heap sort algorithm are processed
using these two methods.

The heap sort algorithm follows two main operations in this procedure −

• Builds a heap H from the input data using the heapify (explained further into the chapter)
method, based on the way of sorting – ascending order or descending order.
• Deletes the root element of the root element and repeats until all the input elements are
processed.
The heap sort algorithm heavily depends upon the heapify method of the binary tree. So what is this
heapify method?

Heapify Method

The heapify method of a binary tree is to convert the tree into a heap data structure. This method uses
recursion approach to heapify all the nodes of the binary tree.

Note − The binary tree must always be a complete binary tree as it must have two children nodes
always.

The complete binary tree will be converted into either a max-heap or a min-heap by applying
the heapify method.

To know more about the heapify algorithm, please click here.

Heap Sort Algorithm

As described in the algorithm below, the sorting algorithm first constructs the heap ADT by calling the
Build-Max-Heap algorithm and removes the root element to swap it with the minimum valued node at
the leaf. Then the heapify method is applied to rearrange the elements accordingly.

Algorithm: Heapsort(A)
BUILD-MAX-HEAP(A)
for i = A.length downto 2
exchange A[1] with A[i]
A.heap-size = A.heap-size - 1
MAX-HEAPIFY(A, 1)

Analysis

The heap sort algorithm is the combination of two other sorting algorithms: insertion sort and merge
sort.

The similarities with insertion sort include that only a constant number of array elements are stored
outside the input array at any time.

The time complexity of the heap sort algorithm is O(nlogn), similar to merge sort.

Example

Let us look at an example array to understand the sort algorithm better −

12 3 9 14 10 18 8 23

Building a heap using the BUILD-MAX-HEAP algorithm from the input array −
Rearrange the obtained binary tree by exchanging the nodes such that a heap data structure is formed.
The Heapify Algorithm

Applying the heapify method, remove the root node from the heap and replace it with the next
immediate maximum valued child of the root.

The root node is 23, so 23 is popped and 18 is made the next root because it is the next maximum node
in the heap.
Now, 18 is popped after 23 which is replaced by 14.

The current root 14 is popped from the heap and is replaced by 12.
12 is popped and replaced with 10.

Similarly all the other elements are popped using the same process.
Here the current root element 9 is popped and the elements 8 and 3 are remained in the tree.

Then, 8 will be popped leaving 3 in the tree.


After completing the heap sort operation on the given heap, the sorted elements are displayed as shown
below −

Every time an element is popped, it is added at the beginning of the output array since the heap data
structure formed is a max-heap. But if the heapify method converts the binary tree to the min-heap, add
the popped elements are on the end of the output array.

The final sorted list is,

3 8 9 10 12 14 18 23
UNIT II GRAPH ALGORITHMS

Graph algorithms: Representations of graphs - Graph traversal: DFS – BFS - applications - Connectivity,
strong connectivity, bi-connectivity - Minimum spanning tree: Kruskal’s and Prim’s algorithm- Shortest
path: Bellman-Ford algorithm - Dijkstra’s algorithm - Floyd-Warshall algorithm Network flow: Flow
networks - Ford-Fulkerson method – Matching: Maximum bipartite matching

Definition

A graph G(V, E) is a non-linear data structure that consists of node and edge pairs of objects connected by
links.

There are 2 types of graphs:

1.Directed

2.Undirected

1.Directed graph

A graph with only directed edges is said to be a directed graph. Example

The following directed graph has 5 vertices and 8 edges. This graph G can be defined as G = (V, E),
where V = {A,B,C,D,E} and E = {(A,B), (A,C) (B, E), (B,D), (D, A), (D, E),(C,D),(D,D)}.

2.Undirected graph

A graph with only undirected edges is said to be an undirected graph. Example

The following is an undirected graph.

Representation of Graphs

Graph data structure is represented using the following representations.

1. Adjacency Matrix
2. Adjacency List
1.Adjacency Matrix

In this representation, the graph can be represented using a matrix of size n x n, where n is the
number of vertices.

This matrix is filled with either 1’s or 0’s.

Here, 1 represents that there is an edge from row vertex to column vertex, and 0 represents that
there is no edge from row vertex to column vertex.

2.Adjacency list

In this representation, every vertex of the graph contains a list of its adjacent vertices.

If the graph is not dense, i.e., the number of edges is less, then it is efficient to represent the graph
through the adjacency list.

Graph traversals

Graph traversal is a technique used to search for a vertex in a graph. It is also used to decide the order
of vertices to be visited in the search process.

A graph traversal finds the edges to be used in the search process without creating loops. This means that,
with graph traversal, we can visit all the vertices of the graph without getting into a looping path.
There are two graph traversal techniques:

1. DFS (Depth First Search)


2. BFS (Breadth-First Search)

Applications of graphs

Social network graphs : To tweet or not to tweet. Graphs that represent who knows whom, who
communicates with whom, who influences whom, or other relationships in social structures. An
example is the twitter graph of who follows whom.

Graphs in epidemiology: Vertices represent individuals and directed edges to view the transfer of an
infectious disease from one individual to another. Analyzing such graphs has become an important
component in understanding and controlling the spread of diseases.
Protein-protein interactions graphs: Vertices represent proteins and edges represent interactions
between them that carry out some biological function in the cell. These graphs can be used to, for
example, study molecular pathway—chains of molecular interactions in a cellular process.

Network packet traffic graphs: Vertices are IP (Internet protocol) addresses and edges are the packets
that flow between them. Such graphs are used for analyzing network security, studying the spread of
worms, and tracking criminal or non- criminal activity.

Neural networks: Vertices represent neurons and edges are the synapses between them. Neural
networks are used to understand how our brain works and how connections change when we learn. The
human brain has about 1011 neurons and close to 1015 synapses.

DFS – Depth First Search

Depth First Search (DFS) algorithm traverses a graph in a depthward motion and uses a stack to remember to
get the next vertex to start a search, when a dead end occurs in any iteration.

As in the example given above, DFS algorithm traverses from S to A to D to G to E to B first, then to F and
lastly to C. It employs the following rules.

Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Push it in a stack.

Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It will pop up all the vertices from
the stack, which do not have adjacent vertices.)

Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.

Step Traversal Description


1

Initialize the stack.

2 Mark S as visited and put it onto the


stack. Explore any unvisited adjacent
node from S. We have three nodes and
we can pick any of them. For this
example, we shall take the node in an
alphabetical order.

3 Mark A as visited and put it onto the


stack. Explore any unvisited adjacent
node from A. Both S and D are
adjacent to A but we are concerned for
unvisited nodes only.

4 Visit D and mark it as visited and put


onto the stack. Here, we

have B and C nodes, which are


adjacent to D and both are unvisited.
However, we shall again choose in an
alphabetical order.

We choose B, mark it as visited and


put onto the stack. Here B does not
have any unvisited adjacent node. So,
we pop B from the stack.

We check the stack top for return to


the previous node and check if it has
any unvisited nodes. Here, we

find D to be on the top of the stack.


7

Only unvisited adjacent node is

from D is C now. So we visit C, mark


it as visited and put it onto the stack.

DFS(G, u)

u.visited = true

for each v ∈ G.Adj[u] if v.visited == false

DFS(G,v)

init() {

For each u ∈ G u.visited = false

For each u ∈ G DFS(G, u)

Application of DFS Algorithm

• For finding the path


• To test if the graph is bipartite
• For finding the strongly connected components of a graph
• For detecting cycles in a graph

Breadth First Search

Breadth First Search (BFS) algorithm traverses a graph in a breadthward motion and uses a queue to
remember to get the next vertex to start a search, when a dead end occurs in any iteration.

As in the example given above, BFS algorithm traverses from A to B to E to F first then to C and G lastly to
D. It employs the following rules.

Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Insert it in a queue.
Rule 2 − If no adjacent vertex is found, remove the first vertex from the queue.

Rule 3 − Repeat Rule 1 and Rule 2 until the queue is empty.

Step Traversal Description

Initialize the queue.

We start from visiting S (starting


node), and mark it as visited.

We then see an unvisited adjacent


node from S. In this example, we have
three nodes but alphabetically we
choose A, mark it as visited and
enqueue it.

Next, the unvisited adjacent node from


S is B. We mark it as visited and
enqueue it.

Next, the unvisited adjacent node from


S is C. We mark it as visited and
enqueue it.
6

Now, S is left with no unvisited


adjacent nodes. So, we dequeue and
find A.

From A we have D as unvisited


adjacent node. We mark it as visited
and enqueue it.

BFS pseudocode

create a queue Q

mark v as visited and put v into Q while Q is non-empty

remove the head u of Q

mark and enqueue all (unvisited) neighbours of u

BFS Algorithm Complexity

The time complexity of the BFS algorithm is represented in the form of O(V + E), where V is the number of
nodes and E is the number of edges.

The space complexity of the algorithm is O(V).

BFS Algorithm Applications

• To build index by search index


• For GPS navigation
• Path finding algorithms
• In Ford-Fulkerson algorithm to find maximum flow in a network
• Cycle detection in an undirected graph
• In minimum spanning tree

Bi Connectivity Graph

An undirected graph is said to be a bi connected graph, if there are two vertex-disjoint paths between any two
vertices are present. In other words, we can say that there is a cycle between any two vertices.
We can say that a graph G is a bi-connected graph if it is connected, and there are no articulation points or
cut vertex are present in the graph.

To solve this problem, we will use the DFS traversal. Using DFS, we will try to find if there is any

articulation point is present or not. We also check whether all vertices are visited by the DFS or not, if not we
can say that the graph is not connected.

Pseudocode for Bi connectivity

isArticulation(start, visited, disc, low, parent)

Begin

time := 0 //the value of time will not be initialized for next function calls dfsChild := 0

mark start as visited

set disc[start] := time+1 and low[start] := time + 1 time := time + 1

for all vertex v in the graph G, do

if there is an edge between (start, v), then if v is visited, then

increase dfsChild parent[v] := start

if isArticulation(v, visited, disc, low, parent) is true, then return ture

low[start] := minimum of low[start] and low[v] if parent[start] is φ AND dfsChild > 1, then

return true

if parent[start] is φ AND low[v] >= disc[start], then return true

else if v is not the parent of start, then low[start] := minimum of low[start] and disc[v]

done return false

End

isBiconnected(graph)

Begin

initially set all vertices are unvisited and parent of each vertices are φ if isArticulation(0, visited, disc, low,
parent) = true, then

return false
for each node i of the graph, do if i is not visited, then

return false done

return true End

Minimum Spanning Tree

A Spanning Tree is a tree which have V vertices and V-1 edges. All nodes in a spanning tree are reachable
from each other.

A Minimum Spanning Tree(MST) or minimum weight spanning tree for a weighted, connected, undirected
graph is a spanning tree having a weight less than or equal to the weight of every other possible spanning
tree. The weight of a spanning tree is the sum of weights given to each edge of the spanning tree. In short out
of all spanning trees of a given graph, the spanning tree having minimum weight is MST.

Algorithms for finding Minimum Spanning Tree(MST):-

1. Prim’s Algorithm
2. Kruskal’s Algorithm

Prim’s Algorithm

Prim's algorithm is a minimum spanning tree algorithm that takes a graph as input and finds the subset of the
edges of that graph which

form a tree that includes every vertex

has the minimum sum of weights among all the trees that can be formed from the graph

How Prim's algorithm works

It falls under a class of algorithms called greedy algorithms that find the local optimum in the hopes of
finding a global optimum.

We start from one vertex and keep adding edges with the lowest weight until we reach our goal. The steps for
implementing Prim's algorithm are as follows:

Initialize the minimum spanning tree with a vertex chosen at random.

Find all the edges that connect the tree to new vertices, find the minimum and add it to the tree

Keep repeating step 2 until we get a minimum spanning tree

Example of Prim's algorithm


Start with a weighted graph

Choose a vertex

Choose the shortest edge from this vertex and add it

Choose the nearest vertex not yet in the solution

Choose the nearest edge not yet in the solution, if there are multiple choices, choose one at random
Prim's Algorithm pseudocode

The pseudocode for prim's algorithm shows how we create two sets of vertices U and V-U. U
contains the list of vertices that have been visited and V-U the list of vertices that haven't. One
by one, we move vertices from set V-U to set U by connecting the least weight edge.

T = ∅;

U = { 1 };

while (U ≠ V)

let (u, v) be the lowest cost edge such that u ∈ U and v ∈ V - U; T = T 𝖴 {(u, v)}

U = U 𝖴 {v}

Prim's Algorithm Complexity

The time complexity of Prim's algorithm is O(E log V).

Kruskal Algorithm

Kruskal's algorithm is a minimum spanning tree algorithm that takes a graph as input and finds
the subset of the edges of that graph which

form a tree that includes every vertex

has the minimum sum of weights among all the trees that can be formed from the graph

How Kruskal's algorithm works


It falls under a class of algorithms called greedy algorithms that find the local optimum in the
hopes of finding a global optimum.

We start from the edges with the lowest weight and keep adding edges until we reach our goal.
The steps for implementing Kruskal's algorithm are as follows:

Sort all the edges from low weight to high

Take the edge with the lowest weight and add it to the spanning tree. If adding the edge created a
cycle, then reject this edge.

Keep adding edges until we reach all vertices.

Example of Kruskal's algorithm

Start with a weighted graph

Choose the edge with the least weight, if there are more than 1, choose anyone

Choose the next shortest edge and add it


Choose the next shortest edge that doesn't create a cycle and add it

Choose the next shortest edge that doesn't create a cycle and add it

Repeat until you have a spanning tree

Kruskal Algorithm Pseudocode

KRUSKAL(G):

A=∅

For each vertex v ∈ G.V:

MAKE-SET(v)

For each edge (u, v) ∈ G.E ordered by increasing order by weight(u, v):

if FIND-SET(u) ≠ FIND-SET(v):

A = A 𝖴 {(u, v)} UNION(u, v)

return A
Shortest Path Algorithm

The shortest path problem is about finding a path between vertices in a graph such that the total
sum of the edges weights is minimum.

Algorithm for Shortest Path

1. Bellman Algorithm
2. Dijkstra Algorithm
3. Floyd Warshall Algorithm

Bellman Algorithm

Bellman Ford algorithm helps us find the shortest path from a vertex to all other vertices of a
weighted graph.

It is similar to Dijkstra's algorithm but it can work with graphs in which edges can have negative
weights.

How Bellman Ford's algorithm works

Bellman Ford algorithm works by overestimating the length of the path from the starting vertex
to all other vertices. Then it iteratively relaxes those estimates by finding new paths that are
shorter than the previously overestimated paths.

By doing this repeatedly for all vertices, we can guarantee that the result is optimized.
Bellman Ford Pseudo code

We need to maintain the path distance of every vertex. We can store that in an array of size v,
where v is the number of vertices.

We also want to be able to get the shortest path, not only know the length of the shortest path.
For this, we map each vertex to the vertex that last updated its path length.

Once the algorithm is over, we can backtrack from the destination vertex to the source vertex to
find the path.
function bellmanFord(G, S) for each vertex V in G distance[V] <- infinite

previous[V] <- NULL distance[S] <- 0

for each vertex V in G for each edge (U,V) in G

tempDistance <- distance[U] + edge_weight(U, V) if tempDistance < distance[V]

distance[V] <- tempDistance previous[V] <- U

for each edge (U,V) in G

If distance[U] + edge_weight(U, V) < distance[V}

Error: Negative Cycle Exists return distance[], previous[] Bellman Ford's Complexity Time
Complexity

Best Case Complexity O(E)

Average Case Complexity O(VE)

Worst Case Complexity O(VE)

Dijkstra Algorithm

Dijkstra's algorithm allows us to find the shortest path between any two vertices of a graph.

It differs from the minimum spanning tree because the shortest distance between two vertices
might not include all the vertices of the graph.

How Dijkstra's Algorithm works

Dijkstra's Algorithm works on the basis that any subpath B -> D of the shortest path A -> D
between vertices A and D is also the shortest path between vertices B and D.
Each subpath is the shortest path

Djikstra used this property in the opposite direction i.e we overestimate the distance of each
vertex from the starting vertex. Then we visit each node and its neighbors to find the shortest
subpath to those neighbors.

The algorithm uses a greedy approach in the sense that we find the next best solution hoping that
the end result is the best solution for the whole problem.

Example of Dijkstra's algorithm

It is easier to start with an example and then think about the algorithm.

Start with a weighted graph

Choose a starting vertex and assign infinity path values to all other devices
Go to each vertex and update its path length

If the path length of the adjacent vertex is lesser than new path length, don't update it

Avoid updating path lengths of already visited vertices


After each iteration, we pick the unvisited vertex with the least path length. So we choose 5
before 7

Notice how the rightmost vertex has its path length updated twice

Repeat until all the vertices have been visited

Djikstra's algorithm pseudocode

We need to maintain the path distance of every vertex. We can store that in an array of size v,
where v is the number of vertices.

We also want to be able to get the shortest path, not only know the length of the shortest path.
For this, we map each vertex to the vertex that last updated its path length.

Once the algorithm is over, we can backtrack from the destination vertex to the source vertex to
find the path.

A minimum priority queue can be used to efficiently receive the vertex with least path distance.
function dijkstra(G, S)

for each vertex V in G distance[V] <- infinite previous[V] <- NULL

If V != S, add V to Priority Queue Q distance[S] <- 0

while Q IS NOT EMPTY

U <- Extract MIN from Q

for each unvisited neighbour V of U

tempDistance <- distance[U] + edge_weight(U, V) if tempDistance < distance[V]


distance[V] <- tempDistance previous[V] <- U

return distance[], previous[]

Dijkstra's Algorithm Complexity

Time Complexity: O(E Log V)

where, E is the number of edges and V is the number of vertices. Space Complexity: O(V)

Floyd Warshall Algorithm

Floyd-Warshall Algorithm is an algorithm for finding the shortest path between all the pairs of
vertices in a weighted graph. This algorithm works for both the directed and undirected weighted
graphs. But, it does not work for the graphs with negative cycles (where the sum of the edges in
a cycle is negative).

A weighted graph is a graph in which each edge has a numerical value associated with it.

Floyd-Warhshall algorithm is also called as Floyd's algorithm, Roy-Floyd algorithm, Roy-


Warshall algorithm, or WFI algorithm.

This algorithm follows the dynamic programming approach to find the shortest paths.

How Floyd-Warshall Algorithm Works?

Let the given graph be:


Initial graph

Follow the steps below to find the shortest path between all the pairs of vertices.

Create a matrix A0 of dimension n*n where n is the number of vertices. The row and the column
are indexed as i and j respectively. i and j are the vertices of the graph.

Each cell A[i][j] is filled with the distance from the ith vertex to the jth vertex. If there is no path
from ith vertex to jth vertex, the cell is left as infinity.

Fill each cell with the distance between ith and jth vertex

Now, create a matrix A1 using matrix A0. The elements in the first column and the first row are
left as they are. The remaining cells are filled in the following way.

Let k be the intermediate vertex in the shortest path from source to destination. In this step, k is
the first vertex. A[i][j] is filled with (A[i][k] + A[k][j]) if (A[i][j] > A[i][k] + A[k][j]).

That is, if the direct distance from the source to the destination is greater than the path h the
vertex k, then the cell is filled with A[i][k] + A[k][j].

In this step, k is vertex 1. We calculate the distance from source vertex to destination vertex

through this vertex k. Calculate the distance from the source vertex to destination
vertex through this vertex k

For example: For A1[2, 4], the direct distance from vertex 2 to 4 is 4 and the sum of the distance
from vertex 2 to 4 through vertex (ie. from vertex 2 to 1 and from vertex 1 to 4) is 7.

Since 4 < 7, A0[2, 4] is filled with 4.


Similarly, A2 is created using A1. The elements in the second column and the second row are
left as they are.

In this step, k is the second vertex (i.e. vertex 2). The remaining steps are the same as in step

2. Calculate the distance from the source vertex to destination vertex through this vertex 2

Similarly, A3 and A4 is also created.

Calculate the distance from the source vertex to destination vertex


through this

vertex Calculate the distance from the source vertex to destination vertex through this vertex 4

A4 gives the shortest path between each pair of vertices.


Floyd-Warshall Algorithm

n = no of vertices

A = matrix of dimension n*n for k = 1 to n

for i = 1 to n for j = 1 to n

Ak[i, j] = min (Ak-1[i, j], Ak-1[i, k] + Ak-1[k, j]) return A

Time Complexity

There are three loops. Each loop has constant complexities. So, the time complexity of the
Floyd- Warshall algorithm is O(n3).

Network Flow

Flow Network is a directed graph that is used for modeling material Flow. There are two
different vertices; one is a source which produces material at some steady rate, and another one
is sink which consumes the content at the same constant speed. The flow of the material at any
mark in the system is the rate at which the element moves.

Some real-life problems like the flow of liquids through pipes, the current through wires and
delivery of goods can be modelled using flow networks.

Definition: A Flow Network is a directed graph G = (V, E) such that

For each edge (u, v) ∈ E, we associate a nonnegative weight capacity c (u, v) ≥ 0.If (u, v) ∉ E,
we assume that c (u, v) = 0.

There are two distinguishing points, the source s, and the sink t;

For every vertex v ∈ V, there is a path from s to t containing v.

Let G = (V, E) be a flow network. Let s be the source of the network, and let t be the sink. A
flow in G is a real-valued function f: V x V→R such that the following properties hold:

Play Video

Capacity Constraint: For all u, v ∈ V, we need f (u, v) ≤ c (u, v).

Skew Symmetry: For all u, v ∈ V, we need f (u, v) = - f (u, v).

Flow Conservation: For all u ∈ V-{s, t}, we need


The quantity f (u, v), which can be positive or negative, is known as the net flow from vertex u to
vertex v. In the maximum-flow problem, we are given a flow network G with source s and sink t,
and

a flow of maximum value from s to t.Ford-Fulkerson Algorithm

Initially, the flow of value is 0. Find some augmenting Path p and increase flow f on each edge
of p by residual Capacity cf (p). When no augmenting path exists, flow f is a maximum flow.

FORD-FULKERSON METHOD (G, s, t)

Initialize flow f to 0

while there exists an augmenting path p

do argument flow f along p

Return f

FORD-FULKERSON (G, s, t)

for each edge (u, v) ∈ E [G]

do f [u, v] ← 0

3. f [u, v] ← 0

while there exists a path p from s to t in the residual network Gf.

do cf (p)←min?{ Cf (u,v):(u,v)is on p}

for each edge (u, v) in p

do f [u, v] ← f [u, v] + cf (p)

8. f [u, v] ←-f[u,v]

Example: Each Directed Edge is labeled with capacity. Use the Ford-Fulkerson algorithm to find
the maximum flow.
Solution: The left side of each part shows the residual network Gf with a shaded augmenting
path p,and the right side of each part shows the net flow f.
Maximum Bipartite Matching

The bipartite matching is a set of edges in a graph is chosen in such a way, that no two edges in
that set will share an endpoint. The maximum matching is matching the maximum number of
edges.

When the maximum match is found, we cannot add another edge. If one edge is added to the
maximum matched graph, it is no longer a matching. For a bipartite graph, there can be more
than one maximum matching is possible.

Algorithm

bipartiteMatch(u, visited, assign)

Input: Starting node, visited list to keep track, assign the list to assign node with another node.

Output − Returns true when a matching for vertex u is possible.

Begin

for all vertex v, which are adjacent with u, do if v is not visited, then

mark v as visited

if v is not assigned, or bipartiteMatch(assign[v], visited, assign) is true, then assign[v] := u

return true done

return false End

maxMatch(graph) Input − The given graph.

Output − The maximum number of the match.


Begin

initially no vertex is assigned count := 0

for all applicant u in M, do make all node as unvisited

if bipartiteMatch(u, visited, assign), then increase count by 1

done End
UNIT III ALGORITHM DESIGN TECHNIQUES

Divide and Conquer methodology: Finding maximum and minimum - Merge sort - Quick sort Dynamic
programming: Elements of dynamic programming — Matrix-chain multiplication - Multi stage graph —
Optimal Binary Search Trees. Greedy Technique: Elements of the greedy strategy - Activity-selection
problem –- Optimal Merge pattern — Huffman Trees.

Divide and Conquer Algorithm

A divide and conquer algorithm is a strategy of solving a large problem by breaking the problem into smaller
sub-problems solving the sub-problems, and combining them to get the desired output.

To use the divide and conquer algorithm, recursion is used.

How Divide and Conquer Algorithms Work?

Here are the steps involved:

Divide: Divide the given problem into sub-problems using recursion.

Conquer: Solve the smaller sub-problems recursively. If the subproblem is small enough, then solve it
directly.

Combine: Combine the solutions of the sub-problems that are part of the recursive process to solve the
actual problem.

Finding Maximum and Minimum

To find the maximum and minimum numbers in a given array numbers[] of size n, the following algorithm
can be used. First we are representing the naive method and then we will present divide and conquer
approach.

Naïve Method

Naïve method is a basic method to solve any problem. In this method, the maximum and minimum number
can be found separately. To find the maximum and minimum numbers, the following straightforward
algorithm can be used.

Algorithm: Max-Min-Element (numbers[]) max := numbers[1]

min := numbers[1] for i = 2 to n do

if numbers[i] > max then max := numbers[i]

if numbers[i] < min then min := numbers[i]

return (max, min)

Analysis

The number of comparison in Naive method is 2n - 2.

The number of comparisons can be reduced using the divide and conquer approach. Following is the
technique.
Divide and Conquer Approach

In this approach, the array is divided into two halves. Then using recursive approach maximum and
minimum numbers in each halves are found. Later, return the maximum of two maxima of each half and the
minimum of two minima of each half.

In this given problem, the number of elements in an array is y−x+1 , where y is greater than or equal to x.

Max−Min(x,y) will return the maximum and minimum values of an array numbers[x...y].

Algorithm: Max - Min(x, y)

if y – x ≤ 1 then

return (max(numbers[x], numbers[y]), min((numbers[x], numbers[y])) else

(max1, min1):= maxmin(x, ⌊((x + y)/2)⌋)

(max2, min2):= maxmin(⌊((x + y)/2) + 1)⌋,y) return (max(max1, max2), min(min1, min2)) Analysis

Let T(n) be the number of comparisons made by Max−Min(x,y), where the number of elements n=y−x+1.

If T(n) represents the numbers, then the recurrence relation can be represented as

Let us assume that n is in the form of power of 2. Hence, n = 2k where k is height of the recursion tree.

So,

Compared to Naïve method, in divide and conquer approach, the number of comparisons is less. However,
using the asymptotic notation both of the approaches are represented

by O(n).
Merge Sort

Merge Sort is one of the most popular sorting algorithms that is based on the principle of Divide and
Conquer Algorithm.

Here, a problem is divided into multiple sub-problems. Each sub-problem is solved individually. Finally,
sub-problems are combined to form the final solution.

Merge Sort example

Divide and Conquer Strategy

Using the Divide and Conquer technique, we divide a problem into subproblems. When the solution to each
sub problem is ready, we 'combine' the results from the subproblems to solve the main problem.

Suppose we had to sort an array A. A subproblem would be to sort a sub-section of this array starting at
index p and ending at index r, denoted as A[p..r].

Divide

If q is the half-way point between p and r, then we can split the subarray A[p..r] into two arrays A[p..q] and
A[q+1, r].

Conquer

In the conquer step, we try to sort both the subarrays A[p..q] and A[q+1, r]. If we haven't yet reached the
base case, we again divide both these subarrays and try to sort them.

Combine

When the conquer step reaches the base step and we get two sorted subarrays A[p..q] and A[q+1, r] for array
A[p..r], we combine the results by creating a sorted array A[p..r] from two sorted subarrays A[p..q] and
A[q+1, r].
MergeSort Algorithm

The MergeSort function repeatedly divides the array into two halves until we reach a stage where we try to
perform MergeSort on a subarray of size 1 i.e. p == r.

After that, the merge function comes into play and combines the sorted arrays into larger arrays until the
whole array is merged.

MergeSort(A, p, r): if p > r

return

q = (p+r)/2 mergeSort(A, p, q) mergeSort(A, q+1, r) merge(A, p, q, r)

void merge(int arr[], int p, int q, int r)

// Create L ← A[p..q] and M ← A[q+1..r] int n1 = q - p + 1;

int n2 = r - q;

int L[n1], M[n2];

for (int i = 0; i < n1; i++) L[i] = arr[p + i];

for (int j = 0; j < n2; j++) M[j] = arr[q + 1 + j];

// Maintain current index of sub-arrays and main array int i, j, k;

i = 0;

j = 0;

k = p;

// Until we reach either end of either L or M, pick larger among

// elements L and M and place them in the correct position at A[p..r] while (i < n1 && j < n2)

if (L[i] <= M[j])

arr[k] = L[i];

else

{ k++;

// When we run out of elements in either L or M,

// pick up the remaining elements and put in A[p..r] while (i < n1)
{

arr[k] = L[i]; i++;

k++;

while (j < n2)

arr[k]=M[j];

j++;

k++;

Time Complexity

Best Case Complexity: O(n*log n)

Worst Case Complexity: O(n*log n)

Average Case Complexity: O(n*log n)

Dynamic Programming -Matrix Chain Multiplication

Dynamic programming is a method for solving optimization problems.

It is algorithm technique to solve a complex and overlapping sub-problems. Compute the solutions to the
sub-problems once and store the solutions in a table, so that they can be reused (repeatedly) later.

Dynamic programming is more efficient then other algorithm methods like as Greedy method, Divide and
Conquer method, Recursion method, etc….

The real time many of problems are not solve using simple and traditional approach methods. like as coin
change problem , knapsack problem, Fibonacci sequence generating , complex matrix multiplication….To
solve using Iterative formula, tedious method , repetition again and again it become a more time consuming
and foolish. some of the problem it should be necessary to divide a sub problems and compute its again and
again to solve a

such kind of problems and give the optimal solution , effective solution the Dynamic programming is
needed…

Basic Features of Dynamic programming:-

• Get all the possible solution and pick up best and optimal solution.
• Work on principal of optimality.
• Define sub-parts and solve them using recursively.
• Less space complexity But more Time complexity.
• Dynamic programming saves us from having to re compute previously calculated sub- solutions.
• Difficult to understanding.

We are covered a many of the real world problems. In our day to day life when we do making coin change,
robotics world, aircraft, mathematical problems like Fibonacci sequence, simple matrix multiplication of
more than two matrices and its multiplication possibility is many more so in that get the best and optimal
solution. NOW we can look about one problem that is MATRIX CHAIN MULTIPLICATION PROBLEM.

Suppose, We are given a sequence (chain) (A1, A2……An) of n matrices to be multiplied, and we wish to
compute the product (A1A2…..An).We can evaluate the above expression using the standard algorithm for
multiplying pairs of matrices as a subroutine once we have parenthesized it to resolve all ambiguities in how
the matrices are multiplied together.

Matrix multiplication is associative, and so all parenthesizations yield the same product. For example, if the
chain of matrices is (A1, A2, A3, A4) then we can fully parenthesize the product (A1A2A3A4) in five
distinct ways:

1:-(A1(A2(A3A4))) ,

2:-(A1((A2A3)A4)),

3:- ((A1A2)(A3A4)),

4:-((A1(A2A3))A4),

5:-(((A1A2)A3)A4) .

We can multiply two matrices A and B only if they are compatible. the number of columns of A must equal
the number of rows of B. If A is a p x q matrix and B is a q x r matrix,the resulting matrix C is a p x r matrix.
The time to compute C is dominated by the number of scalar multiplications is pqr. we shall express costs in
terms of the number of scalar multiplications.For example, if we have three matrices (A1,A2,A3) and its cost
is (10x100),(100x5),(5x500) respectively. so we can calculate the cost of scalar multiplication is
10*100*5=5000 if ((A1A2)A3), 10*5*500=25000 if (A1(A2A3)), and so on cost

calculation. Note that in the matrix-chain multiplication problem, we are not actually multiplying matrices.
Our goal is only to determine an order for multiplying matrices that has the lowest cost.that is here is
minimum cost is 5000 for above example .So problem is we can perform a many time of cost multiplication
and repeatedly the calculation is performing. So this general method is very time consuming and tedious.So
we can apply dynamic programming for solve this kind of problem.

when we used the Dynamic programming technique we shall follow some steps.

Characterize the structure of an optimal solution.

Recursively define the value of an optimal solution.

Compute the value of an optimal solution.


Construct an optimal solution from computed information.

we have matrices of any of order. our goal is find optimal cost multiplication of matrices.when we solve the
this kind of problem using DP step 2 we can get

m[i , j] = min { m[i , k] + m[i+k , j] + pi-1*pk*pj } if i < j…. where p is dimension of matrix , i ≤ k < j …..

The basic algorithm of matrix chain multiplication:-

// Matrix A[i] has dimension dims[i-1] x dims[i] for i = 1..n

MatrixChainMultiplication(int dims[])

// length[dims] = n + 1

n = dims.length - 1;

// m[i,j] = Minimum number of scalar multiplications(i.e., cost)

// needed to compute the matrix A[i]A[i+1]...A[j] = A[i..j]

// The cost is zero when multiplying one matrix

for (i = 1; i <= n; i++) m[i, i] = 0;

for (len = 2; len <= n; len++){

// Subsequence lengths

for (i = 1; i <= n - len + 1; i++) { j = i + len - 1;

m[i, j] = MAXINT;

for (k = i; k <= j - 1; k++) {

cost = m[i, k] + m[k+1, j] + dims[i-1]*dims[k]*dims[j];

if (cost < m[i, j]) { m[i, j] = cost;

s[i, j] = k;

// Index of the subsequence split that achieved minimal cost

}
}

Example of Matrix Chain Multiplication

Example: We are given the sequence {4, 10, 3, 12, 20, and 7}. The matrices have size 4 x 10,

10 x 3, 3 x 12, 12 x 20, 20 x 7. We need to compute M [i,j], 0 ≤ i, j≤ 5. We know M [i, i] = 0 for all i.

Let us proceed with working away from the diagonal. We compute the optimal solution for the product of 2
matrices.

In Dynamic Programming, initialization of every method done by ‘0’.So we initialize it by ‘0’.It will sort out
diagonally.

We have to sort out all the combination but the minimum output combination is taken into consideration.

Calculation of Product of 2 matrices:

1. m (1,2) = m1 x m2

= 4 x 10 x 10 x 3

= 4 x 10 x 3 = 120

2. m (2, 3) = m2 x m3
= 10 x 3 x 3 x 12

= 10 x 3 x 12 = 360

3. m (3, 4) = m3 x m4

= 3 x 12 x 12 x 20

= 3 x 12 x 20 = 720

4. m (4,5) = m4 x m5

= 12 x 20 x 20 x 7

= 12 x 20 x 7 = 1680

We initialize the diagonal element with equal i,j value with ‘0’.

After that second diagonal is sorted out and we get all the values corresponded to it Now the third diagonal
will be solved out in the same way.

Now product of 3 matrices:

M [1, 3] = M1 M2 M3

There are two cases by which we can solve this multiplication: ( M1 x M2) + M3, M1+ (M2x M3)

After solving both cases we choose the case in which minimum output is there.

M [1, 3] =264

As Comparing both output 264 is minimum in both cases so we insert 264 in table and ( M1 x M2) + M3 this
combination is chosen for the output making.

M [2, 4] = M2 M3 M4

There are two cases by which we can solve this multiplication: (M2x M3)+M4, M2+(M3 x M4)
After solving both cases we choose the case in which minimum output is there.

M [2, 4] = 1320

As Comparing both output 1320 is minimum in both cases so we insert 1320 in table and M2+(M3 x M4)
this combination is chosen for the output making.

M [3, 5] = M3 M4 M5

There are two cases by which we can solve this multiplication: ( M3 x M4) + M5, M3+ ( M4xM5)

After solving both cases we choose the case in which minimum output is there.

M [3, 5] = 1140

As Comparing both output 1140 is minimum in both cases so we insert 1140 in table and ( M3 x M4) +
M5this combination is chosen for the output making.

Now Product of 4 matrices:

M [1, 4] = M1 M2 M3 M4

There are three cases by which we can solve this multiplication:

( M1 x M2 x M3) M4

M1 x(M2 x M3 x M4) 3. (M1 xM2) x ( M3 x M4)

After solving these cases we choose the case in which minimum output is there

M [1, 4] =1080
As comparing the output of different cases then ‘1080’ is minimum output, so we insert 1080 in the table and
(M1 xM2) x (M3 x M4) combination is taken out in output making,

M [2, 5] = M2 M3 M4 M5

There are three cases by which we can solve this multiplication:

(M2 x M3 x M4)x M5

M2 x( M3 x M4 x M5)

3. (M2 x M3)x ( M4 x M5)

After solving these cases we choose the case in which minimum output is there

M [2, 5] = 1350

As comparing the output of different cases then ‘1350’ is minimum output, so we insert 1350 in the table and
M2 x( M3 x M4xM5)combination is taken out in output making.

Now Product of 5 matrices:

M [1, 5] = M1 M2 M3 M4 M5

There are five cases by which we can solve this multiplication:

(M1 x M2 xM3 x M4 )x M5

M1 x( M2 xM3 x M4 xM5)

(M1 x M2 xM3)x M4 xM5

M1 x M2x(M3 x M4 xM5)

After solving these cases we choose the case in which minimum output is there
M [1, 5] = 1344

As comparing the output of different cases then ‘1344’ is minimum output, so we insert 1344 in the table and
M1 x M2 x(M3 x M4 x M5)combination is taken out in output making.

Final Output is:

So we can get the optimal solution of matrices multiplication….

Multi Stage Graph

Multistage Graph problem is defined as follow:

Multistage graph G = (V, E, W) is a weighted directed graph in which vertices are partitioned into k ≥ 2
disjoint sub sets V = {V1, V2, …, Vk} such that if edge (u, v) is present in E then u ∈ Vi and v ∈ Vi+1, 1 ≤ i
≤ k. The goal of multistage graph problem is to find minimum cost path from source to destination vertex.

The input to the algorithm is a k-stage graph, n vertices are indexed in increasing order of stages.

The algorithm operates in the backward direction, i.e. it starts from the last vertex of the graph and proceeds
in a backward direction to find minimum cost path.

Minimum cost of vertex j ∈ Vi from vertex r ∈ Vi+1 is defined as, Cost[j] = min{ c[j, r] + cost[r] }

where, c[j, r] is the weight of edge <j, r> and cost[r] is the cost of moving from end vertex to vertex r.

Algorithm for the multistage graph is described below :

Algorithm for Multistage Graph Algorithm MULTI_STAGE(G, k, n, p)

// Description: Solve multi-stage problem using dynamic programming

// Input:
k: Number of stages in graph G = (V, E) c[i, j]:Cost of edge (i, j)

// Output: p[1:k]:Minimum cost path cost[n] ← 0

for j ← n – 1 to 1 do

//Let r be a vertex such that (j, r) in E and c[j, r] + cost[r] is minimum cost[j] ← c[j, r] + cost[r]

π[j] ← r

end

//Find minimum cost path p[1] ← 1

p[k] ← n

for j ← 2 to k - 1 do

p[j] ← π[p[j - 1]]

end

Complexity Analysis of Multistage Graph

If graph G has |E| edges, then cost computation time would be O(n + |E|). The complexity of tracing the
minimum cost path would be O(k), k < n. Thus total time complexity of multistage graph using dynamic
programming would be O(n + |E|).

Example

Example: Find minimum path cost between vertex s and t for following multistage graph using dynamic
programming.

Solution:

Solution to multistage graph using dynamic programming is constructed as, Cost[j] = min{c[j, r] + cost[r]}

Here, number of stages k = 5, number of vertices n = 12, source s = 1 and target t = 12 Initialization:

Cost[n] = 0 ⇒ Cost[12] = 0. p[1] = s ⇒ p[1] = 1

p[k] = t ⇒ p[5] = 12. r = t = 12.

Stage 4:
Stage 3:

Vertex 6 is connected to vertices 9 and 10:

Cost[6] = min{ c[6, 10] + Cost[10], c[6, 9] + Cost[9] }

= min{5 + 2, 6 + 4} = min{7, 10} = 7

p[6] = 10

Vertex 7 is connected to vertices 9 and 10:

Cost[7] = min{ c[7, 10] + Cost[10], c[7, 9] + Cost[9] }

= min{3 + 2, 4 + 4} = min{5, 8} = 5

p[7] = 10

Vertex 8 is connected to vertex 10 and 11:

Cost[8] = min{ c[8, 11] + Cost[11], c[8, 10] + Cost[10] }

= min{6 + 5, 5 + 2} = min{11, 7} = 7 p[8] = 10

Stage 2:

Vertex 2 is connected to vertices6, 7 and 8:

Cost[2] = min{ c[2, 6] + Cost[6], c[2, 7] + Cost[7], c[2, 8] + Cost[8] }

= min{4 + 7, 2 + 5, 1 + 7} = min{11, 7, 8} = 7
p[2] = 7

Vertex 3 is connected to vertices 6and 7:

Cost[3] = min{ c[3, 6] + Cost[6], c[3, 7] + Cost[7] }

= min{2 + 7, 7 + 5} = min{9, 12} = 9

p[3] = 6

Vertex 4 is connected to vertex 8:

Cost[4] = c[4, 8] + Cost[8] = 11 + 7 = 18

p[4] = 8

Vertex 5 is connected to vertices 7 and 8:

Cost[5] = min{ c[5, 7] + Cost[7], c[5, 8] + Cost[8] }

= min{11 + 5, 8 + 7} = min{16, 15} = 15 p[5] = 8

Stage 1:

Vertex 1 is connected to vertices 2, 3, 4 and 5:

Cost[1] = min{ c[1, 2] + Cost[2], c[1, 3] + Cost[3], c[1, 4] + Cost[4], c[1, 5] + Cost[5]}

= min{ 9 + 7, 7 + 9, 3 + 18, 2 + 15 }

= min { 16, 16, 21, 17 } = 16 p[1] = 2

Trace the solution:

p[1] = 2

p[2] = 7

p[7] = 10
p[10] = 12

Minimum cost path is : 1 – 2 – 7 – 10 – 12

Cost of the path is : 9 + 2 + 3 + 2 = 16

Optimal Binary Search Tree

Optimal Binary Search Tree extends the concept of Binary searc tree. Binary Search Tree (BST) is a
nonlinear data structure which is used in many scientific applications for reducing the search time. In BST,
left child is smaller than root and right child is greater than root. This arrangement simplifies the search
procedure.

Optimal Binary Search Tree (OBST) is very useful in dictionary search. The probability of searching is
different for different words. OBST has great application in translation.

If we translate the book from English to German, equivalent words are searched from English to German
dictionary and replaced in translation. Words are searched same as in binary search tree order.

Binary search tree simply arranges the words in lexicographical order. Words like ‘the’, ‘is’, ‘there’ are very
frequent words, whereas words like ‘xylophone’, ‘anthropology’ etc. appears rarely.

It is not a wise idea to keep less frequent words near root in binary search tree. Instead of storing words in
binary search tree in lexicographical order, we shall arrange them according to their probabilities. This
arrangement facilitates few searches for frequent words as they would be near the root. Such tree is called
Optimal Binary Search Tree.

Consider the sequence of nkeys K = < k1, k2, k3, …, kn> of distinct probability in sorted order such that

k1< k2< … <kn. Words between each pair of key lead to unsuccessful search, so for n keys, binary search
tree contains n + 1 dummy keys di, representing unsuccessful searches.

Two different representation of BST with same five keys {k1, k2, k3, k4, k5} probability is shown in
following figure

With n nodes, there exist (2n)!/((n + 1)! * n!) different binary search trees. An exhaustive search for optimal
binary search tree leads to huge amount of time.

The goal is to construct a tree which minimizes the total search cost. Such tree is called optimal binary search
tree. OBST does not claim minimum height. It is also not necessary that parent of sub tree has higher priority
than its child.

Dynamic programming can help us to find such optima tree.


Binary search trees with 5 keys

Mathematical formulation

We formulate the OBST with following observations

Any sub tree in OBST contains keys in sorted order ki…kj, where 1 ≤ i ≤ j ≤ n.

Sub tree containing keys ki…kj has leaves with dummy keys di-1….dj.

Suppose kr is the root of sub tree containing keys ki…..kj. So, left sub tree of root kr contains keys

ki….kr-1 and right sub tree contain keys kr+1 to kj. Recursively, optimal sub trees are constructed from the
left and right sub trees of kr.

Let e[i, j] represents the expected cost of searching OBST. With n keys, our aim is to find and minimize e[1,
n].

Base case occurs when j = i – 1, because we just have the dummy key di-1 for this case. Expected search cost
for this case would be e[i, j] = e[i, i – 1] = qi-1.

For the case j ≥ i, we have to select any key kr from ki…kj as a root of the tree.

With kr as a root key and sub tree ki…kj, sum of probability is defined as

(Actual key starts at index 1 and dummy key starts at index 0)

Thus, a recursive formula for forming the OBST is stated below :


e[i, j] gives the expected cost in the optimal binary search tree.

Algorithm for Optimal Binary Search Tree

The algorithm for optimal binary search tree is specified below :

Algorithm OBST(p, q, n)

// e[1…n+1, 0…n ] : Optimal sub tree

// w[1…n+1, 0…n] : Sum of probability

// root[1…n, 1…n] : Used to construct OBST

for i ← 1 to n + 1 do

e[i, i – 1] ← qi – 1

w[i, i – 1] ← qi – 1

end

for m ← 1 to n do

for i ← 1 to n – m + 1 do

j ← i + m – 1 e[i, j] ← ∞

w[i, j] ← w[i, j – 1] + pj + qj

for r ← i to j do

t ← e[i, r – 1] + e[r + 1, j] + w[i, j]

if t < e[i, j] then e[i, j] ← t root[i, j] ← r

end end

end end

return (e, root)

Complexity Analysis of Optimal Binary Search Tree

It is very simple to derive the complexity of this approach from the above algorithm. It uses three nested
loops. Statements in the innermost loop run in Q(1) time. The running time of the algorithm is computed as
Thus, the OBST algorithm runs in cubic time

Example

Problem: Let p (1 : 3) = (0.5, 0.1, 0.05) q(0 : 3) = (0.15, 0.1, 0.05, 0.05) Compute and

construct OBST for above values using Dynamic approach. Solution:

Here, given that

i 0 1 2 3

pi 0.5 0.1 0.05

qi 0.15 0.1 0.05 0.05

Recursive formula to solve OBST problem is

Where,

Initially,
Now, we will compute e[i, j]

Initially,

e[1, 0] = q0 = 0.15 (∵ j = i – 1)

e[2, 1] = q1 = 0.1 (∵ j = i – 1)

e[3, 2] = q2 = 0.05 (∵ j = i – 1)

e[4, 3] = q3 = 0.05 (∵ j = i – 1)

e[1, 1] = min { e[1, 0] + e[2, 1] + w(1, 1) }

= min { 0.15 + 0.1 + 0.75 } = 1.0

e[2, 2] = min { e[2, 1] + e[3, 2] + w(2, 2) }

= min { 0.1 + 0.05 + 0.25 } = 0.4


e[3, 3] = min { e[3, 2] + e[4, 3] + w(3, 3) }

= min { 0.05 + 0.05 + 0.15 } = 0.25

e[1, 3] is minimum for r = 1, so r[1, 3] = 1

e[2, 3] is minimum for r = 2, so r[2, 3] = 2

e[1, 2] is minimum for r = 1, so r[1, 2] = 1

e[3, 3] is minimum for r = 3, so r[3, 3] = 3

e[2, 2] is minimum for r = 2, so r[2, 2] = 2


e[1, 1] is minimum for r = 1, so r[1, 1] = 1 Let us now construct OBST for given data. r[1, 3] = 1, so k1
will be at the root. k2….3 are on right side of k1

r[2, 3] = 2, So k2 will be the root of this sub tree. k3 will be on the right of k2.

Thus, finally, we get.

Greedy Technique

Activity Selection problem is a approach of selecting non-conflicting tasks based on start and
end time and can be solved in O(N logN) time using a simple greedy approach. Modifications
of this problem are complex and interesting which we will explore as well. Suprising, if we
use a Dynamic Programming approach, the time complexity will be O(N^3) that is lower
performance.

The problem statement for Activity Selection is that "Given a set of n activities with their
start and finish times, we need to select maximum number of non-conflicting activities that
can be performed by a single person, given that the person can handle only one activity at a
time." The Activity Selection problem follows Greedy approach i.e. at every step, we can
make a choice that looks best at the moment to get the optimal solution of the complete
problem.

Our objective is to complete maximum number of activities. So, choosing the activity which
is going to finish first will leave us maximum time to adjust the later activities. This is the
intuition that greedily choosing the activity with earliest finish time will give us an optimal
solution. By induction on the number of choices made, making the greedy choice at every
step produces an optimal solution, so we chose the activity which finishes first. If we sort
elements based on their starting time, the activity with least starting time could take the
maximum duration for completion, therefore we won't be able to maximise number of
activities.
Algorithm
The algorithm of Activity Selection is as
follows: Activity-Selection(Activity, start,
finish)

Sort Activity by finish times stored in


finish Selected = {Activity[1]}

n =
Activity.length j =
1

for i = 2 to n:

if start[i] ≥ finish[j]:

Selected = Selected U
{Activity[i]} j = i

return Selected

Complexity
Time Complexity:

When activities are sorted by their finish time: O(N)

When activities are not sorted by their finish time, the time complexity is O(N log N) due to
complexity of sorting
In this example, we take the start and finish time of activities as follows:
start = [1, 3, 2, 0, 5, 8, 11]

finish = [3, 4, 5, 7, 9, 10, 12]

Sorted by their finish time, the activity 0 gets selected. As the activity 1 has starting time
which is equal to the finish time of activity 0, it gets selected. Activities 2 and 3 have smaller
starting time than finish time of activity 1, so they get rejected. Based on similar
comparisons, activities 4 and 6 also get selected, whereas activity 5 gets rejected. In this
example, in all the activities 0, 1, 4 and 6 get selected, while others get rejected.
Optimal Merge Pattern
Merge a set of sorted files of different length into a single sorted file. We need to find an
optimal solution, where the resultant file will be generated in minimum time.

If the number of sorted files are given, there are many ways to merge them into a single
sorted file. This merge can be performed pair wise. Hence, this type of merging is called
as 2-way merge patterns.

As, different pairings require different amounts of time, in this strategy we want to
determine an optimal way of merging many files together. At each step, two shortest
sequences are merged.

To merge a p-record file and a q-record file requires possibly p + q record moves, the
obvious choice being, merge the two smallest files together at each step.

Two-way merge patterns can be represented by binary merge trees. Let us consider a set of
n sorted files {f1, f2, f3, …, fn}. Initially, each element of this is considered as a single node
binary tree. To find this optimal solution, the following algorithm is used.

Algorithm: TREE
(n) for i := 1 to n – 1
do declare new node

node.leftchild := least (list)


node.rightchild := least (list)

node.weight) := ((node.leftchild).weight) + ((node.rightchild).weight)


insert (list, node);

return least (list);

At the end of this algorithm, the weight of the root node represents the optimal cost.
Example

Let us consider the given files, f1, f2, f3, f4 and f5 with 20, 30, 10, 5 and 30 number of
elements respectively.

If merge operations are performed according to the provided sequence, then


M1 = merge f1 and f2 => 20 + 30 = 50

M2 = merge M1 and f3 => 50 + 10 = 60


M3 = merge M2 and f4 => 60 + 5 = 65

M4 = merge M3 and f5 => 65 + 30 = 95

Hence, the total number of operations is


50 + 60 + 65 + 95 = 270
Now, the question arises is there any better solution?

Sorting the numbers according to their size in an ascending order, we get the following
sequence −

f4, f3, f1, f2, f5

Hence, merge operations can be performed on this sequence


M1 = merge f4 and f3 => 5 + 10 = 15

M2 = merge M1 and f1 => 15 + 20 =


35 M3 = merge M2 and f2 => 35 + 30
= 65 M4 = merge M3 and f5 => 65 +
30 = 95

Therefore, the total number of operations is


15 + 35 + 65 + 95 = 210

Obviously, this is better than the previous one.

In this context, we are now going to solve the problem using this algorithm.
Initial Set

Step 1

Step 2
Step 3

Step 4

Hence, the solution takes 15 + 35 + 60 + 95 = 205 number of comparisons.

Huffman Tree

Huffman coding provides codes to characters such that the length of the code depends on the
relative frequency or weight of the corresponding character. Huffman codes are of variable-
length, and without any prefix (that means no code is a prefix of any other). Any prefix-free
binary code can be displayed or visualized as a binary tree with the encoded characters stored
at the leaves.

Huffman tree or Huffman coding tree defines as a full binary tree in which each leaf of the
tree corresponds to a letter in the given alphabet.

The Huffman tree is treated as the binary tree associated with minimum external path weight
that means, the one associated with the minimum sum of weighted path lengths for the given
set of leaves. So the goal is to construct a tree with the minimum external path weight.

An example is given
below- Letter frequency
table

Letter z k m c u d l e
Frequency 2 7 24 32 37 42 42 120

Huffman code

Letter Freq Code Bits

e 120 0 1

d 42 101 3

l 42 110 3

u 37 100 3

c 32 1110 4

m 24 11111 5

k 7 111101 6

z 2 111100 6
The Huffman tree (for the above example) is
given below -Algorithm Huffman (c)

n= |c|

Q=c

for i<-1 to n-1

do

temp <- get node ()

left (temp] Get_min (Q) right [temp]

Get Min (Q) a = left [templ b = right

[temp]

[temp]

<- f[a]

+ [b]
insert

(Q,

temp)

return Get_min (0)

}
UNIT IV STATE SPACE SEARCH ALGORITHMS

Backtracking: n-Queens problem - Hamiltonian Circuit Problem - Subset Sum Problem – Graph colouring
problem Branch and Bound: Solving 15-Puzzle problem - Assignment problem - Knapsack Problem -
Travelling Salesman Problem

Backtracking

N queen Problem

N - Queens problem is to place n - queens in such a manner on an n x n chessboard that no queens


attack each other by being in the same row, column or diagonal.
It can be seen that for n =1, the problem has a trivial solution, and no solution exists for n =2 and n =3. So
first we will consider the 4 queens problem and then generate it to n - queens problem.
Given a 4 x 4 chessboard and number the rows and column of the chessboard 1 through 4.

Since, we have to place 4 queens such as q1 q2 q3 and q4 on the chessboard, such that no two queens
attack each other. In such a conditional each queen must be placed on a different row, i.e., we put queen "i"
on row "i."
Now, we place queen q1 in the very first acceptable position (1, 1). Next, we put queen q2 so that both
these queens do not attack each other. We find that if we place q2 in column 1 and 2, then the dead end is
encountered. Thus the first acceptable position for q2 in column 3, i.e. (2, 3) but then no position is left for
placing queen 'q3' safely. So we backtrack one step and place the queen 'q2' in (2, 4), the next best possible
solution. Then we obtain the position for placing 'q3' which is (3, 2). But later this position also leads to
a dead end, and no place is found where 'q4' can be placed safely. Then we have to backtrack till 'q1' and
place it to (1, 2) and then all other queens are placed safely by moving q 2 to (2, 4), q3 to (3, 1) and q4 to (4,
3). That is, we get the solution (2, 4, 1, 3). This is one possible solution for the 4-queens problem.
For another possible solution, the whole method is repeated for all partial solutions. The other solutions for
4 - queens problems is (3, 1, 4, 2) i.e.
The implicit tree for 4 - queen problem for a solution (2, 4, 1, 3) is as follows:

Fig shows the complete state space for 4 - queens problem. But we can use backtracking method to
generate the necessary node and stop if the next node violates the rule, i.e., if two queens are attacking.

4 - Queens solution space with nodes numbered in DFS


It can be seen that all the solutions to the 4 queens problem can be represented as 4 - tuples (x1, x2,
x3, x4) where xi represents the column on which queen "qi" is placed.
One possible solution for 8 queens problem is shown in fig:
Thus, the solution for 8 -queen problem for (4, 6, 8, 2, 7, 1, 3, 5).

If two queens are placed at position (i, j) and (k, l).

Then they are on same diagonal only if (i - j) = k - l or i + j = k + l.

The first equation implies that j - l = i - k.

The second equation implies that j - l = k - i.

Therefore, two queens lie on the duplicate diagonal if and only if |j-l|=|i-k|

Place (k, i) returns a Boolean value that is true if the kth queen can be placed in column i. It tests
both whether i is distinct from all previous costs x1, x2, xk-1 and whether there is no other
queen on the same
diagonal.
Using place, we give a precise solution to then n- queens problem.
Place (k, i)
{
For j ← 1 to k - 1
do if (x [j] = i)
or (Abs x [j]) - i) = (Abs (j - k))
then return false;
return true;
}
Place (k, i) return true if a queen can be placed in the kth row and ith column otherwise return is
false. x [] is a global array whose final k - 1 values have been set. Abs (r) returns the absolute
value of r.
N - Queens (k, n) 2.
{
For i ← 1 to n

do if Place (k, i) then 5. {

x [k] ← i;
if (k ==n) then

write (x [1 ............. n));


else
N - Queens (k + 1, n); } }

Hamiltonian Circuit
The Hamiltonian cycle is the cycle in the graph which visits all the vertices in graph exactly once
and terminates at the starting node. It may not include all the edges

▪ The Hamiltonian cycle problem is the problem of finding a Hamiltonian cycle in a graph if
there exists any such cycle.
▪ The input to the problem is an undirected, connected graph. For the graph shown in Figure
(a), a path A – B – E – D – C – A forms a Hamiltonian cycle. It visits all the vertices exactly once,
but does not visit the edges <B, D>.

The Hamiltonian cycle problem is also both, decision problem and an optimization problem. A
decision problem is stated as, “Given a path, is it a Hamiltonian cycle of the graph?”.

1st and (n – 1)th vertex must be adjacent (nth of cycle is the initial vertex itself).
Vertex i must not appear in the first (i – 1) vertices of any path.
With the adjacency matrix representation of the graph, the adjacency of two vertices can be verified in
constant time.
Algorithm
HAMILTONIAN (i)
// Description : Solve Hamiltonian cycle problem using backtracking.
// Input : Undirected, connected graph G = <V, E> and initial vertex i
// Output : Hamiltonian cycle
if
FEASIBLE(i)
then
if
(i == n - 1)
then
Print V[0… n – 1]
else
j←2
while (j
≤ n) do
V[i] ← j HAMILTONIAN(i + 1)
j ← j + 1 end
end

end
function
FEASIBLE(i)
flag ← 1
for
j ← 1 to i – 1
do if
Adjacent(Vi, Vj)
then
flag ← 0
end end if
Adjacent (Vi, Vi-1)
then
flag ← 1
else

flag ← 0

end return flag

Complexity Analysis
Looking at the state space graph, in worst case, total number of nodes in tree
would be, T(n) = 1 + (n – 1) + (n – 1)2 + (n – 1)3 + … + (n – 1)n – 1
=frac(n−1)n–1n–2
T(n) = O(nn). Thus, the Hamiltonian cycle algorithm runs in exponential time.
Example: Find the Hamiltonian cycle by using the backtracking approach for a given graph.

The backtracking approach uses a state-space tree to check if there exists a Hamiltonian cycle in the
graph. Figure (g) shows the simulation of the Hamiltonian cycle algorithm. For simplicity, we have not
explored all possible paths, the concept is self-explanatory. It is not possible to include all the paths in
the graph, so few of the successful and unsuccessful paths are traced in the graph. Black nodes indicate
the Hamiltonian cycle.
Subset Sum Problem

Sum of Subsets Problem: Given a set of positive integers, find the combination of numbers that sum
to given value M.

Sum of subsets problem is analogous to the knapsack problem. The Knapsack Problem tries to fill
the knapsack using a given set of items to maximize the profit. Items are selected in such a way that
the total weight in the knapsack does not exceed the capacity of the knapsack. The inequality
condition in the knapsack problem is replaced by equality in the sum of subsets problem.
Given the set of n positive integers, W = {w1, w2, …, wn}, and given a positive integer M, the
sum of the subset problem can be formulated as follows (where wi and M correspond to item
weights and knapsack capacity in the knapsack problem):

Where,

Numbers are sorted in ascending order, such that w1 < w2 < w3 < …. < wn. The solution is often
represented using the solution vector X. If the ith item is included, set xi to 1 else set it to 0. In each
iteration, one item is tested. If the inclusion of an item does not violet the constraint of the problem,
add it. Otherwise, backtrack, remove the previously added item, and continue the same procedure for
all remaining items. The solution is easily described by the state space tree. Each left edge denotes the
inclusion of wi and the right edge denotes the exclusion of wi. Any path from the root to the leaf forms
a subset. A state-space tree for n = 3 is demonstrated in Fig. (a).

Fig. (a): State space tree for n = 3


Algorithm for Sum of subsets
The algorithm for solving the sum of subsets problem using recursion is stated below:
Examples
Graph Colouring

In this problem, an undirected graph is given. There is also provided m colors. The problem is to
find if it is possible to assign nodes with m different colors, such that no two adjacent vertices of the
graph are of the same colors. If the solution exists, then display which color is assigned on which
vertex.
Starting from vertex 0, we will try to assign colors one by one to different nodes. But before
assigning, we have to check whether the color is safe or not. A color is not safe whether adjacent
vertices are containing the same color.
Input and
Output Input:
The adjacency matrix of a graph G(V, E) and an integer m, which indicates the maximum number
of colors that can be used.

Let the maximum color m =


3. Output:
This algorithm will return which node will be assigned with which color. If the solution is not
possible, it will return false.
For this input the assigned colors are:
Node 0 -> color 1
Node 1 -> color 2
Node 2 -> color 3
Node 3 -> color 2

Algorithm
isValid(vertex, colorList, col)
Input − Vertex, colorList to check, and color, which is trying to assign.
Output − True if the color assigning is valid, otherwise false.
Begin
for all vertices v of the graph, do
if there is an edge between v and i, and col = colorList[i],
then return false
done return
true
End
graphColoring(colors, colorList, vertex)
Input − Most possible colors, the list for which vertices are colored with which color, and the starting
vertex.
Output − True, when colors are assigned, otherwise false.
Begin
if all vertices are checked, then
return true
for all colors col from available colors,
do if isValid(vertex, color, col), then
add col to the colorList for vertex
if graphColoring(colors, colorList, vertex+1) = true, then
return true
remove color for vertex
done
return false

End

Branch and Bound

Solving 15 puzzle Problem (LCBB)


The problem cinsist of 15 numbered (0-15) tiles on a square box with 16 tiles(one tile is blank or
empty). The objective of this problem is to change the arrangement of initial node to goal node by
using series of legal moves.
The Initial and Goal node arrangement is shown by following figure.
1 2 4 15 1 2 3 4

2 5 12 5 6 7 8

7 6 11 14 9 10 11 12

8 9 10 13 13 14 15

Initial Arrangement Final Arrangement

In initial node four moves are possible. User can move any one of the tile like 2,or 3, or 5, or 6 to the
empty tile. From this we have four possibilities to move from initial node.
The legal moves are for adjacent tile number is left, right, up, down, ones at a time.
Each and every move creates a new arrangement, and this arrangement is called state of puzzle
problem. By using different states, a state space tree diagram is created, in which edges are labeled
according to the direction in which the empty space moves.

The state space tree is very large because it can be 16! Different arrangements.
In state space tree, nodes are numbered as per the level. In each level we must calculate the
value or cost of each node by using given formula:
C(x)=f(x)+g(x),
f(x) is length of path from root or initial node to node x,
g(x) is estimated length of path from x downward to the goal node. Number of non blank tile
not in their correct position.
C(x)< Infinity.(initially set bound).
Each time node with smallest cost is selected for further expansion towards goal node. This
node become the e-node.

State Space tree with node cost is shown in diagram.


Assignment Problem
Problem Statement
Let’s first define a job assignment problem. In a standard version of a job assignment problem,
there can be jobs and workers. To keep it simple, we’re taking jobs and
workers in our example:

We can assign any of the available jobs to any worker with the condition that if a job is
assigned to a worker, the other workers can’t take that particular job. We should also notice that
each job has some cost associated with it, and it differs from one worker to another.
Here the main aim is to complete all the jobs by assigning one job to each worker in such a way
that the sum of the cost of all the jobs should be minimized.
Branch and Bound Algorithm Pseudocode
Now let’s discuss how to solve the job assignment problem using a branch and bound
algorithm. Let’s see the pseudocode first:

Here, is the input cost matrix that contains information like the number of available jobs, a list of
available workers, and the associated cost for each job. The function MinCost() maintains a list of
active nodes. The function Leastcost() calculates the minimum cost of the active node at each
level of the tree. After finding the node with minimum cost, we remove the node from the list of
active nodes and return it.
We’re using the add() function in the pseudocode, which calculates the cost of a particular node
and adds it to the list of active nodes.
In the search space tree, each node contains some information, such as cost, a total number of
jobs, as well as a total number of workers.
Now let’s run the algorithm on the sample example we’ve created:
Advantages
In a branch and bound algorithm, we don’t explore all the nodes in the tree. That’s why the
time complexity of the branch and bound algorithm is less when compared with other
algorithms.

If the problem is not large and if we can do the branching in a reasonable amount of time, it
finds an optimal solution for a given problem.

The branch and bound algorithm find a minimal path to reach the optimal solution for a
given problem. It doesn’t repeat nodes while exploring the tree.

Disadvantages
The branch and bound algorithm are time-consuming. Depending on the size of the given
problem, the number of nodes in the tree can be too large in the worst case.

Knapsack Problem using branch and bound

Problem Statement

We are a given a set of n objects which have each have a value vi and a weight wi. The
objective of the 0/1 Knapsack problem is to find a subset of objects such that the total value is
maximized, and

the sum of weights of the objects does not exceed a given threshold W. An important condition
here is that one can either take the entire object or leave it. It is not possible to take a fraction of
the object.
Consider an example where n = 4, and the values are given by {10, 12, 12, 18}and the weights
given by {2, 4, 6, 9}. The maximum weight is given by W = 15. Here, the solution to the
problem will be including the first, third and the fourth objects.

Here, the procedure to solve the problem is as follows are:


• Calculate the cost function and the Upper bound for the two children of each node.
Here, the (i + 1)th level indicates whether the ith object is to be included or not.
• If the cost function for a given node is greater than the upper bound, then the node
need not be explored further. Hence, we can kill this node. Otherwise, calculate the upper bound
for this node. If this value is less than U, then replace the value of U with this value. Then, kill all
unexplored nodes which have cost function greater than this value.
• The next node to be checked after reaching all nodes in a particular level will be the
one with the least cost function value among the unexplored nodes.
• While including an object, one needs to check whether the adding the object
crossed the threshold. If it does, one has reached the terminal point in that branch, and all the
succeeding objects will not be included.

Time and Space Complexity


Even though this method is more efficient than the other solutions to this problem, its worst
case time complexity is still given by O(2n), in cases where the entire tree has to be explored.
However, in its best case, only one path through the tree will have to explored, and hence its best
case time complexity is given by O(n). Since this method requires the creation of the state space
tree, itsspace complexity will also be exponential.

Solving an Example
Consider the problem with n =4, V = {10, 10, 12, 18}, w = {2, 4, 6, 9} and W = 15. Here, we
calculate the initital upper bound to be U = 10 + 10 + 12 = 32. Note that the 4th object cannot
be included here, since that would exceed W. For the cost, we add 3/9 th of the final value, and
hence the cost function is 38. Remember to negate the values after calculation before
comparison.
After calculating the cost at each node, kill nodes that do not need exploring. Hence, the final
state space tree will be as follows (Here, the number of the node denotes the order in which the
state space tree was explored):
Note here that node 3 and node 5 have been killed after updating U at node 7. Also, node 6 is not
explored further, since adding any more weight exceeds the threshold. At the end, only nodes 6
and 8 remain. Since the value of U is less for node 8, we select this node. Hence the solution is
{1, 1, 0, 1}, and we can see here that the total weight is exactly equal to the threshold value in this
case.

Travelling salesman problem


▪ Travelling Salesman Problem (TSP) is an interesting problem. Problem is defined as
“given n cities and distance between each pair of cities, find out the path which visits each city
exactly once and come back to starting city, with the constraint of minimizing the travelling
distance.”
▪ TSP has many practical applications. It is used in network design, and
transportation route design. The objective is to minimize the distance. We can start tour from
any random city and visit other cities in any order. With n cities, n! different permutations are
possible. Exploring all paths using brute force attacks may not be useful in real life
applications.
LCBB using Static State Space Tree for Travelling Salseman Problem
▪ Branch and bound is an effective way to find better, if not best, solution in
quick time by pruning some of the unnecessary branches of search tree.
▪ It works as follow:
Consider directed weighted graph G = (V, E, W), where node represents cities and weighted
directed edges represents direction and distance between two cities.
1. Initially, graph is represented by cost matrix C, where
Cij = cost of edge, if there is a direct path from city i to city j Cij =
∞, if there is no direct path from city i to city j.
2. Convert cost matrix to reduced matrix by subtracting minimum values from
appropriate rows and columns, such that each row and column contains at least one zero entry.
3. Find cost of reduced matrix. Cost is given by summation of subtracted amount from
the cost matrix to convert it in to reduce matrix.
4. Prepare state space tree for the reduce matrix
5. Find least cost valued node A (i.e. E-node), by computing reduced cost node matrix
with every remaining node.
6. If <i, j> edge is to be included, then do following :
(a) Set all values in row i and all values in column j of A to ∞
(b) Set A[j, 1] = ∞
(c) Reduce A again, except rows and columns having all ∞ entries.
7. Compute the cost of newly created reduced matrix
as, Cost = L + Cost(i, j) + r
Where, L is cost of original reduced cost matrix and r is A[i, j].
8. If all nodes are not visited then go to
step 4. Reduction procedure is described below
Raw Reduction:
Matrix M is called reduced matrix if each of its row and column has at least one zero entry or
entire row or entire column has ∞ value. Let M represents the distance matrix of 5 cities. M
can be reduced as follow:
MRowRed = {Mij – min {Mij | 1 ≤ j ≤ n, and Mij <
∞ }} Consider the following distance matrix:
Find the minimum element from each row and subtract it from each cell of matrix.

Reduced matrix would be:

Row reduction cost is the summation of all the values subtracted from each
rows: Row reduction cost (M) = 10 + 2 + 2 + 3 + 4 = 21
Column reduction:
Matrix MRowRed is row reduced but not the column reduced. Matrix is called column reduced
if each of its column has at least one zero entry or all ∞ entries.
MColRed = {Mji – min {Mji | 1 ≤ j ≤ n, and Mji < ∞ }}
To reduced above matrix, we will find the minimum element from each column and subtract it
from each cell of matrix.

Column reduced matrix MColRed would be:


Each row and column of MColRed has at least one zero entry, so this matrix is reduced
matrix. Column reduction cost (M) = 1 + 0 + 3 + 0 + 0 = 4
State space tree for 5 city problem is depicted in Fig. 6.6.1. Number within circle indicates the
order in which the node is generated, and number of edge indicates the city being visited.

Example
Example: Find the solution of following travelling salesman problem using branch
and bound method.

Solution:
▪ The procedure for dynamic reduction is as follow:
▪ Draw state space tree with optimal reduction cost at root node.
▪ Derive cost of path from node i to j by setting all entries in i th row and jth
column as ∞. Set M[j][i] = ∞
▪ Cost of corresponding node N for path i to j is summation of optimal cost +
reduction cost + M[j][i]
▪ After exploring all nodes at level i, set node with minimum cost as E node and
repeat the procedure until all nodes are visited.
▪ Given matrix is not reduced. In order to find reduced matrix of it, we will first find
the row reduced matrix followed by column reduced matrix if needed. We can find row
reduced matrix by subtracting minimum element of each row from each element of
corresponding row. Procedure is described below:
▪ Reduce above cost matrix by subtracting minimum value from each row and column.

M‘1

is not reduced matrix. Reduce it subtracting minimum value from corresponding column.
Doing this we get,

Cost of M1 = C(1)

= Row reduction cost + Column reduction cost


= (10 + 2 + 2 + 3 + 4) + (1 + 3) = 25
This means all tours in graph has length at least 25. This is the optimal cost of the path.
State space tree
Let us find cost of edge from node 1 to 2, 3, 4, 5.
Select edge 1-2:
Set M1 [1] [ ] = M1 [ ] [2] =
∞ Set M1 [2] [1] = ∞
Reduce the resultant matrix if required.

M2 is already reduced.
Cost of node 2 :
C(2) = C(1) + Reduction cost + M1 [1] [2]
= 25 + 0 + 10 = 35
Select edge 1-3
Set M1 [1][ ] = M1 [ ] [3] =
∞ Set M1 [3][1] = ∞
Reduce the resultant matrix if required.

Cost of node 3:
C(3) = C(1) + Reduction cost + M1[1] [3]
= 25 + 11 + 17 = 53
Select edge 1-4:
Set M1 [1][ ] = M1[ ][4]
= ∞ Set M1 [4][1] = ∞
Reduce resultant matrix if required.
Matrix M4 is already
reduced. Cost of node 4:
C(4) = C(1) + Reduction cost + M1 [1] [4]
= 25 + 0 + 0 = 25
Select edge 1-5:
Set M1 [1] [ ] = M1 [ ] [5]
= ∞ Set M1 [5] [1] = ∞
Reduce the resultant matrix if required.

Cost of node 5:
C(5) = C(1) + reduction cost + M1 [1] [5]
= 25 + 5 + 1 = 31
State space diagram:

Node 4 has minimum cost for path 1-4. We can go to vertex 2, 3 or 5. Let’s explore all three nodes.
Select path 1-4-2 : (Add edge 4-2)
Set M4 [1] [] = M4 [4] [] = M4 []
[2] = ∞ Set M4 [2] [1] = ∞
Reduce resultant matrix if required.
Matrix M6 is already reduced. Cost of node 6:
C(6) = C(4) + Reduction cost + M4 [4] [2]
= 25 + 0 + 3 = 28
Select edge 4-3 (Path 1-4-3):
Set M4 [1] [ ] = M4 [4] [ ] = M4 [ ] [3] =
∞ Set M [3][1] = ∞
Reduce the resultant matrix if required.

M‘7 is not reduced. Reduce it by subtracting 11 from column 1.

Cost of node 7:
C(7) = C(4) + Reduction cost + M4 [4] [3]
= 25 + 2 + 11 + 12 = 50
Select edge 4-5 (Path 1-4-5):

Matrix M8 is reduced.
Cost of node 8:
C(8) = C(4) + Reduction cost + M4 [4][5]
= 25 + 11 + 0 = 36
State space tree
Path 1-4-2 leads to minimum cost. Let’s find the cost for two possible paths.

Add edge 2-3 (Path 1-4-2-3):


Set M6 [1][ ] = M6 [4][ ] = M6 [2][ ]
= M6 [][3] = ∞
Set M6 [3][1] = ∞
Reduce resultant matrix if required.

Cost of node 9:
C(9) = C(6) + Reduction cost + M6 [2][3]
= 28 + 11 + 2 + 11 = 52
Add edge 2-5 (Path 1-4-2-5):
Set M6 [1][ ] = M6 [4][ ] = M6 [2][ ] = M6 [ ][5]
= ∞ Set M6 [5][1] = ∞
Reduce resultant matrix if required.
Cost of node 10:
C(10) = C(6) + Reduction cost + M6 [2][5]
= 28 + 0 + 0 = 28
State space tree

Add edge 5-3 (Path 1-4-2-5-3):

Cost of node 11:


C(11) = C(10) + Reduction cost + M10 [5][3]
= 28 + 0 + 0 = 28
State space tree:

So we can select any of the edge. Thus the final path includes the edges <3, 1>, <5, 3>, <1, 4>,
<4, 2>,
<2, 5>, that forms the path 1 – 4 – 2 – 5 – 3 – 1. This path has cost of 28.
UNIT 5

Tractable and Intractable Problems

Tractable problems refer to computational problems that can be solved efficiently using algorithms that
can scale with the input size of the problem. In other words, the time required to solve a tractable
problem increases at most polynomial with the input size.
On the other hand, intractable problems are computational problems for which no known algorithm
can solve them efficiently in the worst-case scenario. This means that the time required to solve an
intractable problem grows exponentially or even faster with the input size.
One example of a tractable problem is computing the sum of a list of n numbers. The time required to
solve this problem scales linearly with the input size, as each number can be added to a running total
in constant time. Another example is computing the shortest path between two nodes in a graph, which
can be solved efficiently using algorithms like Dijikstra’s algorithm or the A* algorithm.
In contrast, some well-known intractable problems include the traveling salesman problem, the
knapsack problem, and the Boolean satisfiablity problem. These problems are NP-hard, meaning that
any problem in NP (the set of problems that can be solved in polynomial time using a non-
deterministic Turing machine) can be reduced to them in polynomial time. While it is possible to find
approximate solutions to these problems, there is no known algorithm that can solve them exactly in
polynomial time.
In summary, tractable problems are those that can be solved efficiently with algorithms that scale well
with the input size, while intractable problems are those that cannot be solved efficiently in the worst-
case scenario.
Examples of Tractable problems
1. Sorting: Given a list of n items, the task is to sort them in ascending or descending order.
Algorithms like Quick Sort and Merge Sort can solve this problem in O(n log n) time
complexity.
2. Matrix multiplication: Given two matrices A and B, the task is to find their product C = AB.
The best-known algorithm for matrix multiplication runs in O(n^2.37) time complexity, which
is considered tractable for practical applications.
3. Shortest path in a graph: Given a graph G and two nodes s and t, the task is to find the shortest
path between s and t. Algorithms like Dijikstra’s algorithm and the A* algorithm can solve this
problem in O(m + n log n) time complexity, where m is the number of edges and n is the
number of nodes in the graph.
4. Linear programming: Given a system of linear constraints and a linear objective function, the
task is to find the values of the variables that optimize the objective function subject to the
constraints. Algorithms like the simplex method can solve this problem in polynomial time.
5. Graph coloring: Given an undirected graph G, the task is to assign a color to each node such
that no two adjacent nodes have the same color, using as few colors as possible. The greedy
algorithm can solve this problem in O(n^2) time complexity, where n is the number of nodes in
the graph.
These problems are considered tractable because algorithms exist that can solve them in polynomial
time complexity, which means that the time required to solve them grows no faster than a polynomial
function of the input size.

Examples of intractable problems


1. Traveling salesman problem (TSP): Given a set of cities and the distances between them, the
task is to find the shortest possible route that visits each city exactly once and returns to the
starting city. The best-known algorithms for solving the TSP have an exponential worst-case
time complexity, which makes it intractable for large instances of the problem.

2. Knapsack problem: Given a set of items with weights and values, and a knapsack that can carry
a maximum weight, the task is to find the most valuable subset of items that can be carried by
the knapsack. The knapsack problem is also NP-hard and is intractable for large instances of
the problem.

3. Boolean satisfiability problem (SAT): Given a boolean formula in conjunctive normal form
(CNF), the task is to determine if there exists an assignment of truth values to the variables that
makes the formula true. The SAT problem is one of the most well-known NP-complete
problems, which means that any NP problem can be reduced to SAT in polynomial time.

4. Subset sum problem: Given a set of integers and a target sum, the task is to find a subset of the
integers that sums up to the target sum. Like the knapsack problem, the subset sum problem is
also intractable for large instances of the problem.

P (Polynomial) problems
P problems refer to problems where an algorithm would take a polynomial amount of time to
solve, or where Big-O is a polynomial (i.e. O(1), O(n), O(n²), etc). These are problems that
would be considered ‘easy’ to solve, and thus do not generally have immense run times.
NP (Non-deterministic Polynomial) Problems
NP problems were a little harder for me to understand, but I think this is what they are. In terms
of solving a NP problem, the run-time would not be polynomial. It would be something like
O(n!) or something much larger.
NP-Hard Problems
A problem is classified as NP-Hard when an algorithm for solving it can be translated to solve
any NP problem. Then we can say, this problem is at least as hard as any NP problem, but it
could be much harder or more complex.
NP-Complete Problems
NP-Complete problems are problems that live in both the NP and NP-Hard classes. This means
that NP-Complete problems can be verified in polynomial time and that any NP problem can
be reduced to this problem in polynomial time.
Bin Packing problem
Bin Packing problem involves assigning n items of different weights and bins each of capacity
c to a bin such that number of total used bins is minimized. It may be assumed that all items
have weights smaller than bin capacity.
The following 4 algorithms depend on the order of their inputs. They pack the item given first
and then move on to the next input or next item
1) Next Fit algorithm

The simplest approximate approach to the bin packing problem is the Next-Fit (NF) algorithm
which is explained later in this article. The first item is assigned to bin 1. Items 2,... ,n are then
considered by increasing indices : each item is assigned to the current bin, if it fits; otherwise, it
is assigned to a new bin, which becomes the current one.
Visual Representation
Let us consider the same example as used above and bins of size 1

Assuming the sizes of the items be {0.5, 0.7, 0.5, 0.2, 0.4, 0.2, 0.5, 0.1, 0.6}.
The minimum number of bins required would be Ceil ((Total Weight) / (Bin Capacity))=
Celi(3.7/1) = 4 bins.
The Next fit solution (NF(I))for this instance I would be- Considering 0.5 sized item first, we
can place it in the first bin

Moving on to the 0.7 sized item, we cannot place it in the first bin. Hence we place it in a new
bin.
Moving on to the 0.5 sized item, we cannot place it in the current bin. Hence we place it in a
new bin.

Moving on to the 0.2 sized item, we can place it in the current (third bin)

Similarly, placing all the other items following the Next-Fit algorithm we get-

Thus we need 6 bins as opposed to the 4 bins of the optimal solution. Thus we can see that this
algorithm is not very efficient.
Analyzing the approximation ratio of Next-Fit algorithm
The time complexity of the algorithm is clearly O(n). It is easy to prove that, for any instance I
of BPP,the solution value NF(I) provided by the algorithm satisfies the bound
NF(I)<2z(I)
where z(I) denotes the optimal solution value. Furthermore, there exist instances for which the
ratio NF(I)/z(I) is arbitrarily close to 2, i.e. the worst-case approximation ratio of NF is r(NF) =
2.
Psuedocode
NEXT FIT (size[], n, c)
size[] is the array containg the sizes of the items, n is the number of items and c is the capacity
of the bin
{
Initialize result (Count of bins) and remaining capacity in current bin. res = 0
bin_rem = c
Place items one by one for (int i = 0; i < n; i++) {
// If this item can't fit in current bin if (size[i] > bin_rem)
{
Use a new bin res++
bin_rem = c - size[i]
}
else
bin_rem -= size[i];
}
return res;
}
2) First Fit algorithm

A better algorithm, First-Fit (FF), considers the items according to increasing indices and
assigns each item to the lowest indexed initialized bin into which it fits; only when the current
item cannot fit into any initialized bin, is a new bin introduced
Visual Representation
Let us consider the same example as used above and bins of size 1

Assuming the sizes of the items be {0.5, 0.7, 0.5, 0.2, 0.4, 0.2, 0.5, 0.1, 0.6}.
The minimum number of bins required would be Ceil ((Total Weight) / (Bin Capacity))=
Celi(3.7/1) = 4 bins.
The First fit solution (FF(I))for this instance I would be- Considering 0.5 sized item first, we
can place it in the first bin

Moving on to the 0.7 sized item, we cannot place it in the first bin. Hence we place it in a new
bin.

Moving on to the 0.5 sized item, we can place it in the first bin.
Moving on to the 0.2 sized item, we can place it in the first bin, we check with the second bin
and we can place it there.

Moving on to the 0.4 sized item, we cannot place it in any existing bin. Hence we place it in a
new bin.

Similarly, placing all the other items following the First-Fit algorithm we get-

Thus we need 5 bins as opposed to the 4 bins of the optimal solution but is much more efficient
than Next-Fit algorithm.
Analyzing the approximation ratio of Next-Fit algorithm
If FF(I) is the First-fit implementation for I instance and z(I) is the most optimal solution, then:

It can be seen that the First Fit never uses more than 1.7 * z(I) bins. So First-Fit is better than
Next Fit in terms of upper bound on number of bins.
Psuedocode
FIRSTFIT(size[], n, c)
{
size[] is the array containg the sizes of the items, n is the number of items and c is the capacity
of the bin

/Initialize result (Count of bins)


res = 0;
Create an array to store remaining space in bins there can be at most n bins bin_rem[n];

Plae items one by one for (int i = 0; i < n; i++) {


Find the first bin that can accommodate weight[i] int j;
for (j = 0; j < res; j++) {
if (bin_rem[j] >= size[i]) { bin_rem[j] = bin_rem[j] - size[i]; break;
}
}

If no bin could accommodate size[i] if (j == res) {


bin_rem[res] = c - size[i]; res++;
}

}
return res;
}

3) Best Fit Algorithm

The next algorithm, Best-Fit (BF), is obtained from FF by assigning the current item to the
feasible bin (if any) having the smallest residual capacity (breaking ties in favor of the lowest
indexed bin).
Simply put,the idea is to places the next item in the tightest spot. That is, put it in the bin so that
the smallest empty space is left.
Visual Representation
Let us consider the same example as used above and bins of size 1

Assuming the sizes of the items be {0.5, 0.7, 0.5, 0.2, 0.4, 0.2, 0.5, 0.1, 0.6}.
The minimum number of bins required would be Ceil ((Total Weight) / (Bin Capacity))=
Celi(3.7/1) = 4 bins.
The First fit solution (FF(I))for this instance I would be-
Considering 0.5 sized item first, we can place it in the first bin

Moving on to the 0.7 sized item, we cannot place it in the first bin. Hence we place it in a new
bin.
Moving on to the 0.5 sized item, we can place it in the first bin tightly.

Moving on to the 0.2 sized item, we cannot place it in the first bin but we can place it in second
bin tightly.

Moving on to the 0.4 sized item, we cannot place it in any existing bin. Hence we place it in a
new bin.

Similarly, placing all the other items following the First-Fit algorithm we get-

Thus we need 5 bins as opposed to the 4 bins of the optimal solution but is much more efficient
than Next-Fit algorithm.
Analyzing the approximation ratio of Best-Fit algorithm
It can be noted that Best-Fit (BF), is obtained from FF by assigning the current item to the
feasible bin (if any) having the smallest residual capacity (breaking ties in favour of the lowest
indexed bin). BF satisfies the same worst-case bounds as FF

Analysis Of upper-bound of Best-Fit algorithm


If z(I) is the optimal number of bins, then Best Fit never uses more than 2 * z(I)-2 bins. So Best
Fit is same as Next Fit in terms of upper bound on number of bins.
Psuedocode
BESTFIT(size[],n, c)
{
size[] is the array containg the sizes of the items, n is the number of items and c is the capacity
of the bin
Initialize result (Count of bins) res = 0;

Create an array to store remaining space in bins there can be at most n bins bin_rem[n];

Place items one by one for (int i = 0; i < n; i++) {

Find the best bin that can accommodate weight[i] int j;

Initialize minimum space left and index of best bin int min = c + 1, bi = 0;

for (j = 0; j < res; j++) {


if (bin_rem[j] >= size[i] && bin_rem[j] - size[i] < min) { bi = j;
min = bin_rem[j] - size[i];
}
}

If no bin could accommodate weight[i],create a new bin if (min == c + 1) {


bin_rem[res] = c - size[i]; res++;
}
else
Assign the item to best bin bin_rem[bi] -= size[i];
}
return res;
}

Approximation Algorithms for the Traveling Salesman Problem


We solved the traveling salesman problem by exhaustive search in Section 3.4, mentioned its
decision version as one of the most well-known NP-complete problems in Section 11.3, and
saw how its instances can be solved by a branch-and-bound algorithm in Section 12.2. Here,
we consider several approximation algorithms, a small sample of dozens of such algorithms
suggested over the years for this famous problem.

But first let us answer the question of whether we should hope to find a polynomial-time
approximation algorithm with a finite performance ratio on all instances of the traveling
salesman problem. As the following theorem [Sah76] shows, the answer turns out to be no,
unless P = N P .

THEOREM 1 If P != NP, there exists no c-approximation algorithm for the traveling


salesman problem, i.e., there exists no polynomial-time approximation algorithm for this
problem so that for all instances
Nearest-neighbour algorithm
The following well-known greedy algorithm is based on the nearest-neighbor heuristic:
always go next to the nearest unvisited city.
Step 1 Choose an arbitrary city as the start.
Step 2 Repeat the following operation until all the cities have been visited: go to the unvisited
city nearest the one visited last (ties can be broken arbitrarily).
Step 3 Return to the starting city.
EXAMPLE 1 For the instance represented by the graph in Figure 12.10, with a as the starting
vertex, the nearest-neighbor algorithm yields the tour (Hamiltonian
circuit) sa: a − b − c − d − a of length 10.

The optimal solution, as can be easily checked by exhaustive search, is the


tour s∗: a − b − d − c − a of length 8. Thus, the accuracy ratio of this approximation is

Unfortunately, except for its simplicity, not many good things can be said about the nearest-
neighbor algorithm. In particular, nothing can be said in general about the accuracy of
solutions obtained by this algorithm because it can force us to traverse a very long edge on the
last leg of the tour. Indeed, if we change the weight of edge (a, d) from 6 to an arbitrary large
number w ≥ 6 in Example 1, the algorithm will still yield the tour a − b − c − d − a of length 4
+ w, and the optimal solution will still be a − b − d − c − a of length 8. Hence,

which can be made as large as we wish by choosing an appropriately large value of w. Hence,
RA = ∞ for this algorithm (as it should be according to Theorem 1).

Twice-around-the-tree algorithm
Step 1 Construct a minimum spanning tree of the graph corresponding to a given instance of
the traveling salesman problem.
Step 2 Starting at an arbitrary vertex, perform a walk around the minimum spanning tree
recording all the vertices passed by. (This can be done by a DFS traversal.)
Step 3 Scan the vertex list obtained in Step 2 and eliminate from it all repeated occurrences of
the same vertex except the starting one at the end of the list. (This step is equivalent to making
shortcuts in the walk.) The vertices remaining on the list will form a Hamiltonian circuit,
which is the output of the algorithm.
EXAMPLE 2 Let us apply this algorithm to the graph in Figure 12.11a. The minimum
spanning tree of this graph is made up of edges (a, b), (b, c), (b, d), and (d, e) . A twice-

around-the-tree walk that starts and ends at a is


a, b, c, b, d, e, d, b, a.
Eliminating the second b (a shortcut from c to d), the second d, and the third b (a shortcut from
e to a) yields the Hamiltonian circuit
a, b, c, d, e, a
of length 39.
The tour obtained in Example 2 is not optimal. Although that instance is small enough to find
an optimal solution by either exhaustive search or branch-and-bound, we refrained from doing
so to reiterate a general point. As a rule, we do not know what the length of an optimal tour
actually is, and therefore we cannot compute the accuracy ratio f (sa)/f (s∗). For the twice-
around-the-tree algorithm, we can at least estimate it above, provided the graph is Euclidean.
Fermat's Little Theorem:
If n is a prime number, then for every a, 1 < a < n-1,

an-1 ≡ 1 (mod n) OR

an-1 % n = 1

Example: Since 5 is prime, 24 ≡ 1 (mod 5) [or 24%5 = 1],


34 ≡ 1 (mod 5) and 44 ≡ 1 (mod 5)
Since 7 is prime, 26 ≡ 1 (mod 7),
36 ≡ 1 (mod 7), 46 ≡ 1 (mod 7)
56 ≡ 1 (mod 7) and 66 ≡ 1 (mod 7) Algorithm
1) Repeat following k times:

a) Pick a randomly in the range [2, n - 2]

b) If gcd(a, n) ≠ 1, then return false

c) If an-1 &nequiv; 1 (mod n), then return false

2) Return true [probably prime].

Unlike merge sort, we don’t need to merge the two sorted arrays. Thus Quicksort requires
lesser auxiliary space than Merge Sort, which is why it is often preferred to Merge Sort.
Using a randomly generated pivot we can further improve the time complexity of QuickSort.
Algorithm for random pivoting

partition(arr[], lo, hi)


pivot = arr[hi]
i = lo // place for swapping for j := lo to hi – 1 do
if arr[j] <= pivot then swap arr[i] with arr[j] i = i + 1
swap arr[i] with arr[hi] return i
partition_r(arr[], lo, hi)
r = Random Number from lo to hi Swap arr[r] and arr[hi]
return partition(arr, lo, hi) quicksort(arr[], lo, hi)
if lo < hi
p = partition_r(arr, lo, hi) quicksort(arr, lo , p-1) quicksort(arr, p+1, hi)

Finding kth smallest element


Problem Description: Given an array A[] of n elements and a positive integer K, find the Kth
smallest element in the array. It is given that all array elements are distinct.
For Example :
Input : A[] = {10, 3, 6, 9, 2, 4, 15, 23}, K = 4
Output: 6
Input : A[] = {5, -8, 10, 37, 101, 2, 9}, K = 6
Output: 37
Quick-Select : Approach similar to quick sort
This approach is similar to the quick sort algorithm where we use the partition on the input
array recursively. But unlike quicksort, which processes both sides of the array recursively,
this algorithm works on only one side of the partition. We recur for either the left or right side
according to the position of pivot.
Solution Steps
1. Partition the array A[left .. right] into two subarrays A[left .. pos] and A[pos + 1 .. right] such
that each element of A[left .. pos] is less than each element of A[pos + 1 .. right].
2. Computes the number of elements in the subarray A[left .. pos] i.e. count = pos - left + 1

3. if (count == K), then A[pos] is the Kth smallest element.

4. Otherwise determines in which of the two subarrays A[left .. pos-1] and A[pos + 1 .. right] the
Kth smallest element lies.
• If (count > K) then the desired element lies on the left side of the partition
• If (count < K), then the desired element lies on the right side of the partition. Since we already
know i values that are smaller than the kth smallest element of A[left .. right], the desired
element is the (K - count)th smallest element of A[pos + 1 .. right].
• Base case is the scenario of single element array i.e left ==right. return A[left] or A[right].
Pseudo-Code
// Original value for left = 0 and right = n-1

int kthSmallest(int A[], int left, int right, int K)

{
if (left == right)
return A[left]

int pos = partition(A, left, right) count = pos - left + 1


if ( count == K )
return A[pos]

else if ( count > K )

return kthSmallest(A, left, pos-1, K)


else
return kthSmallest(A, pos+1, right, K-i)
}

int partition(int A[], int l, int r)

{
int x = A[r]

int i = l-1
for ( j = l to r-1 )
{
if (A[j] <= x)

{
i = i + 1 swap(A[i], A[j])
}
}
swap(A[i+1], A[r])
return i+1

}
Complexity Analysis
Time Complexity: The worst-case time complexity for this algorithm is O(n²), but it can be
improved if we choose the pivot element randomly. If we randomly select the pivot, the
expected time complexity would be linear, O(n).

You might also like