KEMBAR78
Unacademy - Algorithms | PDF | Algorithms | Function (Mathematics)
0% found this document useful (0 votes)
36 views46 pages

Unacademy - Algorithms

The document discusses asymptotic analysis, which is used to evaluate the efficiency of algorithms based on their running time and space complexity as the input size grows. It introduces various asymptotic notations such as O, Ω, and Θ, and explains their significance in comparing algorithm performance. Additionally, it covers methods for solving recurrence relations and provides examples to illustrate these concepts.

Uploaded by

Akash Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views46 pages

Unacademy - Algorithms

The document discusses asymptotic analysis, which is used to evaluate the efficiency of algorithms based on their running time and space complexity as the input size grows. It introduces various asymptotic notations such as O, Ω, and Θ, and explains their significance in comparing algorithm performance. Additionally, it covers methods for solving recurrence relations and provides examples to illustrate these concepts.

Uploaded by

Akash Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

1 Asymptotic sufficiently

Analysis if and only if


f(n) ≤ c . g(n), "n, ≥ n0
Theorem to Remember

such that $ two positive constants c > 0, n0 ≥ 1. Two functions, f(n) and g(n) are equivalent
Introduction know the efficiency of an algorithm and if and only if f(n) = O(g(n)) and f(n) = W(g(n))
c g(n)
we can compare the relative performance
y An Algorithm is a combination of a of an alternative algorithm. o-Notation [Pronounced “little-oh”]
sequence of finite steps to solve a problem. y We study asymptotic efficiency of an
All steps should be finite and compulsory f(n) For a function f(n), we denote o(g(n)) as the
algorithm to know how the running time
(If you are removing any one step, then the set of functions
of an algorithm increases with the size of
algorithm will be affected). input. f(n) = o(g(n))
y Which algorithm is best suitable for a
given application depends on the input Asymptotic Notations iff
size, the architecture of the computer, f(n) < c . g(n), " c > 0 whenever n ≥ n0
y We use asymptotic notation primarily to n0 n
and the kind of storage to be used: disks,
describe the running times of algorithms. f(n) = O(g(n)) n0 ≥ 1.
main memory, or even tapes.
y With the help of Asymptotic notation, we ω-notation: [Pronounced ‘little-omega”]
y It should terminate after a finite time and Fig. 1.2 (B)
compare two positive functions in terms
produce one or more outputs by taking e.g., y By analogy, ω-notation is to Ω-notation as
of running time.
zero or more inputs. o-notation is to O-notation.
θ-Notation f(n) = 3n + 2, g(n) = n
Need For Analysis For a function f(n), we denote ω(g(n)) as
Let f(n) and g(n) be two positive functions f(n) = O(g(n)) the set of functions.
y Problem may have many candidate f(n) = θ(g(n)) f(n) ≤ c g(n) f(n) = ω(g(n))
solution, the majority of which do not solve if and only if
if and only if 3n + 2 ≤ cn
the problem in hand. Finding solution that f(n) > c . g(n), " c > 0 whenever n ≥ n0
does, or one that is ‘best’, can be quite a f(n) ≤ c1 . g(n) c=4 n0 ≥ 1.
challenge. and Some of the important properties of asymptotic
n0 ≥ 2
y While running a program, how much time notation are:
it will take and how much space it will f(n) ≥ c2 . g(n) Ω-Notation: [Pronounced “big-omega”]
Transitivity:
occupy is known as analysis of program or " n ≥ n0 y Ω notation provides an asymptotic lower
analysis of Algorithm. y If f(n) = q(g(n)) and g(n) = q(h(n)) then f(n)
such that there exists three positive constant bound for a given function g(n), denoted
y An algorithm analysis mostly depends on = q(h(n))
c1 > 0, c2 > 0 and n0 ≥ 1 by Ω(g(n)). The set of functions
time and space complexity. y If f(n) = O(g(n)) and g(n) = O(h(n)) then f(n)
f(n) = Ω(g(n))
Generally, we perform following types of = O(h(n))
c1 g(n) if and only if
analysis y If f(n) = W(g(n)) and g(n) = W(h(n)) then
f(n) ≥ c . g(n), " n ≥ n0
⚪ Worst case: f(n) = W(h(n))
f(n) such that $ two positive constants c > 0,
The maximum number of steps or y If f(n) = o(g(n)) and g(n) = o(h(n)) then f(n)
n0 ≥ 1.
resources required to solve a problem. c2 g(n) = o(h(n))
⚪ Best case: y If f(n) = w(g(n)) and g(n) = w(h(n)) then f(n)
The minimum number of steps or
f(n) = w(h(n))
resources required to solve a problem. Reflexivity: Symmetry:
n c g(n)
⚪ Average case: n0
The average number of steps or y f (n) = θ(f(n)) f(n) = θ(g(n)) if and only
Fig. 1.2 (A) y f (n) = O(f(n)) if g(n) = θ(f(n))
resources required to solve a problem.
⚪ Amortized: y f(n) ≤ c1 . g(n) and f(n) ≥ c2 . g(n), "n, n ≥ n0 y f (n) = W(f(n))
A sequence of operations applied on
O-Notation [Pronounced “big-oh”] n0 n Transpose symmetry:
the input of size n averaged over time.
y The order of growth of running time curve Let f(n) and g(n) be two positive functions f(n) = Ω (g(n)) f(n) = O(g(n)) if and only if g(n) = Ω(f(n))
that varies with size of input helps us to f(n) = O(g(n)) Fig. 1.2 (C) f (n) = o(g(n)) if and only if g(n) = ω(f(n))

Asymptotic sufficiently Analysis 5 6 Asymptotic sufficiently Analysis


1 [∴ We have log*(logn) = 0 is Substitute large value in n(eg, n = 2100) Example: T(n) = T(2n/3) + T(n/3) + θ(n)
= lim
n→∞ 1+n asymptotically larger] 2100 log2, 2log2100 There are three methods of solving a
Rack Your Brain 2. Table below, gives whether A is O, o, Ω, ω 2100 , 100*2 recurrence relation:
or θ of B. Assume k ≥ 1, ε > 0 and ∴ 2n grows faster than n2
C > 1 are constants. Example 2: 1. Substitution Method
Consider the following c program
2. Recursive - Tree Method
unacademy(int n) ω θ Determine which is faster growing function
A B O o Ω 3. Master’s theorem
{ n
between n and nlogn
int n = 22
k logkn nε w Yes No No No a) Substitution method
for(i=1, i< = n, i++) nk Cn Yes Yes No No No Solution: Guess a bound and use mathematical
{ Apply log induction method to prove if the guess
n nsinn No No No No No
   j = 2; n logn, logn logn is correct.
   while(j < = n) 2 n
2n/2 No No Yes Yes No b) Recursion (Recursive) - tree method
Substitute n = 216
  { nlogc Clogn Yes No Yes No Yes When more than 1 recursive call is
216 log 216 , log 216 log 216 present in given recurrence relation
     j = j2;
log(n!) log(nn) Yes No Yes No Yes then we go for the recursive tree
     printf(“prepladder”); 2 * 16 , 16 * 16
8

  } n method.
Note: ∴ n grows faster than nlogn
} c) Master theorem:
y log(n!) = θ(nlogn) (by string’s approximation) 4. Following are some of the functions in If any recurrence relation is there in the
}
y Fibonacci numbers are co-related to the increasing order of asymptotic (big-O) growth. form of:
Express number of times prepladder is
golden ratio φ and to its conjugate φ̂ , 4loglogn < 4logn < n1/2log4(n) < 5n < n4 <
printed using O notation. 1
which are two roots of the equation x2 n
(logn)5logn < nlogn < nn
5
< 5n < 55n < (n/4)n/4 < T(n) aT   + f(n)
=
=x+1 n b
4 4 n

Comparison Function and are given by the following formulas n n/4


< 4 n
< 4 < 55
Where a and b are positive constant
 erify the above given order by applying
V
1. Which is asymptotically larger: log(log*n) 1+ 5 and a ≥ 1 and b > 1
φ= = 1.6183 ˆ = 1 − 5 = −.61803
φ logarithm and substituting large value in ‘n’.
The substitution method for solving
or log* (logn) 2 2
recurrences.
Note:
log*n (read “log star” of n) to denote the Specifically, we have The substitution method to solve recurrences
iterated logarithm Rack Your Brain comprises two steps:
ˆi
φi − φ
Fi =
y The growth of iterated logarithm 5 1. Write (Guess) the asymptotic (upper or
Arrange the following functions in
function is very slow. lower) bound on the Recursive Program.
Where asymptotic decreasing order.
log*2 = 1, n 2. Writing a recurrence relation is nothing
F0 = 0 y f1 = 2log 2
but writing the Recursive program in the
log*4 = 2, F1 = 1 y f2 = n2024/16
form of Mathematical relation .
Fi = Fi-1 + Fi-2 y f3 = nlogn
log*16 = 3, For Example:
in Fibonacci series. y f4 = nlog
log*65536 = 4, T(n) 2T( n 2  ) + n
=
log* (265536) = 5 Note: Recurrences
In exam do not apply logarithm blindly, Let’s guess that the solution is T(n) = O(nlogn)
log(log* n) Recurrences go hand in hand with the divide
⇒ lim first cancel the common terms then apply
n→∞ log* (log n) and conquer paradigm, because they give  The substitution method requires us to
logarithm.
us a natural way to differentiate the running
log(log* 2n ) prove
= lim times of divide and conquer algorithms.
n→∞ log* (log 2n ) Example 1: T(n) ≤ c nlogn
[we have log*2n = 1 + log*n] Determine which is faster growing function Let’s start by assuming that this bound holds
log(1 + log* n) between 2n and n2 Definitions
= lim for all positive m < n, in particular for m = n 2 ,
n→∞ log* n Solution:
Apply log Same Recursive program written in the
log(1 + n) T( n 2  ) ≤ c n 2  log( n 2  )
= lim log2n, logn2 form of a mathematical relation is called
n→∞ n
nlog2, 2logn Recurrence relation.

Asymptotic sufficiently Analysis 7 8 Asymptotic sufficiently Analysis


Substituting into the recurrence yields Note:
Sometimes, a small algebraic manipulation
T(n) ≤ 2 ( c n 2  log( n 2  )) + n
can make an unknown recurrence similar
≤ cnlog(n 2) + n to a familiar one.

= cnlogn − cnlog2 + n Example:


= ( )
T(n) 2T  n  + log n
≤ cnlogn, step holds as long as c ≥ 1.
Examples: which looks difficult

y We can simplify this recurrence, by


T(n − 1) + n for n > 1
1. T(n) =  changing the variables.
1 , if n = 1
y For convenience, renaming m = logn yields
T(n) = T(n-1) + n
= T(n-2) + (n-1) + n T(2m ) 2T(2m/2 ) + m
=
= T(n-3) + (n-2) + (n-1) + n // Lets say, T(2m) = S(m) to produce the
= T(n-k) + (n-(k-1))+…+n new recurrence
[T(n − k) becomes n − 1]
1 for k =
S(m) = 2S (m/2) + m
=T (n-n+1) + (n-n+1+1) + 2 + 3
+ … +n // s(m) = O(mlogm)
= T(1) + 2 + 3 + … + n = 1 + 2 + 3 y After back substitution from s(m) to T(n),
we obtain
+ ... +n
T(n) = T(2m) = S(m) = O(mlogm)
n(n + 1)
= = O(logn log logn)
2
O(n2 )
2. function ( ) Rack Your Brain
{
if(n>1) Consider the recurrence relation
return(function(n-1)) T(n) = n T( n) + 100n
}
express T(n) in q notation y At every level, subproblem size decreases y The bottom level, at depth log4n, has
Above function can be written as equation by factor 4, and we must go up to the 3log 4 n = nlog 4 3 nodes , each contributing cost
form as boundary condition.
The Recursive (recursion) tree method for
1 + T(n − 1); n>1 y The subproblem size for node at depth ‘i’ T(1), for a total cost of nlog 4 3 T(1),
T(n) =  solving recurrences:
1 ; n=1
y Recursive tree method produces N-ary
is (n/4i).
At every level, calculate the non recursive
which is θ n ( ) , since we assume that T(1)
log 43

T(n) = 1 + 1 + T(n-2) tree. (Where n is the recursive function terms. is a constant.


= 2 + T(n-2) call). y At each level, there exists three function y To find the total complexity , we find the
= 3 + T(n-3) y Then we calculate recursive terms at each calls. cost at each level and sum all the costs at
= 4 + T(n-4) level of N-ary tree. y At depth ‘i’ ,every node has a cost of each level.
Examples:
 2
3
2
n  3
 n c i  . cn2 +
T(n) = cn2 +   cn2 + ...
i) T(n) = 3T   + cn2 4  16  16 
= K + T (n-k) 4 log 4 n− 1
 3
T(n-k) will eventually become 1 when k = n – 1,
or y Multiplying, we can see that the total cost + 
 16 
(
cn2 + θ nlog4 3 )
n n n over all nodes at depth i, for i = 0, 1, 2, …,
= n – 1 + T (n – (n-1)) T(n) = T   + T   + T   + cn2 log 4 n− 1 i

=n–1+1=n
4 4 4 n
2
log4n−1, is 3i c  i  = ( 3 16 )i cn2 = ∑
 3
 16 
2 log 3
  cn + θ n 4 ( )
∴      T(n) = O(n) 4  i =0

Asymptotic sufficiently Analysis 9 10 Asymptotic sufficiently Analysis


c c
i
= O(n2 )
( )

 3
< ∑  16  cn
i =0
2
+θ n
log 4 3
∴ T(n) = O(n ) 2

_ _ _ _ _ _ _ _ _ _ _ _ _ __ _ _ _ _
=
1 2
cn + θ n ( log 4 3
) c c 2c
1− 3 ( 16) Example:

=
16 2
(
log 3
cn + θ n 4 )
ii) T(n) =T n ( 3 ) + T (2n3 ) + O(n) c c c c 4c

13

( 8)
T n ( 8) T (n8)
T n ( )
T n
8 ( )
T n
8 ( 8) T (n8)
T n T n
8( ) 8c

T(1) T(1) T(1) T(1) nc

y The total cost over all nodes at depth i for y The recurrence (1) describes the running
i = 0, 1, 2,…, logn-1, is 2ic time of an algorithm that divides into ‘a’
log n− 1 subproblems where each is restricted by
T(n) = ∑i =0
2i c size n .
b
=c + 2c + 4c + ...nc y The ‘a’ subproblems are solved recursively,

= c(1 + 2 + 4 + ... + n) assume (n = 2k)


each in time T n .
b ( )
= c(1 + 2 + 4 + ... + 2 ) k y The function f(n) covers the cost of dividing
the problem and combining the results of
= c(1 + 21 + 22 + ... + 2k ) the subproblems.
 1(2k + 1 − 1) 
= c  Master’s theorem:
 2−1 
y The largest simple path from the root to ≤ d( n )log( n ) + d(2n )log(2n ) + cn   Let f(n) is a positive function and T (n) is
a leaf is 3 3 3 3
= c(2k + 1 − 1) defined recurrence relation:
Where d is a suitable positive constant.
= c(2n − 1)
( b ) + f(n) .
2
2 2
n →   n →   n → ........ → 1
3 3
= (( )
d n log n − d n log 3
3 3 ( ) ) = O(n)
T(n) aT n
=

When k = log 3 n , the height of the tree is (( )


+ d 2n log n − d 2n log 3
3 3 ( ) ( ))
2
+ cn
Where a >= 1 and b > 1 are two positive
constants.

(( 3 ) log 3 + (2n3 ) log ( 3 2 )) + cn


2
dnlog n − d n
= Rack Your Brain
Then T(n) follows asymptotic bounds:
log 3 n .
2 Case 1:
y By intuition, we expect the solution to the ( ) ( )
 n log 3 + 2n log 3 
Consider the recurrence relation
recurrence to be at most the number of
dnlog n − d 
=

3 3
( )
 + cn

− 2n 3 log 2  T(n) = T(n / 2) + 2T(n / 5) + T(n / 10) + 4n (
If f(n) = O nlogb a −∈ ) for some constant ∈> o
  express T(n) in ‘big-oh’ (O) notation
levels multiplied with the cost of each level,
or O(cnlog 3 n) = O(nlog n)
( )
=dnlog n − dn log 3 − 2 3 log 2 + cn then T(n) = θ nlogb a ( )
The master¢s theorem procedure for solving
2 Case 2:
≤ dnlog n, as long as d ≥ c recurrences:
Not every level in the tree contributes to
a cost of cn.
(log 3 − (2 3) log 2) The master theorem is used to solve ( ) (
If f(n) = θ nlogb a , then T(n) = θ nlogb a log n )
y Indeed, we can use the substitution ________________ recurrence relation if and only if it is in the
Case 3:
method to verify that O(nlogn) is an upper Example: form of
bound for the solution to the recurrence.
We have ( )
2T n + c; if n > 1

T(n) aT n
= ( b ) + f(n)  …(i) ( )
If f(n) = Ω nlogb a +∈ for some constant ∈> 0 ,

( b)
iii) T(n) =  2
c ; if n = 1 Where a ≥ 1 and b > 1 are positive constant and if a f n ≤ cf(n) for some constant c < 1
T(n) ≤ T( n ) + T(2n ) + cn 
3 3 and f(n) is a positive function.
and all sufficiently large n, then

Asymptotic sufficiently Analysis 11 12 Asymptotic sufficiently Analysis


T(n) = θ(f(n)) we get a = 3, b = 4, f(n) = nlogn, and
log a log 3
y In case 1, the function n logb a
is larger, then n=
b
n=
4
O(n0.793 ) . Previous Years’ Question Previous Years’ Question
the solution is given by Since f(n) = Ω n ( log 4 3 +∈
) , where ∈≈ 0.2 .
(
T(n) = θ n
logb a
). Case 3 master theorem applies,
Consider the following three functions.
f1 = 10n f2 = nlogn f3 = n n
Consider the following functions from
positive integers to real numbers:
y In case 3, if function f(n) is larger, then the For sufficiently large n, we have that Which one of the following options 100
10, n, n, log 2n,
solution is T(n) = θ(f(n)).
y In case 2, if two functions are the same
= af nb ( )
3 n 4 log n 4 ≤ 3 4 nlog n ( ) ( ) ( ) arranges the functions in the increasing n
order of asymptotic growth rate? The correct arrangement of the
size, then multiply by a logarithmic factor, = cf(n) = for c 3 4 . (A) f3, f2, f1 (B) f2, f1, f3 above functions in increasing order of
and the solution is T(n) = θ n logb a
log n ( )
Consequently, by case 3
T(n) = θ (nlogn)
(C) f1, f2, f3 (D) f2, f3, f1 asymptotic complexity is:
Solution: (D) [GATE CS 2021 (Set-1)]  [GATE CSE 2017 (Set-1)]
= θ ( f(n)log n)
4.
= T(n) 2T n + nlog n ,
2 ( ) (A) lognn,
100
, 10 n, n
Note: Even though it appears to have proper form n
The all three cases of Master’s Theorem a = 2, b = 2, f(n) = nlogn, and nlogb a = n . Previous Years’ Question 100
(B) , 10, log 2n, n, n
does not cover all the possibilities of f(n): This is not polynomial larger so this does n
y There is a gap between cases 1 and 2 not belong to case 3 There are n unsorted arrays: A1, A2,..., An. 100
(C) 10 , n, log 2 n, n
when f(n) is smaller than nlogba but not f(n)
=
log a
/n b (nlog
= n) / n log n Assume that n is odd. Each of A1, A2,..., An n
polynomially smaller. contains n distinct elements. There are 100
is asymptotically less than n∈ for any (D) , log 2n, 10, n, n
y Similarly, there is a gap between cases 2 no common elements between any two n
and 3 when f(n) is larger than nlogba but positive constant ∈ . arrays. The worst-case time complexity Solution: (B)
not polynomially larger. Consequently, the recurrence falls into of computing the median of the medians
y If the function f(n) falls into one of these the gap between case 2 and case 3. of A1, A2,..., An is [GATE CS 2019]
gaps, you cannot use the master method
to solve the recurrence.
T(n) 8T n
5.= ( 2 ) + θ (n )2 (A) O(n)
(C) O(n )
2
(B) O(nlogn)
(D) O(n2log n) Previous Years’ Question
a = 8, b = 2, and f(n) = θ (n2), Solution: (C)
Examples: logb a
n=
log 2 8
n= n3 In an adjacency list representation of an
1.
= T(n) 9T n ( 3) +n since n3 is polynomially larger than f(n) undirected simple graph G = (V, E), each
edge (u, v) has two adjacency list entries:
3 −∈
For this recursion, we have a = 9, b = 3, f(n) i.e f(n) O(n
= = ) for ∈ 1 Previous Years’ Question [v] in the adjacency list of u, and [u] in
= n, and thus we have that case 1 applies the adjacency list of v. These are called
log a log 9
n b = n 3 = θ(n2 ) . T(n) = θ(n3) Consider the following C function twins of each other. A twin pointer is a
Since f(n) = O n ( log 3 9 −∈
) . Where ∈= 1 , we can 6.
= T(n) 7T n ( 2 ) + θ (n )2
[GATE CSE - 2017 (Set-2)]
  int fun (int n) {
pointer from an adjacency list entry to
its twin. If |E| = m and |V| = n, and the
apply case 1 of master theorem. a = 7, b = 2, f(n) = θ(n2)    int i, j; memory size is not a constraint, what is
∴ T(n) = θ (n2)    for (i = 1; i < = n; i++) { the time complexity of the most efficient
logb a log 2 7

( )
n =n     for (j = 1 ; j < n ; j+=i) { algorithm to set the twin pointer in each
2.
= T(n) T 2n + 1
3 log 7 −∈      printf (“%d %d”, i, j) ; entry in each adjacency list?
=f(n) O(n
= ) for ∈ 0.8
b 3 , f(n)
a 1,=
= = 1
    }
2 Case 1 applies [GATE CSE 2016 (Set-2)]
    }
and nlogb=
a log
n 3/2= n=o
1
1 ∴ T(n) = θ(nlog27)  } (A) Θ(n2 )
Case 2 of master theorem applies, Asymptotic notation of fun in terms of q
(B) Θ(n + m)
since f(n) =
θ n b =(
log a
θ(1) ) Rack Your Brain
notation is
(A) q (n n ) (B) q (n2) (C) Θ(m2)
∴ T(n) = θ (logn) (C) q (n logn) (D) q (n2logn) (D) Θ(n4)
Give the tight asymptotic bound for the
3.
= T(n) 3T n ( 4) + nlog n recurrence relation Solution:(C) Solution: (B)
T(n) = 2T(n / 4) + n

Asymptotic sufficiently Analysis 13 14 Asymptotic sufficiently Analysis


Chapter Summary 2 Divide and Conquer
y Types of analysis
⚪ Worst case 2. Conquer: By algorithm given below.
The divide and conquer strategy involves
⚪ Best case a)  Finding maximum subarray of A[l…m]
three main steps.
⚪ Average case and A[m+1…h]
⚪ Amortized. y Step 1: Problem is divided into some
b) Finding a maximum subarray that
y Asymptotic notations subproblems that are smaller parts of the
crosses the midpoint.
⚪ Big O notation: same problem.
3. Combine: Returning the maximum of the
f(n) = O(g(n)) if there exist constants n0 and c such that f(n) < c g(n) for all n > n0 y Step 2: Solve (Conquer) the subproblems
above tree.
⚪ Big omega notation: in a recursive manner. If the subproblem is
This process of finding solution works
f(n) = W(g(n)) if there exist constants n0 and c such that f(n) > c g(n) for all n > n0 small, then solve it directly.
because any subarray may lie completely on
⚪ Theta (q q) notation: y Step 3: Combine solutions of all the
one side of the midpoint, on the other side,
If f(n) = O(g(n)) and g(n) = O(f(n)), then f(n) = q(g(n)) subproblems into the original problem’s
or crossing the midpoint.
y Master’s theorem solution.
T(n) = aT(n/b) + f(n) where a 1 and b > 1 be constant, f(n) be the function. y When the subproblem is not small Algorithm:
⚪ If f(n) = O(nlogba–∈) for some constant ∈ > 0, then T(n) = (nlogba) enough to find solution directly (without FIND-MAXIMUM-CROSSING-SUBARRAY (Arr,
⚪ If f(n) = (nlogba), then T(n) = (nlogba) recursion), i.e., with base case, it is divided l, m, h)
⚪ If f(n) = (nlogba+∈) for constant ∈ > 0, and af(n/b) < c f(n) for some constant c < 1 for into smaller subproblems.
l-sum = −∞
all n and all sufficiently large T(n) = q(f(n)
The Maximum Subarray Problem sum = 0
Problem: for i ¬ m downto l
Given an array A[1...n], one should find a sum ¬ sum + Arr[i]
continuous subarray that sums up to the
if sum > l-sum
maximum value.
l-sum ¬ sum
Input: Array A[1...n]
max-left ¬ i
Output:
r-sum ¬ – ∞
The maximum sum of values possible in the
subarray A[i...j] and the indices i and j. sum ¬ 0
for j ¬ m + 1 to h
Note:
Maximum subarray might not be unique. sum ¬ sum + Arr[j]
if sum > r-sum
Example:
r-sum ¬ sum
Array 5 −4 3 4 −2 1 max-right ¬ j
A
return (max-left, max-right, l-sum + r-sum)
1    2   3   4 5 6 Algorithm returns the indices i and j and the
Maximum subarray: A[1…4] (i = 1, j = 4) and sum of two subarrays.
sum = 8. y If the subarray Arr[l…h] contain n entries,
then n = h–l+1.
Divide and Conquer Strategy: y Since, each iteration of two for loops takes
 ivide: The array A[l…h] is divided into two
1. D θ(1) time, and the number of iterations
subarrays of as equal size as possible by is to be counted to get the total time
calculating the mid. complexity.

Asymptotic sufficiently Analysis 15 Divide and Conquer 17


y The first for loop makes m-l+1 iterations, y By using these equations, create a straight y In Strassen’s matrix multiplication, the
and the second for loop makes h-m Previous Years’ Question forward recursive, divide–and–conquer result of four sub-matrices is calculated
iterations. algorithm. using the formula below:
Total number of iterations = (m-l+1) + (h-m) Consider a sequence of 14 elements: SQUARE–MATRIX–MULTIPLY–RECURSIVE
 A 11 A 12  B11 B12  P5 + P4 – P2 + P6 P1 + P2 
A= [-5, -10, 6, 3, -1, -2, 13, 4, -9, -1, 4, 12, (A,B)  × =  
= h-l+1
-3, O]. A21 A22  B21 B22  P3 + P4 P5 + P1 – P3 – P7 
1. N = A. rows
=n The sequence sum S(i, j) = ~t=i A[k]. 2. Let C be a new n×n matrix  A 11 A 12  B11 B12  P5 + P4 – P2 + P6 P1 + P2 
∴ FIND-MAXIMUM-CROSSING-SUBARRAY(Arr, Determine the maximum of S(i, j), where 3. If n == 1  × =  
0 ~ i ~j < 14. A21 A22  B21 B22  P3 + P4 P5 + P1 – P3 – P7 
l, m, h) takes θ(n) time. 4. C11 = a11 . b11
(Divide and conquer approach may be used.) 5. Else partition A, B, and C as in equations (1) Where
y With the help of the FIND-MAXIMUM-
Solution: Sum = 6 + 3 - 1 - 2 + 13 + 4 - 9 - 1 6. C11 = SQUARE–MATRIX–MULTIPLY–RECURSIVE
CROSSING-SUBARRAY procedure in hand
+ 4 + 12 = 29 (A11, B11)
that runs in linear time, pseudocode for
divide and conquer can be written to solve + SQUARE–MATRIX–MULTIPLY–RECURSIVE
the maximum subarray sum problem. Strassen’s Algorithm for Matrix (A12, B21)
Multiplication 7. C12 = SQUARE–MATRIX–MULTIPLY–RECURSIVE
Find-Maximum-Subarray(Arr, L, H) (A11, B12)
Given 2 square matrices A and B of size n×n + SQUARE–MATRIX–MULTIPLY–RECURSIVE
1. If h == l
each, find the product of two matrices. (A12, B22)
2. Return (l, h, Arr[l])
3. Else m = (l + h) / 2 Naive method: 8. C21 = SQUARE–MATRIX–MULTIPLY–RECURSIVE
SQUARE–MATRIX–MULTIPLY (A, B) (A21, B11)
4. (L-low, l-high, l-sum) = FIND-MAXIMUM- + SQUARE–MATRIX–MULTIPLY–RECURSIVE
1. n ¬ A.rows So, a total of 7 multiplications are required.
SUBARRAY(Arr, l, m) (A22, B21)
2. let C be a new n×n matrix
5. (R-low, r-high, r-sum) = FIND-MAXIMUM- 9. C22 = SQUARE–MATRIX–MULTIPLY–RECURSIVE Time complexity of Strassen’s method:
3. for i ¬ 1 down to n
SUBARRAY (Arr, m+1, h) (A21, B12) Adding and subtracting two matrices takes
4. for j ¬ 1 down to n
6. (Cross-low, cross-high, cross-sum) = FIND- + SQUARE–MATRIX–MULTIPLY–RECURSIVE O(n2) time.
5. Cij ¬ 0
MAXIMUM-CROSSING-SUBARRAY (Arr, l, m, h) (A22, B22)
6. for k ¬ 1 down to n Hence, the time complexity taken by the
7. If l-sum ≥ r-sum and l-sum ≥ cross-sum 10.Return C
7. Cij ¬ Cij + aik . bkj algorithm can be written as
8. Return (l-low, l-high, l-sum) y In the above method, we do eight
8. return C
9. Else if r-sum ≥ l-sum and r-sum ≥ cross- multiplications for matrices of size  θ ( 1)
 if n =1
a) Because each of the triply–nested loops
sum T (n ) = 
10. Return (r-low, r-high, r-sum)
runs for exactly n iterations, and each n/2×n/2 and 4 additions.
y Addition of 2 matrices takes O(n2) time

 ( )
7T (n / 2) + θ n2 if n > 1
execution of line 7 takes constant time.
11. Else return(cross-low, cross-high, cross- So, the time complexity can be given as By master’s theorem, the time complexity is
∴ Time complexity = O(n3) time for naive
sum) O(nlog2 7), which is approximately O(n2.8074)
method
Remarks: 
θ 1() if n =
1
NOTE:
A simple divide and conquer algorithm: ()
T n =
1. Initial call: FIND-MAXIMUM-SUBARRAY
(Arr, l, h). y Divide matrices A and B into four n/2 × n/2
( ) ( )
8T n / 2 + θ n
2
if n > 1 Strassen’s method is not suggested for
practical purposes due to the following
2. Base case is used when the subarray matrices By masters theorem, the time complexity is
reasons:
can not be divided further, i.e., with  A 11 A 12  B11 B12   C11 C12  O(n3), which is the same as the naive method.
= A =  ,B =  ,C    (1) y More number of constants are used
1 element.
A21 A22  B21 B22  C21 C22  Strassen’s method: and naive methods performs better in
3. Divide is based on the value of m.
So that we rewrite the equation C = A.B as y Strassen’s method makes the recursion most of the cases.
Conquer by the two recursive calls to
tree sparse. y There are better methods available for
FIND-MAXIMUM-SUBARRAY and a call to  C11 C12   A 11 A 12  B11 B12 
 = .  y Instead of eight recursive multiplications of sparse matrices in particular.
FIND-MAXIMUM-CROSSING-SUBARRAY.
C21 C22  A21 A22  B21 B22  y The sub-matrices occupy more space
Combine the solutions by finding which of n/2x n/2 matrices, it performs only seven.
where y Strassen’s method also uses the divide in recursion method.
the three results gives the maximum sum.
4. Complexity: and conquer strategy, i.e., it divides the y Strassen’s algorithm gives more errors
T(n) = 2.T(n/2) + θ(n) + θ(1) matrix into some smaller matrices of due to precision limitations in computer
order n/2 x n/2. arithmetic on non-integer value.
= θ(nlogn) by masters theorem.

18 Divide and Conquer Divide and Conquer 19


Problem of Sorting y The main operation of merge sort algorithm Example:
is to merge two sorted arrays.
Input: An array A of size n. y MERGE(A,p,q,r) is used to merge the arrays
Output: Reordering of the array such that such that p ≤ q < r.It assumes the two
A[0] ≤ A[1] ≤ … ≤ A[n−1]. arrays A[p...q] and A[q+1...r] are in sorted
order.
Properties of Sorting algorithms:
In place: Algorithm

“A sorting algorithm is said to be in place if MERGE_FUNCTION (arr, low, mid, high)


a maximum of O(1) (in some cases it can be 1. Let n1 ¬ (mid – low) + 1
increased) elements are stored outside the 2. Let n2 ¬ high – mid
array at any moment. 3. Let arr1[1...(n1 + 1)] & arr ¬ 2[1...(n2 + 1)] be
the new arrays
E.g., Insertion-sort is in place. 4. for i ¬ 1 to n1
Stability: 5. arr1[i] ¬ arr[low + i − 1]
6. for j ¬ 1 to n2
A sorting algorithm is said to be stable if the
7. arr2[j] ¬ arr[mid + j]
order of repeated elements is not changed.
8. arr1[n1 + 1] ¬ ∞
E.g., Insertion sort, merge sort, etc. 9. arr2[n2 + 1] ¬ ∞
Adaptive: 10. i ¬ 1
11. j ¬ 1
A sorting algorithm falls into the adaptive sort 12. for k ¬ low to high
family if it takes advantage of the existing 13. if arr1[i] ≤ arr2[j]
order in its input. 14. arr[k] ¬ arr1[i]
E.g., Insertion sort 15. i ¬ i + 1
16. else arr[k] = arr2[j]
Merge sort:
17. j ¬ j + 1
The merge sort algorithm uses divide and y First line finds the length n1 of the subarray
conquer approach. arr[low...mid], and 2nd line finds the length
Divide: n2 of the subarray arr[mid+1...high].
y arr1 and arr2 arrays are created with
Divide the array of n elements each into two lengths n1 and n2, respectively.
subarrays of n/2 elements each.
y The extra position in each array holds the
Conquer: sentinal.
Sort the subarrays in the recursive method y For loop of the line (4-5 and 6-7) copies
by using merge sort. subarray arr[low…mid] into arr1[1…n1] and
arr[mid + 1…high] into arr2[1…n2].
Combine: y Traverse arr1 and arr2 simultaneously
y Merge the two subarrays to result in the from left to right, and write the smallest
sorted array. element of the current positions to arr.

20 Divide and Conquer Divide and Conquer 21


MERGE-SORT (arr, low, high) 3. MERGE-SORT (arr, low, mid) Algorithm:
1. if low < high 4. MERGE-SORT (arr, mid+1, high) QUICKSORT_ALGO (Arr, low, high)
5. MERGE (arr, low, mid, high) Rack Your Brain
2.=
mid (low + high) / 2 1. if low < high
2. mid ¬ PARTITION_OF_ARRAY (Arr, low,
Think about the properties of the merge
Example: high)
sort and other kind of implementation of
3. QUICKSORT_ALGO (Arr, low, mid-1)
merge sort
4. QUICKSORT_ALGO (Arr, mid+1, high)
Hint: There is an in place merge sort
implementation. The main part of the algorithm is the
PARTITION_OF_ARRAY procedure, which
makes the subarray Arr[low...high] in place.
Previous Years’ Question PARTITION_OF_ARRAY (Arr, low, high)
1. x ¬ Arr[high]
Assume that a merge sort algorithm in the 2. i ¬ low-1
worst case takes 30 seconds for an input of 3. for j ¬ low to high-1
size 64. Which of the following most closely 4. i f Arr[j]<= x
approximates the maximum input size of a 5. i ¬ i + 1
problem that can be solved in 6 minutes? 6. swap Arr[i] with Arr[j]
(A) 256 (B) 512 7. swap Arr[i + 1] with Arr[high]
(C) 1024 (D) 2048 8. return i + 1
Solution: (B) Example:

Quick sort: (A)


Quick sort uses 3 step divide and conquer
approach to sort an array A[p...r].
(B)
Divide:
Rearrange the array A into two subarrays (C)
A[p...q-1] and A[q+1...r].

Analysis: The partition procedure computes the index


Combine: (D)
q such that the elements A[p...q-1] are
y Even the Merge sort works correctly on MERGE_FUNCTION procedure applied on an smaller than A[q] and the elements A[q+1...r]
odd-length arrays. Recurrence analysis is n-sized subarray takes time θ(n). are greater.
simplified if we assume the input array (E)
Recurrence for the worst-case running time
length is of power of 2. Conquer:
T(n) of merge sort.
Break down of running time: Sort the 2 subarrays A[p…q-1] and A[q+1…r]
(F)
Divide: θ(1) if n =
1 recursively in quicksort.

T(n) =  n
This step finds the middle of the subarray in 2T  2  + θ(n) if n > 1
Combine:
  
constant time, i.e., O(1). (G)
Since all the subarrays are already sorted,
Conquer: By Masters Theorem:
no extra work is needed to combine them
Solving two subproblems of
n size each T(n) = θ(nlogn) together. Array A[p...r] is now sorted. (H)
2
n
recursively results in 2T   time complexity.
2 (I)

22 Divide and Conquer Divide and Conquer 23


PARTITION_OF_ARRAY applied on an Array: =T(n-1)+cn ( removing Θ with constant) 2 4
= −
y Array entry A[r] be the pivot element x. =T(n-2)+c(n-1)+cn n n(n + 1) Previous Years’ Question
y Orange colour elements are all in the first =T(n-3)+c(n-2)+c(n-1) 2 4 4
partition values no greater than x. = − + Let p be a quicksort program to sort
=T(1)+c(2)+c(3)+...+(n-2)+(n-1)+n n n n+ 1
y Pink colour elements are in the second numbers in ascending order using the first
partition with values greater than x. =c(1+2+3+...+n) element as pivot. Let t and t be the number
4 2
y Blue colour elements, have not yet been =c(n(n+1)/2)=O(n^2). = − of comparisons 1 2 made by p for the inputs
n+ 1 n
put in one of the first two partitions. {1,2,3,4,5} and {4,1,5,3,2}, respectively.
Average case: Which one of the following holds?
Analysing quicksort: Now,
T(n): Average number of comparisons during [CS 2014 (Set-1)]
The choice of pivot is most crucial:
quicksort on n elements. 4 2 (A) t = 5 1
g(n) − g(n − 1)
= −
y A incorrect pivot may lead to worst- n+ 1 n (B) t1 < t2
case time complexity O(n2), whereas a θ ( 1) , T(0) =
T(1) = θ ( 1) (C) t1 > t2
pivot that divides the array into 2 equal- Hence,
(D) t = t 1 2
sized subarrays gives the best case time 1 n Solution: (C)
complexity, i.e., O(nlogn).
T(n)
= ∑ ( T(i − 1)T(n − i) + n − 1)
n i= 1 g(n) =
4  n 1
+  2∑  − 2
n + 1  j= 2 j 
The worst-case choice:
2 n
The pivot may be the largest or the smallest
T(n)
= ∑ ( T(i − 1)) + n − 1
n i= 1 =
4  n 1
+  2∑  − 4
Previous Years’ Question
of all the elements in the array. n + 1  j= 1 j 
n

y When the first iteration is done, then one = 2∑ ( T(i − 1)) + (n − 1)n ...(i)
nT(n) ...(1) In quicksort, for sorting n elements, the (n/4)
i= 1 4 th smallest element is selected as pivot
subarray contains 0 elements, and the = + 2H(n) − 4
For (n-1) n+ 1 using an O(n) time algorithm. What is the
other subarray has n-1 elements. worstcase time complexity of the quicksort?
y Quicksort is applied recursively on this n− 1
 [CS 2009]
− 1) 2∑ ( T(i − 1)) + (n − 1)(n − 1 − 1)
(n − 1)T(n= Now,
second subarray of n-1 elements. i= 1 (A) θ(n)
 4  (B) θ(nlogn)
n− 1 T(n) =+
(n 1)  + 2H(n) − 4 
− 1) 2∑ ( T(i − 1)) + (n − 1)(n − 2)
(n − 1)T(n= ...(2)
...(ii) n + 1  (C) θ(n2)
i= 1
(D) θ(n logn) 2
T(n) = 2(n+1) H(n) - 4(n) Solution: (B)
Subtracting equation (2) from (1), we get
T(n) = 2(n+1) logen + 1.16(n+1) - 4n
nT(n) − (n − 1)T(n − 1)
= 2T(n − 1) + 2(n − 1)
T(n) = 2nlogen - 2.84n + O(1)
nT(n) − (n + 1)T(n − 1) = 2(n − 1) T(n) = 2nlogen Previous Years’ Question

T(n) T(n − 1) 2(n − 1) The average number of comparisons during Randomised quicksort is an extension of
− = quicksort on n elements approaches.
n+ 1 n n(n + 1) quicksort where the pivot is chosen randomly.
2nlog2n - 2.84n What is the worst-case complexity of sorting
However, quicksort is fast on the “randomly 2(n − 1)
g(n) − g(n − 1) = = 1.39n log2n - O(n) n numbers using randomised quicksort?
scattered” pivots. n(n + 1)  [CS 2001]
The partition step needs atleast n-1 So, the average case time complexity as
(A) O(n)
comparisons. T(n) quicksort is O(nlogn).
where g(n) = (B) O(nlogn)
n+ 1 (C) O(n2)
y The recurrence for the running time is
Taking RHS- (D) O(n!)
= T(n − 1) + Θ (n) 
T(n) Solution: (C)
 2(n − 1) 2(n + 1) − 4
T(1) = θ ( 1)  =
n(n + 1) n(n + 1)
T(n)=T(n-1)+Θ(n)

24 Divide and Conquer Divide and Conquer 25


Randomised quicksort: Let the pivot is ith smallest element. Then n + 1  (n − 1) 
Note: <  T(n − 1) + 2 2 * < 2
we get (i-1) element on the left side and  n  n
y Assume all elements are distinct. y The running time of quicksort does not  
(n-i) element on the right side. And we 4
y Partition around a random element, i.e., depend on the input. It depends on [ex =
:2 * 2=* 0.8 1.6]
have to do randomised-quicksort on (i-1) 5
the pivot is chosen randomly from the set the random numbers provided by the
th
and (n-i)th element. So, it take T(i-1) and
of elements. generation.  1 n − 1 + 1 
T(n-i) time, respectively. < n +   T(n − 2) + 2  + 2
y This implies the splits n-1, n-2, …, n-1 have y Thus, for the same input the program n n − 1
 T(n) = T(i-1) + T(n-i) + (n-1) with probability  
the same probability of 1/n. might take 2sec today and 5sec
1/n. n + 1  n 
tomorrow. < T(n − 2) + 2  + 2
 
Note: y The average time taken over many  n n − 1 
Randomisation is a general tool to improve this is because atleast (n-1) different runs of the program would
n+ 1 2(n + 1)
algorithms with bad worst-case but good comparison is required. give us the expected time. < T(n − 2) + +2
(n − 1) n
or average-case complexity. y Same as saying that we are taking
Hence,
average over all possible random n + 1 n − 2 + 1  2(n + 1)
<  T(n − 3) + 2  + +2
Randomised_Partition (Arr, p, r) number sequences provided by the (n − 1)  n − 2  n
1 n

{ T(n)
= ∑ ( T( j − 1) + T(n − j) + n − 1)
n j= 1
generator.
n+ 1  n−1  2(n + 1)
<  T(n − 3) + 2  + +2
i  Random(p, r) /* it generate a (n − 1)  (n − 2)  n
random number (replacing i with j for simplicity) Solution of quicksort( ) recurrence relation:
n+ 1 2(n + 1) 2(n + 1)
between p = This is the expectation. 2 n− 1 < T(n − 3) + + +2
          and r and including

T(n)
= ∑ T(i) + n − 1 
n i=0
...(1) ...(i)
(from eq.(3))
(from eq (3)) (n − 2) (n − 1) n
p and r */ n
n+ 1 1 1 
∑ T( j − 1) ==
This quantity varies
this quantity variesfrom
fromT(0)
T(0)to
toT(n-1)
T(n-1)
Note that T(0) = 0, T(1) = 0. <
T(n − 3) + 2(n + 1)  + +2
exchange Arr[r]  Arr[i] j= 1
(n − 2)  n (n − 1) 
return Partition (Arr, p, r) 2 n− 2
∑ T(i) + (n − 2)
n
Now,=
T(n − 1) n + 1 n − 3 + 1  1 1 
} ∑ T(n − j)==
this quantity
This quantity varies
varies from
from T(n-1)
T(n-1) to
to T(0)
T(0) (n − 1) i=0 <  T(n − 4) + 2 + 2(n + 1)  + +2
j= 1 (n − 2)  n − 3   n (n − 1) 
Randomised_Quicksort (Arr, p, r)
So, every term T(0) to T(n-1) appears twice. 2 < n−n2 + 1 n − 3 + 1  1 1 
{ or, (T(n − 1) − n + 2) = ∑ T(i)  T(n − 4) + 2 + 2(n + 1)  + +2
So, we write it as: (n − 1) i(n
=0
− 2)  n − 3   n (n − 1) 
if p < r
n+ 1  n−2  1 1 
q  Randomised_Partition (Arr, p, r) =
2 n− 1 1 n
∑ T( j) + n ∑ n − 1 Multiply both sides by (n − 1) <  T(n − 4) + 2 + 2(n + 1)  + +2
n n (n − 2)  (n − 3)   n (n − 1) 
=j 0=j 1
Randomised_Quicksort (Arr, p, q-1)
Randomised_Quicksort (Arr, q+1, r) n+ 1  n−2  1 1 
2 n− 1 n(n − 1) 2 n− 2 (n − 1) (n − <
1)  − n + T(n − 4) + 2 + 2(n + 1)  + +2
} = ∑ T( j) + n ∑ ( T(i)) × n= n ×(n( T(n
(n − 1) i=0 (n − 3) 2)
− 2)− 1)   n (n − 1) 
n j=0
Time complexity:
2 n− 2 (n − 1) n+ 1 2(n + 1) 1 1 
y Let’s assume T(n) be the number of= 2 ∑ T( j) + n − 1
n− 1
...(3)
or, ∑ T(i)= n =
n i=0
( T(n − 1) − n + 2) ...(ii)
...(2) <
(n − 3)
T(n − 4) +
(n − 2)
+ 2(n + 1)  + +2
 n (n − 1) 
comparisons needed to sort n numbers n j=0
from (1) and (2); (n + 1) 1 1 1 
using quicksort. < T(n − 4) + 2(n + 1)  + + +2
= O(nlog 2 n) (n − 1) (n − 3)  n (n − 1) (n − 2) 
y Since each split occurs with probability T(n)
=
n
( T(n − 1) − n + 2) + n2 T(n − 1) + (n − 1) 
1/n. In quicksort, every number requires
atleast (n-1) comparison because the (n-1) \ The expected number of comparisons is n − 1 + 2 2(n − 1) (n + 1) 1 1 1 1
number of elements has to be compared O(nlogn). But the worst-case time complexity = T(n − 1)   −n+ 1+ + (n − 1) < T(2) + 2(n + 1)  + + + ... +  + 2
with the pivot element. of randomised quicksort is O(n2).
 n  n (n − (n + 3))  n (n − 1) (n − 2) 4
(n + 1) 2(n − 1) 
= T(n − 1) + (n + 1) 1 1 1 1
n n < T(2) + 2(n + 1)  + + + ... +  + 2
(n − (n + 3))  n (n − 1) (n − 2) 4
n+ 1 2(n − 1)
So, we got =
T(n) T(n − 1) + 
n n

26 Divide and Conquer Divide and Conquer 27


(n + 1) 1 1 1 1 Heap
1  as a tree: Max–heapify (example):
Note:
< T(0) + 2(n + 1)  + + + ... + +  + 2
(n − n + 1)  n (n − 1) (n − 2) 3 y2 Root: The number of nodes present in a complete
First element in the array  n 
(n + 1) 1 1 1 1 1 binary tree at height h is  h− 1  , where n is
T(0) + 2(n + 1)  + + + ... + +  + 2 y parent i) =  i 2 : 2 
n − n + 1)  n (n − 1) (n − 2) 3 2 
the total nodes of the tree.
Returns index of node’s parent
1 1 1 1 y left i) = 2i :
< 2(n + 1)  + + + ... +  + 2 y Observe, however, that MAX–HEAPIFY
n (n − 1) (n − 2)
 2
  Returns index of node’s left child. takes O(1) time for nodes that are a level
y right i) = 2i + 1: above the leaves, and O(L) for the nodes
1 1 1 1 1 1 v that are L levels above the leaves.
1 1 
Returns index of node’s right child.
 + + + + + + ... + + harmonic
= series 
1 2 3 4 5 6 (n − 1) n Height
 of a binary heap is O(logn).
 = (log n) 
There are two types of binary heaps:
= harmonic series
Max–heaps and min–heaps
= (log n) In both the heaps, the values in the nodes
satisfy the heap property.
< 2(n + 1)(log n − 1) + 2
In a max heap, the max–heap property is that
= O(nlog n) for every node i other than the root node, y Max–Heapify Pseudocode
We have n / 4  nodes with level logn
A[PARENTi)] ≥ A[i] MAX–HEAPIFY(Arr, i)
Heap sort: 1. l ¬ left_node i) We have n / 8 nodes with level logn–1,
A min–heap property is that for every node i
2. r ¬ right_node i)
y Heap is a data structure used to store other than the root,
3. if (l ≤ Arr.size_of_heap && Arr[i]<Arr[l])
and manage information. It is the design A[PARENT i)] ≤ A[i] 4. maximum ¬ l We have 1 root node that is 0 level above the
technique used by heapsort. 5. else maximum ¬ i leaves.
y A binary heap is an almost complete The smallest valued node in a min-heap is
at the root. 6. if (r ≤ Arr.size_of_heap && Arr[maximum] y Total amount of work in BUILD–MAX–HEAP
binary tree. < Arr[r]) for loop can be summed as
y For the heap sort algorithm, we use max– 7. maximum ¬ r n/4 (1C) + n/8 (2C) + n/16 (3C) + … + 1(lognC)
16
heap. 8. if (maximum ≠ i) substitute n/4 = 2K
14 10
16 14 10 8 7 9 3 2 4 1
y Min–heaps commonly implement priority 9. swap Arr[i] and Arr[maximum] C 2K (1/20 + 2/21 + 3/22 + …(K+1)/2K)
8 7 9 3 queues. 10. MAX–HEAPIFY (Arr, maximum) The term in brackets is bounded by a
2 4 1 Heap operations: y Convert array A[1…n] into a max–heap, constant.
y Build–Max–Heap: where n=A.length. y This means that build–max–heap is O(n).
y The tree is said to be almost complete Produce a max–heap from an unordered y The elements in the subarray
Build–max–heap (example):
if all the levels are filled except the last array A ( n / 2 + 1) ...n are all leaves of the tree,
level. The nodes in the last level are also y Max_heapify: A 4 1 3 2 16 9 10 14 8 7
and so each is a 1–element heap to begin
filled from left to right. Changes every single violation of the heap with.
Binary tree representation: property in every subtree at its root. 1 4
MAX –HEAPIFY (A, 5)
2 3
y Insert, extract_max, heap sort Build–max–heap (a): 1 3 No change
1 1 Max_heapify: 1. A. size_of_heap = A. length 4 5 6 7 MAX –HEAPIFY (A, 4)
2 16 9 10
2 3 2 3
y Assume that the trees rooted at left i) and 2. for i = A.length / 2 down to 1 8 9 10
Swap A[4] and A[8]
4 5 6 7 4 5 6 7
right i) are max_heap. 3. MAX–HEAPIFY (A, i) 14 8 7
8 9 10 11 12 13 14 15 8 9 10 y If element A[i] violates the max_heap
property, correct violation by “trickling”
A full binary tree of height 3. A complete binary tree with
10 nodes and height 3. element A[i] down the tree, making the
subtree rooted at index i a max_heap.

28 Divide and Conquer Divide and Conquer 29


1 4 Heap–sort demo: 1 14
3 MAX–HEAPIFY (A, 3) 2 3
2 8 10
1 3 Swap A[3] and A[7] 1 16 4 5 6 7
4 2 3
5 6 7 4 7 9 3
14 10 8 9
14 16 9 10 4 5 6 7 2 1
8 9 10 8 7 9 3
2 8 7 8 9 10 Swap A[9] and A[1]
2 4 1
1 4 MAX–HEAPIFY (A, 2) Swap A[10] and A[1]
2 3 1 1
1 10 Swap A[2] and A[5] 2 3
8 10
4 5 6 7 Swap A[5] and A[10] 1 1 4 7
2 3 heap – size = 9 5 6
14 16 9 3 14 10
4 7 9 3
8 9 10
8 9 10 4 5 6 7 2 14 16 Not part of heap
2 8 7 8 7 9 3
8 9 10 MAX–HEAPIFY (A, 1)
1 4 MAX–HEAPIFY (A, 1) 2 4 16 Not part of heap
2 3
Swap A[1] with A[2]
16 10 MAX–HEAPIFY (A, 1) 1 10
4 5 6 7 Swap A[2] with A[4] 2 3
8 9
14 7 9 3 Swap A[4] with A[9] 4 5 6 7
8 9 10 1 14 4 7 1 3
2 3 8
2 8 1 8 10 2
4 5 6 7 _________________________
1 16 4 7 9 3 ______
2 3
8 9
14 10
A 4 1 3 2 16 9 10 14 8 7 4 2 1
5 6 7
8 7 9 3 _________________________
8 9 10 ______
2 4 1
1 10 Running time:
2 3
8 9 y The heap is empty after n iterations.
Heap Sort 4 5 6 7
max-heaps. Run max-heap to solve this. y Each iteration does a swap and a max-
4 7 1 3
Sorting strategy: 6. Go to step 2 if the heap is not empty. 8 heapify operation, which involves O(logn)
2 time.
1. A max-heap is built using an unordered Heapsort(A)
Swap A[8] and A[1] y So it takes O(nlogn) in total.
array.
{
2. Maximum element A[i] is found. Note:
1. BUILD–MAX–HEAP(A)
3. Exchange the values of A[n] and A[1], Heapsort is an efficient algorithm and has
2. for (i = A.length() down to 2) 1 2
which results in changing the maximum to 2 3 its own uses, but quicksort beats it, if it is
3. swap A[1] with A[i] 8 9
the end of the array. implemented correctly.
4. A.heap–size = (A.heap–size–1) 4 5 6 7
4. Delete node n from the heap.
5. MAX–HEAPIFY (A, 1) 4 7 1 3 y The most commonly used application of
5. The new root added may violate the max-
} 8 9 10
heaps is priority queues.
heap property but its children are already 10 14 16 Not part of heap

30 Divide and Conquer Divide and Conquer 31


y Same as with heaps, priority queues are of time on the top of O(logn) time for MAX_
two types, HEAPIFY. Previous Years’ Question
Max-priority queue y The HEAP-INCREASE_KEY implements the Rack Your Brain
Min-priority queue INCREASE KEY operation.
Consider the following array of elements
y A max priority queue has the following Heap_increase_element (Arr, i, key): y Give an O(nlogk) – Time algorithm to (89, 19, 50, 17, 12, 15, 2, 5, 7, 11, 6, 9, 100).
operations: merge k sorted lists into one sorted
1. if key < Arr[i] The minimum number of interchanges
Insert(S,x) list, where n is the total number of
2. ERROR /* since the key is lesser than the needed to convert it into a maxheap is:
Adds an element to set S ,i.e., S=S U {x} elements in all the input lists. Hint :
element */  [CS 2015]
Maximum(S) Use a min-heap for k–way merging.
3. Arr[i] ¬ key (A) 4 (B) 5
y Returns the largest element in the set. 4. while (i > 1) && [Arr[PARENT(i)] < Arr[i]] y Write Pseudocode for the procedures
Extract max(S) (C) 2 (D) 3
5. exchange Arr[i] with Arr[PARENT(i)] HEAP MINIMUM, HEAP–EXTRACT–
y Returns the largest element in S and 6. i ¬ PARENT(i) MIN, HEAP–DECREASE–KEY, and Solution: (D)
deletes it from the set. The HEAP-INCREASE-KEY takes O(log n) time MIN–HEAP INSERT that implement a
Increase(S,x,k) on an n element heap. The path from the min–priority queue w ith a min–heap. Insertion Sort
y Increases the value of x to k, and it is updated node in the 3rd line to the root has
assumed that k ≥ x. Insertion sort is an efficient algorithm to sort
length O(logn). arrays with a small number of elements.
y Apart from these, a min-priority queue
supports the operations MINIMUM, y It is the same as sorting and playing cards
EXTRACT MIN, INSERT and DECREASE KEY.
Previous Years’ Question in your hand.
y The procedure HEAP MAXIMUM on the y Start with an empty hand and pick one
max-priority queue implements the The number of elements that can be card at a time from the cards that are
maximum operations in θ(1) time. sorted in O(logn) time using heap sort is faced down on the table.
Heap–maximum (A): [CS 2013] y Check the card in the right hand with the
(A) 0(1) cards in the left hand. Compare it with
1. return A[1]
The procedure HEAP–EXTRACT–MAX
implements the EXTRACT–MAX operation.
(B) 0 ( log n ) cards from right to left and place it in its
correct position.
Heap–extract–max (A):  log n  y At any given time, the cards in the left
(C) 0  
y The procedure MAX–HEAP–INSERT  log log n  hand are sorted and are the top cards in
1. if A. heap–size <1
implements the INSERT operation. the file.
2. error “heap underflow”
Solution: (C) Pseudocode for insertion sort:
3. max = A[1]
Max–heap–insert (A, key):
4. A[1] = A[A. heap–size] Insertion–sort (A):
5. A. heap–size = A. heap–size – 1 1. A. heap–size = A. heap–size + 1
1.for j from 2 to A. length
6. MAX–HEAPIFY (A, 1) 2. A [A. heap–size] = – ∞
2.Key = A[j]
7. return max 3. HEAP–INCREASE–KEY (A, A. heap–size, key) Previous Years’ Question
3.// Insert A[j] into sorted sequence A[1…j–1]
The running time of MAX–HEAP–INSERT on
y The HEAP-EXTRACT-MAX takes O(logn) 4.i=j–1
an n–element heap is O(logn). Consider a max heap, represented by the
since it performs O(1) the amount of 5.While i > 0 & A[i] > Key
array 40, 30, 20, 10, 15, 16, 17, 8, 4. Now 6.A[i+1] = A[i]
Note: consider that a value 35 is inserted into 7.i = i –1
Time–taken to perform some operations in Max–heap is given below this heap. After insertion, the new heap is 8.A[i + 1] = Key
[CS 2015] y INSERTION–SORT, takes as a parameter
Fing max Delete max Insert Increase Decrease key Find min Search Delete
(A) 40, 30, 20, 10, 15, 16, 17, 8, 4, 35 an array A[1--n] containing a sequence of
(B) 40, 35, 20, 10, 30, 16, 17, 8, 4, 15 n length that is to be sorted.
(C) 40, 30, 20, 10, 35, 16, 17, 8, 4, 15 y The algorithm sorts the input within
O(1) O(Logn) O(Logn) O(Logn) O(Logn) O(n) O(n) O(n) (D) 40, 35, 20, 10, 15, 16, 17, 8, 4, 30 the array A, with at most O(1) or O(logn)
Solution: (B) numbers of them are stored outside the
array.

32 Divide and Conquer Divide and Conquer 33


y The operation of INSERTION–SORT on Complexity: Analysis: smallest ← i
array A = {5, 2, 4, 6, 1, 3} O(nlogn) comparisons Exchange A[j] ↔ A[smallest]
Worst-Case
1 2 3 4 5 6 O(n2) swaps. So overall time complexity is Time complexity:
y The i-th iteration of the “for loop” of lines
(A) 5 2 4 6 1 3 O(n^2).
1–4 will cause “n – i” iterations of the for y (n–1) comparisons are needed to find the
1 2 3 4 5 6 loop of lines 2–4, each with constant time minimum element from an array of ‘n’
(B) 2 5 4 6 1 3
execution, so the worst–case running time elements.
Rack Your Brain of bubble sort is O(n2) y The size of an unsorted array decreases
1 2 3 4 5 6
(C) 2 4 5 6 1 3 Selection Sort to (n–1) after the minimum element is
Rewrite the INSERTION-SORT procedure placed in its proper position, and then
The selection sort finds the minimum
1 2 3 4 5 6 to sort into non-increasing instead of (n–2) comparisons are required to find the
(D) 2 4 5 6 1 3 repeatedly and places it at the beginning.
non-decreasing order. minimum in the unsorted array.
y It divides the array into two subarrays. One Therefore,
1 2 3 4 5 6 is sorted, and another one is unsorted.
(E) 1 2 4 5 6 3 NOTE:
y The minimum element is found in an n (n – 1 )
INSERTION_SORT gives time complexity (n–1) + (n–2) + ... + 1 = comparisons
unsorted array and is added at the end of 2
1 2 3 4 5 6 depending on how nearly the input
the sorted array. and n swaps
(F) 1 2 3 4 5 6 sequence is sorted. It is better for almost
sorted sequences. Example:
Analysis of Insertion Sort ∴ Time complexity = O(n2)
Worst-case behaviour on an array of n length: A = {1, 20, 6, 30, 42}
Counting Sort
y Inner loop could be executed ‘i’ times Previous Years’ Question Find minimum element in A[0...4] and place
y Counting sort assumes every element in
y ‘i’ swaps per loop ⇒ O(n2) total swaps it at the beginning.
the n sized array is in range 0 to k, where
Consider the following array k is an integer.
23 32 45 69 72 73 89 97 The algorithm:
Rack Your Brain
Which algorithm out of the following options Find minimum element in A[1...4] and place it Input
uses the least number of comparisons at the beginning of A[1...4]
What sort of input leads to the worst case Array [1…n], where A[j] ∈ [1,2,3…k]
(among the array elements) to sort the
time complexity in insertion sort? Output
above array in ascending order?
Best case behaviour on an array of length ‘n’ [CS-2021 (Set-1)] The array [1…n] holds the sorted output and
(A) Selection sort Find minimum element in A[2...4] and place the array count[0…k] provides temporary
y When the input array is sorted (B) Merge sort it at the beginning of A[2...4] working storage.
y Inner loop executed 0 times ⇒ 0 swaps (C) Insertion sort
y While condition is entry condition Counting_sort (array, array_size):
(D) Q
 uicksort using the last element as
(always performed at least once) pivot 1. Max ← maximum element in the array
So, O(n) comparisons are in the best case to Solution: (C) and create an array Count of size max and
verify the array is indeed sorted. Find minimum element in A[3...4] and place initialize with zeroes
Bubble sort: it at the beginning of A[3...4] 2. For j ← 0 to size
Binary Insertion Sort Bubble sort is a popular yet inefficient sorting 3. Find the total count of each unique
BINARY–INSERTION–SORT (A, n) algorithm. element and store the count at jth index
for j ← 2 to n y It works by repeatedly swapping adjacent in count array
insert A[j] Key into the (already sorted) elements that are out of order. Pseudocode of selection sort: 4. For i ← 1 to max
sub–array A[1...j–1] 5. Find the cumulative sum and store it in
Bubble sort (Arr) Sleection–sort (A): count array itself
use binary search to find its right position
1. for i ¬ 1 up to Arr. length – 1 for j ← 1 to n–1 6. For j ← size down to 1 restore the elements
y Binary search will take O(logn) time.
2. for j ¬ Arr. length down to i + 1 smallest ← j to array
However, shifting the elements after
3. if Arr [j] < Arr [j – 1] for i ← j + 1 to n 7. Decrement count_arr[arr[j]] and make
insertion will still take O(n) time
4. swap Arr[j] and Arr[j–1] if A[i] < A[smallest] arr[count_arr[arr[j]]]=j

34 Divide and Conquer Divide and Conquer 35


Counting sort example: y To get the output, sort the elements in time (assuming that every number that
each bucket and list them in order to get we are going to insert next has to be just
1 2 3 4 5 6 7 8 the final sorted order. inserted at the beginning of the list).
(A) 2 5 3 0 2 3 0 3 y Bucket sort assumes that the input is an y We are going to insert all the number ‘n’
array A of n elements, and each element times, then the total time complexity is
0 1 2 3 4 5 0 1 2 3 4 5 A[i] satisfies the condition 0 ≤ A[i] ≤ 1. going to be O(n) for all the insertion, and
y The code requires an auxiliary array then we have to scan all the number one
(C) 2 0 2 3 0 1 (C) 2 2 4 7 7 8
B[0...n-1] of linked lists (buckets) and time, so again O(n).
(A) (B) assumes that there is a mechanism for y Therefore, T(n) = O(n).
maintaining such lists.
Worst-case:
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Bucket-sort (A):
y Let us assume that we have ‘k’ bucket,
(B) 3 (B) 0 3 1.Let B[0....size-1] be a new array created.
and since elements, are uniformly
2.n = A.length
distributed. So, each bucket might
0 1 2 3 4 5 0 1 2 3 4 5 3.For itr<-0 to size-1
n
(C) 2 2 4 6 7 8 (C) 1 2 4 6 7 8 4.Make B[itr] empty get   element as a list and if we are
5.For itr<-1 to size k 
(C) (D) 6.Do insert A[itr] into list B[n A[itr]] going to apply insertion sort with all the
element, which are present in a bucket
7.For itr<-0 to size-1
1 2 3 4 5 6 7 8 n
8.Do sort list B[itr] with insertion-sort then we might have to compare with  
(B) 0 3 3 9.Merge all lists B[0], B[1],.., B[size-1] together k 
in order element then in that case time complexity
0 1 2 3 4 5 1 2 3 4 5 6 7 8 Example:  n 
(C) 1 2 4 5 7 8 (B) 0 0 2 2 3 3 3 5
Let array A have 10 elements as shown below:
 k  ( )
is O    n  = O n2 .
  
(E) (F) y Therefore, O(n2) is the worst-case time
complexity.
Finding the maximum element takes O(n) y Bucket sort assumes the input is given a Space complexity:
time. uniform distribution function and runs in
The array B[0…9] of sorted lists (buckets) y If we assume that the number of bucket is
O(n) time.
Creating an array and initializing it to after line 8 of the algorithm are shown below: ‘k’, then ‘k’ cells have been dedicated for
zeroes takes O(k). therefrore, total time y Same as counting sort, bucket sort is also
the bucket, and for all the n-elements, we
complexity=O(n+k) space complexity=O(k) fast since it assumes something about the
should have space of size n.
for creating new array. input.
y So, space complexity = O(n+k).
y Counting sort thinks that the input
Since the elements are changed in-place,it is consists of n integers in a smaller range, Radix-sort:
an inplace algorithm whereas bucket sort assumes that the
input is generated by a random process y Alike counting and bucket sorting
that distributes elements in the array algorithms, even radix-sort assumes that
uniformly and independently over an the input has some information.
Rack Your Brain interval [0, 1). y It assumes that the input values are to
y Bucket sort divides the [0, 1) interval into n be sorted from the base, i.e., means all
Why don’t we always use counting sort? equal-sized intervals (known as buckets) numbers in the array have d digits.
Hint: Depends on the range K of elements. and distributes n input elements into the y In radix sort, sorting done with each digit
buckets. from the last to the first.
Bucket Sort Time Complexity y A stable sort algorithm is used to sort all
y Since the numbers are uniformly and
independently distributed over [0, 1), the Best Case the numbers by least significant, then last
y Counting sort assumes every element in
chances of getting more elements in the but one, so on.
the array of size n is in the range 0 to k, y In case if all the elements are uniformly
where k is an integer. same bucket are very, very less. y Counting sort gives a time complexity of
distributed, then every insertion take O(1)
O(nd)  O(n) for this procedure in radix sort.

36 Divide and Conquer Divide and Conquer 37


Algorithm: Searching Algorithms y After that, employ the binary search y Every call to the binary search algorithm is
algorithm. dividing the array into two equal half’s and
y Radix sort (A, d) /* Each key in
Binary search (A, low, high, key)
A[1…n] is a d digit comparing the key with middle element.
integer, and if the Linear search Binary search if high < low Based on the comparison with middle
y For (i=1 to d) do  keys are not of return (low-1) element, recursing to (left/right) half.
Linear Search
d-digit, then we ∴ T(n) = T(n/2) + C
have to make ‘d’ y Given an array arr[ ] of n elements and a   high − low  
mid ← low +  
search element ‘x’ to be searched in arr[ ]   2  By master method
y Use a stable sorting  digit number by
appending zero. y Begin with the leftmost element of arr[ ] if key = A[mid] T(n) = Θ(logn)
*/ and compare x with each element of arr[
return mid
y Algorithm to sort A ] one by one. y O(1) space in case of iterative
y On digit i. y Return the index if x matches an element. else if key < A[mid]
y /* digits are numbers 1 to d from right to return Binary search (A, low, mid-1, key) implementation, and O(logn) recursion
y Return -1 if x does not match any of the
left */ elements. else call stack space in case of recursive
Example: Example: return Binary search (A, mid+1, high, key) implementation.
y The operation of radix sort on a list of Find 20, i.e., x = 20.
seven 3-digit numbers is shown below: Chapter Summary
0 1 2 3 4 5 6 7
10 30 50 70 60 80 20 40
Sorting Time Complexcity Space
Best Case Average Case Worst Case Complexity
return '6' Algorithms
Worst Case
Time complexity of the linear search is
written as O(n) Bubble Sort 0(n) q(n2) 0(n2) 0(1)
Insertion Sort 0(n) q(n2) 0(n2) 0(1)
Note:
Binary search and hash tables are used Selection Sort 0(n2) q(n2) 0(n2) 0(1)
more significantly in practice compared Merge Sort 0(nlogn) q(nlogn) 0(nlogn) 0(n)
y Let n d-digited numbers be given, in which to Linear search since they allow faster Quick Sort 0(nlogn) q(nlogn) 0(n2) 0(n)
each digit can take k values possible. searching.
Heap Sort 0(nlogn) q(nlogn) 0(nlogn) 0(n)
y RADIX-SORT sorts the input numbers
Binary Search
correctly in (d(n+k)) time. If the
intermediate stable algorithm used to sort y Binary search is one of the most efficient Sorting Algorithms In-place Stable
the numbers takes d(n+k). search algorithms available. Bubble Sort Yes Yes
y When each digit is between 0 to k-1 and k is y It is based on the divide-conquer strategy. Insertion Sort Yes Yes
not too large, counting sort is the best choice.
Note: Selection Sort Yes No
Then each pass over n d-digited numbers
Binary search algorithms can be applied Merge Sort No Yes
takes (n+k). Such passes are d. Therefore,
only on sorted arrays.
radix sort takes (d(n+k)) time in total. Quick Sort Yes No
y As a result, the elements must be
y When d is constant and K = O(n), we can y The algorithm that can sort a dynamic input added while sorting is called online
arranged in a certain order.
make radix sort run in linear time. More sorting algorithm.
y If the elements are numbers, use either
generally, we have some flexibility in how Eg: Insertion sort algorithm.
ascending or descending order.
to break each key into digits.
y If the elements are strings, use y The application and implementation details play a big role in determining which
Searching dictionary order. algorithm is best.
y Searching is a process of locating a specific To use binary search on an array that is not y Quicksort out performs heapsort, it is practically used for sorting large input arrays
element among a set of elements. sorted. since its worst case time complexity is 0(nlogn).
y If the required element is found, the search y If stability and space issues are considered, then merge sort is the best choice.
y Sort the array first using the sorting
is considered successful. Otherwise, it is y When the input array is nearly sorted or the input size is small, insertion sort is preferable.
technique.
unsuccessful.

38 Divide and Conquer Divide and Conquer 39


3 Greedy Techniques Constructing a Huffman code:
y Huffman designed the Huffman code,
y In the pseudocode above, we assume that
C is a character set of size n.
a greedy approach for constructing an y Where character c ∈ C denotes an item
optimised prefix code. with the c.freq attribute indicating its
Greedy Algorithms higher frequency and more number of bits frequency.
to characters with lower frequency. Pseudocode: y In a bottom-up approach, the algorithm
Algorithms for optimisation often follow a
Huffman(C) // C represents the set of n constructs the tree T that corresponds to
number of steps, each with a set of options. Example: character the optimal code.
y Dynamic programming is used to solve Let’s say there exists a file with 130 characters. y To produce the final tree, it starts with a
1. n = |C|
numerous optimisation challenges. The characters and their respective frequencies set of |C| leaves and executes a series of
2. Q = Create a min priority queue and make
y Greedy Algorithms select a path that is in the file are given below: |C|-1 “merging” operations.
the string C inserted into it
optimal at that moment, expecting it y All the frequencies are in min priority
3. for i = 1 to n–1
leads to an optimal solution for the entire Character A B C D E F G Queue Q. The algorithm identifies two
4. allocate a new node z
problem in future. characters with the least frequency,
5. z.left = x = EXTRACT_MIN(Q)
y A greedy algorithm won’t work all the time. Frequency 7 11 12 19 21 25 35 removes their freq in min priority Queue,
6. z.right = y = EXTRACT_MIN(Q)
7. z.freq = x.freq + y.freq sums up the two frequencies then adds
Greedy algorithms have two main Solution: them to the Priority queue, which works
8. INSERT (Q,z)
components: Greedy choice property and based on the freq attribute.
For 7 characters, we need a minimum of 3 9. return EXTRACT_MIN(Q) // returns the root
optimal substructure.
bits to represent. of the tree.
Example: a : 45 55
Greedy choice property:
Character A B C D E F G f:5 e:9 c : 12 b : 13 d : 16 a : 45
0 1
The greedy-choice property is the first
essential component. Frequency 7 11 12 19 21 25 35
25 30


Selecting locally optimal solutions which Fixed 0 1 0 1
may lead us to global optimal solutions. length 000 001 010 011 100 101 111 c : 12 b : 13 14 d : 16 a : 45
Codeword c : 12 b : 13 d : 16
In other words, we choose the choice that 0 1
14

appears to be the best in the current solving 0 1


problem by not taking into account the Variable length codeword: f:5 e:9

results of subproblems. f:5 e:9


Character A B C D E F G


Optimal substructure: Frequency 7 11 12 19 21 25 35


14 d : 16 25 a : 45

y If the global optimal solution includes Fixed 0 1 0 1 100


the local optimal solutions, the problem length 0110 0111 010 110 111 00 10
0 1
has an optimal substructure. It is very Codeword f:5 e:9 c : 12 b : 13
important in both dynamic programming a : 45 55
and greedy paradigms. For fixed length number of total bits = (7 + 11 + 12


+ 19 + 21 + 25 + 35) × 3 = 390 bits 25 30 a : 45 0 1
Huffman Codes
For variable length number of total 0 1 0 1 25 30
y Huffman coding compresses data very bits = 4×7 + 4×11 + 3×12 + 3×19 + 3×21
effectively. + 2×25 + 2×35 = 348 bits. c : 12 b : 13 14 d : 16 0 1 0 1

y Consider the information as a string of


Prefix codes: 0 1 c : 12 b : 13 14 d : 16
characters.
y Huffman’s algorithm finds an optimal way y Prefix codes are codes which do not have the f:5 e:9 0 1
of expressing a character in binary. It gives same prefix. These are used to encode and
less number of bits to characters with decode the characters in an optimal way. f:5 e:9


Greedy Techniques 41 42 Greedy Techniques
y The ideal prefix code is represented by the Knapsack holds, i.e (w) = 4 pounds
final tree. Previous Years’ Question y Since, the thief can take a fraction of an t Previous Years’ Question
y The straightforward path from the root to item
the letter is the letter’s codeword.
Consider the string abbccddeee. Each letter Consider the weights and values of items
2 pounds of item A
Analysis: in the string must be assigned a binary  listed below. Note that there is only one
Solution =  +
code satisfying the following properties: 2 pounds of item C
unit of each item.
y We assume that the Queue is implemented 
as a binary min heap in the algorithm. 1. For any two letters, the code assigned Item Weight Value
to one letter must not be a prefix of the Value = 100 + 80 number
y In line 2, the Queue Q is intialised in O(n) (in kgs) (in Rupees)
code assigned to the other letter. = 180
for a character set of n characters (using
BUILD_MIN_HEAP). 2. For any two letters of the same 1 10 60
y From lines 3-8, the for loop iterates for frequency, the letter which occurs earlier Greedy solution for fractional knapsack: 2 7 28
exactly n-1 times, and each heap operation in the dictionary order is assigned a code 3 4 20
y Given a set of items I.
takes O(logn), which results in O(nlogn) whose length is at most the length of
4 2 24
time. the code assigned to the other letter.
Among the set of all binary code Weight w1 w2 … wn
y Hence, the total time taken is O(nlogn), The task is to pick a subset of these
where n is the number of characters assignments which satisfy the above two items such that their total weight is no
Cost C1 C2 … Cn
present in the character set C. properties, what is the minimum length more than 11 kgs and their total value is
of the encoded string?  [2021 (Set-2)] maximised. Moreover, no item may be
Let P denote the problem of selecting items split. The total value of items picked by
(A) 21 (B) 23
from I, with weight limit K, such that the an optimal algorithm is denoted by Vopt A
(C) 25 (D) 30
Previous Years’ Question resulting cost (value) is maximum. greedy algorithm sorts the items by their
Solution: (B)
Ci value to weight ratios in descending
A message is made up entirely of (1) Calculate vi = , for i = 1, 2…, n
wi order and packs them greedily, starting
characters from the set X = (P, Q, R, S, Fractional Knapsack Problem
(2) Sort the items by decreasing vi. Let the from the first item in the ordered list.
T). The table of probabilities of each y Let’s assume a thief went to rob a store The total value of items picked by the
sorted item sequence be 1,2,…,i,…n and
character is shown below: containing n items. Let wi be the weight of greedy algorithm is denoted by Vgreedy’
the corresponding v and weight be vi and
the ith item and vi is the value of the ith wi, respectively. The value of Vopt-Vgreedy is
Character Probability
item.He has a knapsack that can handle a (3) Let k be the current weight limit (initially, Range [16 to 16] [NAT]
P 0.22 weight of w (integer). k = K)
Q 0.34 y The items can be picked in fractions, and y In each iteration, we choose item i from
R 0.17 the problem is to find the set of items the head of the unselected list; if k > = wi, Time complexity:
whose weight sums up to a weight w and we take item i and update k = k–wi, then
S 0.19 y The sorting of all items in decreasing order
whose value is maximum. consider the next unselected item.
T 0.08 of their cost/weight ratio is the most time-
Example: y If k < wi, we take a fraction f of item i, i.e.,
Total 1.00 consuming step. It takes O(n logn) time.
k
y When a thief walks into a store, he notices we only take f = (< 1) of item I, which y Iterating through every item in step (3)
A message of 100 characters over X is wi
the following items. takes O(n) time if the items are already
encoded using Huffman coding. Then weights exactly k. Then the algorithm is arranged in the required order.
the excepted length of the encoded finished. Total time complexity = O(nlogn) + O(n) =
Items A B C
message in bits is __  [2017 (Set-2)] y The algorithm can take an item in fraction, O(nlogn)
(A) 225 (B) 226 i.e., an item need not be considered
Cost 100 10 120
(C) 227 (D) 228 completely.
Solution: (A)
Weight 2 pounds 2 pounds 3 pounds

Greedy Techniques 43 44 Greedy Techniques


Job Sequencing Problem Representation of graphs: lengths of entries in the adjacency list is 1
1
2
3
|E| for the directed graph and 2|E| for the
y Given a list of jobs, each with its own There are two common ways to express a 6 2 4 3
undirected graph.
deadline and associated profit if the jobs graph G=(V,E): adjacency lists and adjacency 6
matrices. y Since an edge (u,v) is represented both in 5
8
4
are completed before the deadline. u and v entry for the undirected graph and
y Given that each job takes a single unit of only in u entry for the directed graph.
Adjacency list representation:


time, the shortest deadline possible for y Both the directed and undirect graphs
any job is 1. The Graph G(V,E) is expressed as an array Adj
need Θ(V+E) memory for their adjacency 1 2 3 4 5
y When only one job can be scheduled at a of size |V|, one for each element of Adj is a
list representation. 1 0 1 0 0 6
time, how can the total profit be maximized? list for each vertex in V. 2 1 0 3 4 2
y One of the disadvantages of the adjacency
3 0 3 0 6 0
Example: y For each vertex u in the set of vertices V, list is its O(V) time complexity to search 4 0 4 6 0 8
Input: there exists an entry in the array Adj i.e. whether an edge (u,v) exists in the graph 5 6 2 0 8 0
Adj[u] with list of all vertices adjacent to u. or not.
Jobs A B C D
i.e. Adj[u] has links of all the vertices one by Minimum Spanning Trees
Adjacency-Matrix representation:
Deadline 1 4 1 1 one adjacent to u in G. Minimum spanning tree T is a subset of the
y In an adjacency matrix, a graph G(V,E) is
represented as a matrix represented in |V| set of Edges E in a connected and undirected
Profit 10 20 30 40 1 2
x |V| size. graph G(V, E) such that it forms an acyclic
3 // An undirected graph G with
1 if (i, j) ∈ E, graph with
5 vertices and 7 edges aij = 
Output: 5 4 0 otherwise
The following is a list of jobs in order of
w(T) = ∑
(u,v)∈T
w(u, v) , where w(T) is minimised.

maximum profit.
1


y Since T is acyclic and covers(spans) all the


D, B 1 2 vertices in the graph, it is known as the
1 2 5 /
3 minimum spanning tree of Graph G.
Algorithm:
2 1 5 3 4 /
6 2 4 3
y The problem of finding subset Tree T in a
1. First sort (in decreasing order) the profit 3 2 4 / set of Edges E such that it covers all the
of given the input. vertices and gives minimum total weight
4 2 5 3 /
2. Iterate on jobs in order of decreasing is called the minimum spanning tree
profit. 5 4 1 2 /
5 4 6 problem.
For each job follow this process:
8
y We can readily adapt adjacency lists to
Note:
y Find a time slot i, such that slot is empty represent weighted graphs.
A complete graph can have maximum nn-2
and i < deadline and i is a job with the y Simply store weight w(u,v) of the edge (u,v)
spanning trees, where n is the number of


current greatest profit. Keep the job in ∈ E with vertex v in u’s adjacency list.
The adjacency-matrix representation of G. nodes.
this slot and mark this slot filled.
Weighted graphs: y An adjacency matrix of a graph requires
Time complexity:
Θ(V2) space, independent of the number
y For sorting the jobs in decreasing order of Definitions
profit O(nlogn) Definitions 1 2 3
of edges in the graph. 4 5
y An adjacency matrix can represent a A complete graph is a graph in which
y For each job, checking the open time slot
between deadline and 0 is O(n)
Graphs in which a weight w is given to 1 0 1 0
weighted graph also. 0 6 each pair of graph vertices is connected
each edge (u, v). y For example, if G(V,E) is a weighted graph by an edge.
for ‘n’ jobs → n * O(n) 2 1 0 3 4
with edge-weight function w, we can2 A complete graph with n vertices is
= O(n2) y If the length of an entry in an adjacency simply store the weight w(u,v) of the edge represented by symbol Kn.
∴ Total time complexity = O(nlogn) + O(n2) list is given by the number of vertices 3 0 3 0 6 0
(u,v) ∈ E as the entry in a row u and column
= O(n2) linked in that entry, then the sum of all v of the adjacency matrix.
4 0 4 6 0 8
5 6 2 0 8 0
Greedy Techniques 45 46 Greedy Techniques
y For non–complete graphs, we can use The two commonly used algorithms to solve y Sorting edges takes O(ElogE) in line 4. ⇓
Kirchhoff’s theorem to find number of minimum spanning tree problems are: y For loop takes O(V) in 2nd and 3rd lines.
8 7
spanning trees. y For loop takes O(ElogV) to perform O(E) b c d
1. Kruskal’s algorithm 4 9
2. Prim’s algorithm UNION and FIND_SET operations on
2
Kirchhoff’s theorem: disjoint_set forest in 5th and 8th lines. q 11 4 14 e

1. For the given graph, create an adjacency Kruskal’s algorithm: ∴ T(n) = O(1) + O(V) + O(ElogE) + O(ElogV) 7 6
8 10
matrix. = O(ElogE) + O(VlogV) = O(ElogE) =
MST_KRUSKAL (G,w) h
1
g
2
f
2. Substitute the degree of nodes for all the O(Elog(V(V-1)/2)) = O(ElogV)
diagonal elements. 1. A=φ ⇓
Example:
3. Substitute -1 for all non-diagonal 1’s 2. for each vertex v ∈ G.V
4. Calculate the co-factor for any element. 3. MAKE_SET(v) Applying Kruskal’s algorithm on this graph b
8
c
7
d

4. For every edge (u, v) ∈ G.E (Mentioned in gives, 4 9


a) 
The total number of spanning trees for 2
increasing order by weight(u, v)): 11 4
that graph is the cofactor we get. 8 7 q 14 e
5. if FIND-SET(u) ≠ FIND-SET(v): 4
b c d
7 6
9
6. A = A ∪ {(u, v)} 8 10
Example: 2 h g f
7. UNION(u, v) return A q 11 p 4 14 e 1 2
7
1 2 y Set A is a set containing all the edges 8
6
10 ⇓
in the MST in the given graph, which is h
1
g
2
f
// given
// given graphgraph
G G initially empty. b
8
c
7
d
8 7 4 9
y The edge added first is always the least b c d
2
3 4 weighted edge in the entire graph.
4 9 11 4
q 14 e
2
y Kruskal’s algorithm comes under the q 11 4 14 e
7 6
1. greedy algorithm since it chooses the 7
8 10
6 h g f
1 2 3 4 8 10 1 2
1 0 0 1 1
least possible weight edge to add to the h g f
2 0 0 1 1 MST, at each step. 1 2 ⇓
// Adjacency matrix representation
3 1 1 0 1 of graph G y FIND_SET(u): Operation that returns ⇓
8 7
4 1 1 1 0 an element representing the set that is b
8
c
7
d
b c d
4 9
containing the vertex u. 4 9
2. y This operation helps in comparing whether 2 q 11
2
4 14 e
q 11 4 14 e
1 2 3 4 two vertices u, v belongs to the same tree. 7 6
7 8 10
1 2 0 1 1 (By equalising FIND_SET(u) and FIND_ 6
// Substituting degree of 8 10 h g f
2 0 2 1 1 the node for all the diagonal SET(v)). h
1
g
2
f 1 2

3 1 1 3 1 elements y UNION: Operation used to combine 2 trees. ⇓


4 1 1 1 3 y Line 1 and Line 3 initialise the set A to 8 7
empty and creates |V| individual trees for b c d
3. 4 9
each vertex v ∈ G.V. 2
1 2 3 4
y The for loop in the 5th line sorts edges in q 11 4 14 e
1 2 0 -1 -1 non-decreasing order. 7
// Substituting -1 for all 8
6
10
2 0 2 -1 -1 non diagonal 1's y An edge (u,v) is not added to forest A if h g f
1 2
3 -1 -1 3 -1 u and v belong to the same tree since it
4 -1 -1 -1 3 forms a cycle. ⇓
y Otherwise, the edge is added to the forest
4. Co-factor of (1,1) (done by the 7th line), and the two trees of b
8
c
7
d
= 2(9-1) – (-1) (-3-1) -1 (1+3) vertices u and v are merged.
4 9

= 16 – 4 – 4 2
Time complexity analysis: q 11 4 14 e
=8
y Initializing set A to empty in Line 1 takes 7 6
∴ Total 8 spanning trees possible for given 8 10
O(1) time. h g f
graph. 1 2

Greedy Techniques 47 48 Greedy Techniques


6. while Q is not empty y Initialise each vertex’s parent to NIL. y EXTRACT_MIN operation takes O(logV) time
Previous Years’ Question 7. u = Edge with minimum weight in Q y Initialise the minimum priority queue Q and is repeated for |V| times in the while
8. for each v ∈ G.Adj[u] with the set of all vertices in the graph. loop.Therefore the total time is O(VlogV).
9. if v ∈ Q and w(u,v) < v.key y 7th Line finds the vertex u ∈ Q such that y The for loop in lines 8-11 executes O(E)
Consider the following graph:
10.v.π = u it is one of the vertices at the end of the times altogether since the sum of the
11. v.key = w(u,v) light-weight edge that crosses the cut (V- lengths of all adjacency lists in 2|E|
y Prim’s algorithm always assumes that the Q, Q), except in the first iteration, in which y Within the for loop, line 9 takes constant
edges in set A form a single tree. u = r because of the 4th line. time.
y The tree starts with a vertex v and grows y Deleting u from the set Q adds it to the y The assignment in line 11 involves an
till it spans all the vertices in the given set V-Q vertices in the tree, resulting in implicit DECREASE-KEY operation on the
Graph. the addition of (u, u.π) to A. min-heap which is O(logv) time
Which one of the following is not the
y At each step, a new edge is added to Tree y The for loop of the 8th to 11th lines updates y Thus, the total time for Prim’s algorithms
sequence of edges added to the minimum
A, which is the lightest among all the the key and π attributes of each vertex v is with vertices ‘V’ and edges ‘E’ O(vlogv +
spanning tree using Kruskal’s algorithm?
edges that are connected to the vertices adjacent to vertex u but not the key and π Elogv) = O(Elogv)
[2009]
in Tree A. This edge results in connecting attributes in the tree.
(A) (b,e) (e,f) (a,c) (b,c) (f,g) (c,d)
(B) (b,e) (e,f) (a,c) (f,g) (b,c) (c,d) an isolated vertex that is not included in Running time of prim’s algorithm: Note:
(C) (b,e) (a,c) (e,f) (b,c) (f,g) (c,d) Tree A. y If we implement Q as a binary min-heap, The running time of prim’s algorithm by
(D) (b,e) (e,f) (b,c) (a,c) (f,g) (c,d) y Prim’s algorithm comes under the greedy BUILD-MIN-HEAP procedure to perform using the fibonacci heap is O(ElogV).
Solution: (D) algorithm since it selects a vertex that lines 1-5 in O(v) time.
adds the minimum weight possible to the
Example:
resulting minimum spanning tree.
y A connected graph G and a root r to be
grown as a minimum spanning tree are
Previous Years’ Question the inputs given to the pseudocode above.
y The minimum priority queue Q contains all
What is the weight ofa minimum spanning the vertices that are not in Tree A, based
tree of the following graph? on key attributes.
y For a vertex v, v.key is the value of the
minimum weighted edge among the edges
connecting v to any other vertex in the
tree. v.key=-∞, if no such edge exists.
y The attribute v.π names the parent of v in
the tree
y Algorithm maintains the set A

=A {(v, v ⋅ π) : v ∈ V − {r} − Q}
(A) 29 (B) 31
(C) 38 (D) 41 y When the algorithm terminates, the min-
Solution: (B) priority queue Q is empty
y The minimum spanning tree A for G is thus
Prim’s algorithm: =A {(v, v ⋅ π) : v ∈ V − {r}}
PRIM_MST(G, w, r)
y Lines 1-5 in pseudocode, make the key
1. for each u ∈ G.V
attribute of each vertex initialised to ∞
2. u. key = ∞
except the root. Since the root is the first
3. u.π = NIL
vertex to be processed, its key is initialised
4. r.key = 0
to 0.
5. Q = G.V

Greedy Techniques 49 50 Greedy Techniques


The problem: Dijkstra’s algorithm: vertex in V-S to add to set S, it falls
into a greedy strategy.
Previous Years’ Question Let G (V,E) be the given digraph with positive y Dijkstra’s algorithm is a single source
edge weights. There exists a source vertex shortest path algorithm. b) Min-priority queue Q of vertices, keyed
(distinguished from other vertices) S∈V from by their d values, is used in the above
Consider the undirected graph below: y It maintains a set of vertices S apart from
which the shortest path to every other vertex algorithm.
the set of all vertices V, that keeps track of
in the Graph needs to be found out. vertices to which the shortest path is found.
Analysis of Dijkstra’s algorithm:
y The algorithm selects a vertex u ∈ V-S
with the minimum path estimate, adds y The initialisation uses only O(n) time,
Note:
the vertex u to S and relaxes all the edges y Each vertex is processed exactly once so
Any subpath of a shortest path must also leaving from u. condition in line 4 and EXTRACT-MIN are
be a shortest path DIJKSTRA (G, w, s) called exactly once for every vertex, i.e. V
eg: times in total.
1. Initialise the source s
Using Prim’s algorithm to construct a y The inner loop for each V∈Adj[u] is called
minimum spanning tree starting with 2. S = φ once for each edge in the graph.
node A, which one of the following 3. Q = G.V y Each call of the inner loop does O(1) work
sequences of edges represents a possible 4. While Q ≠ φ plus, possibly, one decrease-key operation.
order in which the edges would be added y Recalling that all the priority queue
5. u = EXTRACT-MIN(Q)
to construct the minimum spanning tree? operations require O(log|Q|) = O(logV) time
 [2004] 6. S = S U {u}
Relaxation: y Thus, we have
(A) (E, G), (C, F), (F, G), (A, D), (A, B), (A, C) 7. For each vertex v ∈ G.adj[u]
= O(V+V+VlogV +ElogV)
(B) (A, D), (A, B), (A, C), (C, F), (G, E), (F, G) y For each vertex v ∈ V, maintain an attribute 8. RELAX (u, v, w)
(C) (A, B), (A, D), (D, F), (F, G), (G, E), (F, C) v.d which is an upper bound on the weight a) Because Dijkstra’s algorithms always = O((V+E) logV)
(D) (A, D), (A, B), (D, F), (F, C), (F, G), (G, E)
of the shortest path from source s to v, choose the “lightest” or “closest”
Solution: (D)
called as the shortest-path estimate.
y Initialise the estimates of the shortest
Single Source Shortest Path path and predecessors by the following
Example:

y Let G = (V, E) be a weighted digraph, with takes θ(v) time.


real-valued weight W assigned to every INITIALISE-SINGLE-SOURCE (G,S)
edge E. 1. For each vertex v ∈ G.V
y The weight w(p) of path P = <v0, v1, v…..,vk> 2. v.d = ∞
is the sum of the weights of its constituent 3. v.π = NIL
edges 4. s.d = 0
k
a)  After initialisation, we have v.π = NIL
w(P) = ∑ w(v
i= 1
i− 1 , v i )
for all v ∈ V, s.d = 0, and v.d = ∞ for
b) v ∈ V – {s}
c)  An edge (u,v) is said to be relaxed after
y The shortest-path weight δ(u,v) from u to v
testing given that particular edge can
p improve the shortest path found so
min{w(p):u v}, if there is a far. If yes, then v.d and v.π are updated.
path from u to v, d)  The following code performs a relaxation
(u,v)=
step on edge (u,v) in O(1) time
, otherwise RELAX (u, v, w)
1. if v.d > u.d + w (u,v)
y A shortest path from vertex u to vertex v 2. v.d = u.d + w(u,v)
is then defined as path P with weight w(P) 3. v.π = u
= δ(u,v).

Greedy Techniques 51 52 Greedy Techniques


y If there is no negative weighted cycle, then
the algorithm produces the shortest path Previous Years’ Question
Previous Years’ Question
weights.
y Bellman–Ford algorithm assumes that
Consider the directed graph shown in What is the time complexity of Bellman-Ford single-source shortest path algorithm on a
a vertex v can be reached from vertex u
figure below. There are multiple shortest complete graph of n vertices?
with at most |V|-1 edges which result, in
paths between vertices S and T. Which (A) 0(n2) (B) 0(n2logn) (C) 0(n3) (D) 0(n3logn)
no negative edge cycle.
one will be reported by Dijkstra’s shortest Solution: (C)
y It finds the shortest path involving 1 edge
path algorithm? Assume that, in any
at first, then two and so on |V| -1 edges.
iteration, the shortest path to a vertex
y If the cost found in involving |V| passes
v is updated only when a strictly shorter Solved Examples
and |V|-1 passes are the same, then there
path to vis discovered.
is no negative edge cycle present.
[2012] y The edges are relaxed repeatedly, 1. Consider the weights and values of items 2. Consider the weights and values of
decreasing the estimate v.d on the weight listed below. items listed below.
of the shortest path to all the vertices from Weight Value Weight Value
the source s until the actual shortest-path Item number Item number
(in kgs) (in Rupees) (in kgs) (in Rupees)
weight δ(s,v) is achieved.
1 1 2 1 4 8
BELLMAN-FORD (G, w, s) 2 4 28 2 8 4

1. INITIALISE_SINGLE_SOURCE(G, s) 3 5 25 
The task is to pick a subset of these
2. For i = 1 to (number of vertices) - 1 4 3 18 items by using greedy method such that
3. For every edge (u, v) ∈ G.E their total weight is no more than 4 kgs
5 3 9
(A) SDT (B) SBDT 4. RELAX (u, v, w) and their total value is maximized.
(C) SACDT (D) SAC ET 5. For every edge (u, v) ∈ G.E (Edges of graph) The task is to pick a subset of these The total value of items picked by the
Solution: (D)
6. If v.d > u.d + w (u, v) items by using greedy method such that greedy algorithm is denoted by Vgreedy•
7. Return FALSE their total weight is no more than 15 kgs The value of Vgreedy is ----
The Bellman-Ford algorithm: and their total values is maximized.
8. Return TRUE Solution: 8
y The Bellman-Ford algorithm is a single- The total value of items picked by the
y The Bellman-ford algorithm runs in time Here, the value / weight ratio of both the
source shortest path algorithm. It works greedy algorithms is denoted by Vgreedy”
O(VE), since the initialisation in line 1 takes item is 2. So, we can take any item 1st.
on digraphs with negatively weighted The value of Vgreedy is ------
θ(v) time, each of the |V|-1 passes over the Solution: 80 Let’s take item 1 first. Hence, the total
edges, also.
edges in lines 2-4 takes θ(E) time, and the capacity is going to be 4, and the profit is 8.
y The Bellman-Ford algorithm results in a Now, sort the object in non-increasing order
for loop of lines 5-7 takes O(E) time.
boolean value indicating whether a negative of value/weight ratio.
weight cycle is present or not in the graph. 3. The characters a to f have the set of
Example: Item Weight Value (in Value/ frequencies as shown below:
No. (in kg) rupees) weight a:40, b:30, c:20, d:5, e:3, f:2
A Huffman code is used to represent
2 4 28 7
the characters.
4 3 18 6 What is the weighted external path length?
3 5 25 5 Solution: 205
a b c d e f
5 3 9 3 Step 1: 40, 30, 20, 5, 3, 2
1 1 2 2
First we pick item 2 then item 4 then item
3 and at last item 5. Total capacity of this
4 object is 15 kg and total value is 80. So,
overall profit is 80.

Greedy Techniques 53 54 Greedy Techniques


a b c d 4. The characters a to h have the set of 5. If job J = (J1, J2 , J3 , J4) are given their
Step 2: 40, 30, 20, 5, 5 frequencies given below: processing time Ti = (1, 1, 1, 1) profit p; = (6,
a:2, b:2, c:3, d:4, e:6, f:9, g:14, h:22 8, 5, 10) and deadline are Di = (2, 1, 1, 2).
A Huffman code is used to represent the Maximum how many job can be done?
characters. (A) 1
What is the sequence of characters (B) 2
corresponding to the following code?
h (C) 3
110000010110110001001010011 16 24 22
a b c
(A) hdegfcab (D) All
Step 3: 40, 30, 20, 10
(B) abcdefgh Solution: (B)
(C) hdegcfab The gantt chart is shown below:
(D) hdgefcab
Solution: (A)
a b c d e f g h Profit = 8 + 10 = 18
2 2 3 4 6 9 14 22
6. If job J = (J1, J2 , J3 , J4 , J5, J6 ) are given
their processing time T; = (1, 1, 1, 1, 1, 1)
a b
Step 4: 40, 30, 30 profit p, = (200, 180, 190, 300, 120, 100)
and deadline are D, = (5, 3, 3, 2, 4, 2).
What is the maximum profit?
c d e f g h
4 3 4 6 9 14 22 Solution: 990

J1 J2 J3 J4 J5 J6
Deadline 5 3 3 2 4 2
Profit 200 180 190 300 120 100
The gantt chart is shown below:

d e f g h
7 4 6 9 14 22
Step 5:
Profit = 990
So, the Huffman code is
7. What is the number of spanning tree
d: 000 possible from the given graph?
e: 001
g: 01
h: 11
Weighted external path length = 1 × 40 + 30 f g h f: 101
× 2 + 4 × 5 + 20 × 3 + 2 × 5 + 3 × 5 7 10 9 14 22
c: 1000
= 40 + 60 + 20 + 60 + 10 + 15 a: 10010
= 205 b: 10011
Group the string into characters from right Solution: 125
to left:
Since it is a complete graph. So, the
11 000 001 01 101 1000 10010 10011 number of spanning trees for a complete
h d e g f c a b graph kn is n(n-2), (where n = no. of vertices)

g h
10 16 14 22

Greedy Techniques 55 56 Greedy Techniques


Here, n = 5 Step 2: Connecting (a, c) as shown below. Step 2: Connect (a, b)
So, the number of spanning trees = 5(5-2)
= 5(3)
= 125

Solution: 40 Step 3: Connect (c, d)


8. What is the number of spanning tree
possible from the given graph? The minimum spanning tree in tree is shown Step 3: Connect (c, d)
below:

Step 4: Connect (d, f)


Step 4: Connect (c, f)
Solution: 1
The given graph is itself a spanning tree. So, Minimum cost of spanning tree = 40.
only 1 spanning tree in possible.
11. Consider the undirected graph below:
9. Consider the undirected graph given
below. What is the minimum possible
weight of a spanning tree T ?
Step 5: Connect (d, e)
Step 5: Connect (d, e)

Using Prim’s algorithms to construct a


minimum spanning tree starting with
node A, which one of the following
Step 6: Connect (e, g)
Solution: 26 sequences of node A, which one of the
Similarly, we will get another sequence as
following sequences of edges represents
a possible order in which the edges would (a, b), (b, c), (c, d), (d, f), (d, e), (e, g).
be added to construct the minimum
spanning tree? (MSQ) 12. What is the time complexity of Prim’s
(A) (a, b), (a, c), (c, d), (c, f), (d, e), (e, g) algorithm with and without min heap
(B) (a, c), (a, b), (c, d), (c, f), (d, e), (e, g) respectively, (where E = edge and V =
(C) (a, b), (a, c), (c, d), (d, f), (d, e), (e, g) vertices)
In this graph, we can get another sequences (A) O(ElogV2) and O(V2)
(D) (a, b), (a, c), (b, c), (d, f), (d, e), (e, g) of edges as well to construct a minimum
This is a minimum cost spanning tree. (B) O(V2) and O(V3)
Solution: (A), (B) & (C) spanning tree, as shown below:
(C) O(ElogV) and O(V3)
Step 1: We can either connect (a, b) or (a, c). Step 1: First connect (a, c)
10. Consider a complete undirected graph (D) O(VlogV + E) and O(V2)
We are first connecting ab.
with vertex set {1, 2, 3, 4}. Entry W;i in the Solution: (A)
matrix W below is the weight of the edge
The time complexity of Prim’s algorithm with
{i,j}. What is the minimum possible weight
and without min-heap is O(ElogV) and O(V2).
of a spanning tree T?

Greedy Techniques 57 58 Greedy Techniques


13. The number of distinct minimum 14. A complete, undirected, weighted graph This type of graph is called a star graph. Here, Solution:
spanning trees for the weighted graph ‘G’ is given on the vertex {O, 1, .... , n-1} for we will always get a star graph as a minimum
below is --- any fixed ‘n’. Draw the minimum spanning spanning tree.
a b c d e f g h
tree of G if a  ∞ ∞ ∞ ∞ ∞ ∞ ∞
(A) The weight of the edge (u, v) is lu - vi ∞ ∞
15. Find the single source shortest path e 2 -3 ∞ ∞ ∞
(B) The weight of the edge (u, v) is (u + v) node a to all other vertices. The graph is
f 2 ∞ ∞ -1 ∞ ∞
Solution: (A) given below:
Let’s take n = 4. So, the complete, undirected g 2 ∞ ∞ 2 ∞
and weighted graph is shown below.
b 2 ∞ ∞ 4
Solution: 2 -3
h 5 ∞
By using Kruskal’s algorithm we can add first
(A, B) as show below: c 5
Step 1: d
The order in which single source shorted
path computed is

Step 2: Now add either (B, C) or (A, E). We are The minimum spanning tree is:
adding (A, E).

So, this is a line-graph. In this case, we will


always get line graph.
Step 3: Now, add (B, C) (b) Similarly, for n = 4

Step 4: Now, we can add either (B, D) or (D, The minimum spanning tree for this graph is
E). So, we will get two minimum spanning
trees as shown below:

Greedy Techniques 59 60 Greedy Techniques


Chapter Summary
4 Graph-Based Algorithms
Huffman Coding:
y H
 uffman's greedy algorithm builds an optimum manner of expressing each character Breadth-First Search Approach:
as a binary string using a table that shows how often each character appears i.e., its y The algorithm assumes that the graph
y BFS is a simple algorithm used to traverse G(V,E) is represented in an adjacency list,
frequency. a graph, and it is a model used in many and the color of each vertex is stored in
Applications of Huffman Coding: important graph algorithms. the u colour and the predecessors of u in
y They're used by compression formats like gzip, and other. y BFS can compute the distance(smallest u.π.(If there are no predecessors present
y It is useful when there's a string of characters that appear frequently. number of edges) from s to other reachable for a vertex, S for example, then S.π =NIL).
Fractional knapsack problem: vertices. It is the idea used behind Prim’s y A queue Q is used to store the grey vertices.
minimum spanning tree and Dijkstra’s y For a vertex u ∈ V, u.d stores the distance
y Given n items worth value V; each and weight W; select items whose weight sums to single-source shortest path algorithms. from the source to the vertex u.
a given value is w. The items can be taken in fractions. y BFS for a Graph G(V, E) explores the BFS (G, s)
Job sequencing problem: edges of G to “discover” each edge that is //initializing all vertices to white color,
reachable from S, where V= set of vertices, infinite distance and connected to null
y Given a list of jobs, each with its own deadline and associated profit if completed 1. For every vertex u ∈ G.V – {s}
E=Set of edges, S belongs to V, and S is
before the deadline. 2. u.color = WHITE
the source vertex, and is distinguishable
Minimum spanning trees: from other vertices. 3. u.d = ∞
4. u.π = NULL
y MST is a subset of the set of Edges E in a connected and undirected graph G(V, E) such
5. s.color = GREY
that it forms an acyclic graph.
6. s.d = 0
y Difference between Prim’s Algorithm and Kruskal’s algorithm. 7. s.π = NULL
Prim’s algorithm Kruskal’s algorithm 8. Q = ∅
9. ENQUEUE(Q,s):
y It begins to construct the minimum y The minimum spanning tree is built from
10. While Q is empty
spanning tree from any vertex in the graph. the vertex in the graph with the least edge 11. u = DEQUEUE(Q)
y The time complexity of Prim’s connected to it. y BFS uses white, grey, and black colours for 12. For each v ∈ G.Adj[u]
algorithm is O(V2) where V is the number y The time complexity of Kruskal’s algorithm the vertices to keep track the progress. 13. If (v.color == WHITE)
of vertices, and it can be improved to is O(ElogV), where V is the number of White – not discovered and not explored 14. v.color = GREY
O(ElogV) using Fibonacci heaps. vertices Grey – discovered but not explored 15. v.d = u.d + 1
y Prim’s algorithm returns a connected y Kruskal’s algorithm can both generate Black – discovered and explored 16. v.π = u
component and only works with forests and work on disconnected Discovered – traversed the vertex 17. ENQUEUE(Q, v)
connected graphs. components at any time Explored – the vertices that are reachable 18. u.color = BLACK
y In dense graph, Prim’s algorithm y In sparse graphs, Kruskal’s algorithm from a vertex V are discovered. y Line 1-4 except the source vertex all the
performs better performs better. y The vertices that are adjacent to a black other vertex’s colours are initialised to
vertex will either be in black or grey. white, u.d vector to infinity, and u.π to null.
Single-source shortest path: y Grey vertices may have adjacent white y Line 5 colours the source vertex to grey,
line 6 initialises s.d to zero, and in line 7
y It is a problem of finding shortest path from a source to all the vertices in a graph. vertices, and at a point, it acts as a
s.π is initialised to null.
y Difference between Dijkstra’s algorithm and Bellman Ford’s algorithm. border between white and black-coloured
y In line 8, the queue Q is initially empty.
vertices
y Line 9 adds source vertex to the queue.
y BFS starts with source vertex S and then
Prim’s algorithm Kruskal’s algorithm y Line 11 takes the first element in the queue
includes edges that connect S to its
y When there is a negative weight edge, y When there is a negative weight edge, into u, and for every vertex v adjacent to u
adjacent white vertices.
Dijkstra’s Algorithms does not work Bellman Ford’s algorithm detects the (by means of for loop) it marks it into grey
y For an edge (u,v) that is added to the BFS, (line 14) if it is white (line 13) and intialises
properly all the time. negative weight cycle. u is the predecessor or parent to v in the all the vertices v.d vector to u.d+1 (line 15),
y The time complexity is O(ElogV). y ts time complexity is O(VE). Breadth-first tree. and predecessor of v, i.e., v.π as u(line 16).
y It follows greedy approach, performs better y It follows dynamic programming approach.

Greedy PB
Techniques <Ch_Name>
61 Graph-Based Algorithms 63
y Line 17 adds each vertex discovered into algorithm, and vertices are enqueued and y We can initialise the visited array of all y For initialisation of visited [ ] array, it takes
queue and line 18 marks the explored dequeued only once, which is tested in the vertex to be zero (i.e., visited [i to n] = O(V) time.
vertex into black, i.e., u.colour = black. line 13. 0) and then run the breadth-first search y Therefore, the total time complexity
y This procedure (from line 10 to line 18) is y The enqueuing process and dequeuing starting from one of the vertexes, and after = O(V + 2E)
done until the queue gets empty. processes of a vertex take O(1) time which the search finishes, we should examine = O(V + E)
y The order of vertices resulting from BFS in turn takes O(V) time. the visited array. y The time complexity is the same for
depends on the order of adjacent vertices y The algorithm scans the adjacency list of y In case if visited [ ] array indicates that both the directed graph as well as for
visited in line 12: the BFS may be different, each vertex which takes O(E) time. some of the vertices are not visited, then undirected graph.
but the distances computed by the y Since initialisation takes O(V) time, the the reason is definitely that the graph is y Space complexity is also going to be the
algorithm do not. total time taken is O(V+E). not connected. same as Breadth-First Search.
Analysis y Hence, BFS runs in time equal to the size y Therefore, by using breadth-first search, Conclusion
y All the vertices in the graph are white at of the adjacency-list representation of G, we can say that whether the graph is y The time and space complexity of breadth-
first except the source S, and the vertices connected or not. first traversal is the same as breadth- first
i.e., O(V+E).
are not changed into white further in the Breadth-First Traversal search.
BFT can be executed using BFS. y For a given graph,BFT calls BFS on every
Example: node.When BFS is called on a single node
Algorithm:
BFT (G, n) /*G is the graph, and ‘n’ is the ,that means we are working on smaller
number of vertices */ part of the graph and then continue with
{ the remaining part.
for i = 1 to n do y So, it is as good as running the Breadth-First
  visited [i] = 0; /* visited[] is a search on the entire graph exactly once.
global array of vertices. ‘0’ value Applications of Breadth First Traversal
indicate it is not visited and ‘1’ y Shortest path and minimum spanning tree
indicate it is visited.*/ for weighted graph
for i = 1 to n do y Path finding
if(visited [i] == 0) then y To test if a graph is bipartite
  BFS(i); y Finding all nodes within one connected
} component
y Cycle detection in an undirected graph
y For the time complexity of breadth-first y Social networking websites
traversal (BFT), we have to do an aggregate y Crawlers in search engines
analysis.
y Aggregate analysis considers overall work
done. Previous Years’ Question
y In case if we are going to use adjacency
list representation, then every node is The Breadth-First Search (BFS) algorithm
going to have an adjacency list. has been implemented using the
y Whenever we call BFS, then some part queue data structure. Which one of the
of the adjacency list will be completely following is a possible order of visiting
visited, exactly one. the nodes in the graph below?
y Next time, when we call BFS on the
M N O
remaining nodes, then the remaining
nodes which are present on this list will
R Q P
Traversal vs. Search y Traversal is a search, which involves all be also visited exactly one.
y Traversal of a graph or tree involves the nodes in the graph. y Therefore, overall, on average, all the
(A) MNOPQR (B) NQMPOR
examining every node in the graph or tree. y By using Breadth-First Search (BFS), we nodes will be visited exactly one.
(C) QMNPRO (D) QMNPOR
y Search may or may not involve all the nodes can even find out whether a graph is y In case of the undirected graph, the
Solution: (D) [2017 (Set-2)]
in visiting the graph in systematic manner. connected or not. adjacency list contains 2E nodes.

64 64 Graph-Based
Graph-Based
Algorithms Algorithms Graph-Based
Graph-Based
Algorithms Algorithms 65 65
Depth First Search (DFS) y The code below is DFS with Graph y After all the vertices adjacent to u are BFS = BFT = DFS = DFT = O(V2)
y Depth-first search (DFS) is an algorithm G(directed or undirected) as input. Time is explored,8th line to 10th line in algorithm y Space complexity of
for searching the vertices of a graph a global variable. colours the vertex to black,increments BFS = BFT = DFS = DFT = O(V).
(traversing a graph to be specific), that DFS (G): time, and u.f is noted. Applications of Depth First Search
works by exploring as far into the graph as 1. For each vertex u Î G.V Note: y Detecting cycle in a graph:
possible before backtracking. 2. u.color ← WHITE If there is a back edge in DFS, it has a
The order of vertices returned by
y Input: 3. u.π ← NULL cycle. Hence DFS can detect a cycle in a
DFS depends on the order of vertices
Graph G(V,E), where V is the set of vertices 4. Time ← 0 graph.
discovered in line 5 of DFS algorithm,
and E is the set of edges. 5. For each vertex u Î G.V y Path finding:
and line 4 of DFS-VISIT algorithm.
y To keep track of progress, a depth-first 6. If u.color is WHITE The DFS algorithm can be tweaked to
search colours each vertex 7. DFS-VISIT(G,u) y Apart from the time to execute calls of find a path between two given vertices,
⚪ White: Vertices are white before they end DFS_VISIT,the loops in lines 1-3, and 5-7 in u and z.
are discovered DFS-VISIT(G,u) DFS gives Θ (V ) time complexity. y Used as logic behind topological sorting
1. Time ← time + 1 // white vertex u has y The algorithm DFS_VISIT discovers every
⚪ Gray: Vertices are grey when they are y To test if a graph is bipartite
just been discovered vertex exactly once, and the vertex on
discovered, but their outgoing edges are y Finding strongly connected components
2. u.d ← time
still in the process of being explored. which DFS_VISIT is called should be a of a graph
3. u.color ← GREY
⚪ Black: Vertices are black when the white vertex, and the DFS_VISIT will colour Triee edge, Back edge and Cross edges in
4. For each v Î G.Adj[u] // explore edge
entire subtree starting at the vertex (u,v) it to grey at the very first step. DFS of graph
has been explored 5. if v.color is WHITE y The loop lines 4-7 execute |Adj[v]| times in
y The moment DFS discovers a vertex v in 6 v.π ← u the execution of DFS-VISIT algorithm.
an adjacency list of already visited u,the 7. DFS-VISIT(G,v) ∑ | Adj[v] |= Θ(E) , the total time for
algorithm marks the predecessor of 8. u.color ← BLACK // blacken u; it is v∈V

attribute v. π=u. finished executing lines 4–7 of DFS-VISIT is Θ (E).


y Depending upon the source vertex, there 9. Time ← time + 1 The total time taken by DFS is therefore Θ
will be many subtrees possible for a DFS. 10. u.f ← time
(V +E).
y The predecessor subgraph of a DFS is end
Depth-First Traversal
defined as Analyzing DFS(G): y Depth-first traversal is also exactly the
y All the vertices are coloured white and
Gπ = (V,Eπ ), where same as Breadth-first traversal.
their π attributes are initialised to null in
E=
π {(V.π, V) : v ∈ V and v.π ≠ NULL} lines 1-3 of DFS (G).
y Here, instead of calling BFS inside that
y The time variable is reset in line 4. traversal function, we will call DFS.
y The predecessor subgraph of DFS forms
y From line 5 to line 7, For any vertex u Î V is y The time and space complexity of depth-
a forest with several depth-first trees.The
applied DFS-VISIT(G,u) if it is white. first Traversal and depth-first Search is y Tree Edge [Red edges]
edges in E.π are tree edges.
y For each time DFS_VISIT(G,u) is called on same. Formed after applying DFS on a graph.
y DFS timestamps each vertex apart
a vertex u, then u becomes the new root. DFT (G, n)/* G is the graph & n is the
from creating a depth-first tree.The
y This DFS-VISIT(G,u) returns the vertex number of vertices */ y Forward Edge [vertex A to vertex H]
two timestamps given to a vertex:v.d
with u.d and u.f initialised. { Edge (u,v), in which v is a descendent of u.
is used to record when the vertex is
Analyzing DFS-VISIT(G,u): for i = 1 to n do
discovered(changes v to grey colour) and
y The global variable time is incremented visited[i] = 0; // visited [] is an array y Back Edge [vertex F to vertex B]
v.f is used to assign appropriate finishing
in line, 1, and the new value of discovery for i = 1 to n do Edges (u,v) in which v is the ancestor of u.
time after examining v’s adjacency list
time u.d is updated in line 2. The vertex is if(visited[i] == 0) then
(changes v to black). y Cross Edge [vertex E to vertex D]
coloured grey in line 3. DFS(i);
y Since the adjacency matrix has at most |V|2 Edges(u,v) in which v is neither ancestor
y From the 4th to 6th line, the vertices that }
entries,the timestamps range between 1 nor descendent to u.
are adjacent to input vertex u are checked.
and |V|2. The timestamp of discovering the Conclusion Undirected graph
If they are white, their predecessor, i.e.,
vertex and finishing the vertex are v.d and y Time complexity in case of adjacency list ⚪ In an undirected graph, forward, and
v.p, is initialised to u.
v.f such that v.d<v.f, for every vertex v Î V. y Since every vertex adjacent to u is BFS = BFT = DFS = DFT = O(V + E) backward edges are the same.
y Vertex u is WHITE before u.d, GREY considered in line 4, DFS explores the y Time complexity in case of adjacency ⚪ Cross edges are not possible in an
between u.d and u.f, and BLACK after u.f. edge (u,v) for v Î Adj[u]. matrix undirected graph.

66 66 Graph-Based
Graph-Based
Algorithms Algorithms Graph-Based
Graph-Based
Algorithms Algorithms 67 67
Difference between DFS and BFS
Previous Years’ Question
Depth-First Search Breadth-First Search
Consider the DAG with consider V={1, 2, 3, 4, 5, 6}, shown below. Which of the following is
1. Backtracking is possible from a dead end. 1. Backtracking is not possible. NOT a topological ordering? [2007]
2. Stack is used to traverse the vertices in 2. Queue is used to traverse the vertices
LIFO order. in FIFO order. 2 5

3. Search is done in one particular 3. The vertices at the same level are
1 4
direction. maintained in parallel.
3 6
Topological sort: 1. Therefore, the topological sort takes
O(V + E) time in total. (A) 1 2 3 4 5 6 (B) 1 3 2 4 5 6
y DFS is the logic behind topological sort in
DAG-SHORTEST-PATHS(G, w, s) (C) 1 3 2 4 6 5 (D) 3 2 4 1 6 5
a directed acyclic graph (DAG).
a) sort the vertices of G topologically Solution: (D)
y A topological sort of a DAG G=(V, E) is a
b) start with source
linear ordering of all its vertices, such that
c) for each vertex u, taken in topologically
if G contains an edge (u, v), then u appears
sorted order
before v in the ordering. d) for each vertex v ∈ G.Adj[u] Solved Examples
y If a cycle exists in a graph, there will be no e) RELAX(u, v, w)
linear ordering possible. 1. Consider the following graph (G) (C) Directed acyclic graph
Analysis
y The topological sort of line 1 takes Θ (V +E) C D F (D) None of the above
Topological Sorting(G)
time. Solution: (C)
Step 1: C
all DFS on the graph, so that it
calculates the finishing time of each y The call of INITIALISE-SINGLE-SOURCE in Topological sort can be applied to directed
vertex. line 2 takes Θ (V) time. acyclic graphs.
G
Step 2: Based on the finishing time, insert y The for loop of lines 3–5 makes one
them into a linked list and return the iteration per vertex.
3. Consider the directed acyclic graph with
list of vertices. y Altogether, the for loop of lines 4–5 relaxes B A E
V={V1, V2, V3, V5, V6, V7} shown below.
each edge exactly once.
Analysis Number of cut vertex or articulation
y Because each iteration of the inner for V2 V5
points is _______.
y DFS takes Θ (V + E) time, and O(V) to add loop takes Θ (1) time.
Solution: 2
all vertices one by one into the linked list. y The total time is Θ (V +E).
In an undirected graph, a cut vertex (or
Examples of a directed acyclic graph (DAG): articulation point) is a vertex and if we V1
6 1 6 1 remove it then the graph splits into two
r
5
s
2
t
7
x
-1
y
-2
z r
5
s
2
t
7
x
-1
y
-2
z disconnected components.
 0      0 2 6  
Removal of “D” vertex divides the graph into V3 V7
3 4 2 3 4 2

(a) (b) two connected components ({E, F} and {A, B,


C, G}).
6 1 6 1
r s t x y z r s t x y z Similarly, removal of “C” vertex divides the V6
5 2 7 -1 -2 5 2 7 -1 -2
 0 2 6 6 4  0 2 6 5 7
graph into ({G} and {A, B, C, F}).
3 4 2 3 4 2 Which of the following is not a topological
(c ) (d)
For this graph D and C are the cut vertices. ordering?
(A) V2V5V1V7V3V6
6 1
r s t x y z
2. Topological sort can be applied to (B) V5V2V1V7V3V6
-1 -2

5
0
2
2
7
6 5 3 (A) Undirected graph (C) V1V5V2V7V3V6
3 4 2 (B) All types of graphs (D) None of the above
(e)

68 68 Graph-Based
Graph-Based
Algorithms Algorithms Graph-Based
Graph-Based
Algorithms Algorithms 69 69
Solution: (C) Which all of the above is/are not possible
output(s)? Chapter Summary
Here, every edge has a dependency, like this
(A) 1,3, and 4 only
V2 (B) 1 and 3 only y BFS – Simple algorithm for traversing a graph (Breadth-wise).
(C) 1, 2, 3, and 4 only y Traversal Vs. Search – Traversal goes through each vertex, but the search may or may
not.
(D) None of the above
y DFS – Algorithm used for traversing a graph (depth-wise).
V1 Solution: (D) Difference between DFS and BFS
All the sequences of nodes are the possible
this edge means that V2 comes before V1 Depth First Search Breadth-First Search
output of Depts First Search (DFS).
in topological ordering. Initially, V2 and V5
doesn’t have any dependency. So any one of 1. Backtracking is possible 1. Backtracking is not possible.
5. The Breadth-First Search algorithm has from a dead end. 2. The vertices to be explored are
them can be done independently.
been implemented using the queue data 2. Vertices from which organised as a FIFO queue.
So, either start with V2 or V5. structure. The possible order of visiting exploration is incomplete 3. The vertices at the same level
So, the topological ordering is given below: the nodes of the following graph is (MSQ) are processed in a LIFO are maintained in parallel.
order.
V2V5V1V7V3V6 a 3. Search is done in one
Or particular direction.
V5V2V1V7V3V6
y Tree edge, Back edge, and Cross edge in DFS of a Graph:
Another topological ordering is also possible, b c Tree edge: Formed after applying DFS on a graph.Red edges are tree edges.
but V1V5V2V7V3V6 is not correct topological Forward edge: Edge (u,v) in which v is a descendent of u.
ordering because it starts with V1 before V2 eg: 9.
or V5. Back edge: Edges (u,v) in which v is the ancestor of u.
Hence, (C) the correct option. eg : 10.
d e f g
4. Consider the following sequence of nodes
for the undirected graph given below.
1. a b d h e f c g 2. a c g h f e b d
3. a b d h f c g e 4. a b d h g c f e
h
If a Depth-First Search (DFS) is started
(A) abcdefgh
at node a using stack data structure. The
(B) acbfgdeh
nodes are listed in the order they are
first visited. (C) abcedgfh
(D) abchdefg
a
Solution: (A), (B), and (C)
The sequence of nodes given in options (a), Cross edge : Edges(u,v) in which v is neither ancestor nor descendent to u.
b c (b), and (c) are the correct possible order eg : 5
of visiting the nodes by using breadth-first y Topological Sort : “A topological sort of a DEG G=(V, E) is a linear ordering of all its
search, because breadth-first search visits vertices, such that if G contains an edge (u, v), then u appears before v in the ordering.”
d e f g the “breadth” first, i.e., if it is visiting a node,
then after visiting that node, it will visit
the neighbour nodes (children) first before
h moving on to the next level neighbours.

70 70 Graph-Based
Graph-Based
Algorithms Algorithms Graph-Based
Graph-Based
Algorithms Algorithms 71 71
5 Dynamic Programming y In the above case, fib(2) was evaluated
three-time (over-lapping of subproblems)
int fib (int n)
{
y If the n is large, then other values of fib
(subproblems) are appraised which leads if(n == 0) return 0;
y By using dynamic Programming, we ο Build a suitable solution from the to an exponential-time algorithm. if(n == 1) return 1;
can avoid recalculating the same result calculated result. y It is better, instead of solving the same if(fib[n]! = 0) /*check if already
multiple times. y Steps 1 -3 form the foundation of a DP problem, again and again, to store the calculated*/
y Using the divide and conquer technique; technique. result once and reuse it; it is a way to
we divide bigger problems into simpler y If we need only the value of a solution, return fib[n];
reduce the complexity.
subproblems as the algorithm requires. then skip step 4. y Memoization works like this: Begin with return fib[n] ¬ fib (n-1) + fib (n-2);
y Dynamic programming works on both y During step 4, we store additional a recursive function and add a table that }
top-down and bottom-up approach/ information throughout step 3 so that we maps the parameters of the function to
technique. can easily calculate a suitable solution. y In both methods, Fibonacci Series
the results calculated by the function.
y We usually start with dividing the large complexity is reduced to O(n).
y When the function is called with identical
task into smaller subtasks, storing the
Understanding Dynamic Programming y Because we use already computed values
parameter values that have already been
results of those subtasks to be used by from the table.
Before jumping to problems, let us try to called once, the stored value is returned
another subtask at the later stages of the TC: O(n).
understand how dynamic programming from the table.
problem solving, until we get the desired SC: O(n).
works through examples. Improving:
solution. Note:
Fibonacci series: y Now, we see how DP bring down this
y Using this technique complexity of various Both the approaches can be derived for the
algorithms can be reduced. A series of numbers in which the present problem complexity from exponential to dynamic programming problems.
y The greedy method only generates one number is the sum of the last two numbers. polynomial.
decision sequence at a time, and dynamic y We have two methods for this: Matrix Chain Multiplication
The Fibonacci series is defined as follows:
programming takes care of all the y 1. bottom-up approach: Start with the
possibilities. Fib(n) = 0, if n = 0 lower value of input and slowly calculate Problem:
y The problems with overlapping subproblems 1, if n = 1 a higher value. Let’s say a series of matrices are given, A1
are generally solved by the dynamic int fib[n]; × A2 × A3 × ... × An, with their value, what is
Fib(n-1) + Fib(n-2), if n > 1
programming technique; problems that do int fib(int n) the right way to parenthesize them so that
not have a similar type of pattern are difficult Recurrence relation and time complexity: it provides the minimum the number of total
{
to solve using the dynamic programming y The recurrence call T(n) is divided into multiplication. Assume that we are using
technique. if (n < = 1) standard matrix multiplication and not
T(n-1) and T(n-2) till the base conditions
y Dynamic programming follows four steps: T(1) and T(0). return n; Strassen’s matrix multiplication algorithm.
ο Identify the pattern of a suitable solution. y Therefore, the depth of the tree will be O(n). fib[0] ¬ 0 Input:
ο Recursively calculate the solution. y The number of leaves in a full binary tree fib[1] ¬ 1 Chain of matrices A1 × A2 × A3 × ... × An, where
ο Use the bottom-up technique to of depth n gives 2n. Since each recursion
calculate the value of the solution. for ¬ 2 to n Ai has dimensions or size Pi-1 × Pi. The value
takes O(1) time, and it takes O(2n).
is specified in an array P.
fib(5) fib[i] ¬ fib[i-1] + fib[i-2];
Goal:
return fib[n];
} Finding a way of multiplying matrices(in
fib(4) fib(3)
parenthesized form) such that it gives the
y 2. Top-down approach: We just save the optimal number of multiplications.
fib(3) fib(2) fib(2) fib(1) values of recursive calls and use them in Solution:
future.
y The implementation y We know the fact of mathematics that
fib(2) fib(1) fib(1) fib(0) fib(1) fib(0) matrix multiplication is associative.
int fib[n];
y Parenthesizing does not have any effect
for (i ¬ 0 to n)
fib(1) fib(0) on the result.
fib[i] ¬ 0; /* initialisation */

Dynamic Programming 73 74 Dynamic Programming


y For example – 4 matrices A, B, C, and D, 0 , if i = j Top-Down Dynamic Programming Matrix dynamic programming is that the number

the number of ways could be: M[i, j] =  Chain Multiplication of function, calls in top-down dynamic
(=
A(BC) ) D (=
(AB)C ) D (AB)(CD)
{ }
Min M[i,k] + M[k + 1, j] + pi−1pkp j if i < j
i≤k < j programming and the number of shells in
y Top-down method is also called bottom-up are almost the same.
(B(CD) ) A ((BC)(D) )
= A= y By using the above recursive equation, we
memoization or memoized.
find point k that helps to minimise the y In the top-down method, we are not going
y Number of ways we can parenthesis the y Both the top-down method and bottom- to compute any function, twice. Whenever
number of Scalar multiplications.
(2n) ! up method are going to use the same we compute a function, for the 1st time, we
matrix = , y After calculating all the possible values unique problem.
(n + 1) !n! of k, the value which gives the minimum save it in the table and next time, when we
y The basic similarity between top-down want that function we take it from the table.
Where n = number of matrix - 1 number of scalar multiplications is
dynamic programming and bottom-up
y Multiplying A(p×q) matrix with B(q×r) requires selected.
p*q*r multiplications, i.e. the number of y For reconstructing the optimal MEMOIZED_MATRIX_CHAIN(P)
scalar multiplication. parenthesization, we use a table (say, S[i, j]). {
y Different ways to parenthesize matrix y By using the bottom-up technique, we can
produce a different number of scalar 1. n = p . length – 1 /* p  is a sequence of all dimensions, and n is equal to the number
evaluate the M[ i, j ] and S[ i, j].
multiplication. of matrixces */
/* p is the size of the matrices, Matrix i
2. Let m[1….n, 1….n] be a new table
y Choosing the best parenthesizations using has the dimension p[i-1] × p[i].
/* m[i, j] represents the least number of scalar
brute force metthod gives O(2n). M[i,j] is the least cost of multiplying
multiplication needed to multiply Ai….Aj */
y This time, complexity can be improved matrices i through j.
3. for i = 1 to n
using Dynamic Programming. S[i ,j] saves the multiplication point, and
4. for j = 1 to n
y Let M[i, j] represent the minimum number we use this for backtracking, and length is
5. m[i, j] = ∞
of scalar multiplications required to the size of p array.*/
multiply Ai…Aj. Void MatrixChainOrder (int p[], int length) 6. return LOOKUP_CHAIN (m, p, 1, n)
}
{ LOOKUP_CHAIN (m, p, i, j)
int n = length – 1, M[n] [n], S[n] [n]; {
for i ¬ 1 to n 7. if m[i, j] < ∞ /* these two line check whether it is visited for
M[i] [i] ¬ 0; // fills in matrix by diagonals the 1st time or not */
for l ¬ 2 to n // l is chain length 8. return m[i, j]
{ 9. if i == j
for i ¬ 1 to n – l + 1 { 10. m[i, j] = 0
int j ¬ i + l – 1; 11. else for k = i to j – 1
M[i] [j] ¬ MAX VALUE; 12. q = LOOKUP_CHAIN(m, p, i, k) + LOOKUP_CHAIN (m, p, k+1, j) + pi-1 pk pj
/* Try all possible division points i…k and k…j */ 13. if q < m[i, j]
for k ¬ i to j – 1 14. m[i, j] = q
{ 15. return m[i, j]
 int this.cost ¬ M[i] [k] + M[k+1] [j] + p[i-1] * p[k] * p[j]; }
if(this.cost <M[i] [j]) Analysis:
{ distinct sub-problem, the for-loop in the
y In lines 3 to 5, initially, it is placing infinity
M[i] [j] ¬ this.cost; worst-case run for ‘n’ times.
in the entire table.
S[i] [j] ¬ k; y Therefore, the time complexity is O(n3)
y It is done so to know whether it is the 1st
} same as the bottom-up method.
time program has visited some entry, or it
} y Actual space complexity is more because
has already computed.
} of recursion, but the order of space
y If the entry is ∞ , then the it means that the
} complexity is not going to change.
program has visited that shell for the 1st time. y The depth of the tree is going to be O(n).
}
Time complexity: When the depth of the recursion tree
Time complexity = O(n3) y Number of distinct sub-problem is O(n2), is O(n), then the stack size required is
Space complexity = O(n2). O(n) because, at any time, the maximum
and whenever the program call any

Dynamic Programming 75 76 Dynamic Programming


number of elements that will be present in Recursive solution: y If xm = yn, then we solve 1 subproblem. y b[i, j] points to the table entry
the stack will be equal to the depth of the Evaluating longest common subsequence corresponding to the optimal subproblem
y Before DP Solution, we form a recursive
recursion tree and O(n2) for table m[n, n]. on Xm-1 and Yn-1. computed and selected when evaluating
solution for this, and later examine the
Therefore, space complexity = O(n2) + O(n) y If Xm ! = Yn, then we must solve two c[i, j].
LCS problem.
= O(n2) subproblems. finding an LCS of Xm-1 and Y y The procedure returns the table b and c,
y Given two strings, “ABCBDAB” and
y Space complexity is also same as bottom- and finding the LCS X and Yn-1. c[m, n] contains the length of an LCS of X
“BDCABA”, draw the lines from the letters
up method. y Whichever of these two LCSs is longer is and Y.
in the first string to the corresponding
the LCS of X and Y; these cases exhaust LCS_LENGTH(X, Y)
letter in the second, no two lines should
cross each other. all possibilities. 1. m ← X ⋅ length
Previous Years’ Question y One of the optimal subproblem solutions
A B C B D A B 2. n ← Y ⋅ length
seems like inside a longest common
Four matrices M1, M2, M3 and M4 of BDCAB A subsequence of X and Y. 3. Let b[1…m, 1…n] and c[0…m, 0…n] be new
dimensions p x q, q x r, r x s and s x t y Given X and Y, characters may or may not y As in the matrix chain multiplication tables
respectively can be multiplied in several match. problem, a recursive solution to the LCS
4. for i ¬ 1 to m
ways with different number of total y We can see that the first character of both problems requires to initiate a recurrence
scalar multiplications. For example, strings are not same. for the value of an optimal solution. 5. c[i,0] ¬ 0
when multiplied as ((M1 x M2) x (M3 x M4)) y Next, go to the second character in one of y c[i, j] = length of the LCS of the sequences 6. for j ¬ 0 to n
the total number of multiplications is the strings and check. <x1, x2, ...xi> and <y1, y2, ...yj>
pqr + rst + prt. y Lets go to the second character in the y If (i = 0 or j = 0) then one of the sequence 7. c[0,j] ¬ 0
When multiplied as (((M1 x M2) x M3) x M4), first string, and the remaining subproblem has length 0, and LCS has length 0. 8. for i ¬ 1 to m
the total number of scalar multiplications is followed this way, solving it recursively. y The recursive formula comes from the 9. for j ¬ 1 to n
is pqr + prs + pst. optimal substructure of the LCS problem.
If p = 10, q = 100, r = 20, s = 5 and t = 80, Optimal Substructure of An Lcs 10. if xi == yj
then the minimum number 0 =if i 0= or j 0,
y Let X <x1, x2, …, xm> and Y = <Y1, Y2, …, Yn>  11. c[i,j] ¬ c[i-1, j-1] + 1
of scalar multiplications needed is: c[i, j]= c[i − 1, j − 1] + 1 if i, j > 0 and xi = y j ,
be sequences, and let Z = <Z1, Z2, …., Zk>  12. b[i,j] ¬ “  ”
(A) 248000 (B) 44000 max(c[i, j − 1], c[i − 1, j]) if i, j > 0 and xi ≠ y j
be any LCS of X and Y.
(C) 19000 (D) 25000 13. elseif c[i-1, j] ≥ c[i, j-1]
y If xm = yn, then zk = xm = yn and zk-1 is an LCS
Solution: (C) [GATE 2011] When Xi = Yj, we consider the subproblem
of Xm-1 and Yn-1. 14. c[i,j] ¬ c[i-1, j]
of evaluating an LCS of Xi-1 and Yj-1 else we
y If xm ≠ yn , then zk ≠ xm implies that Z is an evaluate two subproblems, LCS of Xi and 15. b[i,j] ¬ “ ↑ ”
Longest common subsequence:
LCS of Xm−1 and Y. Y(j-1), LCS of X(i-1) and Yj
Given two strings X and Y of length m and 16. else c[i,j] ¬ c[i, j-1]
In the LCS problem; we have only O(mn)
n, respectively. Finding the longest common y If xm ≠ yn , then zk ≠ yn implies that Z is an different subproblems, we can use DP to 17. b[i,j] ¬ “ ← ”
subsequence in both the strings from left evaluate the problem.
LCS of X and Yn-1. 18. return c and b
to right, which need not be continuous. For Method LCS-LENGTH grab 2 sequences
y This characterises that the LCS of two
example, if X = “ABCBDAB” and Y = “BDCABA”, X = <x1, x2, …, xm> and Y = <Y1, Y2, …, Yn> as y The running time of the procedure is θ(mn),
series contains within it an LCS of prefixes
then LCS(X,Y) = {“BCBA”, “BCAB”}. As we can inputs. since each table entry takes θ(1) time to
of the two series.
see, there are several optimal solutions. y We store c[i, j] values in a table c[0..m,
y LCS has optimal-substructure property as compute.
Brute force approach: a problem. 0..n], and evaluate the entries in Row- y Space complexity is θ(mn) because of
Major order. (the procedure fills in the
Subsequence X[1...m] (with m being the y In recursive solutions, there are tables.
first row of C from left to right, then the
length of subsequence X) and if it is a overlapping subproblems. Example:
second row, and so on).
subsequence of Y[1...n] (with n being the Recursive solution: y b[1...m, 1...n] helps us in constructing an Let X = <P, Q, R, Q, S, P, Q> and Y <Q, S, R, P,
length of subsequence Y) checking this takes
y The optimal substructure implies that we optimal solution. Q, P> be two sequences.
O(n) time, and there are 2m subsequences
possible for X. The time complexity thus should examine either 1 or 2 subproblems
sums up to O(n2m), which is not good for when discovery LCS of X = <x1, x2, ….,xm >
large sequence. and Y = <y1, y2, …., yn>.

Dynamic Programming 77 78 Dynamic Programming


2. Return y Let ‘s’ and ‘t’ be the source and destination
3. If b[i, j] == “  ” respectively.
Previous Years’ Question
4. PRINT-LCS(b, X, i-1, j-1) y The sum of the costs of the edges on a
5. Print Xi path from source (s) to destination (t) is
Consider the data given in the above
6. Elseif b[i, j] == “ ↑ ” the path’s cost.
question. The value of l(i,j) could be
y The objective of the MULTISTAGE GRAPH
7. PRINT-LCS(b, X, i-1, j) obtained by dynamic programming
problem is to find the minimum path from
8. Else PRINT-LCS(b, X, i, j-1) based on the correct recursive definition
‘s’ to ‘t’.
of l(i,j) of the form given above, using an
y Each set Vi defines a stage in the graph.
y This procedure takes time O(m+n), since array L[M, NJ, where M = m + 1 and N =
Every path from ‘s’ to ‘t’ starts in stage-1,
it decrements at least one of i and j each n + 1, such that L[i,jJ = l(i,j).
goes to stage-2 then to stage-3, then to
recursive call. Which one of the following statements
stage-4, and so on, and terminates in
would be true regarding the dynamic
stage-k.
programming solution for the recursive
y This MULISTAGE GRAPH problem can be
Previous Years’ Question definition of l(i,j)?
solved in 2 ways.
(A) All elements L should be initialized to
y The c and b are two dimensional matrices ο Forward Method
A sub-sequence of a given sequence O for the value of l(i, j) to be properly
that stores the length of subsequence ο Backward Method
is just the given sequence with some computed.
and the appropriate arrows respectively. elements (possibly none or all) left out. (B) The values of l(i, j) may be computed Forward method:
y Initialise the X row and Y column to zero We are given two sequences X[m] and in a row major order or column major Algorithm FGraph (G,k,n,p)
and start with c[1,1] onwards. Y[n] of length m and n, respectively with order of L[M, N] // The input is a k-stage graph G=(V,E) with
y The length can be computed row wise or indexes of X and Y starting from 0. (C) The values of l(i, j) cannot be ‘n’ vertex.
column wise. We wish to find the length of longest computed in either row major order // Indexed in order of stages and E is a set
y Let us consider row wise here: common sub-sequence (LCS) of X[m] and or column major order of L[M, NJ of edges.
Starting with c[1, 1] P ¹ Q therefore Y[n] is as l(m, n), where an incomplete (D) L[p, qJ needs to be computed before // and c[i,j] is the cost of edge <i,j> is a
c[1, 1] = max {c[0, 1], c[1, 0]} = 0 recursive definition for the function L[r, sJ if either p < r or q < s. minimum-cost path.
b [1, 1] = “ ↑ ” l(i, j) to compute the length of the LCS of Solution: (B) [GATE 2009] {
c[1, 2] = max {c[1, 1], c[0, 2]} = 0 X[m] and Y[n] is given below: cost[n]=0.0;
b[1, 2] = “ ↑ ” since P ¹ S. l(i,j) = 0, if either i = 0 or j = 0 for j=n-1 to 1 do
= expr1, if i, j>0 and X[i-1] = Y[j-1] {
c[1, 3] = max {c[0, 3], c[1, 2]} = 0
Previous Years’ Question // compute cost[j],
b[1, 3] = “ ↑ ” since P ¹ R. = expr2, if i,
// let ‘r’ be the vertex such that
c[1, 4] = c[0, 3] + 1 = 1, j>0 and X[i-1] * Y[j-1]
<j,r> is an edge of ‘G’ &
b[1, 4] = “  ” since P = P. Which one of the following options is Consider Two Strings A = “Qpqrr” And B // c[j,r]+cost[r] is minimum.
correct? = “Pqprqrp”. Let X Be The Length Of The cost[j] = c[j,r] + cost[r];
y Similiarly all the values in each row are
(A) expr1 = l(i-1,j)+1 Longest Common Subsequence (Not d[j] = r;
calculated until c[7, 6].
(B) expr1 = l(i,j-1) Necessarily Contiguous) Between A And B }
y To find the possible subsequences
(C) expr2 = max (l(i-1,j), l(i,j-1)) And Let Y Be The Number Of Such Longest // find a minimum cost path.
possible, b[i, j] is used to reconstruct the
(D) expr2 = max(l(i-1,j-1),l(l,j)) Common Subsequences Between A And B. P[1] = 1;
subsequence from the table.
Solution: (C) [GATE 2009] Then X + 10y = ____ P[k]=n;
PRINT-LCS(b, X, i, j) For j=2 to k-1 do
Solution: 34 [Gate 2014 (Set-2)]
1. If i == 0 or j == 0 P[j]=d[P[j-1]];
}
Multistage Graph:
y Assume that there are ‘k’ stages in a graph.
y A multistage graph G = (V,E) is a directed y In this FORWARD approach, we find out
graph in which the vertices are partitioned the cost of each and every node starting
into K ≥ 2 disjoint sets of Vi, 1 ≤ i ≤ k from the ‘k’ th stage to the 1st stage.
y In addition, if < u, v > is an edge in E, then y We will find out the path (i.e.) minimum
u is in Vi and v belongs to Vi + 1 for some i, cost path from source to the destination
1 ≤ i < k. (i.e,) [ Stage-1 to Stage-k ].

Dynamic Programming 79 80 Dynamic Programming


y Maintain a cost matrix cost[n] which cost(8) = min {c (8,10) + cost (10), c(8,11) + D (7) = 10
stores the distance from any vertex to the Cost(11)} D (10) = 12
destination. = min (5+2, 6+5) So, the minimum-cost path is,
y If a vertex is having more than one path, =7 1 2 7 10 12
then we have to choose the minimum cost(8) = 7 =>D(8) = 10
distance path and the intermediate cost(7) = min(c (7,9) + cost(9), c(7,10) +
vertex, which gives the minimum distance Backward method:
cost(10))
path and will be stored in the distance y It is similar to forward approach, but
= min(4+4, 3+2)
array ‘d’. differs only in two or three ways.
= min(8,5)
y Thus, we can find out the minimum cost y Maintain a cost matrix to store the cost
=5 of every vertices and a distance matrix to
path from each and every vertex.
cost(7) = 5 => D(7) = 10 store the minimum distance vertex. Fig. 5.1
y Finally cost(1) will give the shortest
cost(6) = min(c (6,9) + cost(9), c(6,10) + y Find out the cost of each and every vertex
distance from source to destination. Cost(1) =0 = > D(1)= 0
cost(10)) starting from vertex 1 up to vertex k.
y For finding the path, start from vertex-1 then Cost(2) =9 = > D(2)=1
= min(6+4, 5+2) y To find out the path start from vertex ‘k’,
the distance array D(1) will give the minimum Cost(3) =7 = > D(3)=1
cost neighbour vertex which in turn give the = min(10, 7) then the distance array D (k) will give the Cost(4) =3 = > D(4)=1
next nearest vertex and proceed in this way =7 minimum cost neighbour vertex which Cost(5) =2 = > D(5)=1
till we reach the destination. cost(6) = 7 =>D(6) = 10 in turn gives the next nearest neighbour Cost(6) = min(c (2,6) + cost(2), c(3,6)
y For a ‘k’ stage graph, there will be ‘k’ vertex cost(5) = min(c (5,7) + cost(7), c(5,8) + cost(8)) vertex and proceed till we reach the + cost(3))
in the path. = min(11+5, 8+7) destination. = min(13,9)
Example: = min(16,15) Algorithm: backward method: cost(6) = 9 => D(6)=3
= 15 Cost(7) = min(c (3,7) + cost(3), c(5,7)
Algorithm BGraph (G,k,n,p)
cost(5) = 15 =>D(5) = 8 + cost(5), c(2,7) + cost(2))
cost(4) = min(c (4,8) + cost(8)) // The input is a k-stage graph G=(V,E) with = min(14,13,11)
= min(11+7) ‘n’ vertex. cost(7) = 11 =>D(7)=2
= 18 // Indexed in order of stages and E is a set Cost(8) = min(c (2,8) + cost(2), c(4,8)
cost(4) = 18 =>D(4) = 8 of edges. + cost(4), c(5,8) + cost(5))
cost(3) = min(c (3,6) + cost(6), c(3,7) +cost(7)) // and c[i,j] is the cost of edge <i,j> (i,j are = min(10,14,10)
= min(2+7, 7+5) the vertex number), p[k] is a minimum cost(8) = 10 =>D(8)=2
= min(9, 12) cost path. Cost(9) = min(c (6,9) + cost(6), c(7,9)
=9 + cost(7))
{
= min(15, 15)
cost(3) = 9 =>D(3) = 6
bcost[1]=0.0; cost(9) = 15 =>D(9)=6
cost(2) = min(c (2, 6) + cost(6), c(2,7) + for j=2 to n do Cost(10) = min(c(6,10) + cost(6), c(7,10)
cost(7), c(2,8) + cost(8)) { + cost(7)), c(8,10) + cost(8))
= min(4+7, 2+5, 1+7) // compute bcost[j]
y In the above graph V1…V5 represent the = min(14,14,15)
= min(11, 7, 8) // let ‘r’ be the vertex such that cost(10) = 14 =>D(10)=6
stages. This 5 stage graph can be solved
=7 <r,j> is an edge of ‘G’ & Cost(11) = min(c (8,11) + cost(8))
by using forward approach as follows,
cost(2) = 7=>D(2) = 7 // bcost[r]+c[r,j] is minimum. cost(11) = 16 =>D(11)=8
STEPS: DESTINATION, D
cost(1) = min(c (1,2)+cost(2), c(1,3) + cost(3), Cost(12) = min(c(9,12) + cost(9), c(10,12)
y Cost (12) = 0 D (12) = 0 bcost[j] = bcost[r] + c[r,j];
c(1,4) + cost(4), c(1,5) + cost(5)) d[j]=r; + cost(10), c(11,12) + cost(11))
y Cost (11) = 5 D (11) = 12
y Cost (10) = 2 D (10) = 12 = min(9+7, 7+9, 3+18, 2+15) } = min(19,16,21)
y Cost (9) = 4 D (9) = 12 = min(16, 16, 21, 17) // find a minimum cost path. cost(12) = 16 = >D(12) = 10
For forward approach, = 16 P[1]=1; Start from vertex-12:
cost(1) = 16 =>D(1) = 2 P[k]=n; D(12) = 10
Cost (i,j) = min {c (j,l) + Cost (i+1,l) }
Start from vertex -2 For j= k-1 to 2 do D(10) = 6
l Î Vi + 1
D (1) = 2 P[j]=d[P[j+1]]; D(6) = 3
(j,l) Î E
D (2) = 7 } D(3) = 1

Dynamic Programming 81 82 Dynamic Programming


So the minimum cost path is, length of the shortest path visiting each y Lets start from node 1 y Thus, to take a bottom-up approach, we
7 2 5 2 node in S exactly once, starting at 1 and C(4, 1) = 20 C(3,1) = 15 C(2,1) = 10 must solve the 0-1 knapsack problem for
1 
→ 3 
→ 6 
→ 10 
→ 12
ending at j. C({3},2) = d(2,3) + C(3,1) = 50 all items and possible weights smaller
The cost is 16. When |S| > 1, we define C(S,1) = ∞ since the C({4},2) = d(2,4) + C(4,1) = 45 than W.
path cannot both start and end at 1. C({2},3) = d(3,2) + C(2,1) = 45 y We’ll build an n + 1 by W + 1 table of values
Travelling Salesman Problem
Now, let’s express C(S,j) in terms of smaller C({4},3) = d(3,4) + C(4,1) = 50 where the rows are indexed by item, and
y The traveling salesman problem (TSP) is to subproblems. We need to start at 1 and C({2},4) = d(4,2) + C(2,1) = 35 the columns are indexed by total weight.
find the shortest possible route that visits end at j; what should we pick as the C({3},4) = d(4,3) + C(3,1) = 45 y For row i column j, we decide whether or
each city exactly once and returns to the second-to-last city? It has to be some C({3,4,2}) = min(d(2,3) + C({4},3), d(2,4) + not it would be advantageous to include
starting point given a set of cities and the i ∈ S , so the overall path length is the C({3},4)) item i in the knapsack by comparing the
distance between each pair of cities. distance from 1 to i, namely, C(S – {j}, i), = min(85,70) total value of a knapsack including items
y Take note of the distinction between plus the length of the final edge. dij. We = 70 1 through i − 1 with max weight j, or the
the Hamiltonian cycle and the TSP. must pick the best such i: C({2,4},3) = min(d(3,2) + C({4},2), d(3,4) + total value of including items 1 through i
The Hamiltonian cycle problem entails C({2},4)) − 1 with max weight j − i.weight and also
j) min C(S − { j}, i) + dij .
C(S,=
determining whether a tour exists that i∈S:i≠ j = min(80,65) item i.
visits each city exactly once. The problem The subproblems are ordered by |S|. Here’s = 65 y To solve the problem, we simply examine
is to find a minimum weight Hamiltonian the code. C({2,3},4) = min(d(4,2) + C({3},2), d(4,3) + the [n,W] entry of the table to determine
cycle. We know that Hamiltonian tours C({1},1) = 0 C({2},3)) the maximum value we can achieve.
exist (because the graph is complete), and for s = 2 to n: = min(75, 75) y To read the items we include, start with
there are many of them. for all subsets S ⊆ {1, 2, ...,n} of size s and = 75 entry [n,W]. In general, proceed as follows:
y Let the number of vertices in the given set containing 1: y Finally If entry [i,j] equals entry [i−1, j], don’t
be 1, 2, 3, 4,...n. Let’s use 1 as the starting C(S,1) = ∞ C({2,3,4},1) = min(d(1,2) + C({3,4},2), d(1,3) + include item i, and examine entry [i−1, j],
and ending points for the output. We find for all j ∈ S, j ≠ 1 : C({2,4},3), d(1,4)+C({2,3},4)) next.
the minimum cost path with 1 as the = min(80,80,95) y If entry [i,j] doesn’t equal entry [i − 1, j],
C(S,
= j) min{C(S − { j}, i) + dij : i ∈ S, i ≠ j}
starting point, i as the ending point, and = 80 include item i and examine entry [i − 1, j −
all vertices appearing exactly once. return min j C({1, ...,n}, j) + dj1 ∴ Optimal tour length is 80 i].weight next.
y Let’s say the cost of this path is cost(i), Optimal tour 1-2-4-3-1 Algorithm 0-1 Knapsack(n,W)
There are at most 2n⋅n subproblems, and
then the cost of the corresponding cycle each one takes linear time to solve. The 0-1 Knapsack problem:
1. Initialize a 2D matrix KS of size (n+1) x
is cost(i) + dist(i, 1), where dist(i, 1) is the total running time is, therefore O(n22n). The 0-1 knapsack problem is as follows. A (W+1)
distance between from i to 1. Finally, we
Example: thief robbing a store finds n items. The ith 2. Initialize 1st row and 1st column with zero
return the value that is the smallest of all
y Consider the graph item is worth vi dollars and weighs wi pounds, 3. For itr<- 1 to n
[cost(i) + dist(i, 1)]. So far, this appears to
where vi and wi are integers. The thief wants
be straightforward. The question now is 1 4. For j<-1 to W
to take as valuable a load as possible, but he
how to obtain cost(i). 5. If(j< W[itr])
can carry at most W pounds in his knapsack,
y We need a recursive relation in terms of 6. KS[itr,j]<- K[itr-1][j]
10
20
15 for some integer W. Which items should he
sub-problems to calculate the cost(i) 7. Else
take? (We call this the 0-1 knapsack problem
using dynamic programming. 4 because, for each item, the thief must either 8. KS[itr,j]<-max(KS[itr-1,j] , KS[itr-1 , j])
y Let’s say C(S, i) is the cost of the minimum 25 30 take it or leave it behind; he cannot take a 9. End for
cost path visiting each vertex in set S fractional amount of an item or take an item
exactly once, starting at 1 and ending at i. 2 35 3 10. End for
more than once.)
y We begin with all subsetds of size 2 and End
y Matrix representation of the above graph
calculate C(S, i) for all subsets where S is Dynamic Programming Approach Analysis:
the subset, then calculate C(S, i) for all 1 2 3 4 y Since the time to calculate each entry
y Suppose we know that a particular item
subsets of size 3, and so on. It’s worth 1  0 10 15 20 in the table KS[i,j] is constant; the time
of weight w is in the solution. Then we
noting that 1 must appear in each subset. complexity is Θ(n x W). Where n is the
2  10 0 35 25  must solve the subproblem on n − 1 items
For a subset of cities S ⊆ {1, 2, ...,n} that number of items and W is the capacity of
3  15 35 0 30 with maximum weight W − w.
includes 1, and j ∈ S, let C(S, j) be the   the Knapsack.
4 20 25 30 0 

Dynamic Programming 83 84 Dynamic Programming


Recurrence relation: KS
= ( 2, 1) KS
= ( 1, 1) 10 Example: y First, we check the base condition. If
y In a Knapsack problem, we are given a set Let set = {6, 2, 3, 1}. Check whether there is a there are no elements i.e., i = 0, and we
P2 + KS ( 1, 0 ) , ( 15 + 0 ) ,
of n items where each item i is specified= KS ( 2, 2) max = max =  15 subset whose sum is 5. want to produce some sum S then it is
by a size/weight
 max (Piw+i KS (
andi –a1, w
value )
– wi P, i. We KS ( 1, 2 ) 10
Solution:
not possible. So, it is false.
are also given the size bound W, the size y If there are no elements i.e., i = 0, and sum
 KS (i – 1, w ));if wi ≤ w P2 + KS ( 1, 1) , ( 15 + 10 ) , Here, there is a subset {3, 2} whose sum is 5
(capacity) of our Knapsack. = KS ( 2, 3) max= max
=  25 S = 0 is possible. So, it is true because
  KS = (i, w )  KS (i − 1, w ) ;if wi > w KS ( 1, 3) 10 Here greedy method fails; if we try to be
Sum = 0 is always possible.
 greedy on a smaller number, then it will take
0 = ;if i 0=or w 0 P2 + KS ( 1, 2) , ( 15 + 10 ) , y This above two are base conditions.
 max (P
 
(
 i + KS i – 1, w – wi , ) = KS ( 2, 4 ) max= max
=  25 1, and so if we take 1, then the remaining
y Using ith element if this sum ‘S’ has to be
 KS (i – 1, w ));if wi ≤ w KS ( 1, 4 ) 10 sum is going to be 4. So, there is no number
 possible, then there are two cases:
P2 + KS ( 1, 3 ) , ( 15 + 10 ) , or subset which makes to 4.
KS
= (i, w )  KS (i − 1, w ) ;if wi > w = KS ( 2, 5 ) max=  max
=  25 So, greedy doesn’t work in subset-sum.
Case 1: If we include the ith element in our
 0 = ;if i 0= or w 0 KS ( 1, 5 ) 10 subset, then a sum of (S – ai) should be
 Brute-fore method: possible with the remaining (i-1) elements.
 P2 + KS ( 1, 4 ) , ( 15 + 10 ) ,
 = KS ( 2, 6 ) max= max
=  25 y If there are ‘n’ numbers, then any number Case 2: If we don’t include the ith element
KS ( 1, 6 ) 10 can be present or absent in subset. So, in our subset, then a sum of (S) should be
Where KS (i,w) is the best value that can KS
= ( 3, 1) KS
= ( 2, 1) 10 every a number has 2 options. Hence, the possible with the remaining (i-1) elements.
be achieved, for instance, with only the number of subsets is equal to 2n. So, the recursive equation look like as
first i items and capacity w.
KS
= ( 3, 2) KS
= ( 2, 2) 15 y Brute force method examines every shown below:
Example: P3 + KS ( 2, 0 ) , ( 40 + 0 ) , subset, i.e, equal to 2n. To examine each
KS ( 3, 3) max = max
= = 45 SS(i − 1, S); S < ai
KS ( 2, 3) 25 sub-problem, it will take O(n) time. 
Item 1 2 3 SS(i − 1, S − ai ) V S S(i − 1, S); S ≥ ai
y Therefore, time complexity = number of SS(i, s) = 
P3 + KS ( 2, 1) , ( 40 + 10 ) ,
Weight (in kgs) 1 2 3 KS ( 3, 4 ) max=
=  max
=  50 subproblem * time taken by each sub- true; S = 0
KS ( 2, 4 ) 25 problem. = 0(2n) * O(n) = O(n2n) False;i
 = 0, S ≠ 0
Values (In rupees) 10 15 40
P3 + KS ( 2, 2) , ( 40 + 15 ) , Recursive equation: y If the problem is a subset-sum of SS(n,w)
Capacity of bag = W = 6 kgs KS ( 3, 5 ) max=
=  max
=  55
y Let us assume that SS(i, S) denote a (where n positive numbers A1A2…An and w
KS ( 2, 5 ) 25
0 1 2 3 4 5 6 subset-sum from a1 to ai whose sum is is a sum). Then the number of unique sub-
P3 + KS ( 2, 3) , ( 40 + 25 ) , problem is O(nw).
0 0 0 0 0 0 0 0 KS ( 3, 6 ) max=
=  max
=  65 equal to some number ‘S’.
1 0 10 10 10 10 10 10 KS ( 2, 6 ) 25
2 0 10 15 25 25 25 25 Subset Sum Problem Solved Examples
3 0 10 15 40 50 55 65
Problem:
1. Given the set S = {6, 3, 2, 1}. Is there any subset possible whose sum is equal to 5.
Given a sequence of n positive numbers Solution:
P1 + KS ( 0, 0 ) , ( 10 + 0 ) ,
KS ( 1, 1) max = max
= =  10 A1, A2…..An, give an algorithm which checks
KS ( 0, 1) 0 whether there exists a subset of A whose Since here, the number of elements = 4 and the sum is going to be 5 then the number of
sum of all numbers is T. problem = 4 × 5 = 20.
P1 + KS ( 0, 1) , ( 10 + 0 ) ,
KS ( 1, 2) max = max
= =  10
y This is a variation of the Knapsack problem. Sum
KS ( 0, 2) 0
As an example, consider the following 0 1 2 3 4 5
P1 + KS ( 0, 2) , ( 10 + 0 ) , array: A = [3, 2, 4, 19, 3, 7, 13, 10, 6, 11]
KS ( 1, 3) max = max
= = 10
KS ( 0, 3) 0 Suppose if we want to check whether 0 T F F F F F
there is any subset whose sum is 17. The
P1 + KS ( 0, 3 ) , ( 10 + 0 ) , 1 T F F F F F
KS ( 1, 4 ) max = max
= = 10 answer is yes, because the sum of 4 + 13
KS ( 0, 4 ) 0 = 17 and therefore {4, 13} is such a subset. number
2 T F F T F F
P1 + KS ( 0, 4 ) , ( 10 + 0 ) , y Greedy method is not applicable in subset- of element
KS ( 1, 5 ) max = max
= =  10 sum. If we try to be greedy, then we
KS ( 0, 5 ) 0
don’t know whether to take the smallest 3 T F T T F T
P1 + KS ( 0, 5 ) , ( 10 + 0 ) , number or the highest number. So, there
KS ( 1, 6 ) max = max
= = 10 4 T T T T T T
is no way we could go for greedy.
KS ( 0, 6 ) 0

Dynamic Programming 85 86 Dynamic Programming


y Whenever we want the sum = 0 then it Note: y The Floyd-Warshall algorithm considers stronger statement. Because vertex k is
is always possible with whatever element the intermediate vertices of a shortest not an intermediate vertex of path p1, all
y Either to use dynamic programming or
because a null set is going to be sum = 0. path, where an intermediate vertex of a intermediate vertices of p1 are in the set
not depends on the value of w. simple path P = <V1, V2, …,Vl> is any vertex
y Index (i, j) indicate, with ith element is {1,2,…,k-1}. Therefore, P1 is a shortest path
y If ‘w’ a is big number, then the brute force of P other than V1 or Vl, that is, any vertex from i to k with all intermediate vertices
sum j possible.
method gives better time complexity i.e. in the set {V2, V3,…,Vl-1} in the set {1,2,…,k-1}.
SS(1,1) = SS(0,1) = False
O(2n) otherwise dynamic programming.
[Since the 1st element weight is ‘6’. So, it y The Floyd-Warshall algorithm relies on y Similarly, P2 is a shortest path from vertex
can’t lead to sum of 1 then we have to go Conclusion: the following observation. k to vertex j with all intermediate vertices
for SS(i-1,S) i.e., SS(0, 1)] O(2n )
Time complexity = min  y Under our assumption that the vertices of in the set {1,2,…,k-1}.
Similarly, O(nw) G are V = {1,2,…,n}, let us consider a subset
SS(2,1) = SS(1,1) = False {1,2,…,k} of vertices for some k. Recurrence relation:
y If w is n! then 2n is going to be better than
SS(2,2) = SS(1,2) = False y Let dij(k) be the weight of a shortest path
O(nw). y For any pair of vertices i, j ∈ V , consider all
SS(2,3) = SS(1,0) VSS(1,3) = True V False =
Space complexity: paths from i to j whose intermediate from vertex i to vertex j for which all
True
Space complexity is required for the table. vertices are all drawn from {1,2,…k} and intermediate vertices are in the set {1,
SS(2,4) = SS(1,1) V SS(1,4) = False V False
So, O(nw) is the space complexity. Let p be a minimum-weight path from 2,…,k}.
= False
among them. (Path P is simple)
SS(2,5) = SS(1,2) V SS(1,5) = False V False All Pairs Shortest Path Floyd Warshall y When k=0, a path from vertex i to vertex j
= False y The Floyd-Warshall algorithm exploits with no intermediate vertex.
SS(3,1) = SS(2,1) (Here, S < i) = False Problem: a relationship between path p and
SS(3,2) = SS(2,0) V SS(2,2) = True V False Given a weighted directed graph G = (V, E), shortest paths from i to j with all y Such a path has at most one edge, and
= True where V = {1, 2, …,n}. Find the shortest path intermediate vertices in the set {1,2,…,k-1}. hence dij(0) = wij .
SS(3,3) = SS(2,1) V SS(2,3) = False V True between all pair of nodes in the graph. y The relationship depends on whether or
We define dij(k) recursively by
= True y We can solve an all-pairs shortest-paths not k is an intermediate vertex of path p.
SS(3,4) = SS(2,2) V SS (2,4) = False V False problem by running a single-source  w if k = 0
y If k is not an intermediate vertex of path  ij
dij(k) = 
= False
SS(3,5) = SS(2,3) V SS (2,5) = True V False
shortest-paths algorithms |V| times, once
for each vertex as the source.
p, then all intermediate vertices of path p 
 (
min dij(k −1) , dik
(k − 1) (k–1)
+ dkj ) if k ≥ 1
are in the set {1,2,…,k-1}. Thus, a shortest
= True path from vertex i to vertex j with all y Because for any path, all intermediate
y If all the edges of the graph contain the
SS(4,1) = SS(3, 0) V SS(3,1) = True V False intermediate vertices in the set {1,2,…,k-1} vertices are in the set {1,2,…,n}, the matrix
positive weight, then apply Dijkstra’s
= True
algorithm. is also the shortest path from i to j with all ( )
D(n) = dij(n) gives the final answer:
SS(4,2) = SS(3,1) V SS(3,2) = False V True intermediate vertices in the set {1,2,…,k}.
= True y If we use the linear-array implementation dij(n) = δ(i, j) for i, j ∈ V .
SS(4,3) = SS(3,2) V SS (3,3) = True V True of the min-priority queue, the running y If k is an intermediate vertex of path p, then
time is O(V3 + VE), and we know, E = V2 we decompose p into i  P1 P2
→ k  → j, as
y Based on recurrence relation, we can us
= True the following bottom-up procedure to
so, O(V3).
SS(4,4) = SS(3,3) V SS(3,4) = True V False shown below: compute the value dij(k) in order of
= True y The binary min-heap implementation of the
all intermediate all intermediate
SS(4,5) = SS(3,4) V SS(3,5) = False V True min-priority queue yields a running time increasing values of k.
vertices in {1,2,...,k-1} vertices in {1,2,...,k-1}
= True of O(V+ElogV), which is an improvement if
y Since, the final answer is in shell (4,5). So, the graph is sparse. K Floyd-Warshall(W)
the final answer is true. The final answer y Alternatively, we can implement the min- P1 P2
1. =
n W ⋅ rows
will always be present in (n, w). priority queue with a Fibonacci heap,
yielding a running time of O(V2logV+VE). 2. D(0) = W
y The subset sum can be computed either j
in row major order or column major order. 3. For k = 1 to n
y Instead, we must run the slower Bellman-
Time complexity: ford algorithm once from each vertex.
P: all intermediate vertices in {1,2,...,k}
( )
4. Let D(k) = dij(k) be a new n × n size matrix

y Here, the number of the subproblem is y The resulting running time is O(V E), which
2 5. For i = 1 to n
Fig. 5.2
(nw), and the time required to calculate for the dense graph is O(V4) 6. For j=1 to n
each subproblem is O(1). Hence, time y P1 is a shortest path from vertex i to vertex
complexity = (nw) × O(1) = O(nw).
y In the Floyd-Warshall algorithm,
negative-weight edge is allowed.
the =
k with all intermediate vertices in the set
7. dij(k) min dij(k −1) , dik
(k − 1)
( (k − 1)
+ dkj )
{1,2,….,k}. In fact, we can make a slightly 8. Return D (n)

Dynamic Programming 87 88 Dynamic Programming


y Its input is an n × n matrix W. The procedure 1 2 3 4 5 y The search time for an element depends y If we separate the left subtree time
returns the matrix D(n) of shortest-path 1 0 3 -1 4 -4 which level node is present. and right subtree time, then the above
weights. 2 3 0 -4 1 -1 y The average number of comparisons for expressions can be written as:
r−1 n
y The running time of the Floyd-Warshall D =
(4)
3 7 4 0 5 3 1 + 2 + 2 + 3 + 3 11
algorithm is determined by the triply 4 2 -1 -5 0 -2
the first tree is:
5
=
5
and = S(root) ∑ (depth(root, i) + 1) × F[i]) + F[i] +

i= 1 r
nested for loops of lines 3-7 because each 5 8 5 1 6 0 for the second tree, the average number n
execution of line 7 takes O(1) time, the
algorithm runs in time θ(n3 ).
1 2 3 4 5 of comparisons:
1 + 2 + 3 + 3 + 4 13
= . ∑ (depth(root, i) + 1) × F[i])
1 0 1 -3 2 -4 5 5 i= r + 1

eg: 2 3 0 -4 1 -1 Among the two, the first tree is giving Where “r” indicates the positions of the
better results. root element in the array.
D =
(5)
3 7 4 0 5 3
y Here, frequencies are not given, and if y If we replace the left subtree and right
4 2 -1 -5 0 -2 subtree times with their corresponding
we want to search all elements, then the
5 8 5 1 6 0 recursive calls, then the expression
above simple calculation is enough for
Space complexity: deciding the best tree. becomes:

Here, space complexity is O(n3), but we can y If the frequencies are given, then the n

reduce the space complexity to O(n2). selection depends on the frequencies of


S(root)= S(root → left) + S(root → right) + ∑ F[i]
i= 1
the elements and also the depth of the
elements. Implementation:
Previous Years’ Question y For simplicity, let us assume that, the Node optimalBST(int keys[], int freq[])
given array is A and the corresponding {
Fig. 5.3
The Floyd-Warshall algorithm for all-pair frequencies are in array F. F[i] indicates int n = keys.length;
1 2 3 4 5 shortest paths computation is based on: the frequency of ith element A[i]. int cost[][] = new int[n][n];
1 0 3 8  -4 int root[][] = new int[n][n];
(A) Greedy paradigm.
2  0  1 7 y With this, the total search time S(root) of for(int i=0;i<n;i++)
(B) Divide-and-Conquer paradigm
3 4 0 the tree with root can be defined as
D =   
(0)
(C) Dynamic Programming paradigm for(int j=0;j<n;j++)
4 2  -5 0  (D) Neither Greedy nor Divide-and
n
cost[i][j] = inf;
5    6 0 Conquer nor Dynamic programming
S(root)
= ∑ (depth(root, i) + 1) × F[i])
i= 1
/*
paradigm cost[i][j] is the minimum cost of optimal
1 2 3 4 5
1 0 3 8  -4 Solution: (C) [GATE 2016 (Set-2)] 12 35 subtree formed by the vertices[i to j]
root[i][j] represents the index of the root
2  0  1 7
in the subtree with vertices [i to j]
D =
(1)
3  4 0   12
Optimal Binary Search Trees 3 32 */
4 2 5 -5 0 -2 // cost of the optimal binary search tree
5    6 0 Problem: // index starts at 0
3 21
1 2 3 4 5 y Given a set of n(sorted) keys A[1…n], 21 35 int minCost = optimalCost(0,n-1,freq,cost,root);
1 0 3 8 4 -4 build the best binary search tree for the Node optimalTreeRoot = buildOptimalTree(0,n-
2  0  1 7 elements of A. Also, assume that, each 35 1,keys,root);
element is associated with the frequency, return optimalTreeRoot;
D =
(2)
3  4 0 5 11
which indicates the number of times that Fig. 5.4 }
4 2 5 -5 0 -2 particular item is searched in the binary
y In the above expression, depth (root, i) + 1 int optimalCost(int i, int j , int freq[], int cost[]
5    6 0 search tree.
indicates the number of comparisons for [], int root[][]){
1 2 3 4 5 y To reduce the total search time, we need searching the ith element. // base conditions
1 0 3 8 4 -4 to construct BST (Binary search tree). if(i<j) return 0;
2  0  1 7 y Since we are trying to create a binary
y Understand the problem by taking 5 else if(i==j) return freq[i];
search tree, the left subtree elements are
D =
(3)
3  4 0 5 11 elements in the array. Let us assume that // using stored values
less than the root element, and the right
4 2 -1 -5 0 -2 the given array is A=[3, 12, 21, 32, 35]. To subtree elements are greater than the if(cost[i][j] < inf) return cost[i][j];
5    6 0 represent these elements, there are many root element. int minCost = inf;
ways and below are two of them. int minRoot = i;

Dynamic Programming 89 90 Dynamic Programming


for(int r=i; r<=j; r++){
Solved Examples
// root can be any vertex from i to j
int c = optimalCost(i,r-1,freq,cost,root) + optimalCost(r+1,j,freq,cost,root); 1. Let A1, A2, A3 and A4 be four matrices of 2. Let us define AiAi+1 as an explicitly
if(c<minCost){ dimensions 1 × 2, 2 × 1, 1 × 4 and 4 × computed pair for a given parenthesization
minCost=c; 1 respectively. The minimum number if they are directly multiplied. For
minRoot = r; of scalar multiplications required to find example, in the matrix multiplication
} the product A1A2A3A4 using the basic chain A1A2A3A4 using parenthesization
} matrix multiplication method is ________ A 1 ( ( A 2 A 3 ) A 4 ) , A2 A3 are only explicitly
Solution: 7
int freqSum = 0; computed pairs.
for(int k=i;k<=j;k++) The unique function call which are made in Consider the matrix given in question
freqSum+=freq[k]; m[1, 4] are given below: number 1 that minimises the total
number of scalar multiplication, the
cost[i][j] = minCost + freqSum; 0 0 0 0
explicitly computed pairs is/are
root[i][j] = minRoot; (1, 1) (2, 2) (3, 3) (4, 4)
2 8 4 (A) A1A2 and A3A4 only
return cost[i][j]; (1, 2) (2, 3) (3, 4) (B) A2A3 only
} 6 6
(1, 3) (2, 4) (C) A1A2 only
Node buildOptimalTree(int i,int j,int keys[], int root[][]){ 7
// base conditions (1, 4) (D) A3A4 only
if(i<j) return null; Solution: (A)
0 , if i = j

if(i==j) return new Node(keys[i]); M[i, j] = 
{ }
Min M[i,k] + M[k + 1, j] + pi−1pkp j if i < j
i≤k < j
In question 1, we got the optimal scalar as 7
// getting the index of optimal root of subtree[i,j] stored in the matrix in (A1A2), (A3A4). So, A1A2, and A3A4 are explicitly
int rindex = root[i][j]; Number of scalar multiplication computed pairs.
Node node = new Node(keys[rindex]); A1A2 = 1 × 2 × 1 = 2
node.left = buildOptimalTree(i,rindex-1,keys,root); 3. Consider two strings X = “ABCBDAB”
A2A3 = 2 × 1 × 4 = 8
node.right = buildOptimalTree(rindex+1,j,keys,root); and Y = “BDCABA”. Let u be the length
A3A4 = 1 × 4 × 1 = 4 of the longest common subsequences
return node;
m(1, 1) + m(2, 3) + 1 × 2 × 4 (not necessarily contiguous) between X
} A 1=
A2 A3 m(1,
= 3) min  and Y and let V be the number of
Conclusion: m(1, 2) + m(3, 3) + 1 × 1 × 4
such longest common subsequences
0 + 8 + 8 between X and Y. then V + 10u is______
= min
=  6
2 + 0 + 4 Solution: 43
y We can determine whether the given sub-problems we determine the solution
m(2, 2) + m(3, 4) + 2 × 1 × 1
problem can be solved using a dynamic to the larger problem. A=
2A3A4 m(2,
= 4) min 
approach based on the two properties:- y In top-down programming, the recursive m(2, 3) + m(4, 4) + 2 × 4 × 1
ο Optimal substructure: An optimal structure of the original code is preserved, 0 + 4 + 2
= min
=  6
solution to a problem contains the but unnecessary recalculation is avoided. 8 + 0 + 8
optimal solution to subproblem. The problem is broken into subproblems,
m(1, 1) + m(2, 4) + 1 × 2 × 1
ο Overlapping subproblems: A recursive and these subproblems are solved, and 
A 1 A2 A=
3 A 4 (1, 4) min m(1, 2) + m(3, 4) + 1 × 1 × 1
solution the contains a small number the solutions are remembered, in case m(1, 3) + m(4, 4) + 1 × 4 × 1
of similar subproblems repeated many they need to be solved again. 
times. 0 + 6 + 2 8
y Bottom-up programming depend on values  
Note: = min 2 + 4 =
+ 1 min =
7 7
to calculate and order of evaluation. In Some problems can be solved with both 6 + 0 + 4 
 10
this approach, we solve the sub-problems techniques.
first only and based on the solution of the Hence, 7 is the minimum number of scalar
multiplication required to find the product
A1A2A3A4.

Dynamic Programming 91 92 Dynamic Programming


Here, the subsequence are shown below: Solution: 9 cost [2, 3] + T[3] ∞ + 187. Consider the weights and values of the
cost [2, 4] + T[4] ∞ + 4 items listed below.
X(2 3 4 6) X(4 5 6 7) X(2 3 6 7) The path is from 1 →− 4 →− 7 →− 8, which  
incur a minimum cost as 9. cost [2, 5] + T[5] 4 + 18 Item Number Weight Values (in
Y(1 3 5 6) Y(1 2 4 5) Y(1 3 4 5) = T[2] minimum = minimum =  22
T[i] denotes the minimum cost from ith node cost [2, 6] + T[6] 11 + 13 (in kgs) Rupees)
BCBA BDAB BCAB cost [2, 7] + T[7] ∞ + 2
to node 8.   1 1 10
The length of the longest common cost [2, 8] + T[8] ∞ + 0
= So, T [i] minimum {cost (i, j) + T [ j]} 2 2 12
subsequence is 4 and there are 3 subsequence ( for all j= (i+ 1) ton)
cost [2, 3] + T[3] ∞ + 18
of length 4. T[8] = 0 =j (i + 1) to n   3 4 28
cost [2, 4] + T[4] ∞ + 4
So, u = 4 and v = 3 cost [2, 5] + T[5] 4 + 18 The task is to pick a subset of these
= T[7] minimum {cost [7, 8] + T[8]} = T[2] minimum = minimum =  22
So, v + 10u = 3 + 40 = 43 cost [2, 6] + T[6] 11 + 13 items such that their total weight is no
cost [2, 7] + T[7] ∞ + 2 more than 6kg and their total value is
4. Consider two string X = “AAB” and Y =
= minimum = {2 + 0} 2  
cost [2, 8] + T[8] ∞ + 0 maximised. Moreover, no item may be
“ACA”. Let u be the length of the longest  
cost [6, 7] + T[7] split. The total values of items picked by
common subsequence (not necessarily T[6] = minimum  cost [1, 2] + T[2] 1 + 22 0/1 knapacks is______
 cost [6, 8] + T[8]
contiguous) between X and Y and let v cost [1, 3] + T[3] 2 + 18 Solution: 40
be the number of such longest common ∞ + 2  
 cost [1, 4] + T[4] 5 + 4
subsequences between X and Y. = T[6] minimum
=  13   weight
13 + 0 = T[1] minimum cost [1,= 5] + T[5] minimum= ∞ + 18 9
Then u + 10v is_______  0 1 2 3 4 5 6
cost [1, 6] + T[6] ∞ + 13
Solution: 12  
cost [5, 6] + T[6] cost [1, 7] + T[7] ∞ + 2 0 0 0 0 0 0 0 0
  
= T[5] minimum cost [5, 7] + T[7] cost [1, 8] + T[8] ∞+0
cost [5, 8] + T[8]   1 0 10 10 10 10 10 10

cost [1, 2] + T[2] 1 + 22
cost [1, 3] + T[3] 2 + 18 object 2 0 10 12 22 22 22 22
∞ + 13
  
= T[5] minimum = ∞ + 2 18 cost [1, 4] + T[4] 5 + 4 3 0 10 12 22 28 38 40
18 + 0  
 = T[1] minimum cost [1,= 5] + T[5] minimum = ∞ + 18 9
cost [1, 6] + T[6] ∞ + 13 If there is no object, then the maximum profit
cost [4, 5] + T[5]  
cost [4, 6] + T[6] (value) is going to be 0. This is indicated by
 cost [1, 7] + T[7] ∞ + 2
T[4] = minimum    1st row.
cost [4, 7] + T[7] cost [1, 8] + T[8]  ∞ + 0
Here, the subsequence is AA. So, the length 
Similarly, if the weight is 0, then also
cost [4, 8] + T[8]
of subsequence is 2, and there is only 1 maximum profit is 0. This is indicated by 1st
Hence, 9 is the minimum cost path from
occurrence of subsequence. ∞ + 18 column.
node 1 to node 8.
∞ + 13 = 4
Hence, u = 2, v = 1 
T[4] = minimum  Let ks(i,w) indicate maximum profit if
u + 10v = 2 + 10*1 = 12. 2 + 2 6. Shortest path in the multistage graph considering i’s number of elements and
∞ + 0
can be found by using maximum weight occupied is w.
5. Consider the multistage graph K = 4 then
cost [3, 4] + T[4] ∞ + 4 (A) Greedy method By using recurrence equation:
find out the minimum cost path from cost [3, 5] + T[5] 9 + 18
node 1 to node 8? (B) Dynamic method
  max(pi + ks(i − 1, w − wi ),ks(i − 1, w)), wi ≤ w
= T[3] minimum cost [3, = 6] + T[6] minimum
= 5 + 13 18 (C) Either by greedy method or dynamic 
4 ks(i,=
w) = 0;i 0 or= w 0
2 5 cost [3, 7] + T[7] 16 + 2 method
9   ks(i − 1, w); w > w
 i
1 18 cost [3, 8] + T[8] ∞ + 0 (D) None of above
11 p1 + ks(0, 0) 10 + 0
cost [3, 4] + T[4] ∞ + 4 Solution: (B) = ks(1, 1) max  = max =  10
13 
1
2
3
5
6 cost8  ks(0, 1) 0
[3, 5] + T[5] 9 + 18 Greedy method fails in finding the shortest
  p1 + ks(0, 1) 10 + 0
= T[3] minimum 2 cost [3, = 6] + T[6] minimum = 5 + 13 18 path in the multistage graph. = ks(1, 2) max = max
=  10
16
5 cost [3, 7] + T[7] 16 + 2 ks(0, 2) 0
  By using dynamic programming, we can get
4 7 cost [3, 8] + T[8] ∞ + 0 p1 + ks(0, 2) 10 + 0
2 the shortest path in the multistage graph. = ks(1, 3) max  = max
=  10
ks(0, 3) 0

Dynamic Programming 93 94 Dynamic Programming


p1 + ks(0, 3) 10 + 0 1 2 3 4 1 2 3 4
ks(1, 4) max  =
= max
=  10 1 0 6 1 3 1 0 11 1 6
ks(0, 4) 0
(A) 2 6 0 5 3 2 11 0 7 3
p1 + ks(0, 4) 10 + 0 D=
1

ks(1, 5) max  =
= max
=  10 3 1 5 0 2 3 1 7 0 2
ks(0, 5) 0 4 3 3 2 0 4 6 3 2 0
p1 + ks(0, 5) 10 + 0
ks(1, 6) max  =
= max
=  10 1 2 3 4
ks(0, 6) 0 1 0 11 1 3 1
Solution: 5 2 3 4
ks(2,
= 1) ks(1,
= 1) 10 (B) 2 11 0 7 3 1 0 11 1 6
(1, 2) + T(2, {3, 4})
p2 + ks(1, 0) 12 + 0  3 1 7 0 2 2 11 0 7 3
ks(2, 2) max  = max
= =
 12 T(1,=
{2, 3, 4}) min (1, 3) + T(3, {2, 4}) D=
2

ks(1, 2) 10  4 3 3 2 0 3 1 7 0 2
(1, 4) + T(4, {2, 3})
p2 + ks(1, 1) 12 + 10 1 2 3 4 4 6 3 2 0
ks(2, 3) max =
=  max
=  22 1 + 4
1 0 11 1 6
ks(1, 3) 10 
= min 2 + 7
3 + 4 (C) 2 11 0 7 3 1 2 3 4
p2 + ks(1, 2) 12 + 10 
ks(2, 4) max = max
= =  22 3 1 7 0 2 1 0 8 1 3
ks(1, 4)  10 =5 4 6 3 2 0 2 8 0 7 3
p2 + ks(1, 3) 12 + 10 (2, 3) + T(3, {4}) 4 + 8 D3=
ks(2, 5) max = max
= =
 22 T(2,
= {3, 4}) min  = min =
 4 (D) None of above 3 1 7 0 2
ks(1, 5) 10 (2, 4) + T(4, {3}) 2 + 2 4 3 3 2 0
p2 + ks(1, 4) 12 + 10 (3, 4) + T(4, {2}) 5 + 5 Solution: (A)
ks(2, 6) max =
= max
=  22 T(3,
= {2, 4}) min  = min =
 7
ks(1, 6) 10 (3, 2) + T(2, {4}) 2 + 5 Let Di = set of all shortest path between every 1 2 3 4
ks(3,
= 1) ks(2,
= 1) 10 (4, 2) + T(2, {3}) 4 + 5 pair in such a way that the path is allowed to 1 0 6 1 3
T(4,
= {2, 3}) min  = min =
 4
ks(3,
= 2) ks(2,
= 2) 12 (4, 3) + T(3, {2}) 1 + 3 go through node 0 to node i. 2 6 0 5 3
D4=
ks(3,
= 3) ks(2,
= 3) 22 T(3, {4}) = (3, 4) + T(4, 1) = 5 + 3 = 8 So, 3 1 5 0 2
T(4, {3}) = (4, 3) + T(3, 1) = 1 + 1 = 2 1 2 3 4 4 3 3 2 0
p3 + ks(2, 0) 28 + 0
ks(3, 4) max  = max
= =
 28 1 0 11 1 6
ks(2, 4) 22 T(2, {4}) = (2, 4) + T(4, 1) = 2 + 3 = 5
2 11 0 7 3
D=
0
p3 + ks(2, 1) 28 + 10 T(4, {2}) = (4, 2) + T(2, 1) = 4 + 1 = 5
3 1 7 0 2
ks(3, 5) max =
=  max
=  38
ks(2, 5) 22 T(2, {3}) = (2, 3) + T(3, 1) = 4 + 1 = 5 4 6 3 2 0
p3 + ks(2, 2) 28 + 12 T(3, {2}) = (3, 2) + T(2, 1) = 2 + 1 = 3
ks(3, 6) max = max
= =
 40
ks(2, 6) 22 So, minimum cost is 5 and the path is
8. Subset-sum can be computed by 1 2 1
(A) Row-major order only 1 2 4 3

(B) Column-major order only


(C) Either by row-major for column-
major only 10. Find the length of the shortest path
between all pair vertices for the given
(D) None of the above
graph G.
Solution: (C)
11
It can be computed by eighter row-major or 1 2
column-major only. 7
1 3
9. What is the minimum cost of the
travelling salesman problem, if starting 6
vertex is 1? 3 4
2

Dynamic Programming 95 96 Dynamic Programming


Matrix Chain Multiplication
Chapter Summary
Problem: Given a series of matrices: A1 × A2 × A3 × …×An with their dimensions, what is the
y The major components of dynamic programming are: best way to parenthesize them so that it produces the minimum scalar multiplication.
Recursion: Solves subproblems recursively. (2n) !
Number of ways we can parenthesis the matrix =
Memoization: Stores the result of sub-problems. (n + 1) !n!
Dynamic programming = Recursion + Memoization Where n = number of matrix – 1
y The two dynamic programming properties, which tells whether it can solve the given
problem or not are: Optimal substructure: An optimal solution to a problem contains Let M[i, j] represents the least number of multiplications needed to multiply Ai….Aj.
optimal solutions to subproblems. 0, if i = j
M[i, j] = 
Overlapping subproblems: When a large problem is break down into smaller sub- Min{M[i,K] + M[K + 1, j] + Pi−1PKPj } if i < j
problems, then there are many some sub-problems which are the same and repeated
y Bottom-up matrix chain multiplication
many times; these are known as overlapping sub problem.
Time complexity = O(n3)
y Basically, there are two approaches for solving DP problems:
Space Complexity = O(n2).
ο Bottom-up dynamic programming
Top down dynamic programming of matrix chain multiplication
ο Top-down dynamic programming
Time complexity = O(n3)
Bottom-up dynamic programming: In this method, we evaluate the function starting
Space Complexity = O(n2)
with the smallest possible input argument value, and then we step through possible
y Longest common subsequence: Given two strings: string X of length m [X(1...m)], and
values, and slowly increase the input argument value.
string Y of length n [Y(1…n)], find longest common subsequence: The longest sequence
While computing the values, we store all computed values in a table (memory).
of characters that appear left-to-right (but not necessarily in a contiguous block) in both
As larger arguments are evaluated, pre-computed values for smaller arguments can be
strings for example X = “ABCBDAB” and Y = “BDCABA”, the LCS(X,Y) = {“BCBA”, “BDAB”,
used.
“BCAB”}
y Top-down dynamic programming: In this method, the problem is broken into
y Brute force approach of longest common subsequence is O(n2m)
subproblems, and these subproblems are solved, and the solutions are remembered,
y Recursive equation of the LCS is
in case they need to be again solved. Also, we save the each computed value as
the final action of a recursive function, and as the first action we check if the pre- 0 = if i 0= or j 0,

computed value exists. C[i, j]= C[i − 1, j − 1] + 1 if i, j > 0 and xi= y j ,
y Example of dynamic programming 
max ( C[i, j − 1], C[i − 1, j]) , if i, j > 0 and xi ≠ y j
ο Fibonacci series
ο Matrix chain multiplication y Dynamic programming approach of longest common subsequence takes θ(mn) as both
ο Longest common subsequence time complexity and space complexity.
ο Subset sum problem y Multistage graph: It is a directed graph in which the nodes can be divided into a set of
ο 0/1 knapnack stages such that all edges are from one stage to the next stage only, and there will be no
ο Multistage graph edges between vertices of the same stage.
ο Travelling salesman problem Greedy methods fails to find the shortest path from source to destination in multistage
ο All-pairs shortest path Floyd–Warshall graph.
y Fibonacci Series: y Let ‘S’ be the source node and ‘T’ be the target node and let at stage 2 have nodes A, B
Recurrence equation: and C. So, the recursive equation look like as shown below:
Fib(n) = 0, if n = 0
M(N − 1, i) =i→T
= 1, if n = 1
= Fib(n-1) + Fib(n-2), if n>1 S → A + M(2, A)

Solving the fibonacci series by using dynamic programming takes time and space M(1, S) minimum S → B + M(2,B)
=
S → C + M(2, C)
complexity as O(n). 
y For all problems, it may not be possible to find both top-down and bottom-up
programming solution. M is a function which represents the shortest path with minimum cost.
y Time complexity of bottom-up dynamic programming of multistage graph problem is
O(E).

Dynamic Programming 97 98 Dynamic Programming


0/1 knapsack problem: In this problem, ‘n’ distinct objects are there, which are associated y The time complexity of all pairs shortest path Floyd–Warshall in is O(V3).
with two integer value s and v where, s represents the size of the object and v represents (where V = number of vertices)
the value of the object. We need to fill a knapsack of total capacity w with items of y The space complexity of all pairs shortest pat Floyd–Warshall is O(V3), but it can be
maximum value. Hence, we are not allowed to take partial of items. reduced to O(V2). (where V = number of vertices).
Let us say KS(i,w) represent knapsack. Here, i is the ith elements to be considered, and
the capacity of knapsack remaining is w.
The recurrence equation is:

max (Pi + KS(i − 1, w − wi ),KS(i − 1, w) ) ; wi ≤ w



KS(i,=
w) = 0;i 0 or= w 0
 th
KS(i − 1, w); wi > w(wi is the weight of i element)

O(2n )
Time complexity of 0/1 knapsack = minimum  (n is the number of object and
O(nw)
w is the total capacity of the knapsack.)
y Subset sum problem: Given a sequence of n positive numbers A1, A2,….An, give an
algorithm which checks whether there exists a subset of A whose sum of all numbers is
T.
y Brute force method time complexity of subset sum problem is O(2n).
y Let us assume that SS(i,S) denote sum from a1 to ai whose sum is equal to some number
‘S’.
y Recursive equation of subset-sum is given below:

SS(i − 1, S); S < ai



SS(i − 1, S − ai ) VSS(i − 1, S); S ≥ ai
SS(i, S) = 
true; S = 0
False;i
= 0, S ≠ 0

O(2n )
y Time complexity of subset-sum = minimum 
O(nw)
All Pair Shortest Path Floyd Warshall
Problem: Given a weighted directed graph G=(V,E), where V = {1, 2, …,n}. Find the shortest
path between all pair of nodes in the graph.
y The running time of Dijkstra’s algorithm if we use to find all pairs shortest path is O(VElogV).
y The running time of Bellman-Ford algorithm if we use to find all pair shortest path is
O(V2E).
y Let dij(k) be the weight of a shortest path from vertex i to vertex j for which all intermediate
vertices are in the set {1, 2, …,k}
The recurrence equation is:

wij , if k = 0
dij(k) = 
( (k − 1) (k − 1)
min dij , dik + dkj
(k − 1)
) , if k ≥ 1

Dynamic Programming 99 100 Dynamic Programming

You might also like