Algorithm and Design Complexity-CRC Press (2023)
Algorithm and Design Complexity-CRC Press (2023)
Complexity
Computational complexity is critical in analysis of algorithms and is important to
be able to select algorithms for efficiency and solvability. Algorithm and Design
Complexity initiates with discussion of algorithm analysis, time–space trade-off,
symptotic notations, and so forth. It further includes algorithms that are definite
and effective, known as computational procedures. Further topics explored include
divide-and-conquer, dynamic programming, and backtracking.
Features:
vii
Acknowledgments
First and foremost, praises and thanks to the God, the Almighty, for his showers of
blessings that helped us in prewriting, research, drafting, revising, editing, proof-
reading, and, finally, a successful book to share the knowledge.
“Being deeply loved by someone gives you strength, while loving someone deeply
gives you courage.” Anli Sherine and Dr. Geno Peter would like to express our love
for each other. We wish to thank our sweethearts, Hadriel Peter and Hanne Geona,
for giving us the love and space during the writing process.
Mary Jasmine wishes to thank all his lovable and sweet family members. They
have given me a lot of support and encouragement to complete this book.
Dr. S. Albert Alexander would like to take this opportunity to acknowledge those
people who helped me in completing this book. I am thankful to all my research
scholars and students who are doing their project and research work with me. But the
writing of this book is possible mainly because of the support of my family mem-
bers, parents, and brothers. Most important, I am very grateful to my wife, A. Lincy
Annet, for her constant support during writing. Without her, all these things would
not be possible. I would like to express my special gratitude to my son, A. Albin
Emmanuel, for his smiling face and support; it helped a lot in completing this work.
ix
About the Authors
Anli Sherine graduated with a Bachelor of Technology in information technology
from Anna University, India, and subsequently completed her Master of Engineering
in computer science engineering from Anna University, India. Currently, she works
in the School of Computing and Creative Media of the University of Technology
Sarawak, Malaysia. She is a member of the Malaysian Board of Technologists. Her
research interests include, but are not limited to, cryptography, mobile computing,
and digital image processing.
xi
xii About the Authors
xiii
1 Algorithm Analysis
What Is an algorIthm?
The word algorithm comes from the name of a Persian author Abu Ja’far Muhammad
ibn Musa al-Khwarizmi who wrote a textbook on mathematics. This word has taken
on a special significance in computer science, where ‘algorithm’ has come to refer to
a method that can be used by a computer for the solution of a problem.
Definition: An algorithm is a finite set of instructions that accomplishes a par-
ticular task.
All algorithms must satisfy the following criteria:
DOI: 10.1201/9781003355403-1 1
2 Algorithm and Design Complexity
Example 1
gcd(60, 24) = gcd(24, 60 mod 24)
= gcd(24, 12)
= gcd(12, 24 mod 12)
= gcd(12, 0)
= 12
Step 1: If n = 0, return the value of m as the answer and stop; otherwise, pro-
ceed to step 2.
Step 2: Divide m by n and assign the value of the remainder to r.
Step 3: Assign the value of n to m and the value of r to n. Go to step 1.
How do we know that Euclid’s algorithm eventually comes to a stop? This follows
from the observation that the second integer of the pair gets smaller with each itera-
tion and that it cannot become negative. Indeed, the new value of n on the next
iteration is m mod n, which is always smaller than n (why?). Hence, the value of the
second integer eventually becomes 0, and the algorithm stops.
So we can check whether t divides both m and n. If it does t is the answer; If it does
not, we simply decrease t by 1 and try again. We have to follow this method until it
reaches the answer.
Consider m = 12, n = 8
t = min(12, 8)
We will set a value of t = 8 initially.
Then we go on checking whether m mod t and n mod t both result in 0 or not. Thus,
we will repeat this process each time by decrementing t by 1 and performing m mod
t and n mod t.
120 = 2 × 2 × 2 × 3 × 5
72 = 2 × 2 × 2 × 3 × 3
gcd(120, 72) = 2 × 2 × 2 × 3 = 24.
Algorithm Analysis 5
both the design and the analysis of an algorithm. The data structure and the algo-
rithm work together, and these are interdependent. Hence, choosing the proper data
structure is required before designing the actual algorithm.
Algorithm design techniques: An algorithm design technique is a general
approach to solve problems algorithmically. These problems may belong to different
areas of computing.
Various algorithm design techniques are
• Divide and conquer: In this strategy, the problem is divided into smaller sub-
problems; the subproblems are solved to obtain the solution to the main problem.
• Dynamic programming: The problem is divided into smaller instances,
and the results of smaller reoccurring instances are obtained to solve the
problem.
• Greedy technique: From a set of obtained solutions, the best possible solu-
tion is chosen each time to obtain the final solution.
• Backtracking: In this method, in order to obtain the solution, a trial-and-
error method is followed.
For example:
ALGORITHM Sum(a,b)
//Problem Description: This algorithm performs addition of two numbers
//Input: Two integers a and b
// Output: Addition of two integers
c←a+b
return c.
3. Flowchart: A flowchart is a visual representation of the sequence of
steps for solving a problem. Instead of descriptive steps, we use pictorial
8 Algorithm and Design Complexity
• Time efficiency
• Space efficiency
• Simplicity
• Generality
Time efficiency: Time efficiency indicates how fast the algorithm runs.
Space efficiency: Space efficiency indicates how much extra memory the algo-
rithm needs to complete its execution.
Simplicity: Simplicity of an algorithm means generating a sequence of instruc-
tions that are easy to understand.
Generality: The generality of the problem is what the algorithm solves and the
range of inputs it accepts.
If you are not satisfied with the algorithm’s efficiency, simplicity, or generality, you
must return to the drawing board and redesign the algorithm.
study oF algorIthms
The study of algorithms includes many important areas, such as
• Debugging
• Profiling
1. Sorting
2. Searching
3. String processing
4. Graph problems
5. Combinatorial problems
6. Geometric problems
7. Numerical problems
For searching, there is no single algorithm that fits all situations best. Some algo-
rithms work faster than others but require more memory. Some are very fast but
applicable only to sorted arrays. Unlike sorting algorithms, there is no stability prob-
lem, but different issues arise. Specifically, in applications in which the underlying
data may change frequently relative to the number of searches, searching has to be
considered in conjunction with two other operations: an addition to and deletion from
the data set of an item. In such situations, data structures and algorithms should be
chosen to strike a balance among the requirements of each operation. Also, organiz-
ing very large data sets for efficient searching poses special challenges with impor-
tant implications for real-world applications.
• As problem size grows, the combinatorial objects grow rapidly and reach
to a huge value.
• There is no algorithm available that can solve these problems in a finite
amount of time.
• Many of these problems fall in the category of unsolvable problems.
6. Geometric problems: Geometric algorithms deal with geometric objects
such as points, lines, and polygons. Today, people are interested in geo-
metric algorithms with quite different applications such as applications to
computer graphics, robotics, and topography.
7. Numerical problems: Numerical problems are problems that involve
mathematical objects of continuous nature, solving equations and sys-
tems of equations, computing definite integrals, evaluating functions,
and so on. A majority of such mathematical problems can be solved only
approximately.
Example
Integer is an ADT. The implementation of an integer may be through any one of the
several forms like unsigned, signed, and others. The instance of this is used in pro-
grams. In C++, the instance of the data type int is created by declaring
int i;
Here, the programmer simply uses the properties of the int by creating an instance
without seeing how it has been implemented. Therefore, int can be said to be an
INTEGER ADT.
Example
As an example, let us take the data structure stack. The stack is implemented using
arrays, then, it is possible to modify any data in the array without going through
the proper rule, LIFO (last in, first out). Figure 1.4 demonstrates this. The data’s of
this kind is not possible in an object-oriented approach. Figure 1.4 shows the direct
access of data in the non-object-oriented approach. This is an illegal operation, but
no program error occurs.
Performance Analysis: The analysis of an algorithm deals with the amount of
time and space consumed by it. An efficient algorithm can be computed with mini-
mum requirement of time and space.
Space Complexity:Most of what we will be discussing is going to be how efficient
various algorithms are in terms of time, but some forms of analysis could be done
based on how much space an algorithm needs to complete its task. This space com-
plexity analysis was critical in the early days of computing when storage space on
a computer (both internal and external) was limited. When considering space com-
plexity, algorithms are divided into those that need extra space to do their work and
those that work in place. It was not unusual for programmers to choose an algorithm
that was slower just because it worked in place, because there was not enough extra
memory for a faster algorithm. The space complexity of an algorithm is the amount
of memory it needs to run to completion.
Looking at software on the market today, it is easy to see that space analysis is not
being done. Programs, even simple ones, regularly quote space needs in a number
of megabytes. Software companies seem to feel that making their software space
efficient is not a consideration because customers who don’t have enough computer
memory can just go out and buy another 32 megabytes (or more) of memory to run
the program or a bigger hard disk to store it. This attitude drives computers into
obsolescence long before they really are obsolete.
1. Instruction space
2. Data space
3. Environment stack space
1. Instruction space: The space needed to store the compiled version of the
program instructions
2. Data space: The space needed to store all constant and variable values
3. Environment stack space: The space needed to store information to resume
execution of partially completed functions
The total space needed by an algorithm can be simply divided into two parts from
the three components of space complexity:
1. Fixed
2. Variable
Fixed Part
A fixed part is independent of the characteristics (e.g., number, size) of the inputs and
outputs. This part typically includes the instruction space (i.e., space for the code),
space for simple variables and fixed-size component variables (also called aggre-
gate), space for constants, and so on.
Variable Part
A variable part that consists of the space needed by component variables whose
size is dependent on the particular problem instance being solved, the space needed
by referenced variables (to the extent that this depends on instance characteris-
tics), and the recursion stack space (insofar as this space depends on the instance
characteristics).
Algorithm Analysis 15
Examples
Find the space complexity of the following algorithms:
For algorithm abc, the problem instance is characterized by the specific values of
a, b, and c. Assume that one word is adequate to store the values of each a, b, and c,
and the result, which is the space needed by abc, is independent of the instant char-
acteristics, Sp(instance characteristics) = 0.
Time Complexity
The time complexity of an algorithm is the amount of computer time it needs to run
to completion.
The time T(P) taken by a program P is the sum of the compile time and the run (or
execution) time. The compile time does not depend on the instance characteristics.
A compiled program will run several times without recompilation. The run time is
denoted by tp (instance characteristics).
Because many of the factors tp depends on are not known at the time a program
is conceived, it is reasonable to attempt only to estimate tp, If we knew the charac-
teristics of the compiler to be used, we could proceed to determine the number of
additions, subtractions, multiplications, divisions, compares, loads, stores, and so on
that would be made by the code for P. So, we could obtain an expression for tp(n) of
the form
a. Comments—zero step
b. Assignment statement—one step
c. Iterative statement—finite number of steps (for, while, repeat–until)
Method 1
A new global variable count is assigned with an initial value of zero. Next, intro-
duce into the program statements to increment the count by the appropriate amount.
Therefore, each time a statement in the original program or function is executed, the
count is incremented by the step count of that statement.
Method 2
To determine the step count of an algorithm a table is built in which we list the total
number of steps contributed by each statement. The final total step count is obtained
by consecutive three steps:
Step Count
The change in the value of count by the time this program terminates is the number
of steps executed by the algorithm sum.
• The first method is by taking two arrays, one for the input and the other for
the output.
• Now, read the elements of the first array in reverse linear order and place
them in the second array linearly from the beginning.
FIGURE 1.5 (a) Reversing array elements using two arrays. (b) Reversing array elements
using the swap method.
for(int i=0;i<n;i++)
ary2[i]=ary1[(n–1)-i];
• Another approach is to just swap the first and last elements, then swap the
next two immediate elements as one from each end, and so on.
• This process is repeated until all the elements of the array get swapped.
• Here, no extra array is used. The input array alone gets modified into the
output array.
int ary1[n];
int k = floor(n/2);
for(int i=0;i<k;i++)
swap(&ary1[i],&ary1[(n–1)-i]) ;
In the first method, an extra array of size n, if n is the size of the input array, is used.
The output is obtained by simply assigning values of ary1 into ary2, in reverse order.
Therefore, the total space required for the first algorithm is 2n.
Let us now state the time complexity based on the number of assignment
statements.
20 Algorithm and Design Complexity
• Inside the for loop, there are n assignment statements, and hence, the time
complexity of this algorithm is n units of time.
• In the second method, only one array of size n, and a temporary variable
temp, inside the swap function, is used.
• For each swapping, three assignments are required. But the number of times
swapping is to be done in at most half of the time of the size of the array.
Big-oh notation (O): This notation is used to define the worst-case running time
of an algorithm and is concerned with very large values of n.
deFInItIon
Let f and g be two functions defined from a set of natural numbers to a set of nonneg-
ative real numbers. That is f, g: N → R ≥ 0. It is said that the function f(n) = O(g(n))
(read as ‘f of n is big oh of g of n’) if there exists two positive constants C ∈ R and
n0 ∈ N such that f(n) ≤ C g(n) for all n, n ≥ n0. That is, f(n) grows no faster than g(n);
g(n) is the upper bound. The big-oh notation provides an upper bound for the function
f. The definition is illustrated in Figure 1.6, where, for the sake of visual clarity, n is
extended to be a real number.
Algorithm Analysis 21
Example 1
Consider f(n) = 3n + 2,
where n is at least 2, 3n + 2 ≤ 3n + n.
3n + 2 ≤ 4n
So f(n) = O(n),
that is, f(n) ≤ Cg(n),
3n + 2 ≤ 4n, where C = 4 and n0 = 2.
Example 2
Example 3
C = 11 and n ≥ 5.
22 Algorithm and Design Complexity
Big-omega notation (Ω): This notation is used to describe the best-case running time
of algorithms and is concerned with very large values of n.
deFInItIon
Let f and g be two functions defined from a set of natural numbers to a set of nonneg-
ative real numbers. That is, f, g: N → R ≥ 0. It is said that the function f(n) = Ω(g(n))
(read as ‘f of n is omega of g of n’), if there exists two positive constants C ∈ R and
n0∈ N such that f(n) ≥ C g(n) for all n, n ≥ n0. The definition is illustrated in Figure 1.7.
The Omega notation provides the lower bound for the function f.
Example 1
f(n) = 3n + 2 for all n, 3n + 2 > 3n.
So f(n) = Ω(n).
Example 2
f(n) = 10n2 + 4n + 2
for all n, 10n2 + 4n + 2 > 10n2.
So f(n) = Ω(n2).
Also, 3n + 2 > 1, so f(n) = Ω(1).
10n2 + 4n + 2 > n. So f(n) = Ω(n).
Big-theta notation (θ ): This notation is used to define the exact complexity in which
the function f(n) is bound both above and below by another function g(n).
deFInItIon
Let f and g be two functions defined from a set of natural numbers to a set of non-
negative real numbers. That is, f, g: N → R ≥ 0. It is said that the function f(n) = θ
(g(n)) (read as ‘f of n is theta of g of n’), if there exists three positive constants C1,
C2 ∈ R and n0 ∈ N such that C1g(n) ≤ f(n) ≤ C2 g(n) for all n, n ≥ n0. The definition is
illustrated in Figure 1.8.
Example
Consider f(n) = 3n + 2.
4n ≥ 3n + 2 ≥ 3n
3n ≤ 3n + 2 ≤ 4n
C1 g(n) ≤ f(n) ≤ C2 g(n)
C1 = 3 C2 = 4 Therefore, 3n + 2 = θ (n).
Little-oh notation (o): This notation is used to describe the worst-case analysis of
algorithms and is concerned with small values of n.
The function f(n) = o(g(n)) (read as ‘f of n is little oh of g of n’) iff
f ( n)
lim a 0.
naa g( n)
Little-omega notation (ω): This notation is used to describe the best-case analysis
of algorithms and is concerned with small values of n.
24 Algorithm and Design Complexity
g( n)
lim a 0.
n aa f ( n)
Proof
statement 2
f(n) = O(g(n)) and g(n) ≤ h(n) implies f(n) = O(h(n)).
statement 3
Any function can be said as an order of itself. That is, f(n) = O(f(n)).
Proof
Proof of this property trivially follows from the fact that f(n) ≤ 1 × f(n).
statement 4
Any constant value is equivalent to O (1). That is C = O (1), where C is a constant.
The asymptotic notations are defined only in terms of the size of the input.
For example, the size of the input for sorting n numbers is n. So, the constants are
not directly applied to the stated facts to obtain the results. For instance, suppose we
have the summation 12 + 22 +. . . . . . . . . . . . . . .+ n2. By using the property explained
in statement 1 directly, we get
Actually, the asymptotic value of the sum is O(n3) and is obtained as follows:
Algorithm Analysis 25
statement 5
If lim a f a n a / g a n aa a R a 0 then f a n a a a a g a n a a .
n aa
statement 6
aa f a n a aa
If lim a
n aa g a n a
a a 0 then f a n a a O a g a n a a but f a n a a a a g a n a a .
a a
That is O a f a n a a a O a g a n a a ,
statement 7
If lim a f a n a / g a n aa a a a aa f a n a a a a g a n a a butf a n a a a a g a n a a.
n aa
That is, O a f a n a a a O( g a n a .
Sometimes, the L’Hospital’s rule is useful in obtaining the limit value. The rule states
that
f ana f aana ,
lim a lim
n aa g ana n aa ga a n a
Problem
Prove that log n a O( n ) but n a O(log n).
Proof
1
log n
lim a lim n
[By L-Hospital’s rule]
n aa
n n aa 1
2
n a21
2 n
a lim
n aa n
26 Algorithm and Design Complexity
2
a lim
n aa
n
a0
f(2k) = 2f(2k−1) + 2k
= 22f(2k−2) + 2k + 2k
= 22f(2k−2) + 2(2k)
.............
.............
= 2kf(1) + k2k,
C2 g(n)] for all n ≥ n0. We know that f(n) = 2f(n/2) + n = O(n log n| n is a power of 2),
which is the conditional asymptotic value. That is, f(n) = O(n log n) an a n0 for some
n0 ∈ N .
A function f: N → R ≥ 0 is eventually nondecreasing if an0 a N such that f(n) ≤
f(n + 1), an a n0.
theorem
Let p ≥ 2 be an integer. Let f, g: N → R ≥ 0. Also, f be an eventually nondecreasing
function and g be a p-smooth function. If f(n) ∈ O(g(n)| n is a power of p), then f(n)
∈ O(g(n)).
Proof
Apply the theorem for proving f(n) = O(n log n), ∀n.
To prove this, we have to prove that f(n) is eventually nondecreasing and n log n is
2-smooth.
Claim 1: f(n) is eventually nondecreasing.
Using mathematical induction, the proof is as follows
f(n) = 2 f(n/2) + n
≤ 2f((n + 1)/2) + (n + 1)
= f(n + 1).
a Cifn a n0
T ana a a
aa.T a f a n a a a g a n a otherwise
,
The constants, which are bis, are now converted to ais for simplicity.
Let us denote T(i) as xi. Hence, the equation becomes
a0 X n a a1 X n a1 aaa ak X n a k a 0, (2)
a0 X k a a1 X k a1 aaa ak a 0.
The preceding equation is known as a characteristic equation and can have k roots.
Let the roots be r1, r2, . . . . . , rk. The roots may not be the same.
Algorithm Analysis 29
x2 – 5x + 6 = 0
⇒ (x − 3)(x − 2) = 0
⇒ the roots are 3 and 2.
Case 2:Suppose some of the p roots are equal and the remaining are distinct.
Let us assume that the first p roots are equal to rl. Then, the general solution can
be stated as
p k
T a n a a aci ni a1r1n a acr i i
n
.
i a1 i a p a1
Case 2 can be extended to cases like p roots are equal to r1, q roots are equal to r2,
and so on.
where ai’s and bi’s are constants. Each Pi(n) is a polynomial in n of degree di.
Here the characteristic equation tends to be
(x − b1)d1+1 = 0
(x − b2)d2+1 = 0
...............
30 Algorithm and Design Complexity
These are a set of homogeneous equations, and hence, the general solution can be
obtained as before.
Let the general solutions for the earlier equations be S1, S2, . . . Then the general
solution of the inhomogeneous equation is given by
T(n) = S1 + S2 + . . .
Example
The recurrence equation of the complexity of merge-sort algorithm
Let n = 2k.
For simplicity, we say f(2k) = tk. Therefore,
(x − 2) (x − 2) = 0
⇒ (x − 2)2 = 0
tk = c12k + c2k2k
⇒ f(n) = c1n + c2n logn (3)
Given that f(1) = 1 ⇒ 1 = c1,
similarly, f(2) = 2f(1) + 2 = 2 + 2 = 4.
f(n) = n + n log n
= O (n log n| n is a power of 2).
As we have already proved that n log n is 2-smooth and that (n) is eventually
nondecreasing.
So we say f(n) = O(n log n).
Algorithm Analysis 31
• Substitution
• Iteration
• Master
substItutIon method
In the substitution method, make intelligent guesswork about the solution of a recur-
rence and then prove by induction that the guess is correct.
Given a recurrence relation T(n),
Example
Multiplication for all n > 1
T(n) = T(n − 1) + d
T(n) = T(n − 1) + d
T(n − 1) = T(n − 2) + d
.
.
T(2) = T(1) + d
T(1) = c.
Repeated substitution
T(n) = T(n − 1) + d
= (T(n − 2) + d) + d
= T(n − 2) + 2d
= (T(n − 3) + d) + 2d
= T(n − 3) + 3d
T(n) = T(n − i) + id
32 Algorithm and Design Complexity
Iteration Method
In the iteration method, the basic idea is to
Solution
T(n) = T(n − 1) + 1
T(n − 1) = T (n − 2) + 1
T(n) = (T(n − 2) + 1) + 1
= T(n − 2) + 2
T(n − 2) = T(n − 3) + 1
Thus, T(n) = (T(n − 3) + 1) + 2
= T(n − 3) + 3
T(n) = T(n − k) + k
k=n−1
T(n − k) = T(n − (n − 1))
= T(n − n + 1)
= T(1)
T(n) = T(1) + k
= T(1) + k
T(n) = θ(n)
Master Method
The master method is used for solving the following type of recurrence:
In the preceding recurrence, the problem is divided into a subproblem each of size
almost (n/b). The subproblems are solved recursively each in T(n/b) time. The cost of
splitting the problem or combining the solutions of subproblems is given by function
f(n). It should be noted that the number of leaves in the recursion tree is n E with E=
(log a/b log b).
Algorithm Analysis 33
Master Theorem
Let T(n) be defined on the nonnegative integers by the recurrence.
where a ≥ 1 and b > 1 are constants, f(n) is a function, and n/b can be interpreted
as [n/b]. Then T(n) can be bound asymptotically as follows:
a a
Case 1: If f a n a a O n E a e for some e > 0, then T a n a a a (n E ).
Example
Consider the following recurrence T(n) = 4T(n/2) + n.
Using the master method,
a = 4, b = 2, and f(n) = n
log a
nE = n(log a/logb), where E = .
log b
= n logba
= n log24
= n2
Proof
The characteristic equation is (x − 2)(x − 3) = 0.
Therefore, the roots are 2 and 3.
Proof
The characteristic equation is (x − 2)(x − 1) = 0.
Therefore the roots are 2 and 1.
34 Algorithm and Design Complexity
Proof
The characteristic equation is (x − 2)(x − 2)2(x − 1)3 = 0.
(x − 2)3(x − 1)3 = 0
best-case analysIs
• The best-case efficiency of an algorithm is its efficiency for the best-case
input of size n, which is an input of size n for which the algorithm runs fast-
est among all possible inputs of that size.
• Determine the kind of inputs for which the count c(n) will be the smallest
among all the possible inputs of size n.
• This gives the minimum computed time of the algorithm with respect to all
instances from the respective domain.
Mathematically, it can be stated as
B(n) =min{T (I) | I ∈ Dn}.
• If the best-case efficiency of an algorithm is unsatisfactory, we can immedi-
ately discard it without further analysis.
Worst-case analysIs
• The worst-case efficiency of an algorithm is its efficiency for the worst-case
input of size n, which is an input of size n for which the algorithm runs the
longest among all possible inputs of that size.
Algorithm Analysis 35
• Analyze the algorithm to see what kind of inputs yield the largest value of
basic operation’s count c(n) among all the possible inputs of size n and then
compute this worst-case value Cworst(n).
• This gives the maximum computation time of the algorithm with respect to
all instances from the respective domain.
• Mathematically, it can be stated as
• W(n) = max{T(I) | I ∈ Dn}.
average-case analysIs
• It is neither the worst-case analysis nor its best-case counterpart yields the
necessary information about an algorithm’s behavior on a ‘typical’ or ‘ran-
dom’ inputs.
• To analyze the algorithm’s average-case efficiency, make some assumptions
about possible inputs of size n.
• Mathematically, the average case value can be stated as
A ana a a P a I a T a I a.
I aDn
Example
• Take an array of some elements as shown in Figure 1.9. The array is to be
searched for the existence of an element x.
• If the element x is in the array, the corresponding location of the array
should be returned as an output.
• The element may or may not be in the array. If the element is in the array,
it may be in any location, that is, in the first location, the second location,
anywhere in the middle, or at the end of the array.
• To find the location of the element x, we use the following algorithm, which
is said to be the linear search algorithm.
TABLE 1.1
Number of Comparisons to Find the Element of the Array
Location of the element Number of comparisons required
0 1
1 2
2 3
. .
. .
. .
n–1 n
Not in the array n
• The searching process done here is sequential as one location after the other
is searched beginning from the first location to the end of the array, and
hence, it is named a linear search.
• The program terminates as soon as it finds the element or fails to find the
element in the array after searching the whole array.
• The return value for the successful search is the location of the element x,
and for an unsuccessful search, it is −1.
• In this program, the comparison is made between the array element and the
element that is to be searched for.
Table 1.1 provides information on the number of comparisons required to find the
element present in various locations of the array.
best case
The best case arises when the searching element appears in the first location (0).
B(n) = min{1, 2, . . . . . . . , n}
=1
= O(1)
Worst case
In the worst case, the element could either be in the last location or could not be in
the array.
W(n) = max{1, 2, . . . , n}
=n
= O(n)
average case
Let k be the probability of x being in the array. Therefore, the average probability is
Algorithm Analysis 37
A a n a a a i a 0P( I i )T a I i a
n
n a1
a a k / n a a a i a 1a a a1 a k a n
i a0
k n a n a 1a
a a a1 a k a n
n 2
k a n a 1a
a a a1 a k a n
2
a n a 1a n a3a
aa A a n a a a a n a 1 a O ana
4 2 aa 4 aa
looP controls
The loop control statements in C++ can be classified into two categories, exit-control
and entry-control. The name is due to the place where the condition is verified. As
shown in Figure 1.10, the do-while loop comes under exit control as the condition is
verified at the end of the loop in each iteration. Similarly, the whileloop and for loop
comes under the category of entry control as the condition is verified at the begin-
ning of each iteration.
TABLE 1.2
Computation Time for Linear Searching
Algorithm Best case Worst case Average case
Linear Search O(1) O(n) O(n)
38 Algorithm and Design Complexity
Let us see the asymptotic running time of the following programming statements.
int count = 0;
for (int i = 0;i<n;i++)
count++;
Here, the rule states that the running time of the for loop = max {running time of the
statements inside the for loop} × number of iterations.
Therefore, the ‘for’ loop in the preceding code takes O(n) time. This set of state-
ments can be written using either do-while or while loop, which also takes the same
complexity:
Suppose one for loop lies inside another for loop as in the following example:
int count = 0;
for (int i = 0;i<n;i++)
for(int j = 0;j < m; j++)
count++;
Algorithm Analysis 39
Here, the maximum running time of the statements inside the outer loop is O(m).
Therefore, the total time needed to run this code is O(nm).
Another rule says that in the case of a sequence of instructions, we should add the
computation time for each statement in that sequence.
Example
int count = 0;
for (int i = 0;i<n;i++)
count++;
for(int j = 0;j < m; j++)
count--;
O(n + m) = O(max{m,n})
if (condition)
statementl;
else
statement2;
M(n) = 2M(n − 1) + 1
M(n − 1) = 2M(n − 2) + 1
Thus, M(n) = 2[2M(n − 2) + 1] + 1
= 2M(n − 2) + 2 + 1
M(n − 2) = 2M(n − 3) + 1
Thus, M(n) = 22[2M(n − 3) + 1] + 2 + 1
= 23M(n − 3) + 22 + 2 + 1.
If i = n − 1,
Example
Consider the matrix multiplication program.
An input’s sizes by matrix order n consider multiplication as the algorithm’s basic
operation.
Let us set up a sum for the total number of multiplications m(n) executed by the
algorithm.
n a1 n a1 n a1
M a n a a aaa1
i a0 j a0 k a0
n a1 n a1
a aan
i a0 j a0
n a1
a an 2
i a0
M a n a a n3
2 Divide and Conquer
Divide: Break the problem into several subproblems that are similar to the
original problem but smaller in size.
Conquer: Solve the subproblems recursively.
Combine: Combine the solutions of the subproblems to create the solution to
the original problem.
general method
Given a function with n inputs, the divide-and-conquer strategy splits the input into k dis-
tinct subsets, 1 < k ≤ n, which yields k subproblems. If the subproblems are still relatively
large, then the divide-and-conquer strategy can possibly be reapplied. Now smaller and
smaller subproblems of the same kind are generated until the subproblems that are small
enough to be solved without splitting are produced. These subproblems must be solved,
and then a method must be found to combine sub-solutions into a solution of the whole.
Type DAndC(P)
{
if Small(P) return S(P);
else {
divide P into smaller instances P1, P2 . . . . . . . Pk, k≥1;
Apply DAndC to each of these subproblems;
DOI: 10.1201/9781003355403-2 43
44 Algorithm and Design Complexity
a g ana n small
T ana a a
a a 1a
T n a T a 2a
n aaa T a ka a a
n a f n otherwise ,
g(n) is the time to compute the answer directly for small inputs, and
f(n) is the time for dividing P and combining the solutions to subproblems.
a T a1a n a1
T ana a a ,
aaT a n / b a a f a n a n a1
• In step 1, we divide the original instance into two or more smaller instances.
Let us divide our 16-coin instance into two 8-coin instances by arbitrarily
selecting 8 coins for the first instance (say A) and the remaining 8 coins for
the second instance (say B).
• In step 2, we need to determine whether A or B has a counterfeit coin. For
this step, we use our machine to compare the weights of the coin sets A and
B. If both sets have the same weight, a counterfeit coin is not present in the
16-coin set. If A and B have different weights, a counterfeit coin is present,
and it is in the lighter set.
• Finally, in step 3 we take the results from step 2 and generate the answer
for the original 16-coin instance. For the counterfeit-coin problem, step 3 is
easy. The 16-coin instance has a counterfeit coin iff either A or B has one. So
with just one weight comparison, we can complete the task of determining
the presence of a counterfeit coin.
46 Algorithm and Design Complexity
Solution
• Now suppose we need to identify the counterfeit coin. The 16-coin instance
is a large instance. So it is divided into two 8-coin instances A and B as
earlier.
• By comparing the weights of these two instances, we determine whether a
counterfeit coin is present. If not, the algorithm terminates. Otherwise, we
continue with the sub-instance known to have the counterfeit coin.
• Suppose B is the lighter set. It is divided into two sets of four coins each.
Call these sets B1 and B2. The two sets are compared. One set of coins must
be lighter.
• If B1 is lighter, the counterfeit coin is in B1, and B1 is divided into two sets
of two coins each. Call these sets B1a and B1b. The two sets are compared,
and we continue with the lighter set.
• Since the lighter set has only two coins, it is a small instance. Comparing
the weights of the two coins in the lighter set, we can determine which is
lighter. The lighter one is the counterfeit coin.
Example 1
Consider a = 2, b = 2, T (1) = 2, f(n) = n
Example 2
a = 1, b = 2, f(n) = cn, c = constant
7
= T(n/8) + cn
4
.
.
cn
T(n) = T(n/2i) + 2i − 1
2 i −1
a
aa N logba
a a if a a b k
a
a a
T a N a a aa N k logN if a a b k .
a
a
a a
aa N
k
if a a b k
This theorem can be used to determine the running time of most divide-and-conquer
algorithms.
Example
Find the solution for the following recurrences:
T a N a aa a N alog24
if a a b k
a a
T a N a aa N 2
2. T(n) = 4T(n/2) + n2 T(1) = 1
Compare with T(n) = aT(n/b) + f(n).
We get a = 4, b = 2, f(n) = n2.
θ(nk) = n2
bk = 22 = 4
a = bk
a
T a N a a a N k logN a if a a b k
T a N a a a ( N log N )
2
48 Algorithm and Design Complexity
Divide-and-Conquer Examples
• Binary search
• Sorting: merge sort and quick sort
• Binary tree traversals
• Matrix multiplication: the Strassen algorithm
• Closest-pair algorithm
• Convex-hull algorithm
• Tower of Hanoi
• Exponentiation
Let P = (n, ai, ai+1, . . . , al, x) denote an arbitrary instance of this search problem.
Here n = number of elements in the list
ai, ai+1, . . . ; al = list of elements; and x = the element to be searched for.
Divide and conquer can be used to solve this problem.
Divide and Conquer 49
case 1
Let Small(P) be true if n = 1.
In this case, S(P) will take the value i if x = ai.
Otherwise, it will take the value 0. Then, g(n)= θ(1).
case 2
If P has more than one element, it can be divided (or reduced) into new subproblems
as follows. Pick an index q in the range [i, l] and compare x with a[q].
There are three steps:
Any given problem P gets divided (reduced) into one new subproblem. After a com-
parison with a[q], the instance remaining to be solved (if any) can be solved by using
this divide-and-conquer scheme again. If q is always chosen such that a[q] is the
middle element (i.e., q a a n a 1a / 2 or q a a first a last a / 2), then the resulting search
algorithm is known as binary search. Now the answer to the new subproblem is also
the answer to the original problem ‘P’, so there is no need for any combining.
Example
Let us select the 14 entries −15, −6, 0, 7, 9, 23, 54, 82, 101, 112, 125, 131, 142, and
151; place them in a [1: 14] and simulate the steps that BinSearch goes through as it
searches for different values of x. Only the variables low, high, and mid need to be
traced when we simulate the algorithm.
We try the following values for x: 151, −14, and 9 for two successful searches and
one unsuccessful search. The following table shows the traces of BinSearch on these
three inputs.
Do likewise for all the numbers in the list and find the total number of comparisons.
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Element −15 −6 0 7 9 23 54 82 101 112 125 131 142 151
Comparisons 3 4 2 4 3 4 1 4 3 4 2 4 3 4
Example: 2
Suppose k = 18 and we have the following array:
10, 12, 13, 14, 18, 20, 25, 27, 30, 35, 40, 45, 47
↑
Middle term
Divide and Conquer 51
Solution
1. To obtain the middle element, we have to apply the formula:
Mid = (low + high)/2
Mid = (0 + 12)/2 = 6
Mid = 6
2. Check a[6] = 25 with key. Then divide the array because k < 25 so we need
to search the left array.
10, 12, 13, 14, 18, 20
↑
Middle term
3. Conquer the sub-array by determining whether K is in the sub-array. This is
accomplished by recursively dividing the sub-array. K is present in the left
sub-array.
4. Compare K=18 with the middle element, K > 13. So K is present in the right
sub-array.
14, 18, 20
↑
Middle term
5. Compare K with the middle element, K = 18. Yes. The element is present in
the array.
Here the key comparison is done with half the size of the array.
Let T(n) be the worst-case number of comparisons used by procedure search on an
array of n numbers. Suppose n is the power of 2:
a 0 if n a 1
T ana a a
aT a n / 2 a a 1 otherwise
T(n) = T(n/2) + 1
= (T(n/4) + 1) + 1
= T(n/4) + 2
= (T(n/8) + 1) + 2
= T(n/8) + 3
.
.
= T(n/2i) + i
.
= T(1) +log n (i.e., i = log2 n)
52 Algorithm and Design Complexity
= 0 + log2n
Tw(n) = log2n
T(n) = θ(log n)
Average-Case Analysis
The number of key comparisons in an average case is slightly smaller than that in the
worst case. To obtain the average-case efficiency of a binary search, we will consider
some samples of input n.
log2n + 1 = C.
For instance if n = 2, then log22 = 1. Then, C = 1 + 1 = 2.
Similarly, n = 16; then, C = log216 + 1.
C=4+1=5
Tavg(n) = log2n + 1
T(n) = θ(log2n).
Best-Case Analysis
The number of comparisons used by procedure search on an array of n numbers is
only one.
Tbest(n) = 1
T(n) = θ(1)
theorem
Function BinSearch (a, n, x) Works Correctly.
Proof
int BinSearch(Type a[], int n, Type x)
// Given an array a[l:n] of elements in nondecreasing order, n>=0, determine
whether
Divide and Conquer 53
We assume that all statements work as expected and that comparisons such as x > a
[mid] are appropriately carried out.
Initially, low=l, high=n, n>=0, and a [1] <=a [2] <=. . . . . . . . . . .<=a[n].
If n==0, then the while loop is not entered, and 0 is returned. Otherwise, we observe
that each time through the loop the possible elements to be checked for equality with
x are a[low], a[low + 1] , . . . , a [mid] , . . . , a [high].
If x==a [mid], then the algorithm terminates successfully. Otherwise, the range is
narrowed by either increasing low to mid + l or decreasing high to mid − 1. Clearly
this narrowing of the range does not affect the outcome of the search. If low becomes
greater than high, then x is not present, and hence, the loop is exited.
Suppose we begin by determining the time for BinSearch on the previous data
set. We concentrate on comparisons between x and the elements in a [], recognizing
that the frequency count of all other operations is of the same order as that for these
comparisons. Comparisons between x and elements of a [] are referred to as element
comparisons. The number of element comparisons needed to find each of the 14
elements is
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Element −15 −6 0 7 9 23 54 82 101 112 125 131 142 151
Comparisons 3 4 2 4 3 4 1 4 3 4 2 4 3 4
On unsuccessful search, there are 15 possible ways that an unsuccessful search may
terminate depending on the value x.
54 Algorithm and Design Complexity
• If x < a [1], then the algorithm needs three element comparisons to deter-
mine that x is not present.
• For all the remaining possibilities, BinSearch requires four element
comparisons.
• Thus the average number of comparisons for an unsuccessful search is
= (3 + [14 * 4])/15 = 59/15 = 3.93.
theorem
If n is in the range [2k−1, 2k], then BinSearch makes at most k element compari-
sons for a successful search and either k − 1 or k comparisons for an unsuccess-
ful search. (In other words, the time for a successful search is O(log n) and for
an unsuccessful search is θ (log n)).
Proof
Consider the binary decision tree describing the action of BinSearch on n elements.
All successful searches end at a circular node whereas all unsuccessful searches end
at a square node.
If 2k−1 ≤ n < 2k, then all circular nodes are at levels 1, 2, . . . , k whereas all square
nodes are at levels k and k + 1 (note that the root is at level 1). The number of element
comparisons needed to terminate at a circular node on level i is i whereas the number
of element comparisons needed to terminate at a square node at level i is only i – 1.
This theorem states the worst-case time for binary search. To determine the aver-
age behavior, we need to look more closely at the binary decision tree and equate its
size to the number of element comparisons in the algorithm.
The distance of a node from the root is one less than its level. The internal path
length I is the sum of the distances of all internal nodes from the root. The external
path length E is the sum of the distances of all external nodes from the root.
It is easy to show by induction that for any binary tree with n internal nodes, E and
I are related by the formula
E a I a 2n.
It turns out that there is a simple relationship between E, I, and the average num-
ber of comparisons in binary search.
Let As (n) be the average number of comparisons in a successful search, and Au
(n) the average number of comparisons in an unsuccessful search. The number of
comparisons needed to find an element represented by an internal node is one more
than the distance of this node from the root. Hence,
As a n a a 1 a I / n.
The number of comparisons on any path from the root to an external node is equal
to the distance between the root and the external node. Since every binary tree with
n internal nodes has n + 1 external nodes, it follows that
Au a n a a E / a n a 1a.
Using these three formulas for E, As(n), and Au(n), we find that
E
Au a n a a
a a1a
n
1 a 2n
Au a n a a
a n a1a
n a As a n a a1a a 2 n
Au a n a a
a n a1a
n. As a n a a n a 2 n
Au a n a a
a n a1a
n. As a n a a n
Au a n a a
a n a1a
56 Algorithm and Design Complexity
n a1a As a n a a
Au a n a a
a n a1a
a n a1a Au a n a
1a As a n a a
n
a 1a
As a n a a a 1 a a Au a n a a1.
a na
For example,
Now the best case occurs when the elements are in increasing order. The number
of element comparisons is n − 1. The worst case occurs when the elements are in
decreasing order. In this case, the number of element comparisons is 2(n − 1).
Let P = (n, a[i], . . . , a[j]) denote an arbitrary instance of the problem. Here n is the
number of elements in the list a[i], . . . , a[j]. We are interested in finding the maxi-
mum and minimum of this list.
• Let Small (P) be true when n ≤ 2. In this case, the maximum and minimum
are a[i] if n = l.
• If n = 2, the problem can be solved by making one comparison.
• If the list has more than two elements, P has to be divided into smaller
instances.
For example, divide the problem P into two instances P1 = ( n/2 , a [1], . . . ,
a [ n/2 ]) and P2= (n- n/2 , a [ n/2 +1] . . . a[n]).
After subdividing the problem into smaller subproblems, we can solve them by
recursively invoking the same divide-and-conquer algorithm, and finally, the results
can be combined to give the main solutions.
If MAX (P) and MIN (P) are the maximum and minimum of the elements in P,
then MAX (P) = larger value of MAX (P1) and MAX (P2).
Also, MIN (P) = smaller value of MIN (P1) and MIN (P2).
58 Algorithm and Design Complexity
MaxMin is a recursive function that finds the maximum and minimum of the
set of elements {a(i), a(i + 1), . . . , a(j)}. The situation of set sizes one (i = j) and
two (i = j − 1) are handled separately. For sets containing more than two elements,
the midpoint is determined (just as in binary search) and two new subproblems are
generated.
When the maxima and minima of these subproblems are determined, the two
maxima are compared and the two minima are compared to achieve the solution for
the entire set.
The procedure is initially invoked by the statement
MaxMin (1, n, x, y)
A good way of keeping track of recursive calls is to build a tree by adding a node
each time a new call is made. For this program, each node has four items of informa-
tion: i, j, max, and min.
We see that the root node contains 1 and 9 as the values of i and j corresponding
to the initial call to MaxMin. This execution produces two new calls to MaxMin,
where i and j have the values 1, 5 and 6, 9, respectively, and, thus, split the set into two
Divide and Conquer 59
subsets of approximately the same size. From the tree, we can immediately see that
the maximum depth of recursion is four (including the first call). The circled num-
bers shown in Figure 2.3 in the upper left corner of each node represent the orders in
which max and min are assigned values.
a ana ana
a T a 2 aaT a 2 aa2 na2
aa a a a a
T a n a a a1 na0,
a0 n a1
a
aa
where n = powers of 2,
= 2k k = positive integer.
T(n) = 2T(n/2) + 2
= 2[2T(n/4) + 2] + 2
= 4T(n/4) + 4 + 2
= 8T(n/8) + 8 + 4 + 2
:
:
k a1 a n a
T a n a a 2 T a k a1 a a a 2i
a 2 a 1ai a k a1
60 Algorithm and Design Complexity
a n a
a 2 k a1 T a k a1 a a a 2i
a 2 a 1ai a k a1
Put n = 2k.
a 2k a
a 2 k a1 T a k a1 a a a 2i
a 2 a 1ai a k a1
a 2 k a1 T a 2 a a a 2i
1a i a k a1
a2 k a1
T a2a a a 2i whereT a 2 a a 1
1a i a k a1
a 2 k a1 a
a
2 2 a k a1a
a1 a a 2a k a1a a1
a2
2 a1
a 2 k a1 a 2 k a 2
2k
a a 2k a 2
2
ana
a a aana2 where n a 2 k
a2a
3n
T ana a a2
2
3n
− 2 is the best-, average-, and worst-case number of comparisons when n is a
2
power of 2; that is, n = 2k.
sorted halves of the list back into one sorted list. We see that merge sort breaks a list
in halves on the way down in the recursive process and then puts the sorted halves
together on the way back up.
Example
Consider the array of ten elements a [1: 10] = (310, 285, 179, 652, 351, 423, 861, 254,
450, 520).
• Function Merge Sort begins by splitting a [ ] into two sub-arrays each of size
5 (a [1: 5] and a [6: 10]).
• The elements in a [1: 5] are then split into two sub-arrays of size 3 (a [1: 3])
and 2 (a [4: 5]).
• Then the items in a [1: 3] are split into sub-arrays of size 2 (a [1: 2]) and 1
(a [3: 3]).
• The two values in a [1: 2] are split a final time into one-element sub-arrays,
and now the merging begins.
• Note that no movement of data has yet taken place.
• A record of the sub-arrays is implicitly maintained by the recursive
mechanism.
(310 | 285 | 179 | 652, 351 | 423, 861, 254, 450, 520).
(285, 310 | 179 | 652, 351 | 423, 861, 254, 450, 520).
62 Algorithm and Design Complexity
(179, 285, 310| 652, 351 | 423, 861, 254, 450, 520).
(179, 285, 310| 351, 652 | 423, 861, 254, 450, 520).
(179, 285, 310, 351, 652| 423, 861, 254, 450, 520).
At this point, the algorithm has returned to the first invocation of Merge Sort and
is about to process the second recursive call. Repeated recursive calls are invoked
producing the following sub-arrays:
(179, 285, 310, 351, 652| 423 | 861 | 254 | 450, 520)
(179, 285, 310, 351, 652 | 423, 861 | 254 | 450, 520)
(179, 285, 310, 351, 652 | 254, 423, 861 | 450, 520)
(179, 285, 310, 351, 652 | 254, 423, 861 | 450, 520)
(179, 285, 310, 351, 652 | 254, 423, 450, 520, 861)
At this point, there are two sorted sub-arrays, and the final merge produces the fully
sorted result:
(179, 254, 285, 310, 351, 423, 450, 520, 652, 861).
Figure 2.4 is a tree that represents the sequence of recursive calls that are pro-
duced by Merge Sort when it is applied to ten elements. The pair of values in each
node are the values of the parameters low and high. Notice how the splitting contin-
ues until sets containing a single element are produced.
Divide and Conquer 63
For example, the node containing 1, 2, and 3 represents the merging of a [1:2] with
a [3]. The program for merge sort is given as follows:
The program for merging two sorted sub-arrays using auxiliary storage is given as
If the time for the merging operation is proportional to n, then the computing time
for the merge sort is described by the recurrence relation:
a0 n a 1, a a constant
a
T ana a a ana
a 2T a 2 a a cn n a 1, c a contant .
a a a
• The function Select selects an input from a [] and removes it. The selected
input’s value is assigned to x.
• Feasible is a Boolean-valued function that determines whether x can be
included in the solution vector.
• The function Union combines x with the solution and updates the objective
function.
• The function Greedy describes the essential way that a greedy algorithm
will look, once a particular problem is chosen and the functions Select,
Feasible, and Union are properly implemented.
3. There is a function that checks whether the particular set of candidates pro-
vides a solution to our problem by ignoring the questions of optimality for
the time needed.
4. The second function checks whether the set of candidates is feasible or not.
5. The third function is a selection function that indicates at any time, which
of the remaining candidates is the most promising one.
6. An objective function that does not appear explicitly gives the value of a
solution.
advantage
• Greedy algorithms work fast when they work simple algorithms.
• They are easy to implement.
dIsadvantage
• Greedy algorithms don’t always work.
• It is hard to prove correct.
algorIthm
function makechange(n)
{
C = {100, 25, 10, 5, 1}
S = 0 (Solution set)
S1 = 0 (Sum of item in S)
while S1≠n do
x = largest item in C such that Si + x ≤ n.
if there is no such item return failure.
S← S0 (a coin of value x)
S1 = S1+x;
return S;
}
• For example, a child buys candy valued at less than $1 and gives a $1 bill
to the cashier.
• The cashier wishes to return the change using the fewest number of coins.
Assume that an unlimited supply of quarters, dimes, nickels, and pennies
is available.
• The cashier constructs the change in stages. At each stage, a coin is added
to the change.
• This coin is selected using the greedy criterion: At each stage, increase the
total amount of change constructed by as much as possible.
• To assure feasibility (i.e., the change given exactly equals the desired
amount) of the solution, the selected coin should not cause the total amount
of change given so far to exceed the final desired amount.
• Suppose that 67 cents in change is due to the child. The first two coins
selected are quarters.
• A quarter cannot be selected for the third coin because such a selection
results in an infeasible selection of coins (change exceeds 67 cents).
• The third coin selected is a dime, then a nickel is selected, and finally two
pennies are added to the change.
at any time. An optimal assignment is a feasible assignment that utilizes the fewest
number of machines.
Suppose we have n = 7 tasks labeled a through g and that their start and finish
times are as shown in Figure 2.5.
A greedy way to obtain an optimal task assignment is to assign the tasks in stages.
From the given set of jobs and machines, allocate jobs to a machine in which one is
free as shown in Figure 2.6. Once a machine has been allocated, then it is called old
machine. If a machine is not old, it is new.
For machine selection, use the greedy criterion: If an old machine becomes
available by the start time of the task to be assigned, assign the task to this
machine; if not, assign it to a new machine.
aW X i i a c and Xi a a0,1a ,1 a i a n.
i a1
n
The optimization function is sXi. Every set of xi’s that satisfies the constraints is a
i s1 n
feasible solution. Every feasible solution that maximizes aXi is an optimal solution.
i a1
The ship may be loaded in stages, one container per stage. At each stage, decide
which container to load.
greedy solutIon
• From the remaining containers, select the one with the least weight. This
order of selection will keep the total weight of the selected containers mini-
mum and, hence, leave the maximum capacity for loading more containers.
• Select the container that has the least weight, then the one with the next-
smallest weight, and so on until either all containers have been loaded or
there isn’t enough capacity for the next one.
x [c[i] .id] = 1;
capacity -= c[i] .weight; // remaining capacity
}
}
Example
Suppose that n = 8, [w1, w2, w3, . . . w8] = [100, 200, 50, 90, 150, 50, 20, 80] and c = 400.
Solution
When the greedy algorithm is used, the containers are considered for loading in the
order 7, 3, 6, 8, 4, 1, 5, 2.
[w7, w3, w6, w8, w4, w1, w5, w2] = [20, 50, 50, 80, 90, 100, 150, 200]
Containers 7, 3, 6, 8, 4, and 1 together weigh 390 units and are loaded. The avail-
able capacity is now 10 units, which is inadequate for any of the remaining containers.
In the greedy solution, we have [x1, x2, x3, . . . , x8] = [1, 0, 1, 1, 0, 1, 1, 1] and
a Xi a 6.
objectIve FunctIon
The objective is to obtain a filling of the knapsack that maximizes the total profit
earned. Since the knapsack capacity is m, we require the total weight of all chosen
objects to be at most m. formally, the problem can be stated as
a Px
1a i a n
i i (1)
subject to awx
1a i a n
i i am (2)
and 0 ≤ xi ≤ 1, 1 ≤ i ≤ n. (3)
Divide and Conquer 71
Any solution which satisfies 1 and 3 is called a feasible solution. Any feasible
solution that satisfies 2 is called an optimal solution.
The greedy knapsack problem was divided into three categories:
n = 3, m = 20, (PI, P2, P3) = (25, 24, 15) and (W1, W2, W3) = (18, 15, 10)
solutIon
1. Choose the object that has the highest profit.
PI = 25 W1 = 18
X1W1 = 1 × 18 = 18 X1PI = 1 × 25 = 25
2 a 20 a 18 a
X2 a i.e. a a
15 a P2 a
2 2
X 2W2 a a 15 a 2 X 2 P2 a a 24 a 3.2
15 15
X1 X2 X3 ∑ xi wi ∑ xi Pi
1 2/15 0 20 28.2
X1 = 1, X1W1 = 1 × 10 = 10 X1PI = 1 × 15 = 15
10 2 2 2
X=
2 = X 2W2 a a 15 a 10 X 2 P2 a a 24 a 16
15 3 3 3
X1 X2 X3 ∑ xi wi ∑ xi Pi
1 2/3 0 20 31
72 Algorithm and Design Complexity
P1 25
1. = = 1.38
W1 18
P2 24
2. = = 1.6
W2 15
P3 15
3. = = 1.5
W3 10
So select the value of the highest ratio; now the second one is the highest value.
X1 = 1, X1W2 = 1 × 15 = 15 X1P2 = 1 × 24 = 24
5 1
Then X=
2 =
10 2
1 1
X 2W3 a a 10 a 5 X 2 P3 a a 15 a 7.5 .
2 2
X1 X2 X3 ∑ xi wi ∑ xi Pi
1 ½ 0 20 31.5
Method 1
• Try to fill the sack by considering an object with the largest profit.
• If an object under consideration does not fit into the sack, then a fraction of
it is included.
• Thus, each time an object is included in the knapsack, we obtain the largest
possible increase in profit value.
• Thus, each time an object is included in the knapsack, the profit increases.
Method 2
• Consider the objects in the order of nondecreasing weights (i.e., choose the
object that has the lowest weight) to fill the knapsack first. Since this fits into
the sack completely, X1 = 1.
• Choose the next-lowest weight to fill into the knapsack. Since this size can-
not be completely included in sack, take a fraction of it.
• This method is also suboptimal only.
Method 3
• This algorithm finds the values between a rate at which the profit increases
and the rate at which the capacity is used.
Divide and Conquer 73
• At each step, we include the object that has the maximum profit per unit of
capacity used.
• It means that the objects are considered in the order of ratio (Pi/Wi). So the
object with the highest ratio is added first.
• This method gives the optimal solution.
1. Fractional knapsack
2. 0–1 knapsack
DOI: 10.1201/9781003355403-3 75
76 Algorithm and Design Complexity
Dynamic programming:
• In dynamic programming, many decision sequences may be generated.
• However, sequences containing suboptimal subsequences are discarded
because they cannot be optimal due to the principle of optimality.
1. Forward
2. Backward
ForWard aPProach
A dynamic programming formulation for a k-stage graph problem is obtained by
first noticing that every s-to-t path is the result of a sequence of k − 2 decisions. The
ith decision involves determining which vertex in Vi+1 1 ≤ i ≤ k − 2, is to be on the
path. It is easy to see that the principle of optimality holds.
Let p(i, j) be a minimum-cost path from vertex j in Vi to vertex t. Let cost(i, j) be
the cost of this path. Then using the forward approach, we obtain
l a Vi a1
j, l ∈ E.
A minimum-cost s-to-t path has a cost of 16. This path can be determined easily if
we record the decision made at each state (vertex).
Let d(i, j) be the value of l (where l is a node) that minimizes c(j, l ) + cost(i + 1, l )
In figure,
d(3, 6) = 10
d(3, 7) = 10
d(3, 8) = 10
d(2, 2) = 7
80 Algorithm and Design Complexity
d(2, 3) = 6
d(2, 4) = 8
d(2, 5) = 8
d(1, 1) = 2.
bacKWard aPProach
The multistage graph problem can also be solved using the backward approach. Let
bp(i, j) be a minimum-cost path from vertex s to a vertex j in Vi, Let bcost(i, j) be
the cost of bp(i, j). From the backward approach, we obtain
l a Vi a1
l, j ∈ E .
1, V1,V2, . . . . . , Vk−1,t
Vk = t
Vk−1 = d(k, t)
Vi = d(i + 1, d(i + 2, d(k, t)))
A minimum-cost s-to-t path has a cost of 16. This path can be determined easily if
we record the decision made at each state (vertex) as shown in Figure 3.2.
82 Algorithm and Design Complexity
d(3, 6) = 3
d(3, 7) = 2
d(3, 8) = 2
d(4, 9) = 6
d(4, 10) = 7
d(4, 11) = 8
d(5, 12) = 10
Procedure
• The matrix A can be obtained by solving n single-source problems using
the algorithm ShortestPaths.
• Let us examine a shortest i to j path in G, i ≠ j. This path originates at
vertex i, goes through some intermediate vertices (possibly none), and ter-
minates at vertex j.
• This path contains no cycles. If there is a cycle, then this can be deleted
without increasing the path length (no cycle has a negative length).
• If k is an intermediate vertex on this shortest path, then the sub-paths from i
to k and from k to j must be the shortest paths from i to k and k to j, respec-
tively. Otherwise, the i-to-j path is not of minimum length. So, the prin-
ciple of optimality holds. This alerts us to the prospect of using dynamic
programming.
• If k is the intermediate vertex with the highest index, then the i-to-k path is
a shortest i-to-k path in G going through no vertex with index greater than
k − 1.
• Similarly, the k-to-j path is the shortest k-to-j path in G going through no
vertex of index greater than k − 1.
Using Ak(i, j) to represent the length of the shortest path from i to j going through no
vertex of index greater than k, we obtain
a a a a
A a i, j a a min min1a k a n Ak a1 a i, k a a Ak a1 a k, j a cost a i, j a .
The shortest path from i to j going through no vertex higher than k either goes
through vertex k or it does not. If it does,
Ak a i, j a a Ak a1 a i, k a a Ak a1 a k, j a .
If it does not, then no intermediate vertex has an index greater than k − 1. Hence,
Ak a i, j a a Ak a1 a i, j a .
84 Algorithm and Design Complexity
Combining, we get
a a
Ak a i, j a a min Ak a1 a i, j a , Ak a1 a i, k a a Ak a1 a k, j a , k a 1.
Example
a 0 a 3 aa
a 2 0 a aa
Aa aa a
0a
aa 7 7 1 a
a a
a6 a a 0a
Aija a equals
0
Aa a = the length of the shortest path with intermediate vertices numbered no higher
1
than 1.
a 0 a 3 aa
a2 a
Aa a aa a
1
aa a
a a
a6 a
a a
Ak a i, j a a min Ak a1 a i, j a , Ak a1 a i, k a a Ak a1 a k, j a , k a 1
Dynamic Programming 85
k
a a , Aa a a Aa a a
Aija a a min Aija
k a1
ik
k a1 k a1
kj
Aa a a min a Aa a , Aa a a Aa a a
1 0 0 0
22 22 21 12
aa
a min a0, 2 a aa a 0
1
A22
aa
1
A23 a min A23 a
a a a a
, A21 a A13a a
0 0 0
a
aa
a min aa,2 a 3a a 5
1
A23
aa
a min aa,2 a aa a a
1
A24
aa
1
A32 a min A32 a
a a a a
, A31 a A12a a
0 0 0
a
aa
a min a7, a a aa a 7
1
A32
aa
1
A33 a min A33 a
a a a a
, A31 a A13a a
0 0 0
a
aa
a min a0, a a 3a a 0
1
A33
aa
a min a1, a a aa a 1
1
A33
aa
1
A42 a min A42 a
a a a a
, A41 a A12a a
0 0 0
a
aa
a min aa,6 a aa a a
1
A42
aa
1
A43 a min A43 a
a a a a
, A41 a A13a a
0 0 0
a
aa
a min aa,6 a 3a a 9
1
A43
aa
1
A44 a min A44 a
a a a a
, A41 a A14a a
0 0 0
a
aa
a min a0, 6 a aa a 0
1
A44
So
a 0 a 3 aa
a 2 0 5 aa
Aa a aa a.
1
aa 7 0 1 a
a a
a6 a 9 0a
Aa a = the length of the shortest path with intermediate vertices numbered no higher
2
than 2.
86 Algorithm and Design Complexity
a a a
a2 0 5 a a
Aa a aa a
1
a 7 a
a a
a a a
a0 a 3 a a
a2 0 5 a a
Aa aa a.
2a
a9 7 0 1 a
a a
a6 a 9 0 a
Similarly,
a 3 a
a 5 aa
Aa aa
3a
a9 7 0 1a
a a
a 9 a
a0 10 3 4 a
a2 0 5 6 a
Aa aa a.
3a
a9 7 0 1 a
a a
a6 16 9 0 a
a 4a
a 6 aa
Aa aa
4a
a 1a
a a
a6 16 9 0a
The resultant matrix is
a0 10 3 4 a
a2 0 5 6 a
Aa aa a.
4a
a7 7 0 1 a
a a
a6 16 9 0 a
Dynamic Programming 87
general deFInItIon
Let aa1, a 2,.....ana be a set of identifiers with a1 < a2 < . . . . . . < an. Let p(i) be the
probability with which we search for ai. Let q(i) be the probability that the identifier
x being searched for is such that ai < x < ai+1. 0 ≤ i ≤ n.
n
a q(i)
i a0
is the probability of an unsuccessful search
a p(i) + a q(i)
i a1 i a0
= 1.
cost FunctIon
To obtain the cost function for BSTs, it is useful to add a fictitious node in place of
every empty subtree in the search tree. Such nodes are called as external nodes. All
other nodes are internal nodes.
If a binary search tree represents n identifiers, then there will be exactly n internal
nodes and n + 1 (fictitious) external nodes. Every internal node represents a point
where a successful search may terminate. Every external node represents a point
where an unsuccessful search may terminate.
Successful search:If a successful search terminates at an internal node at level ‘l’,
thenl iterations are needed. Hence, the expected cost contribution from the internal
node for ai is p(i) * level(ai).
Unsuccessful search:An unsuccessful search terminates at the external nodes.
The identifiers not in the BST can be partitioned into n + 1 equivalence classes Ei,
0 ≤ i ≤ n.
Since there will be n + 1 external nodes in a binary tree with n identifiers,
All identifiers in the same class Ei, the search terminates at the same external
node. For identifiers in different Ei, the search terminates at different external nodes.
If the failure node for Ei, is at level l, then only l − 1 iterations are made. Hence, the
cost contribution of this node is q(i) * (level (Ei − 1)).
We define an optimal binary search tree for the identifier set aa1, a 2,.....ana to be a
BST for which a p a i a a level a ai a a a q a i a a a level a Ei a a1a is the minimum.
1a i a n 0 ai a n
Dynamic Programming 89
general Formula
w(i, j) = p(j) + q(j) + w(i, j − 1)
i ≤k ≤ j
Example
Obtain the OBST for the following:
n = 4, (a1, a2, a3, a4) = (do, if, int, while) p (1:4) = (3, 3, 1, 1) q (0:4) = (2, 3, 1, 1, 1)
Dynamic Programming 91
Solution
Initially 0 ≤ i ≤ n
w(i, i) = q[i]
w(0, 0) = q[0] = 2
w(1, 1) = q[1] = 3
w(2, 2) = q[2] = 1
w(3, 3) = q[3] = 1
w(4, 4) = q[4] = 1
c(i, i) = 0
c(0, 0) = 0
c(1, 1) = 0
c(2, 2) = 0
c(3, 3) = 0
c(4, 4) = 0
r(i, i) = 0
r(0, 0) = 0
r(1, 1) = 0
r(2, 2) = 0
r(3, 3) = 0
r(4, 4) = 0
Formula
i ≤k ≤ j
w(0, 1) = 3 + 3 + 2 = 8
0 ≤ k ≤1
c(0,1) a 8 a min a0 a 0a = 8
92 Algorithm and Design Complexity
r (0, 1) = 1
w (1, 2) = 3 + 1 + 3 = 7
1≤ k ≤ 2
r (1, 2) = 2
w (2, 3) = p(3) + q(3) + w(2, 2)
w (2, 3) = 1 + 1 + 1 = 3
c a 2, 3 a a w a 2, 3 a a min ac a i, k a 1a a c a k, j aa
2≤k≤3
c a 2, 3 a a w a 2, 3 a a min ac a 2, 2 a a c a 3, 3 aa
c a 2, 3 a a 3 a min a0,0a a 3
r (2, 3) = 3
w (3, 4) = p(4) + q(4) + w(3, 3)
w (3, 4) = 1 + 1 + 1 = 3
c a 3, 4 a a w a 3, 4 a a min ac a i, k a 1a a c a k, j aa
3≤k ≤4
c a 3, 4 a a w a 3, 4 a a min ac a 3, 3 a a c a 4, 4 aa
c a 3, 4 a a 3 a min a0,0a a 3
r (3, 4) = 4
c a 0, 2 a a w a 0, 2 a a min ac a i, k a 1a a c a k, j aa
0≤k≤2
c a 0, 2 a a 12 a min a0 a 7,8 a 0a a 12 a 7 a 19
r (0, 2) = 1
w (1, 3) = 1 + 1 + 7 = 9
1≤ k ≤ 3
c a1, 3 a a 9 a min a0 a 3, 7 a 0a a 9 a 3 a 12
r (1, 3) = 2
w (2, 4) = 1 + 1 + 3 = 5
c a 2, 4 a a w a 2, 4 a a min ac a i, k a 1a a c a k, j aa
2≤k≤4
c a 2, 4 a a w a 2, 4 a a min ac a 2, 2 a a c a 3, 4 a , c a 2, 3 a a c a 4, 4 aa
c a 2, 4 a a 5 a min a0 a 3, 3 a 0a a 5 a 3 a 8
r (2, 4) = 3
w (0, 3) = 1 + 1 + 12 = 14
c a 0, 3 a a w a 0, 3 a a min ac a i, k a 1a a c a k, j aa
0≤k≤3
94 Algorithm and Design Complexity
r (0, 3) = 2
w (1, 4) = p(4) + q(4) + w(1, 3)
w (1, 4) = 1 + 1 + 9 = 11
1≤ k ≤ 4
r (1, 4) = 2
w (0, 4) = 1 + 1 + 14 = 16
c a 0, 4 a a w a 0, 4 a a min ac a i, k a 1a a c a k, j aa
0≤k≤4
r (0, 4) = 2
The computation of c(0, 4), w(0, 4), and r(0, 4) is shown in Figure 3.5.
Thus, c(0, 4) = 32 is the minimum cost of a BST for (a1,a2,a3,a4).
a4 is ‘while’.
Now the OBST is given as shown in Figure 3.6.
The total time to evaluate all the c(i, j)’s and r(i, j)’s is
a a nm a m a a n a m a a m
1a m a n
2 2
We know that
c a i, j a a w a i, j a a min ac a i, k a 1a a c a k, j aa
i ≤ k ≤ j.
Time complexity can be reduced to O(n2) using D. E. Knuth’s result, which limits the
search to the range
r a i, j a 1a a k a r a i a 1, j a instead of w a i, j a a min ac a i, k a 1a a c a k, j aa ,1 a k a j.
The tree t0n can be constructed from the values r(i, j) in O(n) time.
a 2n a 1
c ana a a a for n a 0 c a 0 a a 1
a n a n a1
For example:
When key = 2, that is, n = 2,
Dynamic Programming 97
a4a 1
c ana a a a
a2a 3
1
a 4C2
3
4a3 1
a . a2
2 a1 3
The possible BST is 2. That is, then the tree is a1 < a2 as shown in Figure 3.7.
If we have three nodes, where a1 < a2 < a3, then the number of the possible binary
search tree is five as shown in Figure 3.8.
a6a 1
c ana a a a
a3a 4
6 a 5a 4 1
a . a5
3 a 2 a1 4
Example Problem
The possible binary search tree for the identifiers (a1, a2, a3) = (do, if, while). Consider
equal probabilities p(i) = q(i) = 1/7 for all i.
Solution
a6a 1
We have three nodes. So the number of possible BST a c a n a a a a
a3a 4
6 a 5a 4 1
a . a5.
3 a 2 a1 4
1
The trees are with equal probability p(i) = q(i) = .
7
Let us find the optimal binary search tree out of the five trees shown in Figure 3.9.
n n
cost a tree a a a a p a i a * level a ai a a aq a i a * a level a Ei a a 1a
i a1 i a0
1 1
a a1 a 2 a 3a a aaa 4 a 1a a a 4 a 1a a a 3 a 1a a a 2 a 1a aa
7 7
6 1
a a a3 a 3 a 2 a 1a
7 7
6 9 15
a a a
7 7 7
1 1
cost a tree b a a a1 a 2 a 2a a aaa 3 a 1a a a 3 a 1a a a 3 a 1a a a 3 a 1aaa
7 7
5 1
a a a2 a 2 a 2 a 2a
7 7
5 8 13
a a a
7 7 7
1 1
cost a tree c a a a1 a 2 a 3a a aaa 4 a 1a a a 4 a 1a a a 3 a 1a a a 2 a 1aaa
7 7
6 1
a a a3 a 3 a 2 a 1a
7 7
6 9 15
a a a
7 7 7
1 1
cost a tree d a a a1 a 2 a 3a a aaa 4 a 1a a a 4 a 1a a a 3 a 1a a a 2 a 1aaa
7 7
6 1
a a a3 a 3 a 2 a 1a
7 7
6 9 15
a a a
7 7 7
1 1
cost a tree e a a a1 a 2 a 3a a aaa 4 a 1a a a 4 a 1a a a 3 a 1a a a 2 a 1aaa
7 7
6 1
a a a3 a 3 a 2 a 1a
7 7
6 9 15
a a a
7 7 7
Tree (b) is optimal.
Example: 2
Consider p(1) = 0.5, p(2) = 0.1, p(3) = 0.05, q(0) = 0.15, q(1) = 0.1, q(2) = 0.05, and
q(3) = 0.05 for the identifiers (a1, a2, a3) = (do, if, while).
Solution
n n
cost a tree a a a a p a i a * level a ai a a aq a i a * a level a Ei a a1a
i a1 i a0
exercIse
Compute w(i, j), r(i, j), and c(i, j), 0 ≤ i < j ≤ 4 for the identifier set (a1, a2, a3, a4) =
(cout, float, if, while) with p(1) = 1/20, p(2) = 1/5, p(3) = 1/10, p(4) = 1/20, q(0) = 1/5,
q(1) = 1/10, q(2) = 1/5, q(3) = 1/20, and q(4) = 1/20. Using the r(i, j)’s construct, the
OBST.
solutIon
The solution to the knapsack problem can be obtained by making a sequence of deci-
sions on the variables x1, x2, x3, . . . . . . . , xn, and the decision on variable xi involves
determining which of the values 0 or 1 is to be assigned to it.
examPle
Consider the knapsack instance n = 3 (W1, W2, W3) = (2, 3, 4), (P1, P2, P3) = (1, 2, 5),
and m = 6.
Pi = 1 2 5
Wi = 2 3 4
solutIon
Step 1: Initialize S0
i = 0, S0 = {(0, 0)}
p, w
Find S10
Step 2: Find S1
Step 3: Find S2
Step 4: Find S3
aPPlIcatIons
1. Routing a postal van to pick up mail from mailboxes located at n different
sites
• Suppose we have to route a postal van to pick up mail from mailboxes
located at n different sites.
• An n + 1 vertex graph can be used to represent the situation.
• One vertex represents the post office from which the postal van starts
and to which it must return. Edge (i, j) is assigned a cost equal to the
distance from site i to site j.
• The route taken by the postal van is a tour, and we have to find a tour of
minimum length.
2. Planning a robot arm to tighten the nuts on different positions
• Suppose we wish to use a robot arm to tighten the nuts on some piece of
machinery on an assembly line.
• The arm will start from its initial position (which is over the first nut
to be tightened), successively move to each of the remaining nuts, and
return to the initial position.
• The path of the arm is clearly a tour on a graph in which vertices repre-
sent the nuts.
• A minimum-cost tour will minimize the time needed for the arm to com-
plete its task
3. Planning production in which n different commodities are manufactured on
the same sets of machines
104 Algorithm and Design Complexity
a a
g a l, V a a1aa a min2 a k a n clk a g a k, V a al, kaa .
a a
g a i, S a a min jas Cij a g a j, S a a jaa .
Procedure
g a i , a a a Ci 1 2aian
g a i, S a for all S a 1 2aian
g a i, S a for all S a 2 2aian
.
.
g a i, S a for all S a n a 2 2aian
g a i, S a for all S a n a 1 S a aV a a1aa
Dynamic Programming 105
Consider the following directed graph shown in Figure 3.10 and the edge length
given by the matrix.
a0 10 15 20 a
a 5 0 9 10 a
a a
a6 13 0 12 a
a a
a8 8 9 0 a
S a0
g a i , a a a Ci 1
g a 2, a a a C21 a 5
g a 3, a a a C31 a 6
g a 4, a a a C41 a 8
S a1
i a 2 to n
a a
g a i, S a a min jaS Cij a g a j, S a a jaa
If i a 2,
a
g a 2, a3aa a min ja3 C23 a g a 3, a3a a a3aa a
g a 2, a3aa a min aC23 a g a 3, a aa
106 Algorithm and Design Complexity
g a 2, a3aa a min a9 a 6a a 15
a
g a 2, a4aa a min ja4 C24 a g a 4, a4a a a4aa a
g a 2, a4aa a min aC24 a g a 4,a aa
g a 2, a4aa a min a10 a 8a a 18.
If i a 3,
a
g a 3, a2aa a min ja2 C32 a g a 2, a2a a a2aa a
g a 3, a2aa a min aC32 a g a 2, a aa
g a 3, a2aa a min a13 a 5a a 18
a
g a 3, a4aa a min ja4 C34 a g a 4, a4a a a4aa a
g a 3, a4aa a min aC34 a g a 4, a aa
g a 3, a4aa a min a12 a 8a a 20.
If i a 4,
a
g a 4, a2aa a min ja2 C42 a g a 2, a2a a a2aa a
g a 4, a2aa a min aC42 a g a 2,a
, aa
g a 4, a2aa a min a8 a 5a a 13
a
g a 4, a3aa a min ja3 C43 a g a 3, a3a a a3aa a
g a 4, a3aa a min aC43 a g a 3, a aa
g a 4, a3aa a min a9 a 6a a 15
S a 2.
i a 2 to n
a a
g a 2, a3, 4aa a min ja3, 4 C23 a g a 3, a3, 4a a a3aa , C24 a g a 4, a3, 4a a a4aa
g a 2, a3, 4aa a min a9 a g a 3, 4 a ,10 a g a 4, 3 aa
g a 2, a3, 4aa a min a9 a 20,10 a 15a a min a29,25a a 25
ia3
a
g a 3, a2, 4aa a min ja2, 4 C32 a g a 2, a2, 4a a a2aa , C34 a g a 4, a2, 4a a a4aa a
g a 3,a2, 4aa a min a13 a g a 2, 4 a ,12 a g a 4,2 aa
g a 3,a2, 4aa a min a13 a 18,12 a13a a min a31,25a a 25
Dynamic Programming 107
ia4
a
g a 4, a2, 3aa a min ja2,3 C42 a g a 2, a2, 3a a a2aa , C43 a g a 3, a2, 3a a a3aaa
g a 4, a2, 3aa a min aC42 a g a 2, 3 a , C43 a g a 3,2 aa
g a 4, a2, 3aa a min a8 a 155, 9 a18a a min a23,27a a 23
S a3
i a 2 to n
a
g a1, a2, 3, 4aa a min ja2,3, 4 C12 a g a 2, a2, 3, 4a a a2aa ,
a
C13 a g a 3, a2, 3, 4a a a3aa , C14 a g a 4, a2, 3, 4a a a4aa
aa
g a1, a2, 3, 4aa a min C12 a g a 2, a3, 4aa , C13 a g a 3, a2, 4aa , C14 a g a 4, a2, 3aaa
g a1, a2, 3, 4aa a min a10 a 25,15 a 25, 20 a 23a a min a35, 40, 43a a 35
Analysis
a a
TimeComplexity a O n 2 2 n
SpaceComplexity a O a n2 a n
Note
Edge from 1 to 2 = 10
Edge from 2 to 4 = 10
Edge from 4 to 3 = 9
Edge from 3 to 1 = 6 Total cost = 35 (Tour of minimum cost)
4 Backtracking
• For some problems, the only way to solve is to check all possibilities.
• To apply the backtracking method, the solution must be expressed as an
n-tuple (x1, x2, . . . . . , xn), where each x i ∈ Si, with Si being a finite set.
constraInts
• Solutions must satisfy a set of constraints.
• Constraints can be divided into two categories:
• Explicit
• Implicit
Explicit Constraints: Explicit constraints are rules that restrict each xi to take on
values only from a given set.
Example
All tuples satisfying the explicit constraints define a possible solution space for I (I =
problem instance).
Implicit Constraints: The implicit constraints are rules that determine which of
the tuples in the solution space of I satisfy the criterion function. Thus, implicit con-
straints describe the way in which the xi must relate to each other.
void Backtrack(int k)
// This is a schema that describes the backtracking process using recursion.
// On entering the first k − 1 values x[1], x[2], . . . , x[k − 1] of the solution vector
// x[1:n] have been assigned. x [] and n are global.
{
for (each x[k] such that x[k]∈ T(x[1], . . . , x [k–1])
{
if (Bk(x[1], x[2], , x[k])) {
if (x [1], x [2], , x [k] is a path to an answer node)
output x [1: k];
if (k < n) Backtrack(k+1);
}
}
}
Backtracking 111
termInology
Solution space: Tuples that satisfy the explicit constraints define a solution
space. The solution space can be organized into a tree.
Problem state: Problem state is each node in the depth-first search tree.
State space: State space is the set of all paths from the root node to the other
nodes.
Solution states: Solution states are the problem states s for which the path
from the root node to s defines a tuple in the solution space.
• In variable-tuple size formulation tree, all nodes are solution states.
• In fixed-tuple size formulation tree, only the leaf nodes are solution states.
• Partitioned into disjoint sub-solution spaces at each internal node.
Answer states: Answer states are those solution states s for which the path
from root node to s defines a tuple that is a member of the set of solutions
• These states satisfy implicit constraints.
State-space trees: State-space trees are the tree organization of the solution
space.
Static trees: Static trees are ones for which tree organizations are independent
of the problem instance being solved.
• Fixed-tuple size formulation
• The tree organization is independent of the problem instance being solved.
112 Algorithm and Design Complexity
Dynamic trees: Dynamic trees are ones for which the organization is depen-
dent on the problem instance.
Live nodes: Live nodes are the generated nodes for which all the children have
not yet been generated.
E-Node: E-node is a live node whose children are currently being generated
and expanded.
Dead nodes: Dead nodes are the generated nodes that are not to be expanded
any further or for which all their children have been generated.
• All the children of a dead node are already generated.
• Live nodes are killed using a bounding function to make them dead nodes.
• These states satisfy the implicit constraints.
Bounding function: The bounding function is a function used to kill live
nodes without generating all their children.
Backtracking: Backtracking is depth-first node generation with bounding
functions.
Branch and bound: Branch and bound is a state generation method in which
E-node remains E-node until it is dead.
Breadth-first search: Branch and bound with each new node placed in a
queue. The front of the queue becomes the new E-node.
Depth search (D-search): New nodes are placed into a stack. The last node
added is the first to be explored.
8-queens Problem
It is a classic combinatorial problem that places eight queens on an 8 × 8 chessboard
so that no two queens attack each other. That is, no two queens should be placed on
same row, column, or diagonal. The 8 × 8 chessboard is numbered from 1 to 8 (rows
and columns). The queens are numbered from 1 to 8 as shown in Figure 4.1. Since
each queen must be on a different row, each queen must be placed on row i.
Explicit criteria: S = {1, 2, 3, . . . , 8}. Define set ‘S,’ which has 8 values since
it is an 8 × 8 matrix.
Implicit criteria: No two queens must attack each other. No two xi’s can be
same.
Backtracking 113
• For example, consider the queen at a[4][2]. The squares that are diagonal to
this queen are a[3][1], a[5][3], a[6][4], a[7][5], and a[8][6].
• All these squares have a row–column value of 2.
• Every element on the same diagonal that goes from the upper right to the
lower left has the same ‘row + column’ value.
Suppose two queens are placed at positions (i, j) and (k, l). Then they are on the
same diagonal only if i a j a k a l or i a j a k a l.
{
for (int j=l; j < k; j++)
if ((x[j] == i) // Two in the same column
|| (abs(x[j]-i)== abs(j-k))) // or in the same diagonal
return(false);
return(true);
}
Place(k,i) is a function that returns a Boolean value that is true if kth queen can
be placed in column i. It tests both whether i is distinct from all previous values
x[1], . . . , x[k − 1] and whether there is no other queen on the same diagonal. Its
computing time is O (k − 1).
a 64 a
For an 8 × 8 chessboard, there are a a possible ways to place eight pieces as shown
a8 a
in Figure 4.2. We can use Estimate to estimate the number of nodes that will be
generated by N-queens.
Note that the assumptions that are needed for Estimate do hold for N-queens.
The bounding function is static. No change is made to the function as the search
proceeds. In addition, all nodes on the same level of the state-space tree have the
same degree. In Figure 4.2, we see five 8 × 8 chessboards that were created using
Estimate. The total number of nodes in the 8-queens state-space tree is
7
a j a
1 a a aa a 8 a i a a a 69.281.
j a0 a i a0 a
So the estimated number of unbounded nodes is only about 2.34% of the total
number of nodes in the 8-queens state-space tree.
Backtracking 115
4-queens Problem
It is a classic combinatorial problem that places four queens on a 4 × 4 chessboard so
that no two queens attack each other. That is, no two queens should be placed on the
116 Algorithm and Design Complexity
State space: Each node in the tree defines the problem state. All paths from the
root to the nodes are called the state space of the problem.
Answer states: Answer states are those solution states s for which the path
from the root to ‘s’ defines a tuple that is a member of the set of solutions
of the problem.
State-space tree: The tree organization of the solution space is referred to as
the state-space tree.
Live node: A node that has been generated and whose children have not yet all
been generated is called a live node.
E-node: The live node whose children are currently being generated is called
the E-node.
Dead node: A dead node is a generated node that is not to be expanded further
or whose children have not all been generated.
Figure 4.4 shows how backtracking works on the 4-queens problem, and the solution
is shown in Figure 4.5.
• Node 2 is an E-node.
• Now, node 3 is generated from the second node with a path length of 2 (Path =
1, 2). So this corresponds to placing the second queen in the second column.
• Now, node 3 is killed; it is assumed that node 2 is a dead node.
• Backtrack to node 2; node 2 is an E-node, so node 8 is generated with a
path = [1, 3].
• Now, node 9 is generated with a path = [1, 3, 2].
• So, node 9 is killed and is assumed to be a dead node.
• Backtrack to node 8; from this, node 11 is generated with a path = [1, 3, 4].
• So, node 11 is also killed; it is the dead node. Finally, node 8 is also killed
because we can’t place a third queen with reference to the position of the
second queen.
• Now backtrack to node 2.
• Now from node 2, node 13 is generated with a path = [1, 4].
• From node 13, node 14 is generated with a path = [1, 4, 2].
• From node 14, node 15 is generated with a path = [1, 4, 2, 3].
• So node 15 is killed; it is called a dead node. Now backtrack to node 14. So
kill node 14 also because no other possibilities to place the fourth queen
from node 14.
• Backtrack to node 13, and node 16 is generated with a path = [1, 4, 3].
• So kill node 16 and kill node 13, and the path we arrive at is wrong, so
finally node 2 is also killed.
• Backtrack to node 1, which generates node 18 with a path = [2].
• Now node 19 is generated from node 18 with a path [2, 1].
• So node 19 is killed. Backtrack to node 18.
• Now node 24 is generated with a path = [2, 3].
• So node 24 is killed. Backtrack to node 18.
• Now node 29 is generated with a path = [2, 4].
• Node 30 is generated from node 29 with a path = [2, 4, 1].
• Now node 31 is generated from node 30 with a path = [2, 4, 1, 3].
n-queens Problem
Consider an n × n chessboard and try to find all ways to place n nonattacking queens.
Here (Xl, . . . . . . . , Xn) represents a solution in which Xi is the column of the ith row
where the ith queen is placed. All Xis will be distinct since no two queens can be
placed in the same column.
If we imagine a chessboard’s squares being numbered as the indices of the two-
dimensional array a [l: n] [l: n], then every element on the same diagonal that runs
from the upper left to the lower right has the same ‘row–column’ value.
1. Variable-sized
2. Fixed-sized
exPlanatIon
The tree of Figure 4.6 corresponds to the variable tuple size formulation. Here, start
from node 1. Initially, x varies from 1 to 4. x1 = 1, 2, 3, 4. From x1, other variations
are x2 = 2, 3, 4, and from x2 = 2, other variations are x3 = 3, 4. From x3, next x4 = 4.
And here, only the increasing order of tree construction is used.
Given m = 31:
Disadvantage: Backtracking will find only the first solution it finds. It will not see
other solutions.
The tree is called static tree; the left child should have a value of 1, and the right
child must have value 0. For any fixed-sized representation, the state space will have
(2n+1 − 1) number of nodes. An example of the binary method is shown in Figure 4.7.
For example, n = 4; that is, 24+1 − 1 = 31.
method oF constructIon
Conditions
• If selected weight > m, then stop.
• If (selected weight + sum of weight) < m, then stop.
Explanation
• Initially, no weight was selected and level = 1; the remaining weights = all
object weights (i.e., 11 + 13 + 24 + 7 = 55).
• So a node contains [0, 1, 55]
Backtracking 121
• Now from the node [24, 4, 7], x4 = 1 means w4 = 7 was selected, so the total
weight = 24 + 7 = 31, level = 5, and the sum of weights = 7 – 7 = 0. So the
node becomes [31, 5, 0].
• From the node [24, 4, 7], x4 = 0 means w4 = 7 was not selected, so the weight
selected = 24, level = 5, and the sum of weights = 7 − 7 = 0. So the node
becomes [24, 5, 0].
• From [11, 3, 31] x3 = 1 or x3 = 0. If x3 = 1 means w3 = 24 was selected, but
11 + 24 = 35. 35 > 31. So stop, and it creates the node [35, 4, 7].
• From [11, 3, 31], if x3 = 0 means w3 = 24 was not selected. So the selected
weight = 11 and level = 4, and the sum of weight = 7. It creates the node
[11, 4, 7].
algorIthm
void SumOfSub(float s, int k, float r)
// Find all subsets of w[1:n] that sum to m. The values of x[j], 1<=j<k, have //
k a1 n
already been determined. s a a w[ j ]a x[ j ] and r a a w[ j ]
j a1 j ak
// The w[j]’s are in nondecreasing order.
n
// It is assumed that w [1]≤ m and a w[i ] a m
i a1
{
// Generate left child. Note that s+w[k] <= m because Bk−1 is true.
x[k] = 1;
if (s+w[k] == m) { // Subset found
for (int j=1; j<=k; j++) cout << x[j] <<‘ ‘;
cout <<endl;
}
// There is no recursive call here as w[j] > 0, 1 <= j <= n.
else if (s+w[k]+w[k+1] <= m)
SumOfSub(s+w[k], k+1, r-w[k]);
// Generate right child and evaluate Bk.
if ((s+r-w[k] >= m) && (s+w[k+1] <= m)) {
x[k] = 0;
SumOfSub(s, k+1, r-w[k]);
}
}
boundIng FunctIons
A simple choice for the bounding functions is Bk(X1, . . . , Xk) = true iff.
k n
aW X a a W
i a1
i i
i a k a1
i am
Backtracking 123
Clearly X1, . . . , Xk cannot lead to an answer node if this condition is not satisfied.
The bounding functions can be strengthened if we assume the Wk’s are initially in a
nondecreasing order. In this case, X1, . . . , Xk cannot lead to an answer node if
aw x
i a1
i i a wk a1 a m.
k n k
Bk a x1 aa xk a a trueiff awi xi a aw i a m and aw x i i a wk a1 a m.
i a1 i a k a1 i a1
Planar graPh
A graph is said to be planar iff it can be drawn in a plane in such a way that no two
edges cross each other.
A famous special case of the m-colorability decision problem is the 4-color prob-
lem for planar graphs. This turns out to be a problem for which graphs are very use-
ful, because a map can easily be transformed into a graph. Each region of the map
becomes a node, and if two regions are adjacent, then the corresponding nodes are
joined by an edge. Figure 4.9 shows a map with five regions and its corresponding
graph. This map requires four colors.
Given any map, the regions can be colored in such a way that no two adjacent
regions have the same color.
algorIthm
• Represent a graph by its adjacency matrix G[1:n, 1:n], where G[i, j] = 1 if
(i, j) is an edge of ‘G’ and G[i, j] = 0 if (i, j) is not an edge of G.
• The colors are represented by the integers 1, 2, . . . . . . . , m, and the solu-
tions are given by the n-tuple (x1, x2, . . . . . . . . . . , xn), where xi is the color
of node i.
• Finding all m-colorings of the graph.
void mColoring(int k)
//This program was formed using the recursive backtracking schema.
// The graph is represented by its Boolean adjacency matrix G[1:n] [1:n].
// All assignments of 1, 2, . . . , m to the vertices of the graph such that adjacent
vertices
//are assigned distinct integers are printed. k is the index of the next vertex to
color.
{
do { // Generate all legal assignments for x[k].
NextValue(k); // Assign to x[k] a legal color.
void NextValue(int k)
// x[1], . . . , x[k–1] have been assigned integer values in the range [1,m]
// such that adjacent vertices have distinct integers. A value for x[k] is
// determined in the range[0,m]. x[k] is assigned the next-highest-numbered
color
// while maintaining distinctness from the adjacent vertices of vertex k.
//If no such color exists, then x[k] is zero.
{
do {
x[k] = (x[k]+1) % (m+1); // Next highest color
if (!x[k]) return; // All colors have been used.
for (int j=1; j<=n; j++) { // Check if this color is
// distinct from adjacent colors.
if (G[k] [j] // If (k, j) is an edge
&& (x[k]== x[j])) // and if adj. vertices
// have the same color
break;
}
if (j == n+1) return; // New color found
} while (1); // Otherwise try to find another color.
}
Example 1
m=3
1 2
4 3
method oF constructIon
• First we color node 1; we are given three colors. We can color node 1 with
one of the three colors (1/2/3).
• So we have three children from root node in the tree.
126 Algorithm and Design Complexity
• If we select color 1 for node 1, then node 2 can be colored with either 2 or 3
so that we have two children from node 2.
• Now assume node 2 colored with color 2.
• Then node 3 can be colored with either color 1 or 3.
• Again assume we selected color 1 for node 3. Then node 4 can be colored
with 2 or 3.
• Now backtrack to node 3; if it was colored with color 3, then node 4 can be
colored with color 2 alone only because node 1 and node 3 were colored
with color 1 and color 3. So node 4 can be colored with color 2 only.
• Now backtrack to node 2; if it is colored with color 3, then node 3 can be
colored with color 1 or 2.
• Now we assume node 3 is colored with color 1; then node 4 can be colored
with color 2 or 3, and likewise, if node 3 is colored with color 2, then node
4 can be colored with color 3 only.
• Likewise, continue for the full solution tree as shown in Figure 4.10.
There is no known easy way to determine whether a given graph contains a Hamiltonian
cycle. A backtracking algorithm is used to find all the Hamiltonian cycles in a graph.
The graph may be directed or undirected. Only distinct cycles are output.
The backtracking solution vector (x1, . . . . . . . . , xn) is defined so that Xi represents
the ith visited vertex of the proposed cycle. The function NextValue (k) is used to
determine a possible next vertex for the proposed cycle.
Example
Consider the graph shown in Figure 4.12.
Start at vertex a; make vertex a the root of the state-space tree. Select vertex b.
Next select vertex c. Then select d, then e, and finally f, which proves to be a dead
end as shown in Figure 4.13.
So the algorithm backtracks from f to e, there are no useful alternatives. Backtrack
from e to d and then to c, which provides the first alternative for the algorithm to
pursue. Going from c to e eventually proves useless as shown in Figure 4.14.
The algorithm backtracks from e to c. There are no other alternatives. Then back-
track from c to b as shown in Figure 4.15. From there, it goes to the vertices f, e, c,
and d, from which it returns to a, yielding the Hamiltonian circuit a, b, f, e, c, d, a.
Note
The traveling salesperson problem is used to find a tour that has minimum cost. This
tour is a Hamiltonian cycle. For the simple case of a graph whose edge costs are all
identical, Hamiltonian will find a minimum-cost tour if a tour exists. If the common
edge cost is c, the cost of a tour is cn since there are n edges in a Hamiltonian cycle.
aW X
1a i a n
i i a m and aPX
1a i a n
i i is maximized.
The Xi’s constitute a zero-one-valued vector. The solution space for this problem
consists of the 2n distinct ways to assign zero or one values to the Xi’s.
130 Algorithm and Design Complexity
A Bounding Function
float Bound(float cp, float cw, int k)
// cp is the current profit total, cw is the current
// weight total; k is the index of the last removed
// item; and m is the knapsack size.
{
float b = cp, c = cw;
for (int i=k+1; i<=n; i++) {
c += w [i] ;
if (c < m) b += p[i] ;
else return(b + (1-(c-m)/w[i])*p[i]);
}
return(b);
}
From Bound, it follows that the bound for a feasible left child of a node Z is the same
as that for Z. Hence, the bounding function need not be used whenever the backtrack-
ing algorithm makes a move to the left child of a node. The resulting algorithm is
BKnap. It was obtained from the recursive backtracking schema.
So far, all our backtracking algorithms have worked on a static state-space tree.
We now see how a dynamic state-space tree can be used for the knapsack problem.
One method for dynamically partitioning the solution space is based on trying to
obtain an optimal solution using the greedy algorithm.
We first replace the integer constraint Xi = 0 or 1 by the constraint 0 ≤ xi ≤ 1. This
yields the relaxed problem
max aPX
1a i a n
i i subject to aW X
1a i a n
i i a m.
Example
Consider the knapsack instance n = 3, w = [20, 15, 15], P = [40, 25, 25], and C = 30.
state-sPace tree
Best solution is L node. Profit is 50.
• The root is the only live node at this time. It is also the E-node (expansion
node).
• From here, move to either B or C. Suppose the move is to B. The live nodes
are A and B.
• B is the current E-node. At node B the remaining capacity r is 10, and the
profit earned cp is 40. From B, move to either D or E. The move to D is
infeasible, as the capacity needed to move there is w2 = 15. The move to E is
feasible, as no capacity is used in this more.
• E becomes the new E-node. The live nodes at this time are A, Band E. At
node E, r = 10 and cp = 40. From E, move to either J or K. The move to node
J is infeasible.
5.1 INTRODUCTION
Graphs are formal descriptions for a wide range of familiar situations. The most
common example is a road map, which shows the location of intersections and the
roads that run between them. In graph terminology, the intersections are the nodes of
the graph, and the roads are the edges. Sometimes our graphs are directed (like one-
way streets) or weighted with a travel cost associated with each edge (like toll roads).
As we give more details on the terminology and concepts in graphs, the similarity to
road maps will be even clearer.
There are times when information needs to be distributed to a large number of
people or to the computers on a large network. We would like this information to get
everywhere, but we also don’t want it going any place twice. Some groups of people
will accomplish this by setting up a ‘phone tree’ where each person has a small num-
ber of people to call to pass on news. If everyone appears once in the tree and if the
tree is not very deep, information will travel to everyone very quickly. For graphs,
this is a bit more complicated, because there are typically many more connections
between nodes than in a tree. We will look at two graph traversal methods, depth-
first and breadth-first, to accomplish this.
Graph G consists of two sets V and E. The set V is a finite, nonempty set of ver-
tices. The set E is a set of pairs of vertices; these pairs are called edges. A graph is
a finite set of vertices (nodes) with edges between nodes, that is, Graph G = (V, E),
where
tyPes oF graPhs
There are two types of graphs:
1. Undirected
2. Directed
Undirected graph: In an undirected graph the pair of vertices representing any edge
is unordered. A graph G is called undirected if every edge in it is undirected. Thus,
the pairs (u, v) and (v, u) represent the same edge. The graph depicted in Figure 5.1
has six vertices and seven undirected edges:
V = {a, b, c, d, e, f }, E = {(a, c), (a, d), (b, c), (b, f), (c, e), (d, e), (e, f)}.
V = {a, b, c, d, e, f }, E = {(a, c), (b, c), (b, f), (c, e), (d, a), (d, e), (e, c), (e, f)}.
• A graph may not have an edge from a vertex v back to itself. That is, the
edges of the form (v, v) are not legal. Such edges are known as self-edges
or self-loops.
• A graph may not have multiple occurrences if the same edge.
Graph 137
Note: The number of distinct unordered pairs (u, v) with u ≠ v in a graph with n
vertices is a
n n a1a
. This is the maximum number of edges in any n-vertex undi-
2 n a n a1a
rected graph. An n-vertex, undirected graph with exactly edges is said to
be complete. 2
Terminology
Adjacency: Two vertices are adjacent if they are connected with an edge. When
(x, y) is an edge, we say that x is adjacent to y, and y is adjacent from x.
Path: A path in Graph G is a sequence of vertices (nodes) V1, V2, . . . , VK
such that there is an edge from each node to the next one in the sequence
aVi , Vi a1 a a E, 0 a i a n.
Length of path: The length of a path is defined as the number of edges in a path.
Degree: The degree of a vertex is the number of edges incident to that vertex.
graph represenTaTion
There are two standard ways to represent a graph:
• Adjacency matrix
• Adjacency list
Advantages
• Simple to implement
• Easy and fast to tell if a pair (i, j) is an edge: simply check if A [i][j] is 1 or 0.
Disadvantage
• No matter how few edges the graph has, the matrix takes O(n2) in memory.
Program: BFS
Void BFS (int v)
// A breadth-first search of G is carried out beginning at vertex v. //For any
node i, visited [i]==1 if ‘i’ has already been visited. The //graph G and
array visited [] are global; visited [] is initialized to zero.
{
int u=v; Queue q[SIZE];
// q is a queue of unexplored vertices.
visited[v] = 1;
do {
for all vertices w adjacent from u {
if (visited[w] == 0) {
q.AddQ(w); // w is unexplored,
visited[w]=1;
}
}
if (q.Qempty()) return; // No unexplored vertex
q.DeleteQ(u); // Get first unexplored vertex.
} while (1);
}
If a BFS is used on a connected undirected graph G, then all vertices in G get visited,
and the graph is traversed. However, if G is not connected, then at least one vertex of
G is not visited. A complete traversal of the graph can be made by repeatedly calling
BFS each time with a new unvisited starting vertex. The resulting traversal algorithm
is known as a BFT. Consider the example graph shown in Figure 5.5 and how to find
out the adjacency matrix and BFS in Figure 5.6.
Adjacency Matrix
C L N B P
C 0 1 1 0 0
L 0 0 1 0 0
N 0 0 0 0 0
B 1 1 1 0 1
P 0 0 1 0 0
Procedure
1. Begin with any node and mark it as visited.
2. Proceed to the next node having any edge connection to the node in step 1
and mark it as visited.
3. Come back to the node in step 1; descend along an edge toward an unvisited
node and mark the new node as visited.
4. Repeat step 3 until all the nodes adjacent to the node in step 1 have been
marked as visited.
5. Repeat steps 1 through 4 from the node visited in step 2; then start from the
node visited in step 3.
6. Keep this up as long as possible.
• At this time, the exploration of the new vertex u begins. When this new
vertex has been explored, the exploration of v continues.
• The search terminates when all reached vertices have been fully explored.
• A DFT of a graph is carried out by repeatedly calling DFS, with a new
unvisited starting vertex each time.
Adjacency Matrix
C L N B P
C 0 1 1 0 0
L 0 0 1 0 0
N 0 0 0 0 0
B 1 1 1 0 1
P 0 0 1 0 0
Procedure
1. Choose any node in the graph designated as a search node and mark it as
visited.
2. Using the adjacency matrix of the graph, find a node adjacent to the search
node that has not been visited, designate it as a new search node, and mark
it as visited.
3. Repeat the step 2 using the new search node. If no node satisfying the step 2
can be found, then return to the previous search node and continue from
there.
4. When a return to a previous search node in step 3 is impossible, then the
search from the originally chosen node is complete.
5. If the graph still contains unvisited nodes, then choose any node that has not
been visited and repeat steps 1 through 4.
1. Tree edge
Edge (u, v) is a tree edge if vertex v is first discovered by exploring edge (u,
v) as shown in Figure 5.9.
2. Back edge
Edge (u, v) is a back edge that connects a vertex u to ancestor v in a depth-
first tree as shown in Figure 5.10.
3. Forward edge
Edge (u, v) is a forward edge (or non-tree edge) that connects a vertex u to
descendent v in a depth-first tree as shown in Figure 5.11.
4. Cross edge
Cross edges are other edges that go between
• vertices of the same tree, as long as one vertex is not an ancestor of the
other.
• vertices of different trees as shown in Figure 5.12.
Applications
1. A BFS is used to determine whether G is a connected graph or not because
• if G is connected, then all vertices will be visited on the first call to BFS.
• if G is not connected, then at least two calls to BFS will be needed.
2. A BFT is used to obtain the connected component of a graph.
All newly visited vertices on a call to BFS from BFT represent the vertices in a con-
nected component of G.
If adjacency lists are used, a BFT will obtain the connected components in
θ (n + e) time.
1. Prim’s algorithm
2. Kruskal’s algorithm
PrIm’s algorIthm
Prim’s algorithm constructs a minimum spanning tree through a sequence of expand-
ing subtrees. The initial subtree in such a sequence consists of a single vertex selected
arbitrarily from the set V of the graph’s vertices. On each iteration, the algorithm
expands the current tree in a greedy manner by simply attaching to it the nearest
vertex that is not in that tree. (By the nearest vertex, we mean a vertex that is not
in the tree connected to a vertex in the tree by an edge of the smallest weight. Ties
can be broken arbitrarily.) The algorithm stops after all the graph’s vertices have
been included in the tree being constructed. Since the algorithm expands a tree by
exactly one vertex on each of its iterations, the total number of such iterations is n − 1,
where n is the number of vertices in the graph. The tree generated by the algorithm
is obtained as the set of edges used for the tree expansions. One way to compute an
MST is to grow the tree in successive stages. In each stage, one node is picked as the
root and an edge is added; that is, an associated vertex is added to the tree until all
the vertices are present in the tree with |V − 1| edges. Figure 5.16 gives an example
of this algorithm in operation.
A greedy method to obtain a minimum-cost spanning tree builds this tree edge by
edge. The next edge to include is chosen according to some optimization criterion.
The Strategy
1. One node is picked as a root node (u) from the given connected graph.
2. At each stage choose a new vertex v from u, by considering an edge (u, v)
with minimum cost among all the edges from u, where u is already in the
tree and v is not in the tree.
3. The Prim’s algorithm table is constructed with three parameters:
• Known—the vertex is added in the tree or not.
Graph 149
v known dv Pv
A 0 0 0
B 0 ∞ 0
C 0 ∞ 0
D 0 ∞ 0
E 0 ∞ 0
F 0 ∞ 0
G 0 ∞ 0
v known dv pv
A 1 0 0
B 0 28 A
C 0 ∞ 0
D 0 ∞ 0
E 0 ∞ 0
F 0 10 A
G 0 ∞ 0
150 Algorithm and Design Complexity
v known dv pv
A 1 0 0
B 0 28 A
C 0 ∞ 0
D 0 ∞ 0
E 0 25 F
F 1 10 A
G 0 ∞ 0
v known dv pv
A 1 0 0
B 0 28 A
C 0 ∞ 0
D 0 22 E
E 1 25 F
F 1 10 A
G 0 24 E
v known dv pv
A 1 0 0
B 0 28 A
C 0 12 D
D 1 22 E
E 1 25 F
F 1 10 A
G 0 24 E
v known dv pv
A 1 0 0
B 0 16 C
C 1 12 D
D 1 22 E
E 1 25 F
F 1 10 A
G 0 24 E
Here the dv value of B is updated using an update rule. That is, B can be reached
through either A or C from the edge with the minimum cost that is chosen.
Graph 151
v known dv pv
A 1 0 0
B 1 16 C
C 1 12 D
D 1 22 E
E 1 25 F
F 1 10 A
G 0 14 B
v known dv pv
A 1 0 0
B 1 16 C
C 1 12 D
D 1 22 E
E 1 25 F
F 1 10 A
G 1 14 B
The edges in the spanning tree can be read from the table:
The total number of edges is 6, which is equivalent to |V| − 1 (number of vertices − 1).
The sequence of the edges added at each stage starts from vertex A and ends at vertex
G. Figure 5.17 and Figure 5.18 shows the application of Prim’s algorithm.
KrusKal’s algorIthm
This is another greedy strategy for selecting edges in order of the smallest weight
and accepting an edge if it does not cause a cycle. Kruskal’s algorithm maintains a
forest collection of trees. It is named Kruskal’s algorithmafter Joseph Kruskal, who
discovered this algorithm when he was a second-year graduate student. Kruskal’s
algorithm looks at an MST of a weighted connected graph G = |V, E| as an acyclic
subgraph with |V| − 1 edges for which the sum of the edge weights is the smallest.
(It is not difficult to prove that such a subgraph must be a tree.) Consequently, the
algorithm constructs an MST as an expanding sequence of subgraphs that are always
acyclic but are not necessarily connected in the intermediate stages of the algorithm.
The algorithm begins by sorting the graph’s edges in a nondecreasing order of
their weights. Then, starting with the empty subgraph, it scans this sorted list, adding
the next edge on the list to the current subgraph if such an inclusion does not create
a cycle and simply skipping the edge otherwise.
154 Algorithm and Design Complexity
Initially there are |V| single-node trees. Adding an edge merges two trees into
one. When the algorithm terminates there is only one tree, and this is the MST. The
algorithm terminates when enough edges are accepted. It turns out to be simple to
decide whether edge (u, v) should be accepted or rejected. Two vertices belong to
the same set if and only if there are connected in the current spanning forest. Each
vertex is initially in its own set. If u and v are in the same set, the edge is rejected
because they are already connected or because adding (u, v) would form a cycle.
Otherwise, the edge is accepted, and a union is performed on the two sets containing
u and v. Figure 5.19 gives an example of this algorithm in operation, and the stages
in Kruskal’s algorithm are shown in Figure 5.20.
the strategy
1. The edges are built into a min-heap structure, and each vertex is considered
a single-node tree.
2. The Deletemin operation is utilized to find the minimum-cost edge (u, v).
3. The vertices u and v are searched in the spanning tree set S, and if the
returned sets are not the same, then (u, v) is added to the set S with the con-
straint that adding (u, v) will not create a cycle in the spanning tree set S.
4. Repeat steps 2 and 3 until a spanning tree is constructed with |V| − 1 edges.
if (j ! = k) { i++;
t[i] [1] = u; t[i] [2]=v;
mincost += cost[u] [v];
Union(j, k);
}
}
if (i ! = n–1) cout “ “No spanning tree” “ endl;
else return(mincost);
}
analysIs
The worst-case running time of this algorithm is O(|E|log|E|) with heap operations.
bIconnected graPh
A graph G is biconnected iff it contains no articulation points. The graph in
Figure 5.23 is biconnected.
bIconnected comPonent
A maximal biconnected subgraph is a biconnected component. Biconnected compo-
nents are shown in Figure 5.24.
Note: Two biconnected components can have at most one vertex in common, and
this vertex is an articulation point.
• Two biconnected components of the same graph can have at most one vertex
in common.
• No edge can be in two or more biconnected components.
• The biconnected components of G partition the edges of G.
• The biconnected components of a connected, undirected graph G can be
found by using any depth-first spanning tree of G.
Steps:
Tree edges: The edges that form the depth-first spanning tree are called tree
edges. A solid line in Figure 5.26 represents a tree edge.
160 Algorithm and Design Complexity
Back edges: The remaining edges other than the tree edge in the graph are
called back edges. A broken line in Figure 5.26 represents a back edge.
The root node of a depth-first spanning tree is an articulation point iff it has at
least two children. Furthermore, if u is any other vertex, then it is not an articulation
point iff from every child w of u it is possible to reach an ancestor of u using only a
path made up of descendants of w and a back edge.
• For each vertex u, define L[u] is the lowest dfn that we can reach from u
using a path of descendants followed by at most one back edge.
• If u is not the root, then u is an articulation point iff u has a child w such
that L[w] ≥ dfn[u].
u = 1, L[w] ≥ dfn[u]
L[4] ≥ dfn[1], 1 ≥ 1 but u = 1 is the root.
u = 2, L[w] ≥ dfn[u]
L[5] ≥ dfn[2], 6 ≥ 6. Therefore, 2 is an articulation point.
u = 3, L[w] ≥ dfn[u]
L[10] ≥ dfn[3], 4 ≥ 3. Therefore, 3 is an articulation point.
u = 4, L[w] ≥ dfn[u]
L[3] ≥ dfn[4], 1! = 2.
u = 5, L[w] ≥ dfn[u]
L[6] ≥ dfn[5], 8 ≥ 7. Therefore, 5 is an articulation point.
In both BFS and D-search, the exploration of a new node cannot begin until
the node currently being explored is fully explored. Just like backtracking, the
bounding functions to avoid generating subtrees that do not contain an answer
node are used.
Like the backtracking, branch and bound uses the same terminology, which
follows:
Live node: A live node is a node that has been generated but whose children
have not yet been generated.
E-node: An E-node is a live node whose children are currently being explored.
In other words, an E-node is a node currently being expanded.
Dead node: A dead node is a generated node that is not to be expanded or
explored any further. All children of a dead node have already been
expanded.
examPle: 4-queens
FIFO branch-and-bound algorithm
• Initially, there is only one live node; no queen has been placed on the
chessboard.
• The only live node becomes the E-node.
• Expand and generate all its children, children being a queen in columns 1,
2, 3, and 4 of row 1 (only live nodes left).
• The next E-node is the node with the queen in row 1 and column 1.
• Expand this node and add the possible nodes to the queue of live nodes.
least-cost search
In both LIFO and FIFO branch and bound, the selection rule for the next E-node is
rigid. The Selection rule does not give preference to the nodes that will lead to an
answer quickly, just to the queues those behind the current live nodes.
1. In the 4-queens problem, if three queens have been placed on the board, it
is obvious that the answer may be reached in one more move.
2. The rigid selection rule requires that other live nodes be expanded, and
then, the current node be tested.
Ranking Function
The search for an answer node can be speeded up by using a ranking function c(.) for
live nodes. The next E-node is selected on the basis of this ranking function.
Let c(x) is the estimated additional computational effort or the additional cost
needed to reach an answer node from the live node X. For any node x, the cost could
164 Algorithm and Design Complexity
Problem
The problem with the previously discussed techniques for computing the cost at node
x is that they involve the search of the subtree at x, implying the exploration of the
subtree.
• By the time the cost of a node is determined, that subtree has been searched,
and there is no need to explore x again.
• The preceding point can be avoided by using an estimate function g(x)
instead of actually expanding the nodes.
Estimate Function
• Let g(x) be an estimate of the additional effort needed to reach an answer
from node x.
• h(x) is the cost of reaching x from the root.
• f(.) is any nondecreasing function.
Cost Function
The actual cost of x is denoted by c(x).
• If x is an answer node, c(x) is the cost of reaching x from the root of the
state-space tree.
Graph 165
boundIng
A branch-and-bound method searches a state-space tree using any search mechanism in
which all children of the E-node are generated before another node becomes the E-node.
Each answer node x has a cost c(x), and we have to find a minimum-cost answer
node.
Search Strategies
The three common search strategies include
1. LC,
2. FIFO, and
3. LIFO.
Cost Function
A cost function c(.) such that (x) ≤ c(x) is used to provide the lower bound on the solu-
tion obtainable from any node x.
Upper Bound
If upper is the upper bound on the cost of a minimum-cost solution, then all live
nodes with c(x) > upper may be killed.
• All answer nodes reachable from x have cost c(x) ≥ c(x) > upper.
• The starting value for upper can be obtained by some heuristic or set to 1.
• Each time a new answer node is found, the value of upper can be updated.
Optimization/Minimization Problem
• The maximization is converted to minimization by changing the sign of the
objective function.
• Formulate the search for an optimal solution as a search for an LC answer
node in a state-space tree.
• Define the cost function c(.) such that c(x) is the minimum for all nodes,
representing an optimal solution.
Solution: A state-space tree is used to solve the knapsack problem using the
branch-and-bound technique.
Construction of state-space tree: The state-space tree is constructed as a
binary tree in which each node on the ith level 0 ≤ i ≤ n represents all the
subsets of n items that include a particular selection made from the first
i-ordered items.
This selection is uniquely determined by a path from the root to the node. A branch
going to the left indicates that the next item is included. A branch going to the right
indicates that the next item is not included. The total weight W and the total value V
of this selection in the node are recorded along with some upper bound, Ub, on the
value of any subset that can be obtained by adding zero or more items in this selection.
The upper bound Ub is computed as
Ub = v + (W − w)(vi+1/wi+1).
Here, v = total value of items, W = knapsack capacity, w = the total weight of the
items, and vi+1/wi+1 = the best payoff per weight unit of the next item.
Example: Consider four items with the following weights and values:
Step 1:
The items are ordered in descending order by their value-to-weight ratio.
Step 2:
The upper bound Ub is computed as
Ub = v + (W − w)(vi+1/wi+1).
Graph 167
The state-space tree can be constructed as a binary tree like that in Figure 5.27. The
root of the tree represents the starting point, with no decisions about the given ele-
ments made as yet.
exPlanatIon
1. Construction of node 0 at level 0.
No items have been selected.
Consider w = 0, v = 0
vi+1/wi+1 = 0
W = 10
Ub = v + (W − w)(vi+1/wi+1)
= 0 + (10 − 0)(7) = 70
2. Construction of node 1 that includes item 1
The total weight w = 7, v = 49, W = 10, vi+1/wi+1 = 6.
Ub = v + (W − w)(vi+1/wi+1)
= 49 + (10 – 7)(6) = 67
3. Construction of node 2 that does not include item 1
The total weight w = 0, v = 0, W = 10, vi+1/wi+1 = 6
Ub = v + (W − w)(vi+1/wi+1)
= 0 + (10 − 0)(6) = 60
Here the upper bound Ub is 60, which is inferior to node 8.
At this level, node 1 is a promising node, so further calculations are done
from that node.
4. Construction of node 3 that includes item 1 and item 4, respectively
The total weight w = 7 + 5=12, v = 79, W = 10, vi+1/wi+1 = 4
Since the total weight (w = 12) exceeds the knapsack capacity (W = 10), it is
not feasible.
5. Construction of node 4 with item 1 and without item 4
The total weight w = 7, v = 49, W = 10, vi+1/wi+1 = 4
Ub = v + (W − w)(vi+1/wi+1)
= 49 + (10 − 7)(4) = 61
6. Construction of node 5 that includes item 1 and item 2 but not item 4,
respectively
The total weight w = weight of item 1 + weight of item 2
= 7 + 3 = 10
v = 61, W = 10, vi+1/wi+1 = 3
Ub = v + (W − w)(vi+1/wi+1)
= 61 + (10 − 10)(3) = 61
7. Construction of node 6 that includes item 1 but not item 4 or item 2
The total weight w = 7, v = 49, W = 10, vi+1/wi+1 = 3
Ub = v + (W − w)(vi+1/wi+1)
= 49 + (10 − 7)(3) = 58
At level 3, the most promising node is 5, so further calculations are done
from that node.
8. Construction of node 7, which has the subset items—item1, item 2, and
item 3 and without item 4
Therefore, the total weight w = weight of item 1 + weight of item 2 + weight
of item 3
= 7 + 13 + 6 = 16, which exceeds the knapsack capacity.
It is not feasible.
9. Construction of node 8 that includes item 1 and item 2.
Therefore, the total weight w = 10, v = 61, W = 10, vi+1/wi+1 = 3
Graph 169
Ub = v + (W − w)(vi+1/wi+1)
=61 + (10 − 10)(3) = 61
There is no next item. So the upper bound is equal to the total value of these items.
FIFo solutIon
The FIFO version uses a queue to keep track of live nodes, as these nodes are to be
extracted in FIFO order.
Example
Consider the knapsack instance n = 3, w = [20, 15, 15], p = [40, 25, 25], and
C = 30.
The FIFO branch-and-bound search begins with the root A as the E-node. At this
time, the live node queue is empty. When node A is expanded, nodes B and C are
generated. As both are feasible (w = 20 when item 1 is included, w = 0 when item
1 is not included), they are added to the live-node queue, and node A is discarded.
Live-node queue B C
The next E-node is node B. It is expanded to get nodes D and E. At D, w = 35; at E,
w = 20. D is infeasible (w > C, 35 > 30) and discarded, while E is added to the queue.
Queue B C E
Next, C becomes the E-node; when expanded, it leads to nodes F and G. Both are
feasible and added to the queue.
Queue B C E F G
The next E-node gets us to J and K. J is infeasible and discarded. K is a feasible
leaf and represents a possible solution to the instance. Its profit value is 40.
The next E-node is node F. Its children L and M are generated.
L represents a feasible packing with a profit value of 50, while M represents a
feasible packing with a value of 15.
G is the last node to become the E-node. Its children N and O are both feasible.
The search now terminates because the live-node queue is empty. The best solution
found has a value of 50.
classIFIcatIon oF Problems
The distinction between problems that can be solved by a polynomial time algorithm
and problems for which no polynomial time algorithm is known can be classified
into one of the two groups:
1. Tractable
2. Intractable
Examples
• Searching O(logn)
• Polynomial evaluation O(n)
• Sorting O(nlogn)
• String editing O (mn)
Examples
• Traveling salesperson problem O (n22n)
• Knapsack problems O (2n/2)
theory oF nP-comPleteness
The theory of NP-completeness shows that many of the problems with no polyno-
mial time algorithms are computationally related. The group of problems is further
subdivided into two classes:
1. NP-complete
2. NP-hard
All NP-complete problems are NP-hard, but some NP-hard problems are not known
to be NP-complete. The relationship among P, NP, and NPC is shown in Figure 5.28.
nondetermInIstIc algorIthms
The algorithm that has the property that the result of every operation is uniquely
defined is termed a deterministic algorithm. Such algorithms agree with the way
programs are executed on a computer.
Nondeterministic algorithms are allowed to contain operations whose outcomes
are not uniquely defined but are limited to specified sets of possibilities. The
machine executing such operations is allowed to choose any one of these outcomes
subject to a termination condition to be defined later. This leads to the concept of a
nondeterministic algorithm.
This algorithm is specified with the help of three new functions:
1. Choice (S)
• This function arbitrarily chooses one of the elements of set S.
• The assignment statement x = Choice (1, n) could result in x being
assigned any one of the integers in the range [1, n].
• There is no rule specifying how this choice is to be made.
2. Failure ()
• This function signals an unsuccessful completion.
• It cannot be used as a return value.
3. Success ()
• The function signals a successful completion.
• It cannot be used as a return value.
• Whenever there is a set of choices that leads to a successful completion,
then one such set of choices is always made and the algorithm terminates
successfully.
Phase 1: guessing
Phase 2: checking
Example
• Searching
• MST
• Sorting
• Satisfiability problem
• Traveling salesperson problem
Void Nd_Search(A,n,x)
{
j = Choice (1, n)
if (A[j] = = x)
{
cout <<j;
success();
}
Cout<<0;
Failure();
}
Graph 173
Definition: Any problem for which the answer is either zero or one is called a deci-
sion problem. An algorithm for a decision problem is termed a decision algorithm.
174 Algorithm and Design Complexity
Any problem that involves the identification of an optimal (either minimum or maxi-
mum) value of a given cost function is known as an optimization problem. An opti-
mization algorithm is used to solve an optimization problem.
Examples
Maximum Clique
A maximal complete subgraph of a graph G = (V, E) is a clique. The size of the
clique is the number of vertices in it. The max clique problem is an optimization
problem that has to determine the size of a largest clique in G. The corresponding
decision problem is to determine whether G has a clique of size at least k for some
given k.
Let DClique(G, k) be a deterministic decision algorithm for the clique decision
problem. If the number of vertices in G is n, the size of a max clique in G can be
found by making several applications of DClique. DClique is used once for each k,
k = n, n − 1, n − 2 . . . until the output from DClique is 1.
If the time complexity of DClique is f(n), then the size of a max clique can be
found in time g(n) ≥ nf (n). Also, if the size of a max clique can be determined in
time g(n), then the decision problem can be solved in time g(n). Hence, the max clique
problem can be solved in polynomial time if and only if the clique decision problem
can be solved in polynomial time.
0/1 Knapsack
The knapsack decision problem is to determine whether there is a 0/1 assignment of
values to Xi, 1 ≤ i ≤ n such that a Pi X i a r and a Wi X i a m for given m and r. The
r is a given number. The Pi’s and Wi’s are nonnegative numbers. If the knapsack
decision problem cannot be solved in deterministic polynomial time, then the opti-
mization problem cannot either.
Max Clique
The input to the max clique decision problem can be provided as a sequence of edges
and an integer k. Each edge in E(G) is a pair of numbers (i, j). The size of the input
for each edge (i, j) in binary representation is log2i + log2j + 2.
The input size of any instance is
na a a log i a log
i , jaE a G a
2 2 j a 2 a a log2 k a 1 .
Note that if G has only one connected component, then n ≥| V |. Thus, if this deci-
sion problem cannot be solved by an algorithm of complexity p(n) for some polyno-
mial p(), then it cannot be solved by an algorithm of complexity p(| V |).
0/1 Knapsack
The input size q for the knapsack decision problem is
void DKP(int p[], int w[], int n, int m, int r, int x[])
{ int W=0, P=0;
for (int i=l; i<=n; i++)
{
X[i] = Choice(0,l);
W+= x[i]*w[i]; P += x[i]*p[i];
}
if ((W > m) ||(P < r)) Failure();
else Success();
}
{
int t = Choice(l, n);
if (t is in S) Failure();
S = S U {t}; // Add t to set S.
}
//At this point S contains k distinct vertex indices.
for (all pairs (i,j) such that i is in S,
j is in S and i!=j)
if ((i,j) is not an edge of G) Failure();
Success();
}
satIsFIabIlIty
Let x1, x2, . . . denote a set of Boolean variables and denote the complement of X i.
Examples
ax a x aa ax
1 2 3 a x4 a
ax 3 a a
a x4 a x1 a x2 a
Conjunctive Normal Form
A formula is in conjunctive normal form (CNF) if it is represented as ∧k i=1 ci
where ci are clauses represented as ∨lij where lij are literals.
a a a
Examples: x3 a x4 a x1 a x2 a
Disjunctive Normal Form
A formula is in disjunctive normal form (DNF) iff it is represented as a ika1 ci, where ci
are clauses represented as ∧lij, where lij are literals.
a a a
Example: x1 a x2 a x3 a x4 a
Satisfiability problem: A satisfiability problem is to determine whether a formula
is true for some assignment of truth values to the variables. CNF-satisfiability is the
satisfiability problem for CNF formulas.
178 Algorithm and Design Complexity
• Show that one problem is no harder or no easier than another, even when
both problems are decision problems.
• Consider a decision problem A which we would like to solve in polynomial
time.
• Instance of the problem:
• Input to the particular problem at hand
• Given a graph G, vertices u and v, and an integer k, determine if a path
exists from u to v consisting of at most k edges.
• Consider a different decision problem B that we already know how to solve
in polynomial time.
• Suppose that we have a deterministic procedure that transforms an instance
α of A to an instance β of B with the following characteristics.
• Transformation takes polynomial time.
• The answers are the same; answer for α is ‘yes’ if and only if the answer
to β is also ‘yes’.
• It is written as A ≤ B.
NP-Hard
A problem L is NP-hard if and only if satisfiability reduces to L (satisfiability ∞ L).
A problem L is NP-complete if and only if L is NP-hard and L ∈ NP is shown in
Figure 5.29.
• If X has n variables, A tries out all the 2n possible truth assignments and
verifies whether X is satisfiable.
• If X is satisfiable, it stops; otherwise, A enters an infinite loop. Hence, A
halts on input X iff X is satisfiable.
If we had a polynomial time algorithm for the halting problem, then we could
solve the satisfiability problem in polynomial time using A and X as input to the
algorithm for the halting problem. Hence, the halting problem is an NP-hard problem
that is not NP.
Polynomially Equivalent
Two problems L1 and L2 are said to be polynomially equivalent if and only if L1 ∞ L2
and L2 ∞ L1.
181