DAA MODULE-5-Notes
DAA MODULE-5-Notes
Module-5-Notes
Contents
Backtracking:
General method, solution using back tracking to N-Queens problem, Sum of subsets problem,
Graph coloring, Hamiltonian cycles Problems.
Basic concepts, non- deterministic algorithms, P, NP, NP Complete, and NP-Hard classes.
MODULE – 5
5.1. Backtracking
Many problems which deal with searching for a set of solutions or which ask for an optimal
solution satisfying some constraints can be solved using the backtracking formulation.
Name was first coined by D H Lehmer in 1950s.
The desired solution is expressible as an n-tuple (x1,….,xn) where, xi are chosen from
some finite set Si.
The problem to be solved calls for finding one vector that maximizes (or minimizes or
satisfies) a criterion function P(x1,….,xn).Sometimes it seeks all vectors that satisfy P.
For example, sorting the array of integers in a[1:n] is a problem whole solution is
expressible by an n-tuple, where xi is the index in a of the ith smallest solution.
o The criterion function P is the inequality a[xi]<=a[xi+1] for 1<=i<n.
o The set Si is finite and includes integers 1 through n.
Consider that miis the size of the set Si. There are m=m1m2…mn n-tuples that are
possible candidates for satisfying the function P. The backtracking algorithm
yields the same answer with far fewer than m trials.
Basic Idea
Build up solution vector one component at a time.
Use modified criterion functions Pi(x1,….,xi) called bounding functions to test
whether the vector being formed has any chance of success.
Advantage
If it is realized that the pal vector (x1,x2,….xi) can in no way lead to an optimal solution, then
mi+1…..mn possible test vectors can be ignored entirely.
Definition 5.1
Explicit constraints are rules that restrict each xi to take on values only from a given set.
Common examples of explicit constraints are
The explicit constraints depend on the particular instance I of the problem being solved. All
tuples that satisfy the explicit constraints define a possible solution space for I.
Definition 5.2
The implicit constraints are rules that determine which of the tuples in the solution space of I
satisfy the criterion function. Thus, implicit constraints describe the way in which the xi must
relate to each other.
Place 8 queens on an 8×8 chessboard so that no two “attack” no two are on the same
row, column, or diagonal.
- Number the rows and columns of the chessboard 1 through 8.
- Number the queens also 1 through 8.
- Assume that given queen i is to be placed on row i, since each queen must be
on different row.
All solutions for 8-queens problem can be represented as 8-tuples (x1,….,x8). Here,
xi represents column on which queen i is placed.
The explicit constraints are Si={1,2,3,4,5,6,7,8}, 1 ≤ i ≤ 8.
o The solution space consists of 88 8-tuples.
The implicit constraints are that
o No two xi s can be the same.
o All queens must be on different columns,
o No two queens can be on the same diagonal.
The first of these two constraint implies that all solutions are permutations of the 8-
tuple (1,2,3,4,5,6,7,8). Size of the solution space is reduced from 88 tuples to 8!
tuples.
The solution in Figure 5.1 is (4,6,8,2,7,1,3,5) expresses as 8-tuple
Given the positive numbers wi, 1 ≤ i ≤ n, and m, find all the subsets of the wi whose sums are
m.
For examples, consider n=4, (w1,w2,w3,w4)=(11,13,24,7) and m=31,
The desired subsets are (11,13,7) and (24,7).
Rather than representing the solution vector by wi, represent the solution vector by giving
indices these wi which sum to m. Hence the solution vectors are (1,2,4) and (3,4). Therefore,
the solutions are k-tuples (x1,x2,….,xk), 1≤k≤n and different solutions have different sized
tuples.
Formulation 1
The explicit constraints require xiϵ{j|j is an integer and 1≤j≤n}
The implicit constraints require that
o No two be the same and
o The sum of the corresponding wi’s be m.
Avoid generating multiple instances of the same subset (example, (1,2,4) &
(1,4,2) represent same solution). Hence, another implicit constraint is
xi<xi+1,1≤i≤k.
Formulation 2
Each solution subset is represented by an n-tuple (x1,x2,…,xn) such that
xiϵ{0,1},1≤i≤n.
o xi=0 if wi is not chosen
o xi=1 if wi is chosen.
Therefore, the solutions for the above instances are (1,1,0,1) and (0,0,1,1).
Solution using a fixed tuple.
The solution space consists of 2n distinct tuples.
Problem: place n queens on an n×n chessboard so that no two queens attack each other by
being in the same row or in the same column or on the same diagonal.
For n=1,the problem has a trivial solution
There is no solution for n=2 and n=3.
Consider 4-queens problem and solve by backtracking.
Since each of the four queens has to be placed in its own row, all we need to do
is to assign a column for each queen on the board presented in Figure 5.2.
To find other solution, the algorithm can simply resume its operations at the leaf at which it
stopped.
NOTE: A single solution to the n-queens problem for any n ≥ 4 can be found in linear time.
Problem: find a subset of a given set S = {S1, S2,….., ,Sn } of n positive integers whose sum is
equal to a given positive integer d.
For example, A = {1, 2, 5, 6, and 8} and d = 9, there are two solutions:
{1, 2, 6} and
{1, 8}
Procedure:
Sort the set’s elements in increasing order.
Assume, a1<a2<….<an
The state-space tree can be constructed as a binary tree like that in Figure 5.4 for the
instance A= {3, 5, 6, 7} and d = 15.
Figure 5.4: Complete state-space tree of the backtracking algorithm applied to the instance
A={3, 5, 6, 7} and d=15of the subset-sum problem. The number inside a node is the sum of
the elements already included in the subsets represented by the node. The inequality below a
leaf indicates the reason for its termination.
The root of the tree represents the starting point, with no decisions about the given
elements made
Its left and right children represent, respectively, inclusion and exclusionofa1 in a set
being sought.
Going to the left from a node of the first level corresponds to inclusion ofa2while
going to the right corresponds to its exclusion, and so on.
A path from the root to a node on the ith level of the tree indicates which of the first i numbers
have been included in the subsets represented by that node.
Record the value of s, the sum of these numbers, in the node.
If s is equal to d, we have a solution to the problem.
If all the solutions need to be found, continue by backtracking to the node’s parent.
If s is not equal to d terminate the node as non-promising if either of the following
two inequalities hold.
Backtracking Algorithm
If such a tuple (x1,x2,...,xi) is not a solution, the algorithm finds the next element in
Si+1 that is consistent with the values of (x1,x2,...,xi) and the problem’s constraints,
and adds it to the tuple as its (i+1)st coordinate.
If such an element does not exist, the algorithm backtracks to consider the next
value of xi , and so on.
To start a backtracking algorithm, the following pseudocode can be called for i=0;
X[1..0] represents the empty tuple.
Generating a random path from the root to a leaf and using the information about the number
of choices available during the path generation to estimate the size of the tree.
NOTE on BACKTRACKING
i. It is typically applied to difficult combinatorial problems for which no efficient
algorithms for finding exact solutions possibly exist
ii. Backtracking at least holds a hope for solving some instances of nontrivial sizes
in an acceptable amount of time.
iii. Even if backtracking does not eliminate any elements of a problem’s state space
and ends up generating all its elements, it provides a specific technique for doing
so, which can be of value in its own right.
If d is the degree of the given graph, then it can be colored with d+1 colors. m-color ability
optimization problemconsiders smallest integer m for which the graph G can be colored.
Where m is chromatic number of the graph.
Problem: Given any map, can the regions be colored in such a way that no two adjacent
regions have the same color, yet only 4 colors are needed.
Each region of the map becomes a node, and if two regions are adjacent, then the
corresponding nodes are joined by an edge.
NOTE: No map that required more than 4 colors had ever been found. Consider, representing
a graph by its adjacency matrix G[1:n, 1:n], where G[i,j]=1 if (i,j) is an edge of G, and
G[i,j]=0 otherwise.
The colors are represented by the integers 1,2,…,m and the solutions are given by the n-tuple
(x1,x2,…,xn), where xi is the color of the node i. Using the recursive backtracking algorithm,
the resulting algorithm is mColoring as given below.
Each node at level i has m children corresponding to m possible assignments to xi, 1≤i≤n.
Nodes at level n+1 are leaf nodes
Figure 5.7: State space tree for m Coloring when n=3 and m=3.
M Coloring
First assign the graph to its adjacency matrix, setting the array x[ ] to 0.
Invoke the statement mColoring(1).
The main loop of mColoring
o Repeatedly picks an element from the set of possibilities,
o Assigns it to xk, and
o Calls mColoring recursively.
Let G=(V,E) be a connected graph with n vertices. A Hamilton Cycle is a round-trip path
along ‘n’ edges of ‘G’ that visits every vertex once and returns to its starting position. Sir
William Hamilton
If a Hamilton Cycle begins at some vertexv1ϵG and the vertices of G are visited in order v1,
v2,…,vn+1, then the edges (vi,vi+1) are in E, 1in and the vi are distinct except for v1 and
vn+1which are equal.
The backtracking solution vector (x1,….,xn) is defined so that xi represents the ith visited vertex
of the proposed cycle.
Problem: assigning n people to n jobs so that the total cost of the assignment is as small as
possible.
An instance of the assignment problem is specified by an n×n cost matrix C so that we can
state the problem as follows:
“Select one element in each row of the matrix so that no two selected elements are in the same
column and their sum is the smallest possible.”
Consider C as follows:
To find a lower bound on the cost of an optimal selection without actually solving the
problem:
The cost of any solution, including an optimal one, cannot be smaller than the sum of
the smallest elements in each of the matrix’s rows.
For the instance sum is 2+3+1+4=10.
For any legitimate selection that selects 9 from the first row, the lower bound will be
9+3+1+4=17.
The problem’s state space tree deals with the order in which the tree nodes will be generated.
Rather than generating a single child of the last promising node as in backtracking, generate
all the children of the most promising node among on terminated leaves in the current tree.
Which of the nodes is most promising?
Compare the lower bounds of the live nodes.
Consider a node with the best bound as most promising,
Figure 5.10: Levels 0 and 1 of the state-space tree for the instance of the assignment
problem being solved with the best-first branch-and-bound algorithm. The number above a
node shows the order in which the node was generated. A node’s fields indicate the job
number assigned to person a and the lower bound value, lb, for this node.
There are four live leaves—nodes 1 through 4—that may contain an optimal
solution
The most promising of them is node 2 because it has the smallest lower-bound value.
Using best-first search strategy, branch out from that node first by considering the
three different ways of selecting an element from the second row and not in the
second column—the three different jobs that can be assigned to person b (Figure
5.11)
Figure 5.11: Levels 0, 1, and 2 of the state-space tree for the instance of the
assignment problem being solved with the best-first branch-and-bound algorithm.
o First, we consider selecting the third column’s element from c’s row, this leaves us
with no choice but to select the element from the fourth column of d’s row.
o This yields leaf 8 (Figure 5.12) which corresponds to the feasible solution
{a→2,b→1,c→3,d→4}with the total cost of 13.
Figure 5.12: Complete state-space tree for the instance of the assignment problem
solved with the best-first branch-and-bound algorithm.
The lower-bound values of nodes1, 3, 4, 6, and 7 are not smaller than 13, the value
of the best selection seen so far (leaf 8).
Hence, terminate all of them and recognize the solution represented by leaf 8 as
the optimal solution to the problem
Figure 5.13: (a) weighted graph. (b) State-space tree of the branch-and-bound
algorithm to find a shortest Hamiltonian circuit in this graph. The list of vertices in a
node specifies a beginning part of the Hamiltonian circuits represented by the node.
v. For any subset of tours that must include particular edges of a given graph,
modify lower bound (lb) accordingly.
For example, for all the Hamiltonian circuits of the graph in Figure 5.12a that must include
edge (a, d) by summing up the lengths of the two shortest edges incident with each of the
vertices, with the required inclusion of edges (a, d)and(d, a), the lower bound is:
Apply the branch-and-bound algorithm; with the bounding function given by formula 5.1.to
find the shortest Hamiltonian circuit for the graph in Figure 5.12a.
Problem: given items of known weights wi and values vi, i=1,2,. . .,n, and a knapsack of
capacity W, find the most valuable subset of the items that fit in the knapsack.
Order the items of a given instance in descending order by their value-to-weight
ratios.
The first item gives the best payoff per weight unit and the last one gives the worst
payoff per weight unit, with ties resolved arbitrarily:
Structure the state-space tree for this problem as a binary tree constructed as follows:
Each node on the ith level of this tree, 0≤i≤n, represents all the subsets of n items that
include a particular selection made from the first i ordered items.
This is the path from root to the node: a branch going to the left indicates the
inclusion of the next item, and a branch going to the right indicates its exclusion.
Figure 5.14: State-space tree of the best-first branch-and-bound algorithm for the
instance of the knapsack problem.
Record the
o Total weight w and
o The total value v of this selection in the node,
o Upper bound ub on the value of any subset that can be obtained by adding
zero or more items to this selection.
To compute the upper bound ub: add to v,
o The total value of the items already selected,
o the product of the remaining capacity of the knapsack W−wand
o the best per unit payoff among the remaining items, which isvi+1/wi+1
(5.2)
For example, apply the branch-and-bound algorithm to the following
instance of knapsack problem:
In Figure 5.14,
At the root no items have been selected as yet. Hence, both the total weight of the
items already selected w and their total value v are equal to 0.
The value of the upper bound computed by formula (5.2) is $100.
Node 1, the left child of the root, represents the subsets that include item 1.
o The total weight and value of the items already included are 4 and $40,
respectively; the value of the upper bound is 40 + (10−4)∗6 = $76.
Node 2 represents the subsets that do not include item 1.
o w=0, v=$0, and ub=0+(10−0)∗6 = $60.
Since node 1 has a larger upper bound than the upper bound of node 2,
it is more promising for this maximization problem and hence, branch from
node 1 first.
Its children—nodes 3 and 4—represent subsets with item 1 and with and without item
2, respectively
o The total weight w of every subset represented by node 3 exceeds the
knapsack’s capacity; node 3 can be terminated immediately.
o Node 4 has the same values of w and v as its parent; the upper bound ub is
equal to 40 + (10−4)∗5 = $70.
o Selecting node 4 over node 2 for the next branching, nodes 5 and 6 by
respectively are obtained including and excluding item3.
The total weights and values as well as the upper bounds for these nodes are
computed in the same way as for the preceding nodes.
Branching from node 5 yields node 7, which represents no feasible solutions, and node
8, which represents just a single subset {1, 3} of value $65.
The remaining live nodes 2 and 6 have smaller upper bound values than the value of
the solution represented by node 8.
Hence, both can be terminated making the subset {1, 3} of node 8 the optimal solution to the
problem.
o The E-node is expanded and two children 2 and 3 are generated. live nodes.
o The cost c(2)=-32, c(3)=-32, u(2)=-32, and u(3)=-27.
o Node-2 is next E-node. Expand and generate 4 and 5. live nodes.
o 4 has least c value and is next E-node. Generate 6 and 7. live nodes.
o Upper is -38.
o Assume 7is next E-node. Generate 8 and 9.
o Node 8 is a solution node. Update upper to -38. Set 8 as live node.
o Node 9 with c(9) > upper is terminated.
o Node 6 and 8 are two live nodes with least c.
o Search terminates at node 8 as answer node.
o Value is -38 with the path 8,7,4,2,1
Assign values to xi such that .
To extract values of xi:
o Associate with each node, a one bit field, tag, such that the sequence of the tag bits
from answer node to the root gives xi values.
o Example, tag(2)=tag(4)=tag(6)=tag(8)=1
o Tag sequence for the path 8, 7, 4, 2, 1 is 1011. x4=1,x3=0,x2=1, and x1=1.
To solve knapsack problem using LCBB, specify
i. The structure of nodes in the state space tree being searched.
ii. How to generate the children of a given node.
iii. How to recognize a solution node
iv. A representation of a list of live nodes and a mechanism for adding anode into
the list as well as identifying the least cost node.
To generate x’s children:
Identify the level of x in state space tree. field used level.
Set xlevel(x)=1 for left child and xlevel(x)=0 for right child.
Retain the value cu (capacity unused)
The evaluation of c(x) and u(x) requires value of profit.
This value can be retained in the field pe.
In order to determine the live node, with least c value , compute c(x).
The c(x) is explicitly stored in ub.
We need nodes with six fields: parent, level, tag, cu, pe, and ub using which the
children of any node can be easily determined.
The left child is feasible iff cu(x)>= wlevel(x).
o Parent(y)=x
o Level(y)=level(x)+1
o cu(y)=cu(x)-wlevel(x).
o pe(y)=pe(x)+Plevel(x).
o tag(y)=1, and
o ub(y)=ub(x)
The functions to be performed on list of live nodes:
i. test if the list is empty
ii. add nodes, and
iii. Delete a node with a least ub.
Initially, node 1 is the E-node and queue of live nodes is empty. upper is
initialized to u(1)= -32.
Generate nodes from left to right.
Generate 2 and 3 and add to the queue. Upper value unchanged.
Node 2 is next E-node. Generate 4 and 5 and add into queue.
Expand node 3, the next E-node. Generate 6 and 7. C (7) > upper so node 7
immediately killed.
Expand node 4.generate 8 and 9 and add to queue.
Update upper to -38.
5 and 6 are next. Both are not expanded since there c value is > upper.
Node 8 is next E-node. Generate nodes 10 and 11.
o Node 10 is killed due to infeasibility.
o Node 11’s c value is > upper so killed.
Expand node 9. Generate 12and 13.
o 13 is killed. its c value > upper.
o Add 12 in queue. It has no children and the search is terminated.
Output the value of upper and path from node 12 to root.
The algorithms for which the result of every operation is uniquely defined are called
deterministic algorithms.
Algorithms may be made to contain operations whose outcomes are not uniquely defined but
are limited to specified sets of possibilities. The machine executing such operations is allowed
to choose any one of these outcomes subject to a termination condition. Such algorithms are
named as nondeterministic algorithms.
Let n be the length of the input to the algorithm. Consider that all inputs are integers.
The length of the input is measured assuming a binary representation. Example,
10 is represented as 1010 whose length is 4.
A positive integer k has a length of bits when represented in
binary. The length of binary representation of 0 is 1.
The size n of the input to an algorithm is the sum of the lengths of the individual
numbers being input.
Consider a radix r for an input with different representation. Then the length of the
positive number kis .
Since, , the length of any input using radix r (r>1)
representation is c(r)n.
Note: Input is unary form if r=1. Length of positive integer k is k.
Example: Max Clique
Definition:
The time required by a nondeterministic algorithm performing on any given input is the
minimum number of steps needed to reach a successful completion if there exists a
sequence of choices leading to such a completion. If not, the time required is O(1).
The complexity of nondeterministic algorithm is O(f(n)) for the inputs of size n.
Figure 5.15: Commonly believed relationship among P, NP, NP-Complete and NP-
Hard problems
There are NP-hard problems that are not NP-complete.
Only a decision problem can be NP-complete.
An optimization problem can be NP-hard.
Optimization problem cannot be NP-complete whereas decision problem can.
There also exist NP-hard decision problems that are not NP-complete.
Example:
Definition:
Two problems L1 and L2 are said to be polynomials equivalent if and only if L1αL2 and
L2αL1.