KEMBAR78
Daa Notes | PDF | Time Complexity | Graph Theory
0% found this document useful (0 votes)
112 views193 pages

Daa Notes

The document discusses algorithms and data structures including disjoint sets, connected components, biconnected graphs, and the divide and conquer algorithm design paradigm. It provides pseudocode for disjoint set operations like find and union, and algorithms for finding connected components and biconnected graphs using depth-first search. It also gives examples of divide and conquer algorithms like binary search, quicksort, and matrix multiplication.

Uploaded by

viswanath kani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views193 pages

Daa Notes

The document discusses algorithms and data structures including disjoint sets, connected components, biconnected graphs, and the divide and conquer algorithm design paradigm. It provides pseudocode for disjoint set operations like find and union, and algorithms for finding connected components and biconnected graphs using depth-first search. It also gives examples of divide and conquer algorithms like binary search, quicksort, and matrix multiplication.

Uploaded by

viswanath kani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 193

DESIGN AND ANALYSIS OF ALGORITHMS

(DAA)

UNIT – 1

Introduction:
Algorithm
Pseudo Code for Expressing Algorithms
Performance Analysis, Space Complexity
Time Complexity
Asymptotic Notation-
Big Oh Notation
Omega Notation
Theta Notation
Little Oh Notation
probabilistic analysis
Amortized analysis.

SHRI VISHNU ENGINEERING COLLEGE FOR WOMEN : : BHIMAVARAM

(AUTONOMOUS)
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
Scanned with CamScanner
DESIGN AND ANALYSIS OF ALGORITHMS

(DAA)

UNIT – 2

Disjoint Sets:
Disjoint Sets
Disjoint Set Operations
Union and Find Algorithms
Connected Components and Bi-Connected Components.

Divide and Conquer:


General method
Applications :
Binary Search
Quick sort
Merge Sort
Strassen‘s Matrix Multiplication.

SHRI VISHNU ENGINEERING COLLEGE FOR WOMEN : : BHIMAVARAM

(AUTONOMOUS)
Unit - 2
Disjoint set:

A disjoint-set data structure is a data structure that keeps track of a set of elements
partitioned into a number of disjoint (non-overlapping) subsets. A union-find algorithm is
an algorithm that performs two useful operations on such a data structure.
Operations on disjoint sets:
Find: Determine which subset a particular element is in. This can be used for
determining if two elements are in the same subset.
Union: Join two subsets into a single subset.
Find :
Find(x) follows the chain of parent pointers from x upwards through the tree until an
element is reached whose parent is itself. This element is the root of the tree and is the
representative member of the set to which x belongs, and may be x itself.
Path compression, is a way of flattening the structure of the tree whenever Find is used
on it. Since each element visited on the way to a root is part of the same set, all of these
visited elements can be reattached directly to the root. The resulting tree is much flatter,
speeding up future operations not only on these elements, but also on those referencing
them.
Pseudo code:
function Find(x)
if x.parent != x
x.parent := Find(x.parent)
return x.parent
Tarjan and Van Leeuwen also developed one-pass Find algorithms that are more efficient
in practice while retaining the same worst-case complexity.
Union:
Union (x, y) uses Find to determine the roots of the trees x and y belong to. If the roots
are distinct, the trees are combined by attaching the root of one to the root of the other.
If this is done naively, such as by always making x a child of y, the height of the trees
can grow as to prevent this union by rank is used.
Union by rank always attaches the shorter tree to the root of the taller tree. Thus, the
resulting tree is no taller than the originals unless they were of equal height, in which
case the resulting tree is taller by one node.
To implement union by rank, each element is associated with a rank. Initially a set has
one element and a rank of zero. If two sets are unioned and have the same rank, the
resulting set's rank is one larger; otherwise, if two sets are unioned and have different
ranks, the resulting set's rank is the larger of the two. Ranks are used instead of height
or depth because path compression will change the trees' heights over time.
Pseudo code:
function Union (x, y)
xRoot:= Find(x)
yRoot:= Find(y)
// x and y are already in the same set
if xRoot == yRoot
return
// x and y are not in same set, so we merge them
if xRoot.rank < yRoot.rank
xRoot.parent := yRoot
else if xRoot.rank > yRoot.rank
yRoot.parent := xRoot
else
xRoot.parent := yRoot
yRoot.rank := yRoot.rank + 1

Connected Components:
A connected component, of an undirected graph is a subgraph in which any two
vertices are connected to each other by paths, and which is connected to no additional
vertices in the super graph. For example, the graph shown in the illustration has three
components. A vertex with no incident edges is itself a component. A graph that is itself
connected has exactly one component, consisting of the whole graph.
It is straightforward to compute the components of a graph in linear time (in
terms of the numbers of the vertices and edges of the graph) using either breadth-first
search or depth-first search. In either case, a search that begins at some particular
vertex v will find the entire component containing v (and no more) before returning. To
find all the components of a graph, loop through its vertices, starting a new breadth first
or depth first search whenever the loop reaches a vertex that has not already been
included in a previously found component

Depth First Search (DFS)


The DFS algorithm is a recursive algorithm that uses the idea of backtracking. It
involves exhaustive searches of all the nodes by going ahead, if possible, else by
backtracking
Here, the word backtrack means that when you are moving forward and there are no
more nodes along the current path, you move backwards on the same path to find nodes
to traverse. All the nodes will be visited on the current path till all the unvisited nodes
have been traversed after which the next path will be selected.
This recursive nature of DFS can be implemented using stacks. The basic idea is as
follows:
1. Pick a starting node and push all its adjacent nodes into a stack.
2. Pop a node from stack to select the next node to visit and push all its adjacent
nodes into a stack.
3. Repeat this process until the stack is empty. However, ensure that the nodes that
are visited are marked. This will prevent you from visiting the same node more
than once. If you do not mark the nodes that are visited and you visit the same
node more than once, you may end up in an infinite loop.
Algorithm:
DFS-iterative (G, s): //Where G is graph and s is source vertex
let S be stack
S. push (s) //Inserting s in stack
mark s as visited.
while (S is not empty):
//Pop a vertex from stack to visit next
v = S.top( )
S.pop( )
//Push all the neighbours of v in stack that are not visited
for all neighbours w of v in Graph G:
if w is not visited:
S.push( w )
mark w as visited

DFS-recursive(G, s):
mark s as visited
for all neighbours w of s in Graph G:
if w is not visited:
DFS-recursive(G, w)
Time complexity is O(V + E), when implemented using the adjacency list
Biconnected Graphs:
Let us assume that G is an undirected connected graph. An articulation point is a
vertex v of G such that the deletion of v, together with all edges incident on v, produces
a graph, G', that has at least two connected components. For example, the connected
graph of Figure(a) has four articulation points, vertices 1,3,5, and 7. A biconnected
graph is a connected graph that has no articulation points.
In many graph applications, articulation points are undesirable. For instance,
suppose that the graph of Figure (a) represents a communication network. In such
graphs, the vertices represent communication stations and the edges represent
communication links. Now suppose that one of the stations that is an articulation point
fails. The result is a loss of communication not just to and from that single station, but
also between certain other pairs of stations. A biconnected component of a connected
undirected graph is a maximal biconnected subgraph, H, of G. By maximal, we mean
that G contains no other subgraph that is both biconnected and properly contains H. For
example, the graph of Figure (a) contains the six biconnected components shown in
Figure(b). It is easy to verify that two biconnected components of the same graph have
no more than one vertex in common. This means that no edge can be in two or more
biconnected components of a graph. Hence, the biconnected components of G partition
the edges of G. We can find the biconnected components of a connected undirected
graph, G, by using any depth first spanning tree of G.

Divide and Conquer Introduction


Divide and Conquer is an algorithmic pattern. In algorithmic methods, the design is to
take a dispute on a huge input, break the input into minor pieces, decide the problem on
each of the small pieces, and then merge the piecewise solutions into a global solution.
This mechanism of solving the problem is called the Divide & Conquer Strategy.
Divide and Conquer algorithm consists of a dispute using the following three steps.
1. Divide the original problem into a set of subproblems.
2. Conquer: Solve every subproblem individually, recursively.
3. Combine: Put together the solutions of the subproblems to get the solution to
the whole problem.

Generally, we can follow the divide-and-conquer approach in a three-step process.


Examples: The specific computer algorithms are based on the Divide & Conquer
approach:
1. Maximum and Minimum Problem
2. Binary Search
3. Sorting (merge sort, quick sort)
4. Tower of Hanoi.

Fundamental of Divide & Conquer Strategy:


There are two fundamental of Divide & Conquer Strategy:
1. Relational Formula
2. Stopping Condition
1.Relational Formula: It is the formula that we generate from the given technique.
After generation of Formula we apply D&C Strategy, i.e. we break the problem
recursively & solve the broken subproblem.
2. Stopping Condition: When we break the problem using Divide & Conquer Strategy,
then we need to know that for how much time, we need to apply divide & Conquer. So
the condition where the need to stop our recursion steps of D&C is called as Stopping
Condition.

BINARY SEARCH:
1) Let ai , 1<=i<=n be a list of elements that are sorted in non-descending order.
2) Suppose if we have 'n' number of elements then x is element searched.
3) If n = 1, small(p) be true then there is no requirement of applying divide and conquer
, if n > 1 then it can be divided into new such problems.
4) Pick an index 'q' in the range [i,l] and compare x with a q then the following 3
possibilities will occur,
a) x = aq ,then search element is found,
b) x < aq, then x is searched in left sublist,
c)x > aq, then x is searched in right sublist,
q

a0, a1,…………………………..aq-1 aq+1 , ……….an

0 1 2 3 4 5
1 2 3 4 5 6

EXAMPLE:
Consider the list n = 6 where a = { 1,2,3,4,5,6}
X = 4(key element which has to be find)
First find the mid value of the given array
Mid = low + high / 2
Low = first element in the array (array index)
High = last element in the array (array index)
Mid = 0 + 5 / 2
= 2.5 = 2
Cases:-
1) Compare the key element with a[mid]
X == a[mid] 4 == a[2] 4 == 3(false)
2) Check if the key element is bigger or smaller than mid element
X <= a[mid] X >= a[mid]
4 <= a[2] 4 >= a[2]
4<3 4>3
As said in the procedure key element in bigger than respective array element so the key
element will be in right sub list of the array
Again find the mid for the right sublist
Now low = mid + 1
=2+1=3
High will not be changed because we are searching in the right sublist
Mid = (3 + 5) / 2
=4
4 5 6
3 4 5

a[mid] = a[4]
=5
Cases:-
1) Compare key element with new mid value
X = a[mid] X <= a[mid]
4 = 5(false) 4 <= 5(true)
The key element is less than mid value so now we have the calculate mid again to find
key element
Now low = 3
High value changes now
High = mid - 1
=4–1=3
Mid = low + high / 2
= (3 + 3) / 2 =3
a[mid] = a[3] = 4
cases:-
4 == a[3]
4 == 4(true)
ANALYSIS:-
TIME COMPLEXITY:
T(n) = T(n/2) + 1. It is in the form of T(n) = aT(n/b) + f(n)
Here a=1, b=2, f(n)=1
By using master’s theorem
1) f(n) = g(n) => O(f(n)logn)
2) f(n) < g(n) => O(g(n))
3) f(n) > g(n) => O(n)
g(n) = nlogba
g(n) = nlog21
= n0
=1
f(n) = g(n) => 1 = 1
Time Complexity T(n) = O(f(n)logn)
= O(logn)
ALGORITHM FOR BINARY SEARCH:
Binary Search (A, x, l, r)
Begin:
If l > r then
Return -1 (or) not found
End if
M = (l + r) / 2
If A[m] == x then
Return m;
Else if x < A[m] then
Return BinarySearch (A, x, l, m - 1)
else
return BinarySearch (A, x, m + 1, r)
End if
End

MERGE SORT:
Merge sort is an efficient, general-purpose, comparison-based sorting algorithm. Most
implementations produce a stable sort. Merge sort is a divide and conquer algorithm that
was invented by John von Neumann in 1945. It is one of the most popular sorting
algorithms and a great way to develop confidence in building recursive algorithms.
Divide and Conquer Strategy:
Using the Divide and Conquer technique, we divide a problem into sub problems. When
the solution to each sub problem is ready, we 'combine' the results from the sub
problems to solve the main problem.
Suppose we had to sort an array A. A sub problem would be to sort a sub-section of this
array starting at index p and ending at index r, denoted as A[p..r].
Divide:
If q is the half-way point between p and r, then we can split the subarray A[p..r] into
two arrays A[p..q] and A[q+1, r].
Conquer:
In the conquer step, we try to sort both the sub arrays A[p..q] and A[q+1, r]. If we
haven't yet reached the base case, we again divide both these subarrays and try to sort
them.
Combine:
When the conquer step reaches the base step and we get two sorted sub arrays A[p..q]
and A[q+1, r] for array A[p..r], we combine the results by creating a sorted array A[p..r]
from two sorted sub arrays A[p..q] and A[q+1, r]
EXAMPLE –
The MergeSort Algorithm:
The MergeSort function repeatedly divides the array into two halves until we reach a
stage where we try to perform MergeSort on a sub array of size 1 i.e. p == r.
After that, the merge function comes into play and combines the sorted arrays into
larger arrays until the whole array is merged
MergeSort (A, p, r) { while (i < n1 && j < n2)
If p > r {
return; if (L[i] <= M[j])
q = (p+r)/2; {
mergeSort (A, p, q) arr[k] = L[i];
mergeSort (A, q+1, r) i++;
merge (A, p, q, r) }
} else
void merge (int A[], int p, int q, int r) {
{ arr[k] = M[j];
/* Create L ← A[p..q] and M ← j++;
A[q+1..r] */ }
int n1 = q - p + 1; k++;
int n2 = r - q; }

int L[n1], M[n2]; /* When we run out of elements in


either L or M, pick up the remaining
for (i = 0; i < n1; i++) elements and put in A[p..r] */
L[i] = A[p + i]; while (i < n1)
for (j = 0; j < n2; j++) {
M[j] = A[q + 1 + j]; A[k] = L[i];
i++;
/* Maintain current index of sub-arrays k++;
and main array */ }
int i, j, k;
i = 0; while (j < n2)
j = 0; {
k = p; A[k] = M[j];
/* Until we reach either end of either L j++;
or M, pick larger among elements L and M k++;
and place them in the correct position at }
A[p..r] */ }
Time Complexity:
Sorting arrays on different machines. Merge Sort is a recursive algorithm and time
complexity can be expressed as following recurrence relation.
T(n) = 2T(n/2) + O(n)
The above recurrence can be solved either using Recurrence Tree method or Master
method. It falls in case II of Master Method and solution of the recurrence is O(nlogn).
Time complexity of Merge Sort is O(nlogn) in all 3 cases (worst, average and best) as
merge sort always divides the array into two halves and take linear time to merge two
halves.
QUICK SORT AND ITS COMPLEXITY:
It is used on the principle of divide-and-conquer. Quick sort is an algorithm of choice in
many situations as it is not difficult to implement. It is a good general-purpose sort and
it consumes relatively fewer resources during execution.
Quick Sort is also based on the concept of Divide and Conquer, just like merge sort.
But in quick sort all the heavy lifting (major work) is done while dividing the array into
subarrays, while in case of merge sort, all the real work happens during merging the
subarrays. In case of quick sort, the combine step does absolutely nothing.
It is also called partition-exchange sort. This algorithm divides the list into three main
parts:
1. Elements less than the Pivot element
2. Pivot element (Central element)
3. Elements greater than the pivot element
Pivot element can be any element from the array, it can be the first element, the last
element or any random element. In this tutorial, we will take the rightmost element or
the last element as pivot

Function: Partition (A, p, r) Partition Algorithm:


x ← A[p] Partition algorithm rearranges the sub
i ← p-1 arrays in a place.
j ← r+1 PARTITION (array A, int m, int n)
while TRUE do 1 x ← A[m]
Repeat j ← j - 1 2o←m
until A[j] ≤ x 3 for p ← m + 1 to n
Repeat i← i+1 4 do if (A[p] < x)
until A[i] ≥ x 5 then o ← o + 1
if i < j then 6 swap A[o] with A[p]
exchange A[i] ↔ A[j] 7 swap A[m] with A[o]
else 8 return o
return j
Complexity Analysis of Quick Sort:
For an array, in which partitioning leads to unbalanced subarrays, to an extent where
on the left side there are no elements, with all the elements greater than the pivot,
hence on the right side.
And if keep on getting unbalanced subarrays, then the running time is the worst case,
which is O(n2)
Whereas if partitioning leads to almost equal subarrays, then the running time is the
best, with time complexity as O (n*log n).
Worst Case Time Complexity [ Big-O]: O(n2)
Best Case Time Complexity [Big-omega]: O (n*log n)
Average Time Complexity [Big-theta]: O (n*log n)
Space Complexity: O (n*log n)
As we know now, that if subarrays partitioning produced after partitioning are
unbalanced, quick sort will take more time to finish. If someone knows that you pick the
last index as pivot all the time, they can intentionally provide you with array which will
result in worst-case running time for quick sort.
To avoid this, you can pick random pivot element too. It won't make any difference in
the algorithm, as all you need to do is, pick a random element from the array, swap it
with element at the last index, make it the pivot and carry on with quick sort.
Space required by quick sort is very less, only O(n*log n) additional space is required.
Quick sort is not a stable sorting technique, so it might change the occurence of two
similar elements in the list while sorting.
Advantages:
It is in-place since it uses only a small auxiliary stack.
It requires only n (log n) time to sort n items.
It has an extremely short inner loop.
This algorithm has been subjected to a thorough mathematical analysis, a very
precise statement can be made about performance issues.
Disadvantages:
It is recursive. Especially, if recursion is not available, the implementation is
extremely complicated.
It requires quadratic (i.e., n2) time in the worst-case.
It is fragile, i.e. a simple mistake in the implementation can go unnoticed and
cause it to perform badly.
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
DESIGN AND ANALYSIS OF ALGORITHMS

(DAA)

UNIT – 3

Greedy Method:
General Method
Applications:
Job Sequencing with Deadlines
Knapsack Problem
Minimum Cost Spanning Trees
Single Source Shortest Path Problem.

SHRI VISHNU ENGINEERING COLLEGE FOR WOMEN : : BHIMAVARAM

(AUTONOMOUS)
UNIT – III

GREEDY METHODOLOGY:-
Every problem have some constraints and objective functions
OBJECTIVE FUNCTION:
It is an attempt to express a business goal (or) a function that is desired to be aximized
or minimized.
Any subset that specifies the given constraints is called feasible solution.
FEASIBLE SOLUTION:
If the given problem has ‘n’ inputs then the subset of inputs satisfies the constraints of
particular problem is called feasible solution.
OPTIMAL SOLUTION:
A feasible solution that either maximizes or minimizes the objective function is called
optimal solution.
In greedy method, we works in stages. At each stage, we take one input at a time and
make a decision, either it gives optimal solution or not.
A decision made in one stage cannot be changed in later stages. i.e, there is no
backtracking.
ALGORITHM:
Algorithm greedy(a,n) || a[1:n] contains ‘n’ inputs
{
Solution:= ¶; || initialize the solution
For i:=1 to n do
{
x :=select(a);
if feasible(solution,x)then
solution:=union(solution,x)
}
return solution
}

KNAPSACK PROBLEM:-
We have 'n' objects and a knapsack or bag.Each object has weight 'Wi' and profit 'Pi' and
knapsack has capacity 'm'. Objective is filling of Knapsack that maximises the total profit
earned.So the problem can be stated as maximise

Σ Pi Xi subject to Σ Wi Xi <= m
1<=i<=n 1<=i<=n and 0<=Xi<=1 , 1 <= i <= n To Compute maximum profit, we take
some solution factor i.e Xi.
If object directly placed Xi = 1 (If Enough space is available)
Otherwise Xi=0.
If object does not fit in the knapsack but some amount of space is available, Xi =
Remaining Space/Actual Weight of Object.
ALGORITHM :
Algorithm knapsack(p,w,n)
{
//p[1:n] & w[1:n] are profits and weights of objects such that
Pi/Wi > P(i+1)/W(i+1) > ......
{
for i := 1 to n do
x[i] := 0 (u -> capacity of bag)
m = u;
for i := 1 to n do {
if(w[i] > m) break;
x[i] := 1;
m = u - w[i];
}
if(i <= n) then
x[i] = m / w[i] ;
}
*Time Complexity is O(n).
EXAMPLE:
1. No. of objects n = 3, m = Capacity of knapsack=20
(p1,p2,p3) = (25,24,15)
(w1,w2,w3)=(18,15,10)
Fill the bag with Maximum Profit using Knapsack greedy method
sol: Given, n=3, m=20
Our main aim is to fill the bag with Maximum Profit
Case 1: (Maximum Profit)
First we place Maximum profit object
p1 = 25 w1=18 x1=1
After placing First Object
M = 20 - 18 = 2 p2 = 24 w2 = 15 x2 = 2/15
There is no space available => x3 = 0
Maximize ΣPiXi => p1x1 + p2x2 + p3x3
1<=i<=n =>25(1)+24(2/15)+15(0) =>25+3.2 =>28.2
Case 2: (Minimum Weight)
Among the above three onjects W3 weight is minimum
w3=10 M=20-10=10 x3=1 [Fitted]
Next object is w2=15
x2 = 10/15 Bag is Full
So,x1=0
Here,Σ Pixi = p1x1 + p2x2 + p3x3
1<=i<=n => 25(0)+24(10/15)+15(1) =>0+16+15 =>31

Case 3: (Maximum Profit per unit weight)


p1/w1 = 25/18 = 1.4
p2/w2 = 24/15 = 1.6
p3/w3 = 15/10 = 1.5
We place an item in bag whose profit per weight (p/w) is maximum i.e.
p2/w2 is maximum so that x2=1
M= 20-15=5 x3 = 5/10=0.5 x1=0
=>25(0)+24(1)+15(0.5) =>24+7.5 =>31.5

Therefore,
The Maximum profitable feasible solution that maximises profit is 31.5
The optimal solution is (0,1,1/2)
Time Complexity is O(n).

MINIMUM COST SPANNING TREE:


It is a connected undirected acyclic graph.let G = ( V , E) be an connected graph , then
the subgraph t = (V , E') of G is a spanning tree iff 't' is a tree.
WEIGHTED GRAPH: A collection of vertices , edges and also weights on the edges then
the graph is said to be weighted graph.
SPANNING TREE: Any tree consisting of all vertices of a graph , then it is called a
spanning tree.
MINIMUM COST SPANNING TREE: A spanning tree with minimum cost is called minimum
cost spanning tree. In case of complete graph the possible number of spanning trees
are n^(n – 2).
To find the minimum cost spaning tree we will use two standard algorithms.
1. prims Algorithm
2. 2. krushkal's Algorithm
1.PRIMS ALGORITHM:
Step 1: Take minimum cost edges from all the edges
Step 2: Take another edge having minimum weight among those vertices which are
adjacent to previous edge.
Step 3: Repeat the above process untill spanning tree contains (n – 1) edges.
EXAMPLE:

Total cost = 4.0 + 8.0 + 1.0 + 2.0 + 2.0 + 11.0 + 7.0 + 4.0 = 39

ALGORITHM PRIMS (G):

//INPUT: A weighted connected graph G = <V , E>


//OUTPUT: T minimum spanning tree of G
{
V = { V0}
T = { } // empty graph
for i = 1 to n – 2
{
choose nearest neighbour Vj of V that is adjacent to Vi
Vi belongs to V and for which edge e i j = (Vi ,Vj ) does not form a
cycle with members of T.
V = V union { Vj }
T = T union { e i j }
}
return T;
}

2.)KRUSHKAL'S ALGORITHM:

Kruskal's algorithm is a minimum-spanning-tree algorithm which finds an edge of the


least possible weight that connects any two trees in the forest.
It is a greedy algorithm in graph theory as it finds a minimum spanning tree for
a connected weighted graph adding increasing cost arcs at each step. This means it finds
a subset of the edges that forms a tree that includes every vertex, where the total
weight of all the edges in the tree is minimized. If the graph is not connected, then it
finds a minimum spanning forest
Below are the steps for finding MST using Kruskal’s algorithm:
Step 1: Take minimum cost edges from all the edges
Step 2: Take minimum weight edge among all remaining (n – 1 ) edges
Step 3: Repeat the above process untill spanning tree contains (n – 1) edges.
EXAMPLE:

Total cost = 4 + 8 + 7 + 9 + 2 + 4 + 1 + 2 = 37
ALGORITHM:
ALGORITHM PRIMS ( G ):
//INPUT: A weighted connected graph G = <V , E>
//OUTPUT: T minimum spanning tree of G
{
T = { } // empty graph
for i = 1 to n – 2 do {
e := any edge in G with smallest weight that does not form a cycle when added
to T
T := T union { e }
}
return T; }
SINGLE SOURCE SHORTEST PATH PROBLEM:
Graphs can be used to represent distance between states or country with vertices
representing cities and edge representing sections of highway. The edges can be assign
weights which may be either distances between two cities connected by edge or average
time to drive along that section of highway.

Start from source vertex :

->Set s[1] = true; and set[2] = true; {1,2,3,4} = ∞;


{1,2,3,5} = 70;
{1,2} = 10; {1,2,3} = 60;
Now shortest distance from
{1,3} = ∞; {1,2,4} = ∞; vertex 3 is 10 i.e{1,2,3,5}
and selected vertex is 5;
{1,4} = 30; {1,2,5} =∞;
set s[5] = true;
{1,5} = 100; Now shortest distance from destination vertex 5 is
achieved through shortest
Now shortest distance from vertex 2 is 50 i.e{1,2,3}
path 1-2-3-5 with path
vertex 1 is10 i.e{1,2} and set[3] = true; length 70.
ANALYSIS :- single source shortest path O(N^2) times
ALGORITHM:-
Algorithm shortest path(v,cost,dist,n) for num:= 2 to n do{
{ // determine n-1 paths from v
For i:= 1 to n do Choose u from among those vertices not in
{ s(not visible)
s[i] := false; Such that dist[u] is minimum
dist[i] := cost[v,i]; dist[u] := min{dist[i]}
} s[u]:= true;
s[v] = true; for (each w adjacent to u with s[w] =
dist[v] =0.0; false) do
if(dist[w]>dist[u]+cost[u,w]) then
dist[w] = dist[u]+cost[u,w];
}
}
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
DESIGN AND ANALYSIS OF ALGORITHMS

(DAA)

UNIT – 4

Dynamic Programming:
General Method
Applications:
Matrix Chain Multiplication
0/1 Knapsack Problem
All Pairs Shortest Path Problem
Travelling Sales Person Problem
Reliability Design

SHRI VISHNU ENGINEERING COLLEGE FOR WOMEN : : BHIMAVARAM

(AUTONOMOUS)
DESIGN AND ANALYSIS OF ALGORITHMS

(DAA)

UNIT – 5

Backtracking:
General Method
Applications:
N-Queen Problem
Sum of Subsets Problem
Graph Coloring
Hamiltonian Cycles.
Branch and Bound:
General Method
Applications:
Travelling Sales Person Problem
0/1 Knapsack Problem
LC Branch and Bound Solution
FIFO Branch and Bound Solution.

SHRI VISHNU ENGINEERING COLLEGE FOR WOMEN : : BHIMAVARAM

(AUTONOMOUS)
DESIGN AND ANALYSIS OF ALGORITHMS

(DAA)

UNIT – 6

NP-Hard and NP-Complete Problems:


Basic Concepts
Non Deterministic Algorithms
NP Hard and NP Complete Classes
Cook‘s Theorem.

SHRI VISHNU ENGINEERING COLLEGE FOR WOMEN : : BHIMAVARAM

(AUTONOMOUS)
UNIT – 6
NP Hard and NP Complete
Polynomial Time Exponential Time
Linear search-n 0/1 knapsack - 2^n
Binary search-log n Travelling sp -2^n
Insertion sort- n^2 sum of subsets – 2^n
Merge sort – n log n Graph colouring – 2^n
Matrix multiplication -n^3 Hamiltonian cycle -2^n
Actually this topic is research topic. The algorithms are categorized into two types:
1. Polynomial time taking algorithms
2. Exponential time taking algorithms
Here we have linear search algorithm which takes O(n) time and faster than that is binary
search algorithm which takes O(log n) time , and for sorting algorithm merge sort which
takes O(n log n) which takes less time and we need faster algorithms which is faster than this
one.
Similarly for these exponential time algorithms we need polynomial time taking
algorithms. All these algorithms which taking exponential time we want faster and easy
method to solve them in just polynomial time because 2^n, n^n , etc.. are much bigger than
polynomial time.
As these exponential algorithms are very time consuming so we need polynomial time for
them, this is our requirement.
This is a research area and the person from computer science and mathematics can solve
these problems so people have been doing research on this one but there is no particular
solution found so far yet for solving these problems in polynomial time.
Then ,when the research work is going fruitless we want something such that whatever the
work we done should be useful so these are the guidelines or framework made for doing
research on these type of problems i.e., exponential type problems and that framework is
NP Hard and NP Complete.
Let us see the basic idea behind that so there are two points on which the entire topic is
based on these two points only.
1. When you are unable to solve these exponential problems, that you are unable to get
solution in polynomial time algorithm for them at least you do the work such that you
try to show the similarities between them so that if one problem is solved the other can
also be solved. We will not be doing research work individually on each and every
problem like one is working on knapsack problem and the other on travelling sp problem.
Let us combine them collectively and put some effort such that if one problem is solved all
the other problems should be solved. So far that we have to show the relationship between
them.
“Try to relate the problem either solve it or at least related”
2. When we are unable to write algorithm for them that is deterministic, why don’t we
write non deterministic algorithm.

Deterministic Algorithm:
Each and every statement how it works we know it clearly. We are sure how they work
i.e., we know the working of the algorithm. This type of algorithms are called deterministic
algorithms.

Non deterministic Algorithms


Here, we don’t know how the algorithm works. Then how can we write the algorithms
which doesn’t know? While writing an algorithm some of the statements may not be figuring
out,how to make them polynomial so leave them blanks and say this is non deterministic and
when we know how it has to be filled ,we fill up that statements.
In a non-deterministic algorithm also most of the statements may be deterministic, but
some of the statements are nondeterministic
By writing this non-deterministic algorithms we can preserve our research work so that in
future somebody else work on this same problem. He/she can make the nondeterministic part
as deterministic. This is the main idea in writing non deterministic algorithms.
Let us take an example :
Non deterministic Search Algorithm
Algorithm Nsearch (A, n, key)
{
J=choice();
If(key=A[i])
{
Write(j);
Success();
}
else{
write(0);
failure();
}
}
Assume that all these statements ( choice(), success(), failure()) takes just one unit of time.
This algorithm is about searching a key from an array of N elements.
While seeing this algorithm we can say that the algorithm takes O(1 ) time.this is the constant
time algorithm for searching and the fastest searching algorithm is binary search that takes O(
log n) time. And this algorithm takes O( 1) and is really faster , we need this one but it is non
deterministic.
In the above algorithm choice() gives the index of key element and we are checking the
key element is present at index j ,then search is successful otherwise failure and if key
element is not present at index ‘j’ then key element is not present in the array then search is
unsuccessful
Now the question is how this choice() knows that the key element is present at j th
index.that’s what it is non deterministic.Once we know the position of key element in
constant time then we will fill up the choice() line. Choice() line is like a blank statement for
us,now once we fill this we have an algorithm of O(1) for searching.
Example: we have an array .
10 8 5 9 4 2

If key=5. Then choice() will give directly index 2.but how we doesn’t know
Similarly, we write non deterministic algorithms for exponential problems with this we
define two classes

P and NP classes:
P - ‘ P ’ is a set of those deterministic algorithms which are taking polynomial time
Example linear search, binary search, bubble sort etc…
NP -These algorithms are non deterministic but they take polynomial time. We actually
write these algorithms for exponential time taking algorithms.
NP- Non deterministic Polynomial time taking algorithms.

NP
P

‘ P ’ is shown as subset of ‘ NP ’ which means the deterministic algorithms that we know


today were the part of non deterministic algorithms.
We completed the first thing that is writing non deterministic polynomial time taking
algorithms. How to write is completed
Now the second part if you are unable to solve , atleast show the relationship between them
such that if one problem it shoulb be easy for solving the other problems also in polynomial
time .we will not work on each problem individually so how to relate them that is the next
thing we are going to discuss
So for relating them together we need some problem as a base problem and the base problem
is satisfiability problem
satisfiability:-

• In polynomial time, if you unable to solve at least show the relationship between them
such that if one problem is solved, then it should be easy for solving other problems also
in polynomial time.
• We will not work on each and every problem individually.so for relating them together
we need some problem as base problem.
• The base problem is Satisfiability.
• Let x1, x2, x3…. xn denotes Boolean variables
• .Let xi denotes the relation of xi.
• A literal is either a variable or its negation
• .A formula in the prepositional calculus is an expression that can be constructed using
literals and the operators and Ʌ or v.
• A clause is a formula with at least one positive literal.
• The satisfiability problem is to determine if a formula is true for some assignment of truth
values to the variables.
• It is easy to obtain a polynomial time non determination algorithm that terminates
successfully if and only if a given prepositional formula E (x1, x2……xn) is satiable.
• Such an algorithm could proceed by simply choosing (non deterministically) one of the 2
n possible assignments of truth values to (x1, x2…xn) and verify that E (x1, x2…xn) is true
for that assignment.

The satisfiability problem:-


The logical formula:
X1 v X2 v X3
& - x1
& - x2
The assignment:
x 1← F, x2← F, x3 ← T
will make the above formula ꓯ≠true.
(-x1, -x2, x3) represents x1 ← F, x2 ← F, x3 ← T
• If there is at least one assignment which satisfies a formula, then we say that this formula
is satisfiable; otherwise, it is unsatisfiable.
• An unsatisfiable formula:
x1 v x2
& x1 v –x
& -x1 v x2
& -x1 v -x2
• Definition of the satisfiability problem:
Given a Boolean formula, determine whether this formula is satisfiable or not.

• A literal: xi or -xi
• A clause: x1 v x2 v -x3 Ci
• A formula: conjunctive normal form (CNF) C1& C2& … & Cm

NP hard and NP complete Classes:


A problem is in the class NPC if it is in NP and is as hard as any problem in NP. A
problem is NP-hard if all problems in NP are polynomial time reducible to it, even though it
may not be in NP itself.

If a polynomial time algorithm exists for any of these problems, all problems in NP would
be polynomial time solvable. These problems are called NP-complete. The phenomenon of
NP-completeness is important for both theoretical and practical reasons.

Definition of NP-Completeness:
A language B is NP-complete if it satisfies two conditions
• B is in NP
• Every A in NP is polynomial time reducible to B.
If a language satisfies the second property, but not necessarily the first one, the
language B is known as NP-Hard. Informally, a search problem B is NP-Hard if there
exists some NP-Complete problem A that Turing reduces to B.

The problem in NP-Hard cannot be solved in polynomial time, until P = NP. If a


problem is proved to be NPC, there is no need to waste time on trying to find an efficient
algorithm for it. Instead, we can focus on design approximation algorithm.
NP-Complete Problems

Following are some NP-Complete problems, for which no polynomial time algorithm is
known.

• Determining whether a graph has a Hamiltonian cycle


• Determining whether a Boolean formula is satisfiable, etc.

NP-Hard Problems

The following problems are NP-Hard

• The circuit-satisfiability problem


• Set Cover
• Vertex Cover
• Travelling Salesman Problem
In this context, now we will discuss TSP is NP-Complete

TSP is NP-Complete

The traveling salesman problem consists of a salesman and a set of cities. The
salesman has to visit each one of the cities starting from a certain one and returning to the
same city. The challenge of the problem is that the traveling salesman wants to minimize the
total length of the trip

Proof
To prove TSP is NP-Complete, first we have to prove that TSP belongs to NP. In
TSP, we find a tour and check that the tour contains each vertex once. Then the total cost of
the edges of the tour is calculated. Finally, we check if the cost is minimum. This can be
completed in polynomial time. Thus, TSP belongs to NP.
Secondly, we have to prove that TSP is NP-hard. To prove this, one way is to show
that Hamiltonian cycle ≤p TSP (as we know that the Hamiltonian cycle problem is
NPcomplete).
Assume G = (V, E) to be an instance of Hamiltonian cycle.
Hence, an instance of TSP is constructed. We create the complete graph G' = (V, E'), where
E’={(i ,j): i , j ∈ V and i ≠ j
Thus, the cost function is defined as follows −
t( i, j)= {0 if(I ,j)∈E
1 Otherwise,
Now, suppose that a Hamiltonian cycle h exists in G. It is clear that the cost of each edge
in h is 0 in G' as each edge belongs to E. Therefore, h has a cost of 0 in G'. Thus, if
graph G has a Hamiltonian cycle, then graph G' has a tour of 0 cost.
Conversely, we assume that G' has a tour h' of cost at most 0. The cost of edges
in E' are 0 and 1 by definition. Hence, each edge must have a cost of 0 as the cost of h' is 0.
We therefore conclude that h' contains only edges in E.
We have thus proven that G has a Hamiltonian cycle, if and only if G' has a tour of cost at
most 0. TSP is NP-complete.

NP-HARD GRAPH AND SCHEDULING PROBLEMS


Some NP-hard Graph Problems:
The strategy to show that a problem L2 is NP-hard is
I. Pick a problem L1 already known to be NP-hard.

II. Show how to obtain an instance I 1 of L2 from any instance I of L1 such that from the
solution of I 1 - We can determine (in polynomial deterministic time) the solution to
instance I of L1.

III. Conclude from (ii) that L1α L2.

IV. Conclude from (i), (ii), and the transitivity of α that Satisfiabilityα L1
L1 αL2
Therefore, Satisfiability L2
and L2 is NP-hard

1. Chromatic Number Decision Problem (CNP)

• A coloring of a graph G = (V, E) is a function f: V → {1,2, …, k} ꓯ I ϵ V.

• If (U, V) ϵ E then f(u) ≠f(v).

• The CNP is to determine if G has a coloring for a given K.

• Satisfiability with at most three literals per clause α chromatic number problem.
Therefore, CNP is NP-hard.

2. Directed Hamiltonian Cycle (DHC): -

• Let G = (V, E) be a directed graph and length n = 1V1

• The DHC is a cycle that goes through every vertex exactly once and then returns to
the starting vertex.

• The DHC problem is to determine if G has a directed Hamiltonian Cycle.


Theorem: CNF (Conjunctive Normal Form) satisfiabilityα DHC.
Therefore, DHC is NP-hard.

3. Problem (TSP) :Travelling Salesperson Decision:


• The problem is to determine if a complete directed graph G = (V, E) with edge costs
C(u, v) has a tour of cost at most M.
Theorem: Directed Hamiltonian Cycle (DHC) αTSP
• But from problem (2) satisfiabilityα DHC
Therefore, Satisfiability α TSP
and TSP is NP-hard.

NP-HARD SCHEDULING PROBLEMS

Sum of subsets: -
The problem is to determine if A= {a1, a2 ……., an} (a1, a2 ………,an are positive
integers) has a subset S that sums to a given integer M. 2.

Schedule identical processors: -

• Let Pi 1≤i≤m be identical processors or machines Pi.

• Let Ji 1≤i≤n be n jobs.

• Jobs Jirequires ti processing time.

• A schedule S is an assignment of jobs to processors.

• For each job Ji, S specifies the time intervals and the processors on which this job is to be

processed.

• A job cannot be processed by more than one processor at any given time. The problem is

to find a minimum finish time non-preemptive schedule.

• The finish time of S is FT(S) = max {Ti} 1≤i≤m.

• Where Ti is the time at which processor Pi finishes processing all jobs (or job segments)

assigned to it.

Cook’s Theorem:
• The Cook-Levin theorem, also known as Cook’s theorem, states that the Boolean
satisfiability problem is NP-complete. That is, any problem in NP can be reduced in
polynomial time by a deterministic Turing machine to the problem whether a Boolean
formula is satisfiable.
• The theorem is named after Stephen Cook and Leonid Levin.
• An important consequence of this theorem is that if there exists a deterministic
polynomial time algorithm for solving Boolean satisfiability, then every NP problem can
be solved by a deterministic polynomial time algorithm.

The work shows that Cook’s theorem is the origin of the loss of non-determinism in terms
of the equivalence of the two definitions of NP, the one defining NP as the class of problems
solvable by a nondeterministic Turing machine in polynomial time, and the other defining N
P as the class of problems verifiable by a deterministic Turing machine in polynomial time.
Therefore, we argue that fundamental difficulties in understanding P versus NP lie firstly at
cognitionlevel, then logic level.

Following are the four theorems by Stephen Cook −

Theorem-1

If a set S of strings is accepted by some non-deterministic Turing machine within


polynomial time, then S is P-reducible to {DNF tautologies}.

Theorem-2

The following sets are P-reducible to each other in pairs (and hence each has the same
polynomial degree of difficulty): {tautologies}, {DNF tautologies}, D3, {sub-graph pairs}.

Theorem-3

• For any TQ(k) of type Q, TQ(k)k√(logk)2TQ(k)k(logk)2 is unbounded


• There is a TQ(k) of type Q such that TQ(k)⩽2k(logk)2TQ(k)⩽2k(logk)2

Theorem-4

If the set S of strings is accepted by a non-deterministic machine within time T(n) = 2n, and
if TQ(k) is an honest (i.e. real-time countable) function of type Q, then there is a constant K,
so S can be recognized by a deterministic machine within time TQ(K8n).
• First, he emphasized the significance of polynomial time reducibility. It means that if
we have a polynomial time reduction from one problem to another, this ensures that
any polynomial time algorithm from the second problem can be converted into a
corresponding polynomial time algorithm for the first problem.
• Second, he focused attention on the class NP of decision problems that can be solved
in polynomial time by a non-deterministic computer. Most of the intractable
problems belong to this class, NP.
• Third, he proved that one particular problem in NP has the property that every other
problem in NP can be polynomially reduced to it. If the satisfiability problem can be
solved with a polynomial time algorithm, then every problem in NP can also be
solved in polynomial time. If any problem in NP is intractable, then satisfiability
problem must be intractable. Thus, satisfiability problem is the hardest problem in
NP.
• Fourth, Cook suggested that other problems in NP might share with the satisfiability
problem this property of being the hardest member of NP.

In order to prove this, we require a uniform way of representing NP problems.


Remember that what makes a problem NP is the existence of a polynomial-time algorithm
more specifically, a Turing machine for checking candidate certificates.
Assume, then, that we are given an NP decision problem D. By the definition of NP,
there is a polynomial function P and a Turing machine M which, when given any instance I
of D, together with a candidate certificate c, will check in time no greater than P(n), where n
is the length of I, whether or not c is a certificate of I
Let us assume that M has q states numbered 0, 1, 2, . . . , q − 1, and a tape alphabet a1,
a2, . . . , as. We shall assume that the operation of the machine is governed by the functions
T, U, and D. We shall further assume that the initial tape is inscribed with the problem
instance on the squares 1, 2, 3, . . . , n, and the putative certificate on the squares −m, . . . , −2,
−1. Square zero can be assumed to contain a designated separator symbol. We shall also
assume that the machine halts scanning square 0, and that the symbol in this square at that
stage will be a1 if and only if the candidate certificate is a true certificate. Note that we must
have m ≤ P(n). This is because with a problem instance of length n the computation is
completed in at most P(n) steps; during this process, the Turing machine head cannot move
more than P(n) steps to the left of its starting point. We define some atomic propositions with
their intended interpretations as follows:
1. For i = 0, 1, . . . , P(n) and j = 0, 1, . . . , q − 1, the proposition Qij says that after i
computation steps, M is in state j.
2. For i = 0, 1, . . . , P(n), j = −P(n), . . . , P(n), and k = 1, 2, . . . , s, the proposition Sijk says
that after i computation steps, square j of the tape contains the symbol ak.
3. i = 0, 1, . . . , P(n) and j = −P(n), . . . , P(n), the proposition Tij says that after i computation
steps, the machine M is scanning square j of the tape.
Next, we define some clauses to describe the computation executed by M:
1. At each computation step, M is in at least one state. For each i = 0, . . . , P(n) we have the
clause
Qi0 ∨ Qi1 ∨ · · · ∨ Qi(q−1),
giving (P(n) + 1)q = O(P(n)) literals altogether.

2. At each computation step, M is in at most one state. For each i = 0, . . . , P(n) and for each
pair j, k of distinct states, we have the clause
¬(Qij ∧ Qik),
giving a total of q(q − 1)(P(n) + 1) = O(P(n)) literals altogether
3. At each step, each tape square contains at least one alphabet symbol. For each i = 0, . . . ,
P(n) and −P(n) ≤ j ≤ P(n) we have the clause
Sij1 ∨ Sij2 ∨ · · · ∨ Sijs,
giving (P(n) + 1)(2P(n) + 1)s = O(P(n) 2 ) literals altogether.

4. At each step, each tape square contains at most one alphabet symbol. For each i = 0, . . . ,
P(n) and −P(n) ≤ j ≤ P(n), and each distinct pair ak, al of symbols we have the clause
¬(Sijk ∧ Sijl),
giving a total of (P(n) + 1)(2P(n) + 1)s(s − 1) = O(P(n) 2 ) literals altogether

5. At each step, the tape is scanning at least one square. For each i = 0, . . . , P(n), we have
the clause
Ti(−P(n)) ∨ Ti(1−P(n)) ∨ · · · ∨ Ti(P(n)−1) ∨ TiP(n) ,
giving (P(n) + 1)(2P(n) + 1) = O(P(n) 2 ) literals altogether.

6. At each step, the tape is scanning at most one square. For each i = 0, . . . , P(n), and
each distinct pair j, k of tape squares from −P(n) to P(n), we have the clause
¬(Tij ∧ Tik),
giving a total of 2P(n)(2P(n) + 1)(P(n) + 1) = O(P(n) 3 ) literals.

7. Initially, the machine is in state 1 scanning square 1. This is expressed by the two clauses
Q01, T01,
giving just two literals.

8. The configuration at each step after the first is determined from the configuration at the
previous step by the functions T, U, and D defining the machine M. For each i = 0, . . . ,
P(n), −P(n) ≤ j ≤ P(n), k = 0, . . . , q − 1, and l = 1, . . . , s, we have the clauses

Tij ∧ Qik ∧ Sijl → Q(i+1)T(k,l)


Tij ∧ Qik ∧ Sijl → S(i+1)jU(k,l)
Tij ∧ Qik ∧ Sijl → T(i+1)(j+D(k,l))
Sijk → Tij ∨ S(i+1)jk
The fourth of these clauses ensures that the contents of any tape square other than the
currently scanned square remains the same (to see this, note that the given clause is
equivalent to the formula Sijk ∧ ¬Tij → S(i+1)jk). These clauses contribute a total of
(12s + 3)(P(n) + 1)(2P(n) + 1)q = O(P(n) 2 ) literals.
9. Initially, the string ai1 ai2 . . . ain defining the problem instance I is inscribed on squares
1, 2, . . . , n of the tape. This is expressed by the n clauses
S01i1 , S02i2 , . . . , S0nin ,
a total of n literals.

10. By the P(n)th step, the machine has reached the halt state, and is then scanning square 0,
which contains the symbol a1. This is expressed by the three clauses
QP(n)0, SP(n)01, TP(n)0,
giving another 3 literals.

Altogether the number of literals involved in these clauses is O(P(n) 3 ) (in working
this out, note that q and s are constants, that is, they depend only on the machine and do not
vary with the problem instance; thus they do not contribute to the growth of the the number
of literals with increasing problem size, which is what the O notation captures for us). It is
thus clear that the procedure for setting up these clauses, given the original machine M and
the instance I of problem D, can be accomplished in polynomial time.
We must now show that we have succeeded in converting D into SAT. Suppose first
that I is a positive instance of D. This means that there is a certificate c such that when M is
run with inputs c, I, it will halt scanning symbol a1 on square 0. This means that there is some
sequence of symbols that can be placed initially on squares −P(n), . . . , −1 of the tape so that
all the clauses above are satisfied. Hence those clauses constitute a positive instance of SAT.
Conversely, suppose I is a negative instance of D. In that case there is no certificate
for I, which means that whatever symbols are placed on squares −P(n), . . . , −1 of the tape,
when the computation halts the machine will not be scanning a1 on square 0. This means that
the set of clauses above is not satisfiable, and hence constitutes a negative instance of SAT.
Thus from the instance I of problem D we have constructed, in polynomial time, a set
of clauses which constitute a positive instance of SAT if and only I is a positive instance of
D. In other words, we have converted D into SAT in polynomial time. And since D was an
arbitrary NP problem it follows that any NP problem can be converted to SAT in polynomial
time.

You might also like