KEMBAR78
DAA 18CS42 Module1 | PDF | Vertex (Graph Theory) | Computer Programming
0% found this document useful (0 votes)
27 views158 pages

DAA 18CS42 Module1

The document outlines the syllabus for a course on Design and Analysis of Algorithms, covering topics such as algorithm specifications, performance analysis, and various algorithmic techniques including divide and conquer, greedy methods, dynamic programming, and backtracking. It includes detailed modules on these techniques, their applications, and relevant data structures, along with course outcomes that emphasize problem-solving and algorithm analysis skills. Additionally, it lists textbooks and references for further study in the field.

Uploaded by

2023becs076
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views158 pages

DAA 18CS42 Module1

The document outlines the syllabus for a course on Design and Analysis of Algorithms, covering topics such as algorithm specifications, performance analysis, and various algorithmic techniques including divide and conquer, greedy methods, dynamic programming, and backtracking. It includes detailed modules on these techniques, their applications, and relevant data structures, along with course outcomes that emphasize problem-solving and algorithm analysis skills. Additionally, it lists textbooks and references for further study in the field.

Uploaded by

2023becs076
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 158

DESIGN AND ANALYSIS OF

ALGORITHMS

Semester – IV
Course Code: 18CS42
Credits-4
Syllabus
Module -1 10 Hours
• Introduction: What is an Algorithm? (T2:1.1), Algorithm Specification (T2:1.2),
Analysis Framework (T1:2.1), Performance Analysis: Space complexity, Time
complexity (T2:1.3). Asymptotic Notations: Big-Oh notation (O), Omega
notation (Ω),Theta notation (Q), and Little-oh notation (o), Mathematical
analysis of Non-Recursive and recursive Algorithms with Examples (T1:2.2, 2.3,
2.4). Important Problem Types: Sorting, Searching, String processing, Graph
Problems, Combinatorial Problems. Fundamental Data Structures: Stacks,
Queues, Graphs, Trees, Sets and Dictionaries.

• Module -2 10 Hours
• Divide and Conquer: General method, Binary search, Recurrence equation for
divide and conquer, Finding the maximum and minimum (T2:3.1, 3.3, 3.4),
Merge sort, Quick sort (T1:4.1, 4.2), Strassen’s matrix multiplication (T2:3.8),
Advantages and Disadvantages of divide and conquer. Decrease and Conquer
Approach: Topological Sort. (T1:5.3).
Module – 3 10 Hours
Greedy Method: General method, Coin Change Problem, Knapsack Problem,
Job sequencing with deadlines (T2:4.1, 4.3, 4.5). Minimum cost spanning
trees: Prim’s Algorithm, Kruskal’s algorithm (T1:9.1, 9.2). Single source
shortest paths: Dijkstra's Algorithm (T1:9.3). Optimal Tree problem: Huffman
Trees and Codes (T1:9.4). Transform and Conquer Approach: Heaps and Heap
Sort (T1:6.4).

Module-4 10 Hours
Dynamic Programming: General method with Examples, Multistage Graphs
(T2:5.1, 5.2). Transitive Closure: Warshall’s Algorithm, All Pairs Shortest Paths:
Floyd's Algorithm, Optimal Binary Search Trees, Knapsack problem ((T1:8.2,
8.3, 8.4), Bellman-Ford Algorithm (T2:5.4), Travelling Sales Person problem
(T2:5.9), Reliability design (T2:5.8)
Module-5 10 Hours
Backtracking: General method (T2:7.1), N-Queens problem (T1:12.1), Sum of
subsets problem (T1:12.1), Graph coloring (T2:7.4), Hamiltonian cycles
(T2:7.5). Branch and Bound: Assignment Problem, Travelling Sales Person
problem (T1:12.2), 0/1 Knapsack problem (T2:8.2, T1:12.2): LC Branch and
Bound solution (T2:8.2), FIFO Branch and Bound solution (T2:8.2). NP-
Complete and NP-Hard problems: Basic concepts, non-deterministic
algorithms, P, NP, NP-Complete, and NP-Hard classes (T2:11.1)
• Text Books
1. Introduction to the Design and Analysis of Algorithms,
Anany Levitin:, 2rd Edition, 2009. Pearson.
2. Computer Algorithms/C++, Ellis Horowitz, Satraj Sahni and
Rajasekaran, 2nd Edition, 2014, Universities Press

• Reference Books
• Introduction to Algorithms, Thomas H. Cormen, Charles E.
Leiserson, Ronal L. Rivest, Clifford Stein, 3rd Edition, PHI
• Design and Analysis of Algorithms , S. Sridhar, Oxford (Higher
Education)
COURSE OUTCOMES
• Able to analyze mathematically the given complex problems
and learn to apply various computation problem solving
techniques and analyze the complexity of the same and
learn the various fundamental data structures.
• Apply and analyze divide and conquer approaches and
decrease and conquer approaches in solving the problems
• Able to choose the appropriate algorithmic design
technique like greedy method, transform and conquer
approaches and compare the efficiency of algorithms to
solve the given appropriate problem.
• Able to apply and analyze dynamic programming
approaches.
• Able to apply and analyze Backtracking, Branch and bound,
approximation algorithms to solve complex problems and
to describe P,NP and NP-Complete problems
MODULE - 1
Introduction - The notion of the algorithm
• An algorithm is a sequence of unambiguous instructions
for solving a problem, i.e., for obtaining a required
output for any legitimate input in a finite amount of
time.
• This definition can be illustrated by a simple diagram .

• The reference to “instructions” in the definition implies


that there is something or someone capable of
understanding and following the instructions given. We
call this a “computer,”
Important points
• The nonambiguity requirement for each step of an algorithm
cannot be compromised.
• The range of inputs for which an algorithm works has to be
specified carefully.
• The same algorithm can be represented in several different
ways.
• There may exist several algorithms for solving the same
problem.
• Algorithms for the same problem can be based on very
different ideas and can solve the problem with dramatically
different speeds.
• Example: the greatest common divisor of two nonnegative,
not-both-zero integers m and n, denoted gcd(m, n), is defined
as the largest integer that divides both m and n evenly, i.e.,
with a remainder of zero.
Algorithms: Properties
• Input : Zero or more quantities are externally supplied
• Output : At least one quantity is produced
• Definiteness : Each instruction is clear and unambiguous
• Finiteness : If we trace out the instructions of an algorithm,
then for all cases ,the algorithm terminates after a finite
number of steps
• Effectiveness : Every instruction must be very basic so that it
can be carried out, in principle, by a person using only pencil
and paper
• Computational Procedures – Algorithms which are Definite
and Effective.
Ex: Operating System of a digital computer
To achieve Definiteness Use Programming Languages

• A Program is the expression of an algorithm in a programming


language.
Program – synonyms – procedure, function, subroutine.
Important Areas of Research
• How to devise algorithms
• Understand the design techniques – dynamic, linear,
non-linear, integer programming.
• How to validate algorithms (2 phases)
• Process to check whether algorithm computes answer
for all possible legal inputs – Algorithm Validation.
• program proving or program verification – done using
assertions and specifications – complete proof to prove
each and every operation is correct .
Important Areas of Research
• How to analyze algorithms(performance
analysis)
• Task of determining how much computing time and
storage an algorithm requires.
• Can compare 2 algorithms quantitatively
• Check whether it meets the constraints
• How to test a program
• Debugging – process of executing programs on sample
inputs to determine whether faulty errors occur, if so,
to correct them.
• Profiling ( performance measurement )- process of
executing a correct program on inputs and measuring
the time & space it takes to compute the results.
Algorithm Specification
• Difference between an algorithm and a program : The
program does not have to satisfy the fourth condition
(Finiteness).
– Ex – Operating System
• Describing an algorithm in 3 ways:
– natural language such as English.
– Using graphical representation called Flowchart – feasible
for small problems.
– Pseudocode – combination of English + programming
language like C, C++ ….
General Approach of writing pseudocode
1. Comments begins with //
2. Compound statements or body of procedure are shown as Blocks
indicated with indent or matching braces{ and }.
3. An identifier begins with letter. The data type of variables are not
explicitly declared. Compound data types can be formed with records
4. Node=record
{
Data type data 1;
………..
………..
………….
Data type data n;
}
5. Assignment of values to variable is written as <variable>:=<Expression
or variable ← expression.
6. Array elements are accessed using a[ i] and in 2-D array the element
(i,j)th element is denoted by A[i,j]
7. Looping statements employed are for, while and repeat-until
while<condition> do
{
< statement 1 >
……………..
……………..
< statement n >
}
for variable := value 1 to value 2 step step do
{
< statement 1 >
……………..
……………..
< statement n >
}
repeat
< statement 1 >
……………..
……………..
< statement n >
until< condition >
The break instruction may be used within looping to force exit. Return may used
to exit from procedure

8. Conditional statements has the following forms


if<condition> then < statement>
if<condition> then < statement1 > else <statement 2>
Here condition is boolean expression and statement may be simple or
compound.
Case statement may be used for multiple choice
Case {
:<condition 1>: <statement1>
………………….
:<condition n>: <statement n>
:< else < statement n+1>
}
9. Input and output are done using read and write Instructions.
10. There is only one type of procedure consisting of heading and a body
which takes the form Algorithm: Name(<parameter list>) ,……-
Body has simple or compound statements
Example: Algorithm to find maximum among given n numbers
Algorithm: Max (A,n)
//A is array of size n
Result:=A[1];
for i:= 2 to n do
if A[i]> Result then Result:=A[i];
return Result ;
Example -Translating a Problem
into Algorithm
• Problem – Selection Sort
– Devise a program that sorts a set of n integers, where n>=
1, from smallest to largest.

• Solution I:
– From those integers that are currently unsorted, find the
smallest and place it next in the sorted list.

• It describes the sorting problem, but it is not an


algorithm (questions are unanswered)
Example -Translating a Problem
into Algorithm
• Solution II: we assume elements are stored in an array a, ith
element is stored at position i, a[i].
• an algorithm, may be written in partially C/ C++ and English

for (i= 1; i< =n; i++) {


examine a[i] to a[n] and suppose
the smallest integer is a[min];
interchange a[i] and a[min];
}
• Solution III

1. void SelectionSort(int a[], int n)


2. //sort the array in nondecreasing order
3. {
4. for (int i= 1; i<n; i++) {
5. int min= i;
6. for (int k= i+1; k<= n; k++)
7. if (a[k]< a[min]) then min= k;
8. int temp=a[i];a[i]=a[min];a[min]=temp;
9. }
10. }
• Theorem 1.1
Function SelectionSort(a, n) correctly sorts a set of n>=1
integers. The result remains in a*1+,…,a*n+ such that
a*1+<=a*2+<=…<=a*n].

• Proof:
• We first note that for any i, say min=i, following the
execution of lines 5 to 8, it is the case that a[min]
<= a[k], min < k <= n. a[1:min] is unchanged.
• Hence, following the last execution of these lines(that
is, i = n),
• we have a*l+<=a*2+ < =…<=a*n+.
Recursive Algorithms
• Recursion is a process in which a procedure calls itself. In C
the function that calls itself is called as recursive function and
the algorithm is called as recursive algorithm.
• Two types
1. Direct recursion – if the function A calls function A itself
2. Indirect recursion – if function A calls function B, function B
calls function A ( recursive chain)
• A recursive procedure must have the following 2 properties
1. There must be base criteria ( terminating condition) for
which procedure doesn’t calls itself
2. Successive recursive calls should lead towards base criteria
Recursion - Examples
• Factorial function – f(n) = n*f(n-1) for n ≠ 0
f(n) = 1 for n = 0
Algorithm: FACTOIAL (FACT, N)
// This procedure calculates N! and returns the value in the
variable FACT
1) If N = 0, then Set FACT := 1 and Return
2) Call FACTORIAL (FACT, N-1)
3) Set FACT := N * FACT
4) Return
Example :
• 5!=5*4*3*2*1
• 5!=5*4!
• n!=n*(n-1)!
Example : Factorial(Fact,5)
• Factorial(Fact,5)
Factorial(Fact,4)
Factorial(Fact,3)
Factorial(Fact,2)
Factorial(Fact,1)
Factorial(Fact,0)
Fact=1*1
Fact=2*1
Fact=3*2
Fact=4*6
Fact=5*24 =120
GCD
• GCD – gcd(m, n) = gcd( n, m mod n) for n ≠0
gcd (m,n) = m for n = 0
Algorithm: GCD(m, n)
// This procedure calculates the GCD of two numbers m, n
and Returns the value
1. If n=0 then, Set gcd := m; Return gcd
2. Call GCD ( n, m mod n)
3. End
Tower of Hanoi
• In this puzzle, we have 3 towers or pegs and n disks of
different sizes .
• Initially, all the disks are on the first peg called source peg in
proper order of size, that is the largest on the bottom and
the smallest on top.
• The goal is to move all the disks to the third peg called
destination peg using the second one as an auxiliary, if
necessary.
• We can move only one disk at a time, and it is forbidden to
place a larger disk on top of a smaller one.
Solution to Tower of Hanoi Problem
• Move top n-1 disks from source peg A to peg B using peg C as Auxpeg
• Transfer the remaining disk from peg A to peg C
• Move n-1 disks from peg B to peg C using peg A as Auxpeg
Algorithm:
TOWER (N, Src, Aux, Dest)
//This procedure gives a recursive solution to the Towers of Hanoi problem for N
disks
1) If N=1 then :
Write : move disk 1 from Src to Dest and Return
[ End of If structure ]
2) [ Move N-1 disks from peg Src to peg Aux ]
Call TOWER( N-1, Src, Dest, Aux)
3) Write : move disk N from Src to Dest
4) [ Move N-1 disks from peg Aux to peg Dest ]
Call TOWER( N-1, Aux,Src, Dest)
5) Return
Recursive solution - Traces for n=3
Tower(3,A,B,C)
Tower(2,A,C,B)
Tower(1,A,B,C)
Move disk 1 from A to C
Move disk 2 from A to B
Tower(1,C,A,B)
Move disk 1 from C to B
Move disk 3 from A to C
Tower(2,B,A,C)
Tower(1,B,C,A)
Move disk 1 from B to A
Move disk 2 from B to C
Tower(1,A,B,C)
Move disk 1 from A to C
Permutation of list
• Problem is to list out all possible permutations
• For example if list is {a,b,c} ,
• Permutation is {(a, b,c),(a, c,b),(b, a,c),(b,c,a),(c,a,b),(c,b,a)}
The algorithm for permutation of (a,b,c) is
i) a followed by permutation of(b,c)
ii) b followed by permutation of (a,c)
iii) c followed by permutation of(a,b)
• Algorithm : Perm(list, k, n)
// list[1:n] is array with n objects,initially k=1
If (k = n) then write ( list[1:n])
else
for i:=k to n
swap(list[k],list[i]);
Perm (list,k+1,n);
swap(list[k],list[i]);
Tracing for list={a,b,c}, n=3, k=1
Analysis Framework
• Frame work for analyzing the efficiency of algorithms is time
efficiency and space efficiency
• Algorithms efficiency:
• Time efficiency : The time efficiency indicates how fast
an algorithm in question runs.
• Space efficiency: The space efficiency deals with the
extra space the algorithm requires.
• Both the efficiencies depends on the input size.
Framework
• Measuring an input’s size
Almost all algorithms run longer on larger inputs. Therefore, we
investigate an algorithm’s efficiency as a function of some
parameter n indicating the algorithm’s input size.
• Efficiency  Function of input size
• Units for measuring running time
The drawbacks to measuring time in seconds are
1)The dependency on the speed of a particular computer
2)Dependency on the quality of a program implementing the
algorithm and of the compiler used in generating the machine
code,
• Since we are after a measure of an algorithm’s efficiency, we
would like to have a metric that does not depend on these
factors.
Solution to this problem
1)Count the number of times each of the algorithm’s operations
is executed.
2)Identify the most important operation of the algorithm, called
the basic operation, the operation contributing the most to
the total running time
• The basic operation of an algorithm is usually the most time-
consuming operation in the algorithm’s inner most loop.

• Algorithm efficiency  Number of times the basic


operation is executed
• Let cop be the execution time of an algorithm’s basic operation
on a particular computer
• Let C(n) be the number of times this operation needs to be
executed for this algorithm.
• Then we can estimate the running time
T (n) ≈ copC(n).
If C(n)=n if we double input size, time taken is doubled where as
If C(n) = (1/2 )n(n − 1), how much longer will the algorithm run if
we double its input size? The answer is about four times longer.
• C(n) = (1/2)n(n − 1) = (1/2) n2 – (1/2)n ≈ (1/2)n2
Consider T (2n)/T (n)≈ (copC(2n))/(copC(n)≈((1/2) (2n)2 )/(1/2)n2= 4.
• Note that we were able to answer the last question without
actually knowing the value of cop: it was neatly cancelled out in
the ratio.
• That is running time depends on order of growth
• Orders of growth
• Framework ignores multiplicative constants
• Concentrates on order of growth of the input
Worst-case, Best-case, Average-case Efficiency
• There are many algorithms for which running time depends not only
on an input size but also on the specifics of a particular input.
• An example, sequential search.
• searches for a given search key(item) in a list of n elements by
checking successive elements of the list until either a match with the
search key is found or the list is exhausted.
• Consider list is implemented as an array.
• ALGORITHM : Sequential Search(A[0..n − 1+, K)
//Searches for a given value in a given array by sequential search
//Input: An array A[0..n − 1+ and a search key K
//Output: The index of the first element in A that matches K or −1 if
there //are no matching elements
i ←0
while i < n and A[i+ ≠ K
i ←i + 1do
if i < n return i
else return −1
• Clearly, the running time of this algorithm can be quite
different for the same list size n.
• The worst-case efficiency of an algorithm is its efficiency for
the worst-case input of size n, which is an input (or inputs) of
size n for which the algorithm runs the longest among all
possible inputs of that size.
• Cworst(n) = n for this algorithm Then the algorithm makes the
largest number of key comparisons among all possible inputs
of size n:
• The worst case occurs
• when there are no matching elements or the first matching
element happens to be the last one on the list,
• The best-case efficiency of an algorithm is its efficiency for
which the algorithm runs the fastest among all possible inputs
of that size
• Cbest(n) = 1for this algorithm.
• The algorithm’s behavior on a “typical” or “random” input.
Leads to the average-case efficiency .

• To analyze the algorithm’s average case efficiency, some


assumptions about possible inputs of size n is made
• The probability of a successful search is equal to p (0 ≤ p ≤ 1)
• The probability of the first match occurring in the ith position of
the list is the same for every i.
• Now we can find the average number of key comparisons Cavg(n)
as follows.
• In the case of a successful search, the probability of the first
match occurring in the ith position of the list is p/n for every i,
and the number of comparisons made by the algorithm in such a
situation is obviously i. In the case of an unsuccessful search, the
number of comparisons will be n with the probability of such a
search being (1− p).
• Therefore,
Cavg(n) = [1 . p/n + 2 . p/n + . . . + i . p/n + . . . + n . p/n]+ n . (1− p)
= p/n[1+ 2 + . . . + i + . . . + n]+ n(1− p)
= (p/n)n(n + 1)/2 + n(1− p)
= p(n + 1)/2 + n(1-p)
• That is, if p = 1(the search must be successful), the average number of
key comparisons made by sequential search is (n + 1)/2
• If p = 0 (the search must be unsuccessful), the average number of key
comparisons will be n because the algorithm will inspect all n
elements on all such inputs.
• In some situations a single operation can be expensive, but the total
time for an entire sequence of n such operations is always
significantly better than the worst-case efficiency of that single
operation multiplied by n. Such type of efficiency is called amortized
efficiency
Conclusion
• Efficiency of algorithm for the worst case input of size
‘n’  algorithm runs longer period of time

• Efficiency of algorithm for the best case input of size ‘n’


 algorithm runs shortest period of time

• Efficiency for any random input is average case


efficiency
Performance Analysis
• Algorithms can be judged upon the criteria's:
• Does it do our expected work from it
• Does it work according to the original specification of
the task
• Is there any documentation regarding how to use it and
how it works
• Are any procedures created in such a way that they
perform some logical sub functions
• Is the code readable
These are important when we write software
Other Criteria’s…
• Space Complexity  Amount memory
algorithm needs to run to completion

• Time complexity  Amount of time it needs


to run to completion

These criteria have direct relationship with


algorithm performance
Phases of performance evaluation
• Priori estimates
Performance analysis

• Posteriori testing
 Performance measurement
Space Complexity
S(P)=C+SP(I)
• Fixed Space Requirements (C)
Independent of the characteristics of the inputs and
outputs
– instruction space
– space for simple variables, fixed-size structured variable,
constants
• Variable Space Requirements (SP(I))
Depend on the instance characteristic I
– number, size, values of inputs and outputs associated with I
– recursive stack space  formal parameters, local variables,
return address
Sp(Instance Characteristics)=0
Recursion stack space >= 3(n+1)
Each stack space= Space for actual and formal
parameters, return address = 3
Recursion depth = n+1
Time complexity
Time complexity of an algorithm is the amount of time it needs to
run to completion
T(P)=C+TP(I)
C  Compile time
– independent of instance characteristics
– Once compiled can be executed multiple number of
times
TP  run (execution) time
– Depends on instance characteristics
– We can actually get only the estimate of run time
Different methods for run time tp(n)
1 method
 Find the no. of additions, multiplications, divisions,
compares, loads, stores etc
tp(n) = CaADD(n)+CsSUB(n)+CmMUL(n)+CdDIV(n)...
 n  Instance characteristics
 Ca,Cs,Cm,Cd  Time required to add, sub, ….
ADD(n), SUB(n), MUL(n), DIV(n) Functions that give
number of add, mul etc.,

 But time needed (Ca,Cs,Cm,..) depends on no.s being used


II Method
 Program is typed, compiled and run on a particular machine
 Then the execution time is physically clocked. Thus tp(n) is
obtained
 But in multiuser system time depends on
 System load
 Number of other programs running
 Characteristics of those other programs

Therefore for analyzing algorithm for time efficiency we


go by 1) Step count method
2) Operation count method
Step Count Method
 Counting the total number of program steps

 A program step is a syntactically or semantically meaningful


program segment whose execution time is independent of
the instance characteristics.

 To count no. of program steps:


 Use a count variable in the program that will be
incremented for each program steps & end of the program,
count variable will provide total number of program steps.
 Build a table in which we list total number of steps
contributed by each statement.
Count variable examples:

Total Count =2n+3


Step count using tables:
 s/e Number of steps per execution of the
statement
 Frequency Total number of times each statement
is executed
An Example for step count method for recursive algorithm

Recursive formula for step count

• Recursive formula are called recurrence relation


Solving the recurrence relation with repeated
substitution:

 Step count implies that, if n is doubled then run time


also doubled.
 So run time grows linearly in n
Tabular method for recursive algorithm
Matrix Addition
With counting Statements…
Simplified matrix addition with count statements…
Matrix addition- step table
Fibonacci Numbers
Fibonacci sequence
0 1 1 2 3 5 8 13 21……….

=> n th value in the series is given by the formula

fn = f(n-1) + f(n-2)

i.e each new term is obtained as the sum of previous 2 terms


Number of steps when
• n=0/1 2 steps
• n>1  4n+1 steps
Number of steps when
• n=0/1 2 steps
• n>1  2n+3 steps
Asymptotic Notations and Basic Efficiency Classes(O, , )
• The order of growth of basic operation as principal indicator of
algorithms efficiency
• Scientists use 3 notations to compare & rank order of growth:
• O(big oh)
• (big omega)
• (big theta)
These notations describes behaviour of time and space
efficiencies for large instance characteristics
• Consider 2 programs with complexities
C1n2+C2n and C3n respectively.
• The program with complexity C3n will be faster than one with
complexity C1n2+C2n for sufficiently large values of n.
• For small values of n either program could be faster depending
on C1,C2 and C3.
That is if C1=1,C2=2 and C3=10 then C1n2+C2n >= C3n for n>=8
if C1=1,C2=2 and C3=100 then C1n2+C2n >= C3n for n>=98
That is What ever the values of C1,C2 ,C3 there is a critical value of
n, beyond which the program with complexity C3n will be faster
than program with complexity C1n2+C2n . This value of n is
called as break even point.
If break even point is 0, then program with complexity C3n is
always faster
• Here 2 functions are used:
– t(n) & g(n)  any nonnegative functions defined
on the set of natural numbers

• t(n)  algorithm’s running time (usually indicated by


its basic operation count C(n))

• g(n)  some simple function to compare the count


with
O notation
• A function t(n) is said to be in O(g(n)), denoted t(n)
∈ O(g(n)), If there exist some positive constant c
and some nonnegative integer n0 such that
t(n) ≤ cg(n) for all n ≥ n0
i.e., the function t(n) is at most C times the function g(n) for all
n>= n0
Then g(n) is upper bound on value of t(n) or t(n) is bounded
above by some constant multiple of g(n) for all large n
Informally, O(g(n))  set of all functions with a smaller or same
order of growth as g(n)
ie, 2n+3 ∈ O(n) is true
2n2+3 ∈ O(n2 and 2n2+3 ∈ O(n3) are true Where as
2n2+3 ∈ O(n) is false
Pictorial representation
Prove that 100n+5 ∈ O(n)
• t(n)=100n+5 can be written as
• 100n+5 ≤ 100n+n for all n≥5
≤ 101n for all n≥5
This is of the form
t(n) ≤ cg(n) for n ≥ n0 then t(n) ∈ O(g(n))
Where t(n)=100n+5 , c=101,g(n)=n and n0= 5
Therefore from definition of O,
t(n)=100n+5 ∈O(n)
• T(n)=10 n2 +4n+2
<= 10n2 +4n+n for all n>=2
<= 10n2 +5n
<= 10n2 + n2 here n2 >= 5n true
<=11 n2 for n>= 5
This is of the form
t(n) ≤ cg(n) for n ≥ n0 then t(n) ∈ O(g(n))
Where t(n)= 10 n2 +4n+2 , c=11,g(n)= n2 and n0= 5
Therefore from definition of O,
t(n)= 10 n2 +4n+2 ∈O(n2)
Indicate the class O(g(n)) to which following function belongs

• t(n)=6*2n+ n2
We can write 6*2n + n2≤ 6*2n+2n for all n≥4
≤ 7* 2n for all n≥4
because 2n≥ n2 for all n≥4

Here C=7 and g(n)=2n


Hence t(n)=6*2n+ n2 ∈O(2n )
Let t(n)=5
5 ≤ 5*1
Here c=5 and g(n)=1
T(n)=5∈O(1)
 Notation
• A function t(n) is said to be in (g(n)), denoted t(n) ∈  (g(n)), if
there exist some positive constant c and some nonnegative
integer n0 such that
t(n) ≥ cg(n) for all n ≥ n0

i.e., the function t(n) is at least C times the function g(n) for all n>= n0
Then g(n) is lower bound on value of t(n) or t(n) is bounded below by
some constant multiple of g(n) for all large n
Informally, (g(n))  set of all functions with a larger or same order
of growth as g(n)
ie, 2n+3 ∈ (n) is true
2n2+3 ∈ (n2 and 2n2+3 ∈ (n) are true Where as
2n2+3 ∈ (n3) is false
Pictorial representation
Prove that 100n+5 ∈ (n)
• t(n)=100n+5 can be written as
• 100n+5 >100n for all n≥0
> 100n for all n≥5
This is of the form
t(n) > cg(n) for n ≥ n0 then t(n) ∈ (g(n))
Where t(n)=100n+5 , c=100,g(n)=n and n0= 0
Therefore from definition of ,
t(n)=100n+5 ∈ (n)
T(n)=100n-5 ≥ 100n-n for all n≥5 t(n)=100n-
5 ∈ (n)
Ө Notation
• A function t(n) is said to be in Ө(g(n)), denoted
t(n) ∈ Ө(g(n)), if there exist some positive constants c1
and c2 and some nonnegative integer n0 such that
c2g(n) ≤ t(n) ≤ c1g(n) for all n ≥ n0.
Informally,
Ө(g(n))  set of all functions that have same order of
growth as g(n)
t(n) is bounded both above and below by some positive
constant multiples of g(n) for all large n
ie, 2n+3 ∈ Ө(n) is true
2n2+3 ∈ Ө(n2 where as
2n2+3 ∈ Ө(n3) and 2n2+3 ∈ Ө(n) are false
Pictorial representation
Prove that (1/2)n(n-1) ∈ Ө(n2 )
• We are asked to prove C2(n2 )≤ (1/2)n(n-1) ≤ C1(n2 )
First let us prove right inequality
ie, (1/2)n(n-1)=(1/2) n2 – (1/2)n ≤ (1/2) n2 for all n≥0
Then let us prove left inequality
ie, (1/2)n(n-1)=(1/2) n2 – (1/2)n ≥ (1/2) n2 - (1/2)n * (1/2)n for all n≥2
≥ (1/4) n2
Hence we can select C2= ¼, C1= ½, n0= 2, g(n)= n2

¼ (n2 )≤ (1/2)n(n-1) ≤ ½(n2 ) for all n≥2

Hence by definition of big Ө


(1/2)n(n-1) ∈ Ө(n2 )
Theorem: The algorithms overall efficiency is determined by part
with larger order of growth
that is if t1(n) ∈ O(g1(n)) and t2(n)∈ O(g2(n) then
t1(n) + t2(n) ∈ O(max{g1(n), g2(n)})

PROOF : Consider four arbitrary real numbers a1, b1, a2, b2.
if a1 ≤ b1 and a2 ≤ b2, then a1+a2 ≤ 2max{b1, b2}
This simple fact will extends to orders of growth.
Since t1(n) ∈ O(g1(n)), There exists some constant c1 and some non negative
integer n1 such that
t1(n) ≤ c1 g1(n) for all n≥n1
Similarly since t2(n) ∈ O(g2(n)),
t2(n) ≤ c2 g2(n) for all n≥n2
Let us denote C3=max{c1, c2} and n ≥ max{n1 , n2 } so that we can use both
inequalities.
Now we can write t1(n)+ t2(n) ≤ c1 g1(n) + c2 g2(n)
≤ c3 g1(n) + c3 g2(n) ≤ c3{ g1(n)+g2(n) }
≤ c3 2 max{ g1(n),g2(n) }
Hence t1(n) + t2(n) ∈ O(max{g1(n), g2(n)})
with C= c3 2 and n0=max{n1, n2}
Using Limits for Comparing Orders of
Growth
• Limits can be used for comparing the order of
growth of two specific functions.
• 3 Principal cases:

 First 2 cases  t(n) ∈ O(g(n))


 Last 2 cases  t(n) ∈ (g(n))
 Second case  t(n) ∈ (g(n))
Compare order of growth of 100n+5 with n2

• Let n=1/y, Then y=1/n. Thus as n->∞ , y-> 1/∞=0


• Now
Lim Lim Lim
n-> ∞ (100n+5)/ n2= y-> 0 ((100*(1/y)+5)/(1/y2 ) = y-> 0 100y+ y2 =0

• Therefore 100n+5 ∈ O(n2 )


Little-oh Notation
• The function t(n) = o(g(n)) iff t(n) ∈ O(g(n)) and t(n) ∈ (g(n))

• Example: if t(n)=100n+5, Then t(n) ∈ o(n) is false because t(n)


∈ O(n) and t(n) ∈ (n) are true
• where as if t(n)=100n+5 ,t(n) ∈ O(n2) is true and t(n) ∈ (n2)
is false Hence t(n) ∈ o(n2) is true.
• The difference between big oh and little oh is that
• t(n) ∈ O(g(n)) means t(n) ≤ cg(n) holds for some constant c>0
• t(n) = o(g(n)) means t(n) ≤ cg(n) holds for all constant c>0
• Hence if
0

then t(n) = o(g(n))


Mathematical Analysis of Non-
recursive Algorithms
General Plan
• Decide on a parameter (or parameters)
indicating an input’s size.

• Identify the algorithm’s basic operation. (As a


rule, it is located in the innermost loop.)

• Check whether the number of times the basic


operation is executed depends only on the size
of an input.
• If it also depends on some additional property,
the worst-case, average-case, and, if necessary,
best-case efficiencies have to be investigated
separately.

• Set up a sum expressing the number of times


the algorithm’s basic operation is executed.

• Using standard formulas and rules of sum


manipulation, either find a closed form formula
for the count or, at the very least, establish its
order of growth.
Two Basic Rules Of Sum Manipulation
...............................................................
........
Two Summation Formulas
Algorithm to find value of the
largest element in a list
•Input size  number of elements in the array, i.e., n

•Basic Operation  Comparison

•The number of comparisons will be the same for all


arrays of size n; therefore there is no need to
distinguish among the worst, average, and best cases
here.
Let us denote C(n) the number of times this
comparison is executed
Element Uniqueness Problem
• Input size  number of elements in the array, i.e., n
• Basic Operation  Comparison
• The number of element comparisons depends not only
on n but also on whether there are equal elements in
the array and, if there are, which array positions they
occupy.
∈ O(n2 )
Matrix Multiplication
 Input size  Matrix order ‘n’
 Two Arithmetic operations addition and multiplications
are there in inner loop, both executed exactly once on
each iteration. So by counting one, we automatically
other. Hence let us consider
 Basic Operation  Multiplication
 The number of multiplication count depends only on
the size of the input matrices
The total number of multiplications M(n) is
expressed by the following triple sum:
• To estimate the running time of the
algorithm on a particular machine, we can
do it by the product

where cm is the time of one multiplication on the


machine in question
• If we take into account the time spent on the
additions, too:

where ca is the time of one addition


Algorithm in which loop variable change in
different manner
Example :The number of binary digits in the binary representation
of a positive decimal integer
Binary value of Decimal 18 =10010
Hence count= 5

Decimal Number Count


18/2 1
9/2 2
4/2 3
2/2 4
1/2 5
0 stop
Analysis
• Basic Operation  Comparison n>0
• Since the value of n is about halved on each
repetition of the loop, the answer should be about
log2 n.
• The exact formula for the number of times the
comparison n>1 will be
Iteration 1 2 3 4 --- K Failed
K+1

n n n/2 n/22 n/23 n/2k-1 n/2k

That is, When n/2k≤ 1 it failed

Solving n/2k≤ 1 to find K

n≤ 2k
Applying log on both sides
log2 n ≤ k log2 2

Hence k ≥ log2 n ∈ Ө (log2 n)


Mathematical Analysis of Recursive Algorithms
General Plan
• Decide on a parameter (or parameters) indicating an
input’s size.
• Identify the algorithm’s basic operation.
• Check whether the number of times the basic operation
is executed can vary on different inputs of the same
size; if it can, the worst-case, average-case, and best-
case efficiencies must be investigated separately.
• Set up a recurrence relation, with an appropriate initial
condition, for the number of times the basic operation
is executed.
• Solve the recurrence or, at least, ascertain the order of
growth of its solution.
Factorial Function
• Input size  Value of n
• Basic Operation  Multiplication M(n)
• The number of multiplications M(n)
needed

M(0) = 0
Recurrence Relation
• solving recurrence relations using the method
of backward substitutions

A general formula for the pattern:


M(n) = M(n − i) + i

• Since it is specified for n = 0, we have to substitute


i = n in the pattern’s formula to get the ultimate result of
our backward substitutions: M(n)=
M(n − 1) + 1= ...= M(n − i) + i = ...= M(n − n) + n = n.
Tower of Hanoi
• Input size  Number of disks, n
• Basic Operation  Moving one disk

• The number of moves M(n) needed using


recurrence relation
M(n) = M(n − 1) + 1+ M(n − 1) for n > 1

• Initial condition M(1) = 1 


M(n) = 2M(n − 1) + 1 for n > 1,
M(1) = 1
Solve the recurrence by the method of backward
substitutions:
• Since it is specified for n = 1, we have to
substitute i = n-1 in the pattern’s formula to get
the ultimate result of our backward
substitutions:

• Thus, we have an exponential algorithm,


which will run for an unimaginably
long time even for moderate values of n
• When a recursive algorithm makes more than a single
call to itself, it can be useful for analysis purposes to
construct a tree of its recursive calls. In this tree, nodes
correspond to recursive calls

•By counting the number of nodes in the tree, we can get the
total number of calls made by the Tower of Hanoi algorithm:
The number of binary digits in n’s
binary representation

• Basic Operation  Addition


• The number of additions made in computing
BinRec(n/2) is A(n/2), plus one more addition is made
by the algorithm to increase the returned value by 1.
This leads to the recurrence
A(n) = A(n/2) + 1 for n > 1

• Since the recursive calls end when n is equal to 1 and


there are no additions made then, the initial condition is
A(1) = 0

• The presence of n/2 in the function’s argument makes


the method of backward substitutions stumble on
values of n that are not powers of 2. Therefore, the
standard approach to solving such a recurrence is to
solve it only for n = 2k
• For n = 2k recurrence takes the form
Important Problem Types
The most important problem types:
• Sorting
• Searching
• String processing
• Graph problems
• Combinatorial problems
• Geometric problems
• Numerical problems
Sorting
• Rearrange the items of a given list in non-decreasing order.
• Usually need to sort lists of numbers, characters from an
alphabet, character strings.
• Example: Re-arrange the students, library records in school.
Employees data in companies.
• In the case of records, we need to choose a piece of
information called as Key to guide sorting.
Ex: Student names or grade point
• Importance of sopting : Give ranking, Searching, other
algorithms
2 Properties:
• Stable: if it preserves the relative order of any two equal
elements in its input
• In-place: if it does not require extra memory, except, possibly,
for a few memory units.
Searching
• Finding a given value, called a search key, in a given set or
multi set
• There are plenty of searching algorithms
• But there is no single algorithm that fits all situations best
• Examples:
– Linear search
– Binary search
Application : Dictionary
String Processing
• A string is a sequence of characters from an alphabet.

• Strings of particular interest are text strings, which comprise


letters, numbers, and special characters; bit strings, which
comprise zeros and ones

• One particular problem—that of searching for a given word in


a text—has attracted special attention from researchers. They
call it string matching.
Graph Problems
• A graph can be thought of as a collection of points
called vertices, some of which are connected by line
segments called edges.
• Graphs can be used for modelling a wide variety of
applications, including transportation, communication,
social and economic networks, project scheduling,
games, modern applications such as circuit board and
VLSI chip fabrication, X-ray crystallography, and genetic
engineering.
• Algorithms: Finding shortest path, path to reach all
nodes,
Combinatorial Problems
• These are problems that ask, explicitly or implicitly, to find a
combinatorial object—such as a permutation, a combination,
or a subset—that satisfies certain constraints
• Ex:
The travelling salesman problem
The graph colouring problem

• They are difficult to solve


– The number of combinatorial objects typically grows
extremely fast with a problem’s size, reaching
unimaginable magnitudes even for moderate-sized
instances
– There are no known algorithms for solving most such
problems exactly in an acceptable amount of time.
Geometric Problems
• Geometric algorithms deal with geometric objects such as
points, lines, and polygons.
• Different application area includes computer graphics,
robotics, and tomography.
• Two classic problems of computational geometry:
• The closest-pair problem: given n points in the plane, find the
closest pair among them.
• The convex-hull problem asks to find the smallest convex
polygon that would include all the points of a given set.
Numerical Problems
• Numerical problems, another large special area of
applications, are problems that involve mathematical objects
of continuous nature: solving equations and systems of
equations, computing definite integrals, evaluating functions,
and so on.
• Difficulties:
– The majority of such mathematical problems can be solved
only approximately.
– problems typically require manipulating real numbers,
which can be represented in a computer only
approximately. Moreover, a large number of arithmetic
operations performed on approximately represented
numbers can lead to an accumulation of the round-off
error to a point where it can drastically distort an output
produced by a seemingly sound algorithm
Fundamental Data Structures
• Linear Data Structures
• Graphs
• Trees
• Sets and Dictionaries
Linear Data Structures
Array
• A array is a sequence of n items of the same data
type that are stored contiguously in computer
memory and made accessible by specifying a value of
the array’s index
• Each and every element of an array can be accessed
in the same constant amount of time regardless of
where in the array the element in question is
located.
Linked list
• It is a sequence of zero or more elements called nodes.
• Each node contains two kinds of information: some data
and one or more links called pointers to other nodes of
the linked list.
• In a singly linked list, each node except the last one
contains a single pointer to the next element
• A special node called the header may contain
information about the linked list itself, such as its
current length; it may also contain, in addition to a
pointer to the first element, a pointer to the linked list’s
last element.
• the doubly linked list, in which every node, except the
first and the last, contains pointers to both its successor
and its predecessor
Stack
• a list in which insertions and deletions can
be done only at the end. This end is called
the top

• when elements are added to (pushed


onto) a stack and deleted from (popped
off) it, the structure operates in a “last-in–
first-out” (LIFO) fashion
Queue
• a list from which elements are deleted
from one end of the structure, called the
front (this operation is called dequeue)
and new elements are added to the other
end, called the rear (this operation is
called enqueue).

• A priority queue is a collection of data


items from a totally ordered universe
Graphs
• A graph G =(V,E) is defined by a pair of two sets: a finite
nonempty set V of items called vertices and a set E of pairs of
these items called edges.

• A graph G is called undirected if every edge in it is undirected.

• If a pair of vertices (u, v) is not the same as the pair (v, u), we
say that the edge (u, v) is directed from the vertex u, called
the edge’s tail, to the vertex v, called the edge’s head 
digraphs.
•A graph with every pair of its vertices connected by an
edge is called complete. A standard notation for the
complete graph with |V | vertices is K|V |.
• A graph with relatively few possible edges missing is
called dense;
•A graph with few edges relative to the number of its
vertices is called sparse.
Graph Representations
Adjacency Matrix
• The adjacency matrix of a graph with n vertices is an n
× n boolean matrix with one row and one column for
each of the graph’s vertices, in which the element in the
ith row and the jth column is equal to 1 if there is an
edge from the ith vertex to the jth vertex, and equal to 0
if there is no such edge.
Adjacency lists
• The adjacency lists of a graph or a digraph is a collection
of linked lists, one for each vertex, that contain all the
vertices adjacent to the list’s vertex (i.e., all the vertices
connected to it by an edge).
• A weighted graph (or weighted digraph) is a graph (or
digraph) with numbers assigned to its edges. These
numbers are called weights or costs weight matrix
Path
• A path from vertex u to vertex v of a graph G can be
defined as a sequence of adjacent (connected by an
edge) vertices that starts with u and ends with v.
• If all vertices of a path are distinct, the path is said to be
simple.
• The length of a path is the total number of vertices in
the vertex sequence defining the path minus 1, which is
the same as the number of edges in the path.
• A directed path is a sequence of vertices in which every
consecutive pair of the vertices is connected by an edge
directed from the vertex listed first to the vertex listed
next.
2 Properties of graph:

Connectivity
• A graph is said to be connected if for every pair of its
vertices u and v there is a path from u to v.

Cyclicity
• A cycle is a path of a positive length that starts and ends
at the same vertex and does not traverse the same edge
more than once.
Trees
• Tree is a connected acyclic graph
• A graph that has no cycles but is not necessarily
connected is called a forest
• The number of edges in a tree is always one less than
the number of its vertices:
|E| = |V| − 1.
Rooted Trees
• For every two vertices in a tree, there always exists
exactly one simple path from one of these vertices to
the other.
• A rooted tree is usually depicted by placing its root on
the top (level 0 of the tree), the vertices adjacent to the
root below it (level 1), the vertices two edges apart from
the root still below (level 2), and so on.
Properties of Trees
• For any vertex v in a tree T , all the vertices on the simple path
from the root to that vertex are called ancestors of v. The
vertex itself is usually considered its own ancestor
• The set of ancestors that excludes the vertex itself is referred
to as the set of proper ancestors
• If (u, v) is the last edge of the simple path from the root to
vertex v (and u = v), u is said to be the parent of v and v is
called a child of u; vertices that have the same parent are said
to be siblings.
• A vertex with no children is called a leaf ; a vertex with
at least one child is called parental

• All the vertices for which a vertex v is an ancestor are


said to be descendants of v; the proper descendants
exclude the vertex v itself.

• The depth of a vertex v is the length of the simple path


from the root to v.

• The height of a tree is the length of the longest simple


path from the root to a leaf.
• An ordered tree is a rooted tree in which all the children
of each vertex are ordered.
• A binary tree can be defined as an ordered tree in which
every vertex has no more than two children and each
child is designated as either a left child or a right child of
its parent; a binary tree may also be empty.
Sets and Dictionaries
• A set can be described as an unordered collection (possibly
empty) of distinct items called elements of the set
• A specific set is defined:
– by an explicit listing of its elements
e.g., S = {2,3, 5, 7}
– by specifying a property that all the set’s elements and
only they must satisfy
e.g., S = {n: n is a prime number smaller than 10}
Set operation can be:
• Checking membership of a given item in a given set;
• Finding the union of two sets, which comprises all the
elements in either or both of them
• Finding the intersection of two sets, which comprises all the
common elements in the sets.
2 ways of implementing sets in computer applications:
• The first considers the sets that are subsets of some
large set U, called the universal set.

• If set U has n elements, then any subset S of U can be


represented by a bit string of size n, called a bit vector,
in which the ith element is 1 if and only if the ith element
of U is included in set S.

• The second and more common way to represent a set


for computing purposes is to use the list structure to
indicate the set’s elements.
The two distinction between sets and lists:
• First, a set cannot contain identical elements; a list can.
(But multi set, or bag: An unordered collection of items
that are not necessarily distinct)

• Second, a set is an unordered collection of items;


therefore, changing the order of its elements does not
change the set. A list, defined as an ordered collection
of items, is exactly the opposite.

• In computing, the operations performed on a set or a


multi set are: searching for a given item, adding a new
item, and deleting an item from the collection. A data
structure that implements these three operations is
called the dictionary.

You might also like