DAA MODULE-1 Reference Material
DAA MODULE-1 Reference Material
1 Introduction 8 Hours
Notion of an Algorithm – Fundamentals of Algorithmic Problem Solving – Important
Problem Types – Fundamentals of the Analysis of Algorithm Efficiency – Analysis
Framework – Asymptotic Notations and its properties – Mathematical analysis for
Recursive and Non-recursive algorithms.
Notion of an Algorithm
An algorithm is the best way to represent the solution of a particular problem in a very
simple and efficient way. If we have an algorithm for a specific problem, then we can
implement it in any programming language, meaning that the algorithm is independent
from any programming languages.
Algorithm Design
1
Problem Development Steps
2
As one would not follow any written instructions to cook the recipe, but only the standard
one. Similarly, not all written instructions for programming are an algorithm. For some
instructions to be an algorithm, it must have the following characteristics:
Properties of Algorithm:
Advantages of Algorithms:
It is easy to understand.
An algorithm is a step-wise representation of a solution to a given problem.
In an Algorithm the problem is broken down into smaller pieces or steps hence, it
is easier for the programmer to convert it into an actual program.
Disadvantages of Algorithms:
Problem − Design an algorithm to add two numbers and display the result.
Step 1 − START
Step 2 − declare three integers a, b & c
3
Step 3 − define values of a & b
Step 4 − add values of a & b
Step 5 − store output of step 4 to c
Step 6 − print c
Step 7 − STOP
4
II. Decision Making
→In some newer computers, operations are executed concurrently, i.e., in parallel.
Algorithms that take advantage of this capability are called parallel algorithms.
→Choice of computational devices like Processor and memory is mainly based
on space and time efficiency.
→The next principal decision is to choose between solving the problem exactly or
solving it approximately.
→An algorithm used to solve the problem exactly and produce correct result is
called an exact algorithm.
→If the problem is so complex and not able to get exact solution, then we have to
choose an algorithm called an approximation algorithm. i.e., produces an
→Approximate answer. E.g., extracting square roots, solving nonlinear equations,
and evaluating definite integrals.
5
III. Methods of Specifying an Algorithm
a. Natural language
b. Pseudocode
c. Flowchart
Pseudocode and flowchart are the two options that are most widely used nowadays for
specifying algorithms.
a. Natural Language
It is very simple and easy to specify an algorithm using natural language. But many times
specification of algorithm by using natural language is not clear and thereby we get brief
specification.
Such a specification creates difficulty while actually implementing it. Hence many
programmers prefer to have specification of algorithm by means of Pseudocode.
b) Pseudocode
Pseudocode is a mixture of a natural language and programming language
constructs.
Pseudocode is usually more precise than natural language.
For Assignment operation left arrow “←”, for comments two slashes “//”,if
condition, for, while loops are used.
6
ALGORITHM Sum(a,b)
//Problem Description: This algorithm performs addition of two numbers
//Input: Two integers a and b
//Output: Addition of two integers c←a+b
return c
c)Flowchart
In the earlier days of computing, the dominant method for specifying algorithms
was a flowchart, this representation technique has proved to be inconvenient.
Flowchart is a graphical representation of an algorithm. It is a method of expressing
an algorithm by a collection of connected geometric shapes containing descriptions
of the algorithm’s steps.
Once an algorithm has been specified then its correctness must be proved.
An algorithm must yield a required result for every legitimate input in a finite amount
of time.
For Example, the correctness of Euclid’s algorithm for computing the greatest
common divisor stems from the correctness of the equality gcd(m, n) = gcd(n, m
DAA PROF. G. SRINIVASA RAO
7
mod n).
A common technique for proving correctness is to use mathematical induction
because an algorithm’s iterations provide a natural sequence of steps needed for
such proofs.
The notion of correctness for approximation algorithms is less straightforward than
it is for exact algorithms. The error produced by the algorithm should not exceed a
predefined limit.
V. Analyzing an Algorithm
8
Important Problem-Types
The two motivating forces for any problem is its practical importance and some specific
characteristics. The different types of problems are:
Sorting
Searching
String processing
Graph problems
Combinatorial problems
Geometric problems
Numerical problems
We use these problems to illustrate different algorithm design techniques and methods of
algorithm analysis.
Sorting:
Sorting problem is one which rearranges the items of a given list in ascending
order.
We usually sort a list of numbers, characters, strings and records similar to college
information about their students, library information and company information is
chosen for guiding the sorting technique.
For eg in student’s information, we can sort it either based on student’s register
number or by their names. Such pieces of information is called a key.
The most important when we use the searching of records. There are different
types of sorting algorithms.
There are some algorithms that sort an arbitrary of size n using nlog2n
comparisons, On the other hand, no algorithm that sorts by key comparisons can
do better than that.
Although some algorithms are better than others, there is no algorithm that would
be the best in all situations.
Some algorithms are simple but relatively slow while others are faster but more
complex.
Some are suitable only for lists residing in the fast memory while others can be
adapted for sorting large files stored on a disk, and so on.
Some of the Sorting techniques like Bubble Sort, Insertion Sort, Quick Sort & Merge
Sort.
9
Searching:
The searching problem deals with finding a given value, called a search key, in a
given set.
The searching can be either a straightforward algorithm or binary search algorithm
which is a different form.
These algorithms play an important role in real-life applications because they are
used for storing and retrieving information from large databases.
Some algorithms work faster but require more memory, some are very fast but
applicable only to sorted arrays.
Searching, mainly deals with addition and deletion of records. In such cases, the
data structures and algorithms are chosen to balance among the required set of
operations.
String processing:
A String is a sequence of characters. It is mainly used in string handling algorithms.
Most common ones are text strings, which consists of letters, numbers and special
characters. Bit strings consist of zeros and ones.
The most important problem is the string matching, which is used for searching a
given word in a text.
For e.g. sequential searching and brute- force string matching algorithms.
Graph problems:
One of the interesting areas in algorithmic is graph algorithms.
A graph is a collection of points called vertices which are connected by line
segments called edges.
Graphs are used for modelling a wide variety of real-life applications such as
transportation and communication networks.
It includes graph traversal, shortest-path and topological sorting algorithms. Some
graph problems are very hard, only very small instances of the problems can be
solved in realistic amount of time even with fastest computers.
There are two common problems: the traveling salesman problem, finding the
shortest tour through n cities that visits every city exactly once.
The graph-colouring problem is to assign the smallest number of colours to
vertices of a graph so that no two adjacent vertices are of the same colour.
It arises in event-scheduling problem, where the events are represented by
vertices that are connected by an edge if the corresponding events cannot be
scheduled in the same time, a solution to this graph gives an optimal schedule.
10
Combinatorial problems:
The traveling salesman problem and the graph-colouring problem are
examples of combinatorial problems.
These are problems that ask us to find a combinatorial object such as permutation,
combination or a subset that satisfies certain constraints and has some desired
(e.g. maximizes a value or minimizes a cost) tasks.
These problems are difficult to solve for the following facts.
First, the number of combinatorial objects grows extremely fast with a
problem’s size.
Second, there are no known algorithms, which are solved in acceptable
amount of time.
Geometric problems:
Geometric algorithms deal with geometric objects such as points, lines and
polygons.
It also includes various geometric shapes such as triangles, circles etc. The
applications for these algorithms are in computer graphic, robotics etc.
The two problems most widely used are the closest-pair problem, given ’n’ points
in the plane, find the closest pair among them.
The convex-hull problem is to find the smallest convex polygon that would include
all the points of a given set.
Numerical problems:
This is another large special area of applications, where the problems involve
mathematical objects of continuous nature: solving equations computing definite
integrals and evaluating functions and so on.
These problems can be solved only approximately.
These require real numbers, which can be represented in a computer only
approximately. If can also lead to an accumulation of round-off errors.
The algorithms designed are mainly used in scientific and engineering applications.
11
Fundamentals of the Analysis of Algorithm Efficiency
12
DAA PROF. G. SRINIVASA RAO
13
DAA PROF. G. SRINIVASA RAO
14
DAA PROF. G. SRINIVASA RAO
15
DAA PROF. G. SRINIVASA RAO
16
DAA PROF. G. SRINIVASA RAO
17
DAA PROF. G. SRINIVASA RAO
18
Worst-Case, Best-Case, and Average-Case Efficiencies
//Output: The index of the first element in A that matches K or -1 if there are
no
// matching elements
i ←0
while i < n and A[i] ≠ K do
i ←i + 1
if i < n
return i
else
return- 1
Clearly, the running time of this algorithm can be quite different for the same list
size n.
In the worst case, there is no matching of elements or the first matching element
can found at last on the list. In the best case, there is matching of elements at first
on the list
Worst-case efficiency
The worst-case efficiency of an algorithm is its efficiency for the worst case input
of size n.
The algorithm runs the longest among all possible inputs of that size.
For the input of size n, the running time is Cworst(n) =n.
The best-case efficiency of an algorithm is its efficiency for the best case input of
size n.
The algorithm runs the fastest among all possible inputs of that size n.
In sequential search, if we search a first element in list of size n. (i.e. first element
19
equal to a search key), then the running time is Cbest(n) =1
The Average case efficiency lies between best case and worst case.
To analyze the algorithm’s average case efficiency, we must make some
assumptions about possible inputs of size n.
The standard assumptions are that
The efficiency of an algorithm depends on the amount of time, storage and other
resources required to execute the algorithm. The efficiency is measured with the
help of asymptotic notations.
An algorithm may not have the same performance for different types of inputs. With
the increase in the input size, the performance will change.
The study of change in performance of the algorithm with the change in the order
of the input size is defined as asymptotic analysis.
Asymptotic Notations
Asymptotic notations are the mathematical notations used to describe the
running time of an algorithm when the input tends towards a particular value or a
limiting value.
For example: In bubble sort, when the input array is already sorted, the time taken
by the algorithm is linear i.e. the best case.
But, when the input array is in reverse condition, the algorithm takes the maximum
20
time (quadratic) to sort the elements i.e. the worst case.
When the input array is neither sorted nor in reverse order, then it takes average
time. These durations are denoted using asymptotic notations.
Big-O notation
Omega notation
Theta notation
Big-O notation represents the upper bound of the running time of an algorithm. Thus,
it gives the worst-case complexity of an algorithm.
21
provided by O(g(n)).
Since it gives the worst-case running time of an algorithm, it is widely used to
analyze an algorithm as we are always interested in the worst-case scenario.
Omega notation represents the lower bound of the running time of an algorithm. Thus, it
provides the best-case complexity of an algorithm.
22
Theta Notation (Θ-notation)
Theta notation encloses the function from above and below. Since it represents the upper
and the lower bound of the running time of an algorithm, it is used for analyzing the
average-case complexity of an algorithm.
The above expression can be described as a function f(n) belongs to the set Θ(g(n)) if
there exist positive constants c1 and c such that it can be sandwiched
2
If a function f(n) lies anywhere in between c1g(n) and c g(n) for all n ≥ n0 , then f(n) is said
2
23
Mathematical analysis for Recursive and Non-recursive algorithms
1.1 General Plan for Analysing the Time Efficiency of Non-recursive Algorithms:
2. Identify the algorithm’s basic operation (in the inner most oop).
3. Check whether the number of times the basic operation is executed depends only on the
size of an input. If it also depends on some additional property, the worst-case, average-case,
and, if necessary, best-case efficiencies have to be investigated separately.
4. Set up a sum expressing the number of times the algorithm’s basic operation
is executed.
5. Using standard formulas and rules of sum manipulation either find a closed
form formula for the count or at the least, establish its order of growth.
EXAMPLE 1: Consider the problem of finding the value of the largest element in a list
of n numbers. Assume that the list is implemented as an array for simplicity.
for i ←1 to n − 1 do
24
if A[i]>maxval
maxval←A[i]
return maxval
Algorithm analysis
• The measure of an input’s size here is the number of elements in the array, i.e., n.
• The comparison operation is considered as the algorithm’s basic operation, because the
comparison is executed on each repetition of the loop and not the assignment.
• The number of comparisons will be the same for all arrays of size n; therefore, there is no
need to distinguish among the worst, average, and best cases here.
• Let C(n) denotes the number of times this comparison is executed. The algorithm makes
one comparison on each execution of the loop, which is repeated for each value of the loop’s
variable i within the bounds 1 and n − 1, inclusive. Therefore, the sum for C(n) is calculated
as follows:
25
MATHEMATICAL ANALYSIS FOR RECURSIVE ALGORITHMS:
EXAMPLE 1: Compute the factorial function F(n) = n! for an arbitrary non negative integer
n. Since n!= 1•. . . . • (n − 1) • n = (n − 1)! • n, for n ≥ 1 and 0!= 1 by definition, we can
DAA PROF. G. SRINIVASA RAO
26
compute F(n) = F(n − 1) • n with the following
recursive algorithm
ALGORITHMF(n)
//Computes n! recursively
//Input: A nonnegative integer n
//Output: The value of n!
if n = 0 return 1
else return F(n − 1) * n
Algorithm analysis
• For simplicity, we consider n itself as an indicator of this algorithm’s input size. i.e.1.
• The number of multiplications M(n) needed to compute it must satisfy the equality
M (n − 1) multiplications are spent to compute F(n − 1), and one more multiplication is
needed to multiply the result by n
Recurrence relations
The last equation defines the sequence M(n) that we need to find. This equation defines
M(n) not explicitly, i.e., as a function of n, but implicitly as a function of its value at another
point, namely n − 1. Such equations are called recurrence relations or recurrences.
Solve the recurrence relation (n) = (n − 1) + 1, i.e., to find an explicit formula f o r M(n) in
terms of n only. To determine a solution uniquely, we need an initial condition that tells us
the value with which the sequence starts. We can obtain this value by inspecting the
condition that makes the algorithm stop its recursive calls:
if n = 0 return 1.
First, since the calls stop when n = 0, the smallest value of n for which this
DAA PROF. G. SRINIVASA RAO
27
algorithm is executed and hence M(n) defined is 0.
Second, by inspecting the pseudocode’s exiting line, we can see that when n
= 0, the algorithm performs no multiplications.
Thus, the recurrence relation and initial condition for the algorithm’s number of
multiplications
M(n): M(n) = M(n − 1) + 1 for n >0,
M(0)=0 for n =0.
28
Master Method
The Master Method is used for solving the following types of recurrence
T (n) = a T + f (n) with a≥1 and b≥1 be constant & f(n) be a function and can be
interpreted as
Let T (n) is defined on non-negative integers by the recurrence.
T (n) = a T + f (n)
In the function to the analysis of a recursive algorithm, the constants and function take on
the following significance:
o n is the size of the problem.
o a is the number of subproblems in the recursion.
o n/b is the size of each subproblem. (Here it is assumed that all subproblems are
essentially the same size.)
o f (n) is the sum of the work done outside the recursive calls, which includes the
sum of dividing the problem and the sum of combining the solutions to the
subproblems.
o It is not possible always bound the function according to the requirement, so we
make three cases which will tell us what kind of bound we can apply on the function.
29
DAA PROF. G. SRINIVASA RAO
30
DAA PROF. G. SRINIVASA RAO
31
DAA PROF. G. SRINIVASA RAO
32
DAA PROF. G. SRINIVASA RAO
33
DAA PROF. G. SRINIVASA RAO
34
DAA PROF. G. SRINIVASA RAO
35
DAA PROF. G. SRINIVASA RAO
36
DAA PROF. G. SRINIVASA RAO
37
Examples on Asymptotic Notation – Upper, Lower and Tight Bound
Example: Find upper bound of running time of constant function f(n) = 6993.
To find the upper bound of f(n), we have to find c and n0 such that 0 ≤ f (n) ≤ c.g(n) for
all n ≥ n0
0 ≤ f (n) ≤ c × g (n)
0 ≤ 6993 ≤ c × g (n)
0 ≤ 6993 ≤ 6993 x 1
So, c = 6993 and g(n) = 1
Any value of c which is greater than 6993, satisfies the above inequalities, so all such
values of c are possible.
38
f(n) = O(g(n)) = O(1) for c = 6993, n0 = 1
f(n) = O(g(n)) = O(1) for c = 8000, n0 = 1 and so on.
Tabular Approach
0 ≤ 6n + 3 ≤ c × g (n)
0 ≤ 6n + 3 ≤ 7 n
Now, manually find out the proper n0, such that f (n) ≤ c.g (n)
n f(n) = 6n + 3 c.g(n) = 7n
1 9 7
2 15 14
3 21 21
4 27 28
5 33 35
From Table, for n ≥ 3, f (n) ≤ c × g (n) holds true. So, c = 7, g(n) = n and n 0 = 3, There
can be such multiple pair of (c, n0)
f(n) = O(g(n)) = O(n) for c = 9, n0 = 1
f(n) = O(g(n)) = O(n) for c = 7, n0 = 3
and so on.
39
Tabular approach:
0 ≤ 3n2 + 2n + 4 ≤ c.g (n)
0 ≤ 3n2 + 2n + 4 ≤ 4n2
Now, manually find out the proper n0, such that f(n) ≤ c.g(n)
n f(n) = 3n2 + 2n + 4 c.g (n) = 4n2
1 9 4
2 20 16
3 37 36
4 60 64
5 89 100
From Table, for n ≥ 4, f(n) ≤ c × g (n) holds true. So, c = 4, g(n) = n 2 and n0 = 4. There
can be such multiple pair of (c, n0)
f(n) = O (g(n)) = O (n2) for c = 9, n0 = 1
f(n) = O (g(n)) = O (n2) for c = 4, n0 = 4
and so on.
40
f(n) = O(g(n)) = O(n3) for c = 11, n0 = 1
f(n) = O(g(n)) = O(n3) for c =3, n0 = 3 and so on.
Lower Bound
Lower bound of any function is defined as follow:
Let f(n) and g(n) are two nonnegative functions indicating the running time of two
algorithms. We say the function g(n) is lower bound of function f(n) if there exist some
positive constants c and n0 such that 0 ≤ c.g(n) ≤ f(n) for all n ≥ n0. It is denoted as f(n)
= Ω (g(n)).
Example: Find lower bound of running time of constant function f(n) = 23.
To find lower bound of f(n), we have to find c and n 0 such that { 0 ≤ c × g(n) ≤ f(n) for all
n ≥ n0 }
0 ≤ c × g(n) ≤ f(n)
0 ≤ c × g(n) ≤ 23
0 ≤ 23.1 ≤ 23 → true
0 ≤ 12.1 ≤ 23 → true
0 ≤ 5.1 ≤ 23 → true
Above all three inequalities are true and there exists such infinite inequalities
So c = 23, c = 12, c = 5 and g(n) = 1. Any value of c which is less than or equals to 23,
satisfies the above inequality, so all such value of c are possible. Function f(n) is
constant, so it does not depend on problem size n. Hence n0 = 1
DAA PROF. G. SRINIVASA RAO
41
f(n) = Ω (g(n)) = Ω (1) for c = 23, n0 = 1
f(n) = Ω (g(n)) = Ω (1) for c = 12, n0 = 1 and so on.
To find lower bound of f(n), we have to find c and n0 such that 0 ≤ c.g(n) ≤ f(n) for all n ≥
n0
0 ≤ c × g(n) ≤ f(n)
0 ≤ c × g(n) ≤ 6n + 3
0 ≤ 6n ≤ 6n + 3 → true, for all n ≥ n0
0 ≤ 5n ≤ 6n + 3 → true, for all n ≥ n0
Above both inequalities are true and there exists such infinite inequalities. So,
f(n) = Ω (g(n)) = Ω (n) for c = 6, n0 = 1
f(n) = Ω (g(n)) = Ω (n) for c = 5, n0 = 1
and so on.
Example: Find lower bound of running time of quadratic function f(n) = 3n2 + 2n + 4.
To find lower bound of f(n), we have to find c and n0 such that 0 ≤ c.g(n) ≤ f(n) for all n ³
n0
0 ≤ c × g(n) ≤ f(n)
0 ≤ c × g(n) ≤ 3n2 + 2n + 4
0 ≤ 3n2 ≤ 3n2 + 2n + 4, → true, for all n ≥ 1
0 ≤ n2 ≤ 3n2 + 2n + 4, → true, for all n ≥ 1
Above both inequalities are true and there exists such infinite inequalities.
So, f(n) = Ω (g(n)) = Ω (n2) for c = 3, n0 = 1
f(n) = Ω (g(n)) = Ω (n2) for c = 1, n0 = 1
and so on.
Example: Find lower bound of running time of quadratic function f(n) = 2n3 + 4n + 5.
To find lower bound of f(n), we have to find c and n 0 such that 0 ≤ c.g(n) ≤ f(n) for all n ≥
n0
0 ≤ c × g (n) ≤ f(n)
0 ≤ c × g (n) ≤ 2n3 + 4n + 5
0 ≤ 2n3 ≤ 2n3 + 4n + 5 → true, for all n ≥ 1
0 ≤ n3 ≤ 2n3 + 4n + 5 → true, for all n ≥ 1
Above both inequalities are true and there exists such infinite inequalities.
So, f(n) = Ω (g(n)) = Ω (n3) for c = 2, n0 = 1
f(n) = Ω (g(n)) = Ω (n3) for c = 1, n0 = 1
and so on.
42
Tight Bound
Tight bound of any function is defined as follow:
Let f(n) and g(n) are two nonnegative functions indicating running time of two algorithms.
We say the function g(n) is tight bound of function f(n) if there exist some positive constants
c1, c2, and n0 such that 0 ≤ c1× g(n) ≤ f(n) ≤ c2× g(n) for all n ≥ n0. It is denoted as f(n) = Θ
(g(n)).
Example: Find tight bound of running time of constant function f(n) = 23.
To find tight bound of f(n), we have to find c1, c2 and n0 such that, 0 ≤ c1× g(n) ≤ f(n) ≤
c2 × g(n) for all n ≥ n0
0 ≤ c1× g(n) ≤ 23 ≤ c2 × g(n)
0 ≤ 22 ×1 ≤ 23 ≤ 24 × 1, → true for all n ≥ 1
0 ≤ 10 ×1 ≤ 23 ≤ 50 × 1, → true for all n ≥ 1
Above both inequalities are true and there exists such infinite inequalities.
So, (c1, c2) = (22, 24) and g(n) = 1, for all n ≥ 1
(c1, c2) = (10, 50) and g(n) = 1, for all n ≥ 1
f(n) = Θ (g (n)) = Θ (1) for c1 = 22, c2 = 24, n0 = 1
f(n) = Θ (g (n)) = Θ (1) for c1 = 10, c2 = 50, n0 = 1
and so on.
43
Example: Find tight bound of running time of a linear function f(n) = 6n + 3.
To find tight bound of f(n), we have to find c 1, c2 and n0 such that, 0 ≤ c1× g(n) ≤ f(n) ≤
c2 × g(n) for all n ≥ n0
0 ≤ c1× g(n) ≤ 6n + 3 ≤ c2 × g(n)
0 ≤ 5n ≤ 6n + 3 ≤ 9n, for all n ≥ 1
Above inequality is true and there exists such infinite inequalities.
So, f(n) = Θ(g(n)) = Θ(n) for c1 = 5, c2 = 9, n0 = 1
General Problems
(i) 3n + 2 = Θ(n)
To prove above statement, we have to find c1, c2 and n0 such that, 0 ≤ c1× g(n) ≤ f(n) ≤
c2 g(n) for all n ≥ n0
0 ≤ c1× g(n) ≤ 3n + 2 ≤ c2 × g(n)
0 ≤ 2n ≤ 3n + 2 ≤ 5n, for all n ≥ 1
So, f(n) = Θ(g(n)) = Θ(n) for c1 = 2, c2 = 5 n0 = 1
44
So, f(n) = Θ(g(n)) = Θ(2n) for c1 = 6, c2 = 7 n0 = 1
Example: Let f(n) and g(n) be asymptotically positive functions. Prove or disprove
following. f(n) + g(n) = q(min(f(n), g(n))).
Function f(n) and g(n) are non-negative functions such that there exist f(n) ≥ 0 and g(n)
≥ 0, for all n ≥ n0.
For all such n ≥ n0, f(n) + g(n) ≥ f(n) ≥ 0
Similarly, f(n) + g(n) ≥ g(n) ≥ 0
Adding both inequalities, f(n) + g(n) ≥ min(f(n), g(n))
⇒ min(f(n), g(n)) ≤ c.( f(n) + g(n)) for all n ≥ n0 with c = 1.
From the definition of big oh, min(f(n), g(n)) = O(f(n) + g(n)).
However converse is not true. Let us take f(n) = n and g(n) = n2. i.e. min(f(n), g(n)) = n. It
is easy to show that for any n0, c there is always exist n such that n < c(n + n2). So it is
proved that min(f(n), g(n)) = Ω(f(n) + g(n)) is false.
Hence, the given statement is false.
45
Example: Find big theta and big omega notation of f(n) = 14 * 7 + 83
46