Introduction to Data
Structures
Unit 1: Asymptotic analysis (Big-O notation)
Algorithm: Outline, the essence of a computational procedure, step-by-step
instructions
Program: an implementation of an algorithm in some programming language
Algorithmic Problem
Specification of Specification of
input ? output as a
function of input
( Infinitely many correct algorithms for the same algorithmic problem )
Correctness of the algorithm is not the only thing
Need to compare algorithm
Find out efficient algorithm
Analyse the algorithm (Complexity Analysis)
What is an Efficient Algorithm?
• Taking less Running time (Analysis of Time complexity)
• Using less Space (Analysis of Space complexity)
Time Space Trade off
A situation where the memory use can be reduced at the cost
of slower program execution (e.g. using linear search for searching
records on multiple criterion)
Or
the computation time can be reduced at the cost of increased
memory use (e.g. using multiple indexes for searching)
As the relative costs of CPU cycles, RAM space, and hard drive space change—the
appropriate choices for time space tradeoff have changed radically.
Time Complexity
The running time of an algorithm is the total number of primitive operations executed
The exact number of steps will depend on exactly what machine or language is being used
To avoid that problem, the Asymptotic Notation is used.
Algorithm complexity is rough estimation of the number of steps performed by given
computation depending on the size of the input data, measured through Asymptotic
Notation
Usually, the time required by an algorithm falls under three types −
• Best Case − Minimum time required for program execution.
• Average Case − Average time required for program execution.
• Worst Case − Maximum time required for program execution.
Rate of Growth
log2n < √n<n < nlog2n < n2 < n3 < 2n
Constant Ο(1)
Logarithmic Ο(log n)
Linear Ο(n)
Log Linear Ο(n log n)
Quadratic Ο(n2)
Cubic Ο(n3)
Polynomial nΟ(1)
Exponential 2Ο(n)
Asymptotic Notation
Asymptotic notation of an algorithm is a mathematical representation of
its complexity
An asymptote of a curve is a line such that the distance between the
curve and the line approaches zero as one or both of the x or y
coordinates tends to infinity.
It is a method of describing limiting behaviour.
e.g. If f(n) = n2 + 3n, then as n becomes very large, the term 3n becomes
insignificant compared to n2.
The function f(n) is said to be "asymptotically equivalent to n2, as n → ∞".
Commonly used asymptotic notations to calculate the running time
complexity of an algorithm are
Ο Notation (Big Oh Notation)
Ω Notation (Omega Notation)
θ Notation (Theta Notation)
Big Oh Notation - Asymptotic upper bound
It measures the worst case time complexity or the longest amount of time an
algorithm can possibly take to complete.
Ω Notation - Asymptotic lower bound
It measures the best case time complexity or the best amount of time an algorithm can
possibly take to complete
the best case performance of an algorithm is generally not useful, the Omega notation is the least used notation
among all three.
Θ Notation – bounds a functions from above and below
It measures the average case time complexity
Basic Rules for calculating
complexity
1. Nested loops are multiplied together.
2. Sequential loops are added.
3. Only the largest term is kept, all others are dropped.
4. Constants are dropped.
5. Conditional checks are constant (i.e. 1).
Measuring Efficiency of an algorithm
We are more interested in finding out the Worst Case Complexity or Big O
Analysis of Algorithms
• // Here c is a constant
for (int i = 1; i <= c; i++)
{
// some O(1) expressions
}
Reviewing Complexity
#include <iostream>
int main(){
int c = 4;
cin>>n;
int d = c*5;
}
Reviewing Complexity
#include <iostream>
int main(){
int c = 4; 1
cin>>n; 1
int d = c*5; 1
}
O(1+1+1) = constant time
Reviewing Complexity
for(int i =0; i<n ;i++){
int c = c+n;
}
Reviewing Complexity
for(int i =1; i<n ;i+=2){
stmt;
}
Reviewing Complexity
for(int i =0; i<n ;i++){
for(int j = 0; j<n;j++){
int d = d*6;
}
}
Reviewing Complexity
int a = 0, b = 0;
for (i = 0; i < N; i++) {
a = a + rand();
}
for (j = 0; j < M; j++) {
b = b + rand();
}
int a = 0; 1
int b = 0; 1
for (i = 0; i < N; i++) { n+1
a = a + rand(); n
}
for (j = 0; j < M; j++) { m+1
b = b + rand(); m
}
total steps : 4+2n+2m
=2(n+m) +4
ignore constant ->O(N+M)
Reviewing Complexity
int a = 0;
for (i = 0; i < N; i++) {
for (j = N; j > i; j--) {
a = a + i + j;
}
}
Reviewing Complexity
--
for(i=1; i<=n;i=*2){
stmt;
}
• i=1 1*2 Assume i>=n
• i=2 2*2=22 2k >=n
• i=3 22* *2=23 2k =n
• ---- k=o(logn)
• ----
• ----
• 2k
Reviewing Complexity
--
for(i=n; i>=1;i=i/2){
stmt;
}
Reviewing Complexity
--
p=0;
for(i=1; p<=n;i++){
p = p+i;
}
• i=1 p=0+1 Assume p>=n
• i=2 1+2 k(k+1)/2
• i=3 1+2+3
k2>n
• ---- k=o(sqrt(n))
• ----
• ----
• k k(k+1)/2
Reviewing Complexity
for(int i =0; i<n ;i++){
for(int j=0;j<i;j++)
stmt;
}
}
Suppose we had an algorithm that took in an array of strings,
sorted each string, and then sorted the full array. What would the
runtime be?
1. Sorting Each String
Suppose the array has n strings, and the average length of each string is m.
Sorting each string individually would take O(mlogm) time per string.
Since there are n strings, the total time to sort all the strings is O(n ⋅mlogm)
2. Sorting the Full Array
After sorting each string, the array is sorted.
Each comparison between two strings will take O(m) time since we may need to compare up to m
characters.
Sorting the full array of n strings will therefore take O(nlogn) comparisons, with each comparison taking
O(m) time.
Hence, the time to sort the entire array will be O(n⋅mlogn).
Overall Runtime
Combining both steps, the total runtime of the algorithm is the sum of the two main operations:
O(n⋅mlogm)+O(n⋅mlogn)
Since mlogm and mlogn are comparable, the overall runtime is dominated by the larger of the two terms.
Therefore, the runtime is: O(n⋅m(logm+logn))
This can be simplified to: O(n⋅mlog(mn))
• Given a number n, find the sum of the first n natural numbers.