VR 17 1
Introduction
Algorithm Specification:
The word algorithm comes from the name of a Persian and Muslim mathematician Abu
Abdullah Muhammad ibn Musa Al-Khwarizmi, he was mathematician, astronomer and
geographer during the Abbasid Caliphate, a scholar in the House of Wisdom in Baghdad. This
word has taken a special significance in computer Science as it is used as a tool for solving a
well specified computational problem.
• Definition:
An Algorithm is a finite set of instructions that if followed, accomplishes a particular task
Or
An Algorithm is any well defined computational procedure that takes some value or set of values
as input and produces some value or set of values as output.
Or
An Algorithm is step by step process to solve the particular problem.
• Specification of Algorithm:
1. Input: There are zero or more quantities that are externally supplied.
2. Output: At least one quantity is produced.
3. Definiteness: Each instruction is clear and unambiguous.
4. Finiteness: If we trace out the instructions of an algorithm, then for all cases, the
algorithm terminates after a finite number of steps.
5. Effectiveness: Every instruction must be basic enough to be carried out, in principle, by a
person using only pencil and paper. It is not enough that each operation be definite as in
(3); it also must be feasible.
In computational theory, one distinguishes between an algorithm and a program, the latter
of which does not have to satisfy the fourth condition. For example, we can think of an
operating system that continues in a wait loop until more jobs are entered. Such a program
does not terminate unless the system crashes. Since our programs will always terminate.
We can describe an algorithm in many ways. We can use a natural language like English,
although, if we select this option, we must make sure that the resulting instructions are
definite. Graphic representations called flowcharts are another possibility, but they work well
only if the algorithm is small and simple.
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 2
• Example -1: Selection sort
Suppose we must devise a program that sorts a set of n ≥ 1 integers. A simple solution is given
by the following:
From those integers that are currently unsorted, find the smallest and place it next in the sorted
list.
Although this statement adequately describes the sorting problem, it is not an algorithm since it
leaves several unanswered questions. For example, it does not tell us where and how the integers
are initially stored, or where we should place the result. We assume that the integers are stored in
an array, list, such that the ith integer is stored in the ith position, list [i ], 0 ≤ i < n.
for (i = 0; i < n; i++) {
Examine list[i] to list[n-l] and suppose that the
smallest integer is at list[min];
Interchange list[i] and list[min];
}
Algorithm 1: Selection Sort
Selection Sort - finding the smallest integer in the given list and interchanging it with list [i].
void sort(int list[],int n) {
int i, j, min, temp;
for (i = 0; i < n-1; i++) {
min = i;
for (j = i+1; j < n; j++)
if (list[j] < list[min])
min = j;
SWAP(list[i],list[min],temp);
}
}
Example 1.2: Binary search
Assume that we have n ≥ 1 distinct integers that are already sorted and stored in the array list.
That is, list [0] ≤ list [1] ≤ ... ≤ list [n-l]. We must figure out if an integer searchnum is in this list.
If it is we should return an index, i, such that list[i] = searchnum. If searchnum is not present, we
should return -1. Sincethe list is sorted we may use the following method to search for the value.
Let left and right, respectively, denote the left and right ends of the list to be searched.
Initially, left = 0 and right = n-1. Let middle = (left+right)/2 be the middle position in the list. If
we compare list [middle] with searchnum, we obtain one of three results:
1. searchnum < list[middle]. In this case, if searchnum is present, it must be in the
positions between 0 and middle - 1. Therefore, we set right to middle - 1.
2. searchnum = list[middle]. In this case, we return middle.
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 3
3. searchnum > list[middle]. In this case, if searchnum is present, it must be in the
positions between middle + 1 and n - 1. So, we set left to middle + 1.
while (there are more integers to check ) {
middle = (left + right) / 2;
if (searchnum < list [middle] )
right = middle - 1;
else if (searchnum == list[middle])
return middle;
else
left = middle + 1;
}
Algorithm 2: Binary Search
Data Abstraction
We are familiar with the basic data types of C. These include char, int, float, and double.
Some of these data types may be modified by the keywords short, long, and unsigned.
Ultimately, the real world abstractions we wish to deal with must be represented in terms of
these data types.
In addition to these basic types, C helps us by providing two mechanisms for grouping
data together. These are the array and the structure. Arrays are collections of elements of the
same basic data type. They are declared implicitly, for example, int list[5] defines a five-element
array of integers whose legitimate subscripts are in the range 0 ... 4. Structs are collections of
elements whose data types need not be the same. They are explicitly defined. For example,
struct student {
char studentName;
int studentId;
char grade;
};
Defines a structure with three fields, two of type character and one of type integer. The structure
name is student. All programming languages provide at least a minimal set of predefined data
types, plus the ability to construct new, or user-defined types. It is appropriate to ask the
question, "What is a data type?"
Definition: A data type is a collection of objects and a set of operations that act on those
objects.
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 4
Whether your program is dealing with predefined data types or user-defined data
types,these two aspects must be considered: objects and operations. For example, the data type
intconsists of the objects {0, +1, -1, +2, -2, ..., INT-MAX, INT-MIN}, where INT_MAX and
INT_MIN are the largest and smallest integers that can be represented on your machine. (They
are defined in limits.h.) The operations on integers are many, and would certainly include the
arithmetic operators +, -, *, /, and %. There is also testing for equality/inequality and the
operation that assigns an integer to a variable.
In addition to knowing all of the facts about the operations on a data type, we might also
want to know about how the objects of the data type are represented. For example on most
computers a char is represented as a bit string occupying 1 byte of memory, where as an int
might occupy 2 or possibly 4 bytes of memory. If 2 eight-bit bytes are used, then INT_MAX is
215 -1 = 32,767.
Knowing the representation of the objects of a data type can be useful and dangerous. By
knowing the representation we can often write algorithms that make use of it. However, if we
ever want to change the representation of these objects, we also must change the routines that
make use of it. It has been observed by many software designers that hiding the representation of
objects of a data type from its users is a good design strategy. In that case, the user is constrained
to manipulate the objects solely through the functions that are provided. The designer may still
alter the representation as long as the new implementations of the operations do not change the
user interface.
Definition: An Abstract Data Type (ADT) is a data type that is organized in such a way that the
specification of the objects and the specification of the operations on the objects is separated
from the representation of the objects and the implementation of the operations.
Some programming languages provide explicit mechanisms to support the distinction
between specification and implementation. For example, C++ has a concept called a class. Class
assists the programmer in implementing abstract data types. Although C does not have an
explicit mechanism for implementing ADTs, it is still possible and desirable to design your data
types using the same notion.
Specification of the operations of an ADT differs from the implementation of the
operations?
The specification consists of the names of every function, the type of its arguments, and
the type of its result. There should also be a description of what the function does, but without
appealing to internal representation or implementation details. This requirement is quite
important, and it implies that an abstract data type is implementation-independent.
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 5
Example - Abstract Data Type of NaturaINumber:
ADT NaturalNumber is
Objects: an ordered subrange of the integers starting at zero and ending at the maximum integer
(lNT -MAX) on the computer.
Functions:
for all x, y ∈ NaturalNumber; TRUE, FALSE ∈ Boolean
where +, -, <, and == are the usual integer operations.
NaturalNumber Zero() :: = 0
if (x) return FALSE
Boolean IsZero(x) :: =
else return TRUE
if (x == y) return TRUE
Boolean Equal(x, y) :: =
else return FALSE
if (x == INT_MAX) return x
NaturalNumber Successor(x) :: =
else return x + 1
if ((x + y) <= INT_MAX) return x + y
NaturalNumber Add(x, y) :: =
else return INT –MAX
if (x < y) return 0
NaturalNumber Subtract(x, y) :: =
else return x – y
end NaturalNumber
Abstract data type NaturalNumber
The ADT definition begins with the name of the ADT. There are two main sections in the
definition: the objects and the functions.
✓ The objects are defined in terms of the integers, but we make no explicit reference to
their representation.
✓ The function definitions are a bit more complicated. First, the definitions use the symbols
x and y to denote two elements of the data type NaturalNumber, while TRUE and FALSE
are elements of the data type Boolean. In addition, the definition makes use of functions
that are defined on the set of integers, namely, plus, minus, equals, and less than. For
each function, we place the result type to the left of the function name and a definition of
the function to the right. The symbols "::=" should be read as "is defined as."
Functions:
• The first function, Zero, has no arguments and returns the natural number zero. This is a
constructor function.
• The function Successor(x) returns the next natural number in sequence. This is an
example of a transformer function. Notice that if there is no next number in sequence,
that is, if the value of x is already INT_MAX, then we define the action of Successor to
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 6
return INT_MAX. Some programmers might prefer that in such a case Successor return
an error flag.
• Other transformer functions are Add and Subtract. They might also return an error
condition, although here we decided to return an element of the set NaturalNumber.
Performance Analysis:
If we want to travel from 'City - A' to 'City - B', there can be many ways of doing this.
We can go by flight, by bus, by train and also by bicycle. Depending on the availability and
convenience, we choose the one which suits us. Similarly, in computer science, there are
multiple algorithms to solve a problem. When we have more than one algorithm to solve a
problem, we need to select the best one. Performance analysis helps us to select the best
algorithm from multiple algorithms to solve a problem. When there are multiple alternative
algorithms to solve a problem, we analyze them and pick the one which is best suitable for our
requirements. To compare algorithms, we use a set of parameters or set of elements like memory
required by that algorithm, the execution speed of that algorithm i.e., Space Complexity and
Time Complexity.
Space complexity: The amount of computer memory required to run an application. Space
complexity can be broadly classified into two types. Fixed or constant space complexity and
Linear space complexity.
✓ In Fixed or constant space complexity the size of space consumed will not be increased
with the increase of input size. Consider the following example “ Square of the given
Number”
int getSquare (int n) {
return (n * n);
}
Here we can solve the problem without consuming any extra space, hence the space
complexity is constant even if the input size increases.
✓ In Linear space complexity the space consumed will be increased with the increase of
input size. Consider the following example “Sum of Array elements”
int calculate_sum(int arr[], int len){
int total_sum = 0;
int i = 0;
for( i= 0; i< len; i++) {
total_sum = total_sum + arr[i];
}
}
From the above code, we can see that:
A variable “total_sum” that takes constant sum of 1 unit
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 7
A variable “length” that takes constant sum of 1 unit
A variable “i” that takes constant sum of 1 unit
But the variable arr[] is an array, it’s space consumption increases with the increase of
input size.
Time complexity: The amount of CPU time required to run/execute an application.
Calculate the time complexity of sum of N array elements.
int calculate_sum(int arr[], int length) {
int total_sum = 0, i;
for( i= 0; i< len; i++) {
total_sum = total_sum + arr[i];
}
return total_sum;
}
From the above code,
✓ The statement “int total_sum = 0;” will be executed 1 time. Hence 1 unit time-->
Constant Time
✓ The statement “for( i= 0; i< len; i++)” will be executed “len+1” times. This is
because, it will also calculate till, after the condition is false. --> Variable Time
(depends on len)
✓ The statement “total_sum = total_sum + arr[i];” will execute “len” times. -->
Variable Time (depends on len)
✓ The statement “return total_sum;” will be executed 1 time. Hence 1 unit time-->
Constant Time
Statement Total No. of Times
int total_sum = 0; 1
for( i= 0; i< len; i++) len+1 times (N+1 times)
total_sum = total_sum + arr[i]; len times (N-times)
int i = 0; 1
Total Time (2N + 3)
Hence the total time taken to execute is = “2N+3”. For larger value of N, we ignore the
constants, hence the final time complexity is “O(N)” times.
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 8
Example -2: Time complexity for matrix addition
int add_matrix( int a[m] [n], int b [m] [n], int m, int n) {
for (i = 0; i< n; i++) {
for(j = 0; j<n; j++){
c[i][j] = a[i][j] + b[i][j]
}
}
}
From the above code, we can see that:
The first for loop “for (i = 0; i< n; i++)” will be executed (n+1) time. Hence n+1 units.
The second for loop “for(j = 0; j<n; j++)will be executed n*(n+1) time. Hence n(n+1) units.
The third statement “c[i][j] = a[i][j] + b[i][j]” will be executed “n * n” times. Hence n^2 units.
Total time taken to run this code is,
=> (n+1) + nx(n+1) + n^2
=> n + 1 + n^2 + n + n^2
=> 2x(n^2 + n) + 1
For larger value of “n” we ignore the constants, hence total time taken to execute is O(n^2).
Asymptotic Notations
Asymptotic analysis of an algorithm refers to defining the mathematical bound of its run-
time performance. Using asymptotic analysis, we can very well conclude the best case, average
case, and worst case scenario of an algorithm. Asymptotic analysis is input bound i.e., if there's
no input to the algorithm, it is concluded to work in a constant time. Other than the "input" all
other factors are considered constant.
Asymptotic analysis refers to computing the running time of any operation in
mathematical units of computation. For example, the running time of one operation is computed
as f(n) and may be for another operation it is computed as g(n2). This means the first operation
running time will increase linearly with the increase in n and the running time of the second
operation will increase exponentially when n increases. Similarly, the running time of both
operations will be nearly the same if n is significantly small.
Usually, the time required by an algorithm falls under three types −
• Best Case − Minimum time required for program execution.
• Average Case − Average time required for program execution.
• Worst Case − Maximum time required for program execution.
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 9
Asymptotic Notations:
Big Oh Notation: (O)
The notation Ο(n) is the formal way
to express the upper bound of an algorithm's
running time. It measures the worst case
time complexity or the longest amount of
time an algorithm can possibly take to
complete.
Consider function f(n) as time complexity of an algorithm and g(n) is the most significant term.
If f(n) <= C g(n) for all n >= n0, C > 0 and n0 >= 1. Then we can represent f(n) as O(g(n)).
Omega Notation, Ω
The notation Ω(n) is the formal way to
express the lower bound of an algorithm's
running time. It measures the best case time
complexity or the best amount of time an
algorithm can possibly take to complete.
Consider function f(n) as time complexity of
an algorithm and g(n) is the most significant
term. If f(n) >= C g(n) for all n >= n0, C > 0
and n0 >= 1. Then we can represent f(n) as
Ω(g(n)).
Theta Notation, θ
The notation θ(n) is the formal way to
express both the lower bound and the upper
bound of an algorithm's running time.
Consider function f(n) as time complexity of
an algorithm and g(n) is the most significant
term. If C1 g(n) <= f(n) <= C2 g(n) for all n
>= n0, C1 > 0, C2 > 0 and n0 >= 1. Then we
can represent f(n) as Θ(g(n)).
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 10
Common Asymptotic Notations
Following is a list of some common asymptotic notations −
Constant − Ο(1)
Logarithmic − Ο(log n)
Linear − Ο(n)
n log n − Ο(n log n)
Quadratic − Ο(n2)
Cubic − Ο(n3)
Polynomial − nΟ(1)
Exponential − 2Ο(n)
The increasing order of time consumption:
O(1) < O( log n) < O(n)< O( nlog n) < O(n^2) < O(n^3) < O(2^n) < O(3^n) < O(n^n)
Searching: In computer science, searching is the process of finding an item with specified
properties from a collection of items. The items may be stored as records in a database, simple
data elements in arrays, text in files, nodes in trees, vertices and edges in graphs, or they may be
elements of other search spaces.
Why do we need Searching?
Searching is one of the core computer science algorithms. We know that today’s computers store
a lot of information. To retrieve this information proficiently we need very efficient searching
algorithms. There are certain ways of organizing the data that improves the searching process.
That means, if we keep the data in proper order, it is easy to search the required element. Sorting
is one of the techniques for making the elements ordered. In this chapter we will see different
searching algorithms.
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 11
Types of Searching
✓ Linear/Sequential Search
✓ Binary Search
• Linear Search: Linear Search is also called as Sequential Search. In Linear search, Search
for a key within an array one by one until it finds in. In an unordered array, linear search
would have to check the entire array until it found the desired value.
procedure LINEAR_SEARCH (array, key)
for each item in the array
if match element == key
return element's index
end if
end for
end procedure
Algorithm: Linear Search
Implementation of Linear Search in ‘C’:
#include <stdio.h>
int main() {
int a[20], n, i, key, search_result;
printf("Number of Elements in the array:");
scanf("%d",&n);
printf("Enter Array Elements:");
for(i=0;i<n;i++){
printf("a[%d]=",i);
scanf("%d",&a[i]);
}
printf("What is your Search Key?:");
scanf("%d",&key);
search_result = linear_search(a,n,key);
if(search_result>=0)
printf("Search Key %d is found @ %d Location",key, search_result);
else
printf("Search Key %d is Not found ",key);
return 0;
}
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 12
int linear_search(int a[], int n, int key){
int i;
for(i=0;i<n;i++){
if(a[i]==key)
return i;
}
return -1;
}
Output:
Number of Elements in the array:5 Number of Elements in the array:5
Enter Array Elements: a[0]=34 Enter Array Elements: a[0]=34
a[1]=23 a[1]=23
a[2]=56 a[2]=56
a[3]=32 a[3]=32
a[4]=40 a[4]=40
What is your Search Key?:40 What is your Search Key?:10
Search Key 40 is found @ 4 Location Search Key 10 is Not found
Linear Search Complexity analysis:
Time complexity: O(n), in the worst case we need to scan the complete array.
Space complexity: O(1).
Binary Search Techniques:
Let us consider the problem of searching a word in a dictionary. Typically, we directly go
to some approximate page [say, middle page] and start searching from that point. If the name that
we are searching is the same then the search is complete. If the page is before the selected pages
then apply the same process for the first half; otherwise apply the same process to the second
half. Binary search also works in the same way. The algorithm applying such a strategy is
referred to as binary search algorithm. Binary Search follows Divide and Conquer technique.
Binary Search, Search a sorted array by repeatedly dividing the search interval in half. Begin with
an interval covering the whole array. If the value of the search key is less than the item in the middle of
the interval, narrow the interval to the lower half. Otherwise, narrow it to the upper half. Repeatedly
check until the value is found or the interval is empty.
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 13
Procedure BinarySearch(a, n, key){
low =0, high = n-1;
while( low <= high ){
mid = (low +how) / 2 ;
if(a[mid] == key)
return mid;
else if(a[mid] < key)
low = mid + 1;
else
high = mid - 1;
}
return -1;
}
Algorithm: Binary Search
Binary Search Implementation in C:
#include <stdio.h>
int main()
{
int a[20],n,i,key,search_result=-1;
printf("Number of Elements in the array:");
scanf("%d",&n);
printf("Enter Array Elements:");
for(i=0;i<n;i++){
printf("a[%d]=",i);
scanf("%d",&a[i]);
}
printf("What is your Search Key:");
scanf("%d",&key);
search_result = binary_search(a,n,key);
if(search_result>=0)
printf("Search Key %d is found @ %d Location",key, search_result);
else
printf("Search Key %d is Not found ",key);
return 0;
}
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 14
int binary_search(int a[], int n, int key){
int low=0,high=n-1,mid;
while(low<=high){
mid = (low + high)/2;
if(a[mid]==key)
return mid;
else if(a[mid]<key)
low = mid +1;
else
high = mid -1 ;
}
return -1;
}
OUTPUT:
Number of Elements in the array:5 Number of Elements in the array:5
Enter Array Elements: a[0]=23 Enter Array Elements: a[0]=23
a[1]=32 a[1]=32
a[2]=34 a[2]=34
a[3]=40 a[3]=40
a[4]=56 a[4]=56
What is your Search Key:40 What is your Search Key:10
Search Key 40 is found @ 3 Location Search Key 10 is Not found
Binary Search Complexity analysis:
• Let say the iteration in Binary Search terminates after k iterations. In the above example, it
terminates after 2 iterations, so here k = 2
• At each iteration, the array is divided by half. So let’s say the length of array at any
iteration is n
• At Iteration 1,
Length of array = n
• At Iteration 2,
Length of array = n⁄2
• At Iteration 3,
Length of array = (n⁄2)⁄2 = n⁄22
• Therefore, after Iteration k,
Length of array = n⁄2k
• Also, we know that after
After k divisions, the length of array becomes 1
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures
VR 17 15
• Therefore
Length of array = n⁄2k = 1
=> n = 2k
• Applying log function on both sides:
=> log2 (n) = log2 (2k)
=> log2 (n) = k log2 (2)
• As (loga (a) = 1)
Therefore,
=> k = log2 (n)
Hence, the time complexity of Binary Search is log2 (n)
Questions:
1. Define Algorithm. What are the different characteristics of Algorithm?
2. Define ADT.
3. Define Time and Space complexity; find the time complexity of matrix multiplication.
4. Discuss about Asymptotic notation.
5. Write a linear search algorithm using recursion and calculate the time complexity of the
linear search.
6. Write a Binary search algorithm using recursion and calculate the time complexity of the
linear search.
7. Compare Linear and Binary search.
V V N V PHANI KUMAR, CSE, VRSEC. 17CS3303 Data Structures