KEMBAR78
Data Structures Full Notes | PDF | Queue (Abstract Data Type) | Algorithms
92% found this document useful (12 votes)
58K views90 pages

Data Structures Full Notes

The document discusses data structures and algorithms. It defines data structures as organizing data in a way that allows for efficient use and manipulation. It provides examples of common data structures like arrays, linked lists, stacks, queues, trees, and graphs. It also discusses the differences between primitive and non-primitive data structures, categories of data structures, common operations on data structures, applications of different data structures, and characteristics and writing of algorithms.

Uploaded by

shahalameenu2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
92% found this document useful (12 votes)
58K views90 pages

Data Structures Full Notes

The document discusses data structures and algorithms. It defines data structures as organizing data in a way that allows for efficient use and manipulation. It provides examples of common data structures like arrays, linked lists, stacks, queues, trees, and graphs. It also discusses the differences between primitive and non-primitive data structures, categories of data structures, common operations on data structures, applications of different data structures, and characteristics and writing of algorithms.

Uploaded by

shahalameenu2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 90

DATA STRUCTURES

UNIT I
Data structure is a particular way of organizing data in a computer so that it
can be used effectively.

Data Structures are widely used in almost every aspect of Computer Science i.e.
Operating System, Compiler Design, Artificial intelligence, Graphics and many
more.

Data Structures are the main part of many computer science algorithms as they
enable the programmers to handle the data in an efficient way. It plays a vital
role in enhancing the performance of a software or a program as the main
function of the software is to store and retrieve the user's data as fast as possible

Example of data structures

1. Arrays

An array is a structure of fixed-size, which can hold items of the same


data type. It can be an array of integers, an array of floating-point
numbers, an array of strings or even an array of arrays (such as 2-
dimensional arrays). Arrays are indexed, meaning that random access is
possible.

2. Linked Lists

A linked list is a sequential structure that consists of a sequence of items


in linear order which are linked to each other. Hence, you have to access
data sequentially and random access is not possible.
3. Stacks

A stack is a LIFO (Last In First Out — the element placed at last can be
accessed at first) structure which can be commonly found in many
programming languages. This structure is named as ―stack‖ because it
resembles a real-world stack — a stack of plates.

4. Queues

A queue is a FIFO (First In First Out — the element placed at first can be
accessed at first) structure which can be commonly found in many
programming languages. This structure is named as ―queue‖ because it
resembles a real-world queue — people waiting in a queue.

5. Trees

A tree is a hierarchical structure where data is organized hierarchically


and are linked together. This structure is different than a linked list
whereas, in a linked list, items are linked in a linear order.

6. Graphs

A graph consists of a finite set of vertices or nodes and a set


of edges connecting these vertices.

Elementary Data Organization


Data: Data can be defined as an elementary value, for example, student's name
and his roll number are the data about the student.

Group Items: Data items which have subordinate data items are called Group
item, for example, name of a student can have first name and the last name.
Record: Record can be defined as the collection of various data items, for
example, if we talk about the student entity, then his name, address, course and
marks can be grouped together to form the record for the student.

File: A File is a collection of various records of one type of entity, for example,
if there are 60 employees in a Bank, then there will be 60 records in the related
file where each record contains the data about each employee.

Entity: An entity is a person, place, thing, event or concept about which


information recorded.

What is the difference between a data structure and a data type?

Data type Data structure

A data type describes pieces of data A data structure is a way of describing


that all share a common property. a certain way to organize pieces of
data so that operations and algorithms
can be more easily applied.

For example an integer data type For example tree type data structures
describes every integer that the often allow for efficient searching
computer can handle. algorithms.

Values can directly be assigned to the The data is assigned to the data
data type variables. structure object using some set of
algorithms and operations like push,
pop and so on.

No problem of time complexity Time complexity comes into play


when working with data structures.
Examples: int, float, double Examples: stacks, queues, tree

Categories of data structure


Data structures can be classified as follows

Primitive Data Structures

These are the structures which are supported at the machine level, they can be

used to make non-primitive data structures. These are integral and are pure in

form. They have predefined behavior and specifications.

Examples: Integer, float, character etc


Non-primitive Data Structures

The non-primitive data structures cannot be performed without the primitive

data structures. Although, they too are provided by the system itself yet they are

derived data structures and cannot be formed without using the primitive data

structures.

The Non-primitive data structures are further divided into the following

categories:

1. Linear data structure

Data structure where data elements are arranged sequentially or linearly

where the elements are attached to its previous and next adjacent in what

is called a linear data structure. In linear data structure, single level is

involved. Therefore, we can traverse all the elements in single run only.

Linear data structures are easy to implement because computer memory is

arranged in a linear way. Its examples are array, stack, queue, linked list

2. Non Linear data structure

Data structures where data elements are not arranged sequentially or


linearly are called non-linear data structures. In a non-linear data
structure, single level is not involved. Therefore, we can’t traverse all the
elements in single run only. Non-linear data structures are not easy to
implement in comparison to linear data structure. It utilizes computer
memory efficiently in comparison to a linear data structure. Its examples
are trees and graphs.

Data structure operations

1. Traversing- It is used to access each data item exactly once so that it can

be processed.

2. Searching- It is used to find out the location of the data item if it exists in

the given collection of data items.

3. Inserting- It is used to add a new data item in the given collection of data

items.

4. Deleting- It is used to delete an existing data item from the given

collection of data items.

5. Sorting- It is used to arrange the data items in some order i.e. in ascending

or descending order in case of numerical data and in dictionary order in case

of alphanumeric data.

6. Merging- It is used to combine the data items of two sorted files into

single file in the sorted form.


Applications of data structures

 To search and find a specified target value, such as maximum, minimum,

mean, median, average, count from a group of data objects the arrays are

very efficient.

 A stack can be used as an "undo" mechanism in text editors; this

operation has been accomplished by keeping all text changes in a stack.

 Queues are used for CPU job scheduling and in disk scheduling

 Linked lists are used in Symbol Tables for balancing parenthesis and in

representing Sparse Matrix.

 An operating system maintains a disk's file system as a tree, where file

folders act as tree nodes. The tree structure is useful because it easily

accommodates the creation and deletion of folders and files.

 Shortest Path algorithm is the commonly used algorithm for determining

the shortest path between any two points. This is practically used in

applications such as network stations, computer games, maps, satellite

navigation system etc.

Algorithms

Algorithm is a step-by-step procedure, which defines a set of instructions to be

executed in a certain order to get the desired output. Algorithms are generally
created independent of underlying languages, i.e. an algorithm can be

implemented in more than one programming language.

From the data structure point of view, following are some important categories

of algorithms −

 Search − Algorithm to search an item in a data structure.

 Sort − Algorithm to sort items in a certain order.

 Insert − Algorithm to insert item in a data structure.

 Update − Algorithm to update an existing item in a data structure.

 Delete − Algorithm to delete an existing item from a data structure.

Characteristics of an Algorithm

Not all procedures can be called an algorithm. An algorithm should have the

following characteristics −

 Unambiguous − Algorithm should be clear and unambiguous. Each of

its steps (or phases), and their inputs/outputs should be clear and must

lead to only one meaning.

 Input − An algorithm should have 0 or more well-defined inputs.

 Output − An algorithm should have 1 or more well-defined outputs, and

should match the desired output.

 Finiteness − Algorithms must terminate after a finite number of steps.


 Feasibility − Should be feasible with the available resources.

 Independent − An algorithm should have step-by-step directions, which

should be independent of any programming code.

How to Write an Algorithm?

There are no well-defined standards for writing algorithms. Rather, it is

problem and resource dependent. Algorithms are never written to support a

particular programming code.

As we know that all programming languages share basic code constructs like

loops (do, for, while), flow-control (if-else), etc. These common constructs can

be used to write an algorithm.

Example

Let's try to learn algorithm-writing by using an example.

Problem − Design an algorithm to add two numbers and display the result.

Step 1 − START

Step 2 − declare three integers a, b & c

Step 3 − define values of a & b

Step 4 − add values of a & b

Step 5 − store output of step 4 to c

Step 6 − print c

Step 7 − STOP
Algorithms tell the programmers how to code the program. Alternatively, the

algorithm can be written as −

Step 1 − START ADD

Step 2 − get values of a & b

Step 3 − c ← a + b

Step 4 − display c

Step 5 − STOP

In design and analysis of algorithms, usually the second method is used to

describe an algorithm. It makes it easy for the analyst to analyze the algorithm

ignoring all unwanted definitions. He can observe what operations are being

used and how the process is flowing.

Writing step numbers, is optional.

We design an algorithm to get a solution of a given problem. A problem can be

solved in more than one ways.


Algorithm Complexity

Suppose X is an algorithm and n is the size of input data, the time and space

used by the algorithm X are the two main factors, which decide the efficiency

of X.

 Time Factor − Time is measured by counting the number of key

operations such as comparisons in the sorting algorithm.

 Space Factor − Space is measured by counting the maximum memory

space required by the algorithm.

The complexity of an algorithm f(n) gives the running time and/or the storage

space required by the algorithm in terms of n as the size of input data.

Space Complexity

Space complexity of an algorithm represents the amount of memory space

required by the algorithm in its life cycle. The space required by an algorithm

is equal to the sum of the following two components −

 A fixed part that is a space required to store certain data and variables,

that are independent of the size of the problem. For example, simple

variables and constants used, program size, etc.

 A variable part is a space required by variables, whose size depends on

the size of the problem. For example, dynamic memory allocation,

recursion stack space, etc.


Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the

fixed part and S(I) is the variable part of the algorithm, which depends on

instance characteristic I. Following is a simple example that tries to explain the

concept −

Algorithm: SUM(A, B)

Step 1 - START

Step 2 - C ← A + B + 10

Step 3 - Stop

Here we have three variables A, B, and C and one constant. Hence S(P) = 1 +

3. Now, space depends on data types of given variables and constant types and

it will be multiplied accordingly.

Time Complexity

Time complexity of an algorithm represents the amount of time required by the

algorithm to run to completion. Time requirements can be defined as a

numerical function T(n), where T(n) can be measured as the number of steps,

provided each step consumes constant time.

Time-Memory Tradeoff

In computer science, a space-time or time-memory tradeoff is a way of solving

a problem or calculation in less time by using more storage space (or memory),

or by solving a problem in very little space by spending a long time. Most


computers have a large amount of space, but not infinite space. Also, most

people are willing to wait a little while for a big calculation, but not forever. So

if your problem is taking a long time but not much memory, a space-time

tradeoff would let you use more memory and solve the problem more quickly.

Or, if it could be solved very quickly but requires more memory than you have,

you can try to spend more time solving the problem in the limited memory.

Big O Notation

Big O notation is the language we use for talking about how long an algorithm

takes to run. It's how we compare the efficiency of different approaches to a

problem. It's like math except it's an awesome, not-boring kind of math.

With big O notation we express the runtime in terms of—brace yourself—how

quickly it grows relative to the input, as the input gets arbitrarily large.

Big O notation is used to classify algorithms according to how their run time or

space requirements grow as the input size grows.

Strings

A string in C is an array of characters. The length of a string is determined by a


terminating null character: '\0'. So, a string with the contents, say, "abc" has
four characters: 'a', 'b', 'c', and the terminating null ( '\0') character.
The terminating null character has the value zero.
Rithun
String operations

Function Work of Function

strlen() computes string's length

strcpy() copies a string to another

strcat() concatenates(joins) two strings

strcmp() compares two strings

strlwr() converts string to lowercase

strupr() converts string to uppercase

Strings handling functions are defined under "string.h" header file.

#include <string.h>

UNIT II
Array

Array is a container which can hold a fixed number of items and these items
should be of the same type. Most of the data structures make use of arrays to
implement their algorithms. Following are the important terms to understand
the concept of Array.

 Element − Each item stored in an array is called an element.


 Index − Each location of an element in an array has a numerical index,
which is used to identify the element.

Representation of linear array in memory

Arrays can be declared in various ways in different languages. For illustration,


let's take C array declaration.

As per the above illustration, following are the important points to be


considered.

 Index starts with 0.

 Array length is 10 which means it can store 10 elements.

 Each element can be accessed via its index. For example, we can fetch an
element at index 6 as 27.

Basic Operations

Following are the basic operations supported by an array.

 Traverse − print all the array elements one by one.

 Insertion − Adds an element at the given index.

 Deletion − Deletes an element at the given index.

 Search − Searches an element using the given index or by the value.


Traverse Operation
This operation is to traverse through the elements of an array.

Example
Following program traverses and prints the elements of an array:

#include <stdio.h>

int main()

int age[5]={20,21,19,18,25};

int i;

for(i=0;i<5;i++)

printf("%d ", age[i]);

return 0;

When we compile and execute the above program, it produces the following
result −
Output
20 21 19 18 25

Insertion operation
Given an array arr of size n, this article tells how to insert an element x in this
array arr at a specific position pos.
Approach:
Here’s how to do it.
1. First get the element to be inserted, say x
2. Then get the position at which this element is to be inserted, say pos
3. Then shift the array elements from this position to one position forward,
and do this for all the other elements next to pos.
4. Insert the element x now at the position pos, as this is now empty.

Deletion operation

A user will enter the position at which the array element deletion is required.
Deleting an element does not affect the size of the array. It also checks whether
deletion is possible or not, for example, if an array contains five elements and
user wants to delete the element at the sixth position, it isn't possible.

we need to shift array elements which are after the element to be deleted, it is
very inefficient if the size of the array is large or we need to remove elements
from an array repeatedly. In linked list data structure shifting isn't required only
pointers are adjusted. If frequent deletion is required and the number of
elements is large, it is recommended to use a linked list.
Multi-Dimensional Array

When the number of dimensions specified is more than one, then it is called as a
multi-dimensional array. Multidimensional arrays include 2D arrays and 3D
arrays.

A two-dimensional array will be accessed by using the subscript of row and


column index. For traversing the two-dimensional array, the value of the rows
and columns will be considered. In the two-dimensional array face [3] [4], the
first index specifies the number of rows and the second index specifies the
number of columns and the array can hold 12 elements (3 * 4).

Similarly, in a three-dimensional array, there will be three dimensions.


The array face [5] [10] [15] can hold 750 elements (5 * 10 * 15).

Declaration/Initialization of Arrays

// A sample program for Array Declaration


#include <stdio.h>
int main()
{
int one_dim [10]; # declaration of 1D array
int two_dim [2][2]; #declaration of 2D array
int three_dim [2][3][4] = {
{ {3, 4, 2, 3}, {0, -3, 9, 11}, {23, 12, 23, 2} },
{ {13, 4, 56, 3}, {5, 9, 3, 5}, {3, 1, 4, 9} }
}; #declaration of 3D array. Here the elements are also defined.
return 0;
}

Parallel arrays

A parallel array is a structure that contains multiple arrays. Each of these arrays
are of the same size and the array elements are related to each other. All the
elements in a parallel array represent a common entity.
An example of parallel arrays is as follows −

employee_name = { Harry, Sally, Mark, Frank, Judy }


employee_salary = {10000, 5000, 20000, 12000, 5000}
In the above example, the name and salary of 5 different employees is stored in
2 arrays.

Sparse Matrix
In computer programming, a matrix can be defined with a 2-dimensional array.
Any array with 'm' columns and 'n' rows represent a m X n matrix. There may
be a situation in which a matrix contains more number of ZERO values than
NON-ZERO values. Such matrix is known as sparse matrix.

Sparse matrix is a matrix which contains very few non-zero elements.

When a sparse matrix is represented with a 2-dimensional array, we waste a lot


of space to represent that matrix. For example, consider a matrix of size 100 X
100 containing only 10 non-zero elements. In this matrix, only 10 spaces are
filled with non-zero values and remaining spaces of the matrix are filled with
zero. That means, totally we allocate 100 X 100 X 2 = 20000 bytes of space to
store this integer matrix. And to access these 10 non-zero elements we have to
make scanning for 10000 times. To make it simple we use the following sparse
matrix representation.

Sparse Matrix Representations

A sparse matrix can be represented by using Triplet Representation (Array


Representation)

Triplet Representation (Array Representation)

In this representation, we consider only non-zero values along with their row
and column index values. In this representation, the 0 th row stores the total
number of rows, total number of columns and the total number of non-zero
values in the sparse matrix.

For example, consider a matrix of size 5 X 6 containing 6 number of non-zero


values. This matrix can be represented as shown in the image...

In above example matrix, there are only 6 non-zero elements ( those are 9, 8, 4,
2, 5 & 2) and matrix size is 5 X 6. We represent this matrix as shown in the
above image. Here the first row in the right side table is filled with values 5, 6 &
6 which indicates that it is a sparse matrix with 5 rows, 6 columns & 6 non-zero
values. The second row is filled with 0, 4, & 9 which indicates the non-zero
value 9 is at the 0th-row 4th column in the Sparse matrix. In the same way, the
remaining non-zero values also follow a similar pattern.

Linked List
A linked list is a linear data structure, in which the elements are not stored at

contiguous memory locations. The elements in a linked list are linked using pointers

as shown in the below image:

In simple words, a linked list consists of nodes where each node contains a data field

and a reference(link) to the next node in the list.

Array Vs Linked List

Array Linked List

Linked List is an ordered collection of


An array is a collection of elements of a elements of the same type in which each
similar data type. element is connected to the next using
pointers.

Random accessing is not possible in linked


Array elements can be accessed randomly
lists. The elements will have to be accessed
using the array index.
sequentially.

New elements can be stored anywhere and a


Data elements are stored in contiguous
reference is created for the new element using
locations in memory.
pointers.

Insertion and Deletion operations are


Insertion and Deletion operations are fast and
costlier since the memory locations are
easy in a linked list.
consecutive and fixed.
Memory is allocated during the compile time Memory is allocated during the run-time
(Static memory allocation). (Dynamic memory allocation).

Size of the array must be specified at the Size of a Linked list grows/shrinks as and
time of array declaration/initialization. when new elements are inserted/deleted.

How can you represent Linear Linked List in memory

Memory Representation of Linear Linked List:

Let LIST is linear linked list. It needs two linear arrays for memory representation. Let these linear arrays
are INFO and LINK. INFO[K] contains the information part and LINK[K] contains the next pointer field of
node K. A variable START is used to store the location of the beginning of the LIST and NULL is used as
next pointer sentinel which indicates the end of LIST. It is shown below:

Traversing in singly linked list


Traversing is the most common operation that is performed in almost every scenario of
singly linked list. Traversing means visiting each node of the list once in order to perform
some operation on that. This will be done by using the following statements.

ptr = head;
while (ptr!=NULL)
{
ptr = ptr -> next;
}

Algorithm
o STEP 1: SET PTR = HEAD
o STEP 2: IF PTR = NULL

WRITE "EMPTY LIST"


GOTO STEP 7
END OF IF

o STEP 4: REPEAT STEP 5 AND 6 UNTIL PTR != NULL


o STEP 5: PRINT PTR→ DATA
o STEP 6: PTR = PTR → NEXT

[END OF LOOP]

o STEP 7: EXIT

Insertion Operation
Adding a new node in linked list is a more than one step activity. We shall learn this
with diagrams here. First, create a node using the same structure and find the
location where it has to be inserted.

Imagine that we are inserting a node B (NewNode), between A (LeftNode)


and C (RightNode). Then point B.next to C −
NewNode.next −> RightNode;
It should look like this −

Now, the next node at the left should point to the new node.
LeftNode.next −> NewNode;

This will put the new node in the middle of the two. The new list should look like this

Similar steps should be taken if the node is being inserted at the beginning of the
list. While inserting it at the end, the second last node of the list should point to the
new node and the new node will point to NULL.

Deletion Operation
Deletion is also a more than one step process. We shall learn with pictorial
representation. First, locate the target node to be removed, by using searching
algorithms.
The left (previous) node of the target node now should point to the next node of the
target node −
LeftNode.next −> TargetNode.next;

This will remove the link that was pointing to the target node. Now, using the
following code, we will remove what the target node is pointing at.
TargetNode.next −> NULL;

We need to use the deleted node. We can keep that in memory otherwise we can
simply deallocate memory and wipe off the target node completely.

Searching in Linked List


Sequential search is the most common search used on linked list structures.

Algorithm
Step-1: Initialise the Current pointer with the beginning of the List.

Step-2: Compare the KEY value with the Current node value;

if they match then quit there

else go to step-3.

Step-3: Move the Current pointer to point to the next node in the list and go
to step-2, till the list is not over or else quit

Header Linked List


A Header linked list is one more variant of linked list. In
Header linked list, we have a special node present at the
beginning of the linked list. This special node is used to store
number of nodes present in the linked list. In other linked list
variant, if we want to know the size of the linked list we use
traversal method. But in Header linked list, the size of the
linked list is stored in its header itself.

Two Way Linked List (Doubly Linked List)


Doubly Linked List is a variation of Linked list in which navigation is possible in both
ways, either forward and backward easily as compared to Single Linked List.
Following are the important terms to understand the concept of doubly linked list.
 Link − Each link of a linked list can store a data called an element.
 Next − Each link of a linked list contains a link to the next link called Next.
 Prev − Each link of a linked list contains a link to the previous link called
Prev.
 LinkedList − A Linked List contains the connection link to the first link called
First and to the last link called Last.

Doubly Linked List Representation

As per the above illustration, following are the important points to be considered.
 Doubly Linked List contains a link element called first and last.
 Each link carries a data field(s) and two link fields called next and prev.
 Each link is linked with its next link using its next link.
 Each link is linked with its previous link using its previous link.
 The last link carries a link as null to mark the end of the list.

Basic Operations
Following are the basic operations supported by a list.
 Insertion − Adds an element at the beginning of the list.
 Deletion − Deletes an element at the beginning of the list.
 Insert Last − Adds an element at the end of the list.
 Delete Last − Deletes an element from the end of the list.
 Insert After − Adds an element after an item of the list.
 Delete − Deletes an element from the list using the key.
 Display forward − Displays the complete list in a forward manner.
 Display backward − Displays the complete list in a backward manner.

Circular Linked List


Circular Linked List is a variation of Linked list in which the first element points to
the last element and the last element points to the first element. Both Singly Linked
List and Doubly Linked List can be made into a circular linked list.

Singly Linked List as Circular


In singly linked list, the next pointer of the last node points to the first node.

Doubly Linked List as Circular


In doubly linked list, the next pointer of the last node points to the first node and the
previous pointer of the first node points to the last node making the circular in both
directions.

As per the above illustration, following are the important points to be considered.
 The last link's next points to the first link of the list in both cases of singly as
well as doubly linked list.
 The first link's previous points to the last of the list in case of doubly linked
list.

Basic Operations
Following are the important operations supported by a circular list.
 insert − Inserts an element at the start of the list.
 delete − Deletes an element from the start of the list.
 display − Displays the list.
Applications of linked list
1. Implementation of stacks and queues
2. Implementation of graphs : Adjacency list representation of graphs is most
popular which is uses linked list to store adjacent vertices.
3. Dynamic memory allocation : We use linked list of free blocks.
4. Maintaining directory of names
5. Performing arithmetic operations on long integers
6. Manipulation of polynomials by storing constants in the node of linked list
7. representing sparse matrices
8. Image viewer – Previous and next images are linked, hence can be
accessed by next and previous button.
9. Previous and next page in web browser – We can access previous and next
url searched in web browser by pressing back and next button since, they
are linked as linked list.
10. Music Player – Songs in music player are linked to previous and next song.
you can play songs either from starting or ending of the list.

Algorithm of insertion and deletion in Singly Linked


List (SLL)
Insertion
In a single linked list, the insertion operation can be performed in three ways. They are as
follows...

1. Inserting At Beginning of the list


2. Inserting At End of the list
3. Inserting At Specific location in the list

Inserting At Beginning of the list


We can use the following steps to insert a new node at beginning of the single linked list...

 Step 1 - Create a newNode with given value.


 Step 2 - Check whether list is Empty (head == NULL)
 Step 3 - If it is Empty then, set newNode→next = NULL and head = newNode.
 Step 4 - If it is Not Empty then, set newNode→next = head and head = newNode.

Inserting At End of the list


We can use the following steps to insert a new node at end of the single linked list...

 Step 1 - Create a newNode with given value and newNode → next as NULL.
 Step 2 - Check whether list is Empty (head == NULL).
 Step 3 - If it is Empty then, set head = newNode.
 Step 4 - If it is Not Empty then, define a node pointer temp and initialize with head.
 Step 5 - Keep moving the temp to its next node until it reaches to the last node in the list
(until temp → next is equal to NULL).
 Step 6 - Set temp → next = newNode.

Inserting At Specific location in the list (After a


Node)
We can use the following steps to insert a new node after a node in the single linked list...

 Step 1 - Create a newNode with given value.


 Step 2 - Check whether list is Empty (head == NULL)
 Step 3 - If it is Empty then, set newNode → next = NULL and head = newNode.
 Step 4 - If it is Not Empty then, define a node pointer temp and initialize with head.
 Step 5 - Keep moving the temp to its next node until it reaches to the node after which
we want to insert the newNode (until temp1 → data is equal to location, here location is
the node value after which we want to insert the newNode).
 Step 6 - Every time check whether temp is reached to last node or not. If it is reached to
last node then display 'Given node is not found in the list!!! Insertion not
possible!!!' and terminate the function. Otherwise move the temp to next node.
 Step 7 - Finally, Set 'newNode → next = temp → next' and 'temp → next = newNode'

Deletion
In a single linked list, the deletion operation can be performed in three ways. They are as
follows...

1. Deleting from Beginning of the list


2. Deleting from End of the list
3. Deleting a Specific Node

Deleting from Beginning of the list


We can use the following steps to delete a node from beginning of the single linked list...

 Step 1 - Check whether list is Empty (head == NULL)


 Step 2 - If it is Empty then, display 'List is Empty!!! Deletion is not possible' and
terminate the function.
 Step 3 - If it is Not Empty then, define a Node pointer 'temp' and initialize with head.
 Step 4 - Check whether list is having only one node (temp → next == NULL)
 Step 5 - If it is TRUE then set head = NULL and delete temp (Setting Empty list
conditions)
 Step 6 - If it is FALSE then set head = temp → next, and delete temp.

Deleting from End of the list


We can use the following steps to delete a node from end of the single linked list...

 Step 1 - Check whether list is Empty (head == NULL)


 Step 2 - If it is Empty then, display 'List is Empty!!! Deletion is not possible' and
terminate the function.
 Step 3 - If it is Not Empty then, define two Node pointers 'temp1' and 'temp2' and
initialize 'temp1' with head.
 Step 4 - Check whether list has only one Node (temp1 → next == NULL)
 Step 5 - If it is TRUE. Then, set head = NULL and delete temp1. And terminate the
function. (Setting Empty list condition)
 Step 6 - If it is FALSE. Then, set 'temp2 = temp1 ' and move temp1 to its next node.
Repeat the same until it reaches to the last node in the list. (until temp1 →
next == NULL)
 Step 7 - Finally, Set temp2 → next = NULL and delete temp1.

Deleting a Specific Node from the list


We can use the following steps to delete a specific node from the single linked list...

 Step 1 - Check whether list is Empty (head == NULL)


 Step 2 - If it is Empty then, display 'List is Empty!!! Deletion is not possible' and
terminate the function.
 Step 3 - If it is Not Empty then, define two Node pointers 'temp1' and 'temp2' and
initialize 'temp1' with head.
 Step 4 - Keep moving the temp1 until it reaches to the exact node to be deleted or to the
last node. And every time set 'temp2 = temp1' before moving the 'temp1' to its next
node.
 Step 5 - If it is reached to the last node then display 'Given node not found in the list!
Deletion not possible!!!'. And terminate the function.
 Step 6 - If it is reached to the exact node which we want to delete, then check whether
list is having only one node or not
 Step 7 - If list has only one node and that is the node to be deleted, then
set head = NULL and delete temp1 (free(temp1)).
 Step 8 - If list contains multiple nodes, then check whether temp1 is the first node in the
list (temp1 == head).
 Step 9 - If temp1 is the first node then move the head to the next node (head = head →
next) and delete temp1.
 Step 10 - If temp1 is not first node then check whether it is last node in the list (temp1 →
next == NULL).
 Step 11 - If temp1 is last node then set temp2 → next = NULL and
delete temp1 (free(temp1)).
 Step 12 - If temp1 is not first node and not last node then set temp2 → next = temp1 →
next and delete temp1 (free(temp1)).

UNIT III

Stack Data Structure


Stack is a linear data structure which follows a particular order in which the
operations are performed. The order may be LIFO(Last In First Out) or FILO(First In
Last Out).
There are many real-life examples of a stack. Consider an example of plates stacked
over one another in the canteen. The plate which is at the top is the first one to be
removed, i.e. the plate which has been placed at the bottommost position remains in
the stack for the longest period of time. So, it can be simply seen to follow LIFO(Last
In First Out)/FILO(First In Last Out) order.
This feature makes it LIFO data structure. LIFO stands for Last-in-first-out. Here,
the element which is placed (inserted or added) last, is accessed first. In stack
terminology, insertion operation is called PUSH operation and removal operation is
called POP operation.
The following diagram depicts a stack and its operations −

A stack can be implemented by means of Array, Structure, Pointer, and Linked List.
Stack can either be a fixed size one or it may have a sense of dynamic resizing.
Here, we are going to implement stack using arrays, which makes it a fixed size
stack implementation.
Primitive operations on stack
Stack operations may involve initializing the stack, using it and then de-initializing it.
Apart from these basic stuffs, a stack is used for the following two primary
operations −
 push() − Pushing (storing) an element on the stack.
 pop() − Removing (accessing) an element from the stack.
When data is PUSHed onto stack.
To use a stack efficiently, we need to check the status of stack as well. For the
same purpose, the following functionality is added to stacks −
 peek() − get the top data element of the stack, without removing it.
 isFull() − check if stack is full.
 isEmpty() − check if stack is empty.
At all times, we maintain a pointer to the last PUSHed data on the stack. As this
pointer always represents the top of the stack, hence named top. The top pointer
provides top value of the stack without actually removing it.

Stack Push & Pop


When you add an item (often called an element) to the top of a stack, that's called a push. Push
is a good word for this, because the new item pushes all the old ones down. If you've ever
worked in a cafeteria or restaurant kitchen, spring-loaded shafts for plates are super common.
The weight of the plates on top push the lower plates down.
When you remove an item from a stack, that's called a pop. In a cafeteria, when you remove the
topmost plate, the ones beneath it pop up because of the spring beneath them. Well, a computer
stack is kind of like that.

Stack Underflow & Overflow


Now that we have a basic picture in mind of what a stack conceptually looks like, we can define
what underflow and overflow are. Stack underflow happens when we try to pop (remove) an
item from the stack, when nothing is actually there to remove. This will raise an alarm of sorts in
the computer because we told it to do something that cannot be done.
Stack overflow happens when we try to push one more item onto our stack than it can actually
hold. You see, the stack usually can hold only so much stuff. Typically, we allocate (set aside)
where the stack is going to be in memory and how big it can get. So, when we stick too much
stuff there or try to remove nothing, we will generate a stack overflow condition or stack
underflow condition, respectively.

Algorithms for push and pop


Push Operation
The process of putting a new data element onto stack is known as a Push
Operation. Push operation involves a series of steps −
 Step 1 − Checks if the stack is full.
 Step 2 − If the stack is full, produces an error and exit.
 Step 3 − If the stack is not full, increments top to point next empty space.
 Step 4 − Adds data element to the stack location, where top is pointing.
 Step 5 − Returns success.

If the linked list is used to implement the stack, then in step 3, we need to allocate
space dynamically.
Algorithm for PUSH Operation
A simple algorithm for Push operation can be derived as follows −
begin procedure push: stack, data

if stack is full
return null
endif

top ← top + 1
stack[top] ← data

end procedure

Pop Operation
Accessing the content while removing it from the stack, is known as a Pop
Operation. In an array implementation of pop() operation, the data element is not
actually removed, instead top is decremented to a lower position in the stack to
point to the next value. But in linked-list implementation, pop() actually removes
data element and deallocates memory space.
A Pop operation may involve the following steps −
 Step 1 − Checks if the stack is empty.
 Step 2 − If the stack is empty, produces an error and exit.
 Step 3 − If the stack is not empty, accesses the data element at which top is
pointing.
 Step 4 − Decreases the value of top by 1.
 Step 5 − Returns success.

Algorithm for Pop Operation


A simple algorithm for Pop operation can be derived as follows −
begin procedure pop: stack

if stack is empty
return null
endif

data ← stack[top]
top ← top - 1
return data

end procedure

Array implementation of Stack


In array implementation, the stack is formed by using the array. All the operations
regarding the stack are performed using arrays. Lets see how each operation can be
implemented on the stack using array data structure.

Adding an element onto the stack (push operation)


Adding an element into the top of the stack is referred to as push operation. Push
operation involves following two steps.

1. Increment the variable Top so that it can now refere to the next memory location.
2. Add element at the position of incremented top. This is referred to as adding new
element at the top of the stack.
Stack is overflown when we try to insert an element into a completely filled stack
therefore, our main function must always avoid stack overflow condition.

Algorithm:

begin
if top = n then stack full
top = top + 1
stack (top) : = item;
end

Time Complexity : o(1)

implementation of push algorithm in C language


void push (int val,int n) //n is size of the stack
{
if (top == n )
printf("\n Overflow");
else
{
top = top +1;
stack[top] = val;
}
}

Deletion of an element from a stack (Pop operation)


Deletion of an element from the top of the stack is called pop operation. The value of the
variable top will be incremented by 1 whenever an item is deleted from the stack. The
top most element of the stack is stored in an another variable and then the top is
decremented by 1. the operation returns the deleted value that was stored in another
variable as the result.

The underflow condition occurs when we try to delete an element from an already empty
stack.

Algorithm :

begin
if top = 0 then stack empty;
item := stack(top);
top = top - 1;
end;

Time Complexity : o(1)

Implementation of POP algorithm using C language


int pop ()
{
if(top == -1)
{
printf("Underflow");
return 0;
}
else
{
return stack[top - - ];
}
}

Linked List implementation of Stack


Linked List-based implementation provides the best (from the efficiency point
of view) dynamic stack implementation.

Stack applications
1. Stacks can be used for expression evaluation.
2. Stacks can be used to check parenthesis matching in an expression.
3. Stacks can be used for Conversion from one form of expression to
another.
4. Stacks can be used for Memory Management.
5. Stack data structures are used in backtracking problems.

Polish Notation

Polish notation is a notation form for expressing arithmetic, logic and algebraic
equations. Its most basic distinguishing feature is that operators are placed on
the left of their operands. If the operator has a defined fixed number of
operands, the syntax does not require brackets or parenthesis to lessen
ambiguity.
Polish notation is also known as prefix notation, prefix Polish notation, normal
Polish notation, Warsaw notation and Lukasiewicz notation.
Polish notation was invented in 1924 by Jan Lukasiewicz, a Polish logician and
philosopher, in order to simplify sentential logic. The idea is simply to have a
parenthesis-free notation that makes each equation shorter and easier to parse in
terms of defining the evaluation priority of the operators.
Example:
Infix notation with parenthesis: (3 + 2) * (5 – 1)
Polish notation: * + 3 2 – 5 1
Recursion

 "Recursion" is technique of solving any problem by calling same


function again and again until some breaking (base) condition where
recursion stops and it starts calculating the solution from there on. For
eg. calculating factorial of a given number
 Thus in recursion last function called needs to be completed first.
 Now Stack is a LIFO data structure i.e. ( Last In First Out) and hence
it is used to implement recursion.
 The High level Programming languages, such as Pascal , C etc. that
provides support for recursion use stack for book keeping.

Queue
A Queue is a linear structure which follows a particular order in which the operations
are performed. The order is First In First Out (FIFO). A good example of a queue is
any queue of consumers for a resource where the consumer that came first is served
first. The difference between stacks and queues is in removing. In a stack we
remove the item the most recently added; in a queue, we remove the item the least
recently added.

Queue Representation
As we now understand that in queue, we access both ends for different reasons.
The following diagram given below tries to explain queue representation as data
structure −

As in stacks, a queue can also be implemented using Arrays, Linked-lists, Pointers


and Structures. For the sake of simplicity, we shall implement queues using one-
dimensional array.

Basic Operations
Queue operations may involve initializing or defining the queue, utilizing it, and then
completely erasing it from the memory. Here we shall try to understand the basic
operations associated with queues −
 enqueue() − add (store) an item to the queue.
 dequeue() − remove (access) an item from the queue.
Few more functions are required to make the above-mentioned queue operation
efficient. These are −
 peek() − Gets the element at the front of the queue without removing it.
 isfull() − Checks if the queue is full.
 isempty() − Checks if the queue is empty.
In queue, we always dequeue (or access) data, pointed by front pointer and while
enqueing (or storing) data in the queue we take help of rear pointer.

Circular Queue
Circular Queue is a linear data structure in which the operations are performed
based on FIFO (First In First Out) principle and the last position is connected back to
the first position to make a circle. It is also called ‘Ring Buffer’.
In a normal Queue, we can insert elements until queue becomes full. But once
queue becomes full, we can not insert the next element even if there is a space in
front of queue.

Operations on Circular Queue:


 Front: Get the front item from queue.
 Rear: Get the last item from queue.
 enQueue(value) This function is used to insert an element into the circular
queue. In a circular queue, the new element is always inserted at Rear
position.

Priority Queue
Priority Queue is an extension of queue with following properties.
1. Every item has a priority associated with it.
2. An element with high priority is dequeued before an element with low priority.
3. If two elements have the same priority, they are served according to their order in the
queue.

Memory Representation of Queues


Like Stacks, Queues can also be represented in memory in two ways.

 Using the contiguous memory like an array


 Using the non-contiguous memory like a linked list

Using the Contiguous Memory like an Array


In this representation the Queue is implemented using the array. Variables used in this case
are

 QUEUE- the name of the array storing queue elements.


 FRONT- the index where the first element is stored in the array representing the queue.
 REAR- the index where the last element is stored in array representing the queue.
 MAX- defining that how many elements (maximum count) can be stored in the array
representing the queue.

Using the Non-Contiguous Memory like a Linked List


In this representation the queue is implemented using the dynamic data structure Linked List.
Using linked list for creating a queue makes it flexible in terms of size and storage. You don’t
have to define the maximum number of elements in the queue.

Pointers (links) to store addresses of nodes for defining a queue are.

 FRONT- address of the first element of the Linked list storing the Queue.
 REAR- address of the last element of the Linked list storing the Queue.
Queue Applications
Queues are used in various applications in Computer Science-

 Job scheduling tasks of CPU.


 Printer’s Buffer to store printing commands initiated by a user.
 Input commands sent to CPU by devices like Keyboard and Mouse.
 Document downloading from internet.
 User Requests for call center services.
 Order Queue for Online Food Delivery Chains.
 Online Cab Booking applications.

Algorithm on insertion and deletion in simple queue

Enqueue Operation
Queues maintain two data pointers, front and rear. Therefore, its operations are
comparatively difficult to implement than that of stacks.
The following steps should be taken to enqueue (insert) data into a queue −
 Step 1 − Check if the queue is full.
 Step 2 − If the queue is full, produce overflow error and exit.
 Step 3 − If the queue is not full, increment rear pointer to point the next
empty space.
 Step 4 − Add data element to the queue location, where the rear is pointing.
 Step 5 − return success.
Sometimes, we also check to see if a queue is initialized or not, to handle any
unforeseen situations.
Algorithm for enqueue operation
procedure enqueue(data)

if queue is full
return overflow
endif

rear ← rear + 1
queue[rear] ← data
return true

end procedure

Implementation of enqueue() in C programming language −


Example
int enqueue(int data)
if(isfull())
return 0;

rear = rear + 1;
queue[rear] = data;

return 1;
end procedure

Dequeue Operation
Accessing data from the queue is a process of two tasks − access the data
where front is pointing and remove the data after access. The following steps are
taken to perform dequeue operation −
 Step 1 − Check if the queue is empty.
 Step 2 − If the queue is empty, produce underflow error and exit.
 Step 3 − If the queue is not empty, access the data where front is pointing.
 Step 4 − Increment front pointer to point to the next available data element.
 Step 5 − Return success.

Algorithm for dequeue operation


procedure dequeue

if queue is empty
return underflow
end if

data = queue[front]
front ← front + 1
return true

end procedure

Implementation of dequeue() in C programming language −


Example
int dequeue() {
if(isempty())
return 0;

int data = queue[front];


front = front + 1;

return data;
}

Algorithm to insert an element in circular queue


o Step 1: IF (REAR+1)%MAX = FRONT
Write " OVERFLOW "
Goto step 4
[End OF IF]
o Step 2: IF FRONT = -1 and REAR = -1
SET FRONT = REAR = 0
ELSE IF REAR = MAX - 1 and FRONT ! = 0
SET REAR = 0
ELSE
SET REAR = (REAR + 1) % MAX
[END OF IF]
o Step 3: SET QUEUE[REAR] = VAL
o Step 4: EXIT

Algorithm to delete an element from a circular


queue
To delete an element from the circular queue, we must check for the three
following conditions.

1. If front = -1, then there are no elements in the queue and therefore this
will be the case of an underflow condition.
2. If there is only one element in the queue, in this case, the condition rear =
front holds and therefore, both are set to -1 and the queue is deleted
completely.
3. If front = max -1 then, the value is deleted from the front end the value of
front is set to 0.
4. Otherwise, the value of front is incremented by 1 and then delete the
element at the front end.

Algorithm
o Step 1: IF FRONT = -1
Write " UNDERFLOW "
Goto Step 4
[END of IF]
o Step 2: SET VAL = QUEUE[FRONT]
o Step 3: IF FRONT = REAR
SET FRONT = REAR = -1
ELSE
IF FRONT = MAX -1
SET FRONT = 0
ELSE
SET FRONT = FRONT + 1
[END of IF]
[END OF IF]
o Step 4: EXIT

UNIT IV
Tree
What are trees?

 Tree is a hierarchical data structure which stores the information naturally


in the form of hierarchy style.
 Tree is one of the most powerful and advanced data structures.
 It is a non-linear data structure compared to arrays, linked lists, stack and
queue.
 It represents the nodes connected by edges.

The above figure represents structure of a tree. Tree has 2 subtrees. A is a parent
of B and C.

B is called a child of A and also parent of D, E, F.


Tree is a collection of elements called Nodes, where each node can have
arbitrary number of children.

Field Description

Root Root is a special node in a tree. The entire tree is referenced through it. It does not have a
parent.

Parent Node Parent node is an immediate predecessor of a node.

Child Node All immediate successors of a node are its children.

Siblings Nodes with the same parent are called Siblings.

Path Path is a number of successive edges from source node to destination node.

Height of Height of a node represents the number of edges on the longest path between that node
Node and a leaf.

Height of Height of tree represents the height of its root node.


Tree

Depth of Depth of a node represents the number of edges from the tree's root node to the node.
Node

Degree of Degree of a node represents a number of children of a node.


Node

Edge Edge is a connection between one node to another. It is a line between two nodes or a node and
a leaf.

In the above figure, D, F, H, G are leaves. B and C are siblings. Each node excluding a root is
connected by a direct edge from exactly one other node

parent → children.

Levels of a node
Levels of a node represents the number of connections between the node and the root. It represents generation
of a node. If the root node is at level 0, its next node is at level 1, its grand child is at level 2 and so on. Levels
of a node can be shown as follows:
- If node has no children, it is called Leaves or External Nodes.

- Nodes which are not leaves, are called Internal Nodes. Internal nodes have at
least one child.

- A tree can be empty with no nodes or a tree consists of one node called the Root.

Height of a Node

As we studied, height of a node is a number of edges on the longest path between that node and a leaf. Each
node has height.

In the above figure, A, B, C, D can have height. Leaf cannot have height as there will be no path starting from a
leaf. Node A's height is the number of edges of the path to K not to D. And its height is 3.

Note:

- Height of a node defines the longest path from the node to a leaf.
- Path can only be downward.

Depth of a Node

While talking about the height, it locates a node at bottom where for depth, it is located at top
which is root level and therefore we call it depth of a node.

In the above figure, Node G's depth is 2. In depth of a node, we just count how many edges between the
targeting node & the root and ignoring the directions.

Note: Depth of the root is 0.

Advantages of Tree

 Tree reflects structural relationships in the data.


 It is used to represent hierarchies.
 It provides an efficient insertion and searching operations.
 Trees are flexible. It allows to move subtrees around with minimum effort.

Tree Representation

Left-Child Right-Sibling Representation of Tree

It is a different representation of an n-ary tree where instead of holding a reference


to each and every child node, a node holds just two references, first a reference to
it’s first child, and the other to it’s immediate next sibling. This new transformation
not only removes the need of advance knowledge of the number of children a node
has, but also limits the number of references to a maximum of two, thereby making
it so much easier to code.
Binary trees

A tree whose elements have at most 2 children is called a binary tree. Since each
element in a binary tree can have only 2 children, we typically name them the left
and right child.

Operations on binary trees


Searching: For searching element 2, we have to traverse all elements (assuming we do breadth first
traversal). Therefore, searching in binary tree has worst case complexity of O(n).

Insertion: For inserting element as left child of 2, we have to traverse all elements. Therefore,
insertion in binary tree has worst case complexity of O(n).

Deletion: For deletion of element 2, we have to traverse all elements to find 2 (assuming we do
breadth first traversal). Therefore, deletion in binary tree has worst case complexity of O(n).

Traversal of binary trees


Tree traversal is the process of visiting each node in the tree exactly once. Visiting each
node in a graph should be done in a systematic manner. If search result in a visit to all the
vertices, it is called a traversal. There are basically three traversal techniques for a
binary tree that are

1. Preorder traversal
2. Inorder traversal
3. Postorder traversal
In-order Traversal
In this traversal method, the left subtree is visited first, then the root and later
the right sub-tree. We should always remember that every node may represent
a subtree itself.
If a binary tree is traversed in-order, the output will produce sorted key values
in an ascending order.

We start from A, and following in-order traversal, we move to its left


subtree B. B is also traversed in-order. The process goes on until all the nodes
are visited. The output of inorder traversal of this tree will be −
D→B→E→A→F→C→G
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Visit root node.
Step 3 − Recursively traverse right subtree.
Pre-order Traversal
In this traversal method, the root node is visited first, then the left subtree and
finally the right subtree.

We start from A, and following pre-order traversal, we first visit A itself and
then move to its left subtree B. B is also traversed pre-order. The process goes
on until all the nodes are visited. The output of pre-order traversal of this tree
will be −
A→B→D→E→C→F→G
Algorithm
Until all nodes are traversed −
Step 1 − Visit root node.
Step 2 − Recursively traverse left subtree.
Step 3 − Recursively traverse right subtree.
Post-order Traversal
In this traversal method, the root node is visited last, hence the name. First we
traverse the left subtree, then the right subtree and finally the root node.
We start from A, and following Post-order traversal, we first visit the left
subtree B. B is also traversed post-order. The process goes on until all the
nodes are visited. The output of post-order traversal of this tree will be −
D→E→B→F→G→C→A
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Recursively traverse right subtree.
Step 3 − Visit root node.

Linked List Representation of Binary Tree


We use a double linked list to represent a binary tree. In a double linked list,
every node consists of three fields. First field for storing left child address,
second for storing actual data and third for storing right child address.
In this linked list representation, a node has the following structure.

The above example of the binary tree represented using Linked list
representation is shown as follows...
Binary Search Tree (BST)
Binary Search Tree is a node-based binary tree data structure which has the
following properties:

 The left subtree of a node contains only nodes with keys lesser than the
node’s key.
 The right subtree of a node contains only nodes with keys greater than the
node’s key.
 The left and right subtree each must also be a binary search tree.
Operations on binary search tree
Following are the basic operations −
 Search − Searches an element in a tree.
 Insert − Inserts an element in a tree.
 Pre-order Traversal − Traverses a tree in a pre-order manner.
 In-order Traversal − Traverses a tree in an in-order manner.
 Post-order Traversal − Traverses a tree in a post-order manner.

Expression Tree
The expression tree is a binary tree in which each internal node corresponds to
the operator and each leaf node corresponds to the operand so for example
expression tree for 3 + ((5+9)*2) would be:

Inorder traversal of expression tree produces infix version of given postfix


expression (same with preorder traversal it gives prefix expression)
Construction of Expression Tree:

Now For constructing an expression tree we use a stack. We loop through input
expression and do the following for every character.
1. If a character is an operand push that into the stack
2. If a character is an operator pop two values from the stack make them its
child and push the current node again.
In the end, the only element of the stack will be the root of an expression tree.
Forest

A collection of disjoint trees is called a forest.

Creating forest from a tree

You can create a forest by cutting the root of a tree.

Applications of trees

 Binary Search Trees(BSTs) are used to quickly check whether an


element is present in a set or not.
 Heap is a kind of tree that is used for heap sort.
 A modified version of a tree called Tries is used in modern routers
to store routing information.
 Most popular databases use B-Trees and T-Trees, which are
variants of the tree structure we learned above to store their data
 Compilers use a syntax tree to validate the syntax of every program
you write.

UNIT V
Graph

A Graph is a non-linear data structure consisting of nodes and edges. The nodes
are sometimes also referred to as vertices and the edges are lines or arcs that
connect any two nodes in the graph. More formally a Graph can be defined as,
A Graph consists of a finite set of vertices(or nodes) and set of Edges which
connect a pair of nodes.

In the above Graph, the set of vertices V = {0,1,2,3,4} and the set of edges E = {01,
12, 23, 34, 04, 14, 13}.
Graphs are used to solve many real-life problems. Graphs are used to represent
networks. The networks may include paths in a city or telephone network or circuit
network. Graphs are also used in social networks like Instagram, Facebook. For
example, in Facebook, each person is represented with a vertex(or node). Each
node is a structure and contains information like person id, name, gender, locale etc.

All of facebook is then a collection of these nodes and edges. This is


because facebook uses a graph data structure to store its data.

More precisely, a graph is a data structure (V, E) that consists of

 A collection of vertices V
 A collection of edges E, represented as ordered pairs of
vertices (u,v)

In the graph,
V = {0, 1, 2, 3}

E = {(0,1), (0,2), (0,3), (1,2)}

G = {V, E}

Graph Terminology

 Adjacency: A vertex is said to be adjacent to another vertex if


there is an edge connecting them. Vertices 2 and 3 are not adjacent
because there is no edge between them.
 Path: A sequence of edges that allows you to go from vertex A to
vertex B is called a path. 0-1, 1-2 and 0-2 are paths from vertex 0
to vertex 2.
 Directed Graph: A graph in which an edge (u,v) doesn't
necessarily mean that there is an edge (v, u) as well. The edges in
such a graph are represented by arrows to show the direction of the
edge.

Graph Representation

Graphs are commonly represented in two ways:

1. Adjacency Matrix

An adjacency matrix is a 2D array of V x V vertices. Each row and column


represent a vertex.

If the value of any element a[i][j] is 1, it represents that there is an edge


connecting vertex i and vertex j.
The adjacency matrix for the graph we created above is
Since it is an undirected graph, for edge (0,2), we also need to mark edge (2,0);
making the adjacency matrix symmetric about the diagonal.

Edge lookup(checking if an edge exists between vertex A and vertex B) is


extremely fast in adjacency matrix representation but we have to reserve space
for every possible link between all vertices(V x V), so it requires more space.

2. Adjacency List

An adjacency list represents a graph as an array of linked lists.

The index of the array represents a vertex and each element in its linked list
represents the other vertices that form an edge with the vertex.

The adjacency list for the graph we made in the first example is as follows:
An adjacency list is efficient in terms of storage because we only need to store
the values for the edges. For a graph with millions of vertices, this can mean a
lot of saved space.

Graph Operations

The most common graph operations are:

 Check if the element is present in the graph


 Graph Traversal
 Add elements(vertex, edges) to graph
 Finding the path from one vertex to another

Directed Graphs vs. Undirected Graphs

Directed Graphs

A directed graph is a set of vertices (nodes) connected by edges, with each


node having a direction associated with it.

Edges are usually represented by arrows pointing in the direction the graph can
be traversed.

In the example on the right, the graph can be traversed from vertex A to B, but
not from vertex B to A.

Undirected Graphs

In an undirected graph the edges are bidirectional, with no direction associated


with them. Hence, the graph can be traversed in either direction. The absence of
an arrow tells us that the graph is undirected.
In the example on the left, the graph can be traversed from node A to B as well
as from node B to A.

Some more complex directed and undirected graphs might look like the
following:

Weighted graph:

 Weighted graph = a graph whose edges have weights

Example:
 The weight of an edge can represent:

 Cost or distance = the amount of effort needed


to travel from one place to another

 Capacity = the maximim amount of flow that can


be transported from one place to another

Representing weighted graphs using an adjacency list

 Representing a weighted graph using an adjacency list::

 Each node in the adjacency graph will contain:

 A neighbor node ID (this field was already discussed


previously)

 A cost field (this field is new)

 Example:
 Graph:


 Representation:

 Explanation:

 Row 0 contains the linked list with the following 3 elements:

 (NodeId = 1, link cost = 3): this represent


the link (0,1) in the figure above.
 (NodeId = 3, link cost = 2): this represent
the link (0,3) in the figure above.
 (NodeId = 8, link cost = 4): this represent
the link (0,8) in the figure above.

 And so on....

Graph Traversal

Graph traversal is a technique used for a searching vertex in a graph. The graph
traversal is also used to decide the order of vertices is visited in the search
process. A graph traversal finds the edges to be used in the search process
without creating loops. That means using graph traversal we visit all the vertices
of the graph without getting into looping path.

There are two graph traversal techniques and they are as follows...

1. DFS (Depth First Search)


2. BFS (Breadth First Search)

DFS (Depth First Search)

DFS traversal of a graph produces a spanning tree as final result. Spanning


Tree is a graph without loops. We use Stack data structure with maximum
size of total number of vertices in the graph to implement DFS traversal.

We use the following steps to implement DFS traversal...

 Step 1 - Define a Stack of size total number of vertices in the graph.


 Step 2 - Select any vertex as starting point for traversal. Visit that vertex
and push it on to the Stack.
 Step 3 - Visit any one of the non-visited adjacent vertices of a vertex
which is at the top of stack and push it on to the stack.
 Step 4 - Repeat step 3 until there is no new vertex to be visited from the
vertex which is at the top of the stack.
 Step 5 - When there is no new vertex to visit then use back tracking and
pop one vertex from the stack.
 Step 6 - Repeat steps 3, 4 and 5 until stack becomes Empty.
 Step 7 - When stack becomes Empty, then produce final spanning tree by
removing unused edges from the graph

Back tracking is coming back to the vertex from which we reached the current
vertex.
BFS (Breadth First Search)

BFS traversal of a graph produces a spanning tree as final result. Spanning


Tree is a graph without loops. We use Queue data structure with maximum
size of total number of vertices in the graph to implement BFS traversal.

We use the following steps to implement BFS traversal...

 Step 1 - Define a Queue of size total number of vertices in the graph.


 Step 2 - Select any vertex as starting point for traversal. Visit that vertex
and insert it into the Queue.
 Step 3 - Visit all the non-visited adjacent vertices of the vertex which is
at front of the Queue and insert them into the Queue.
 Step 4 - When there is no new vertex to be visited from the vertex which
is at front of the Queue then delete that vertex.
 Step 5 - Repeat steps 3 and 4 until queue becomes empty.
 Step 6 - When queue becomes empty, then produce final spanning tree by
removing unused edges from the graph
Applications of Graph Data Structure

 In Computer science graphs are used to represent the flow of computation.


 Google maps uses graphs for building transportation systems, where
intersection of two(or more) roads are considered to be a vertex and the
road connecting two vertices is considered to be an edge, thus their
navigation system is based on the algorithm to calculate the shortest path
between two vertices.
 In Facebook, users are considered to be the vertices and if they are friends
then there is an edge running between them. Facebook’s Friend suggestion
algorithm uses graph theory. Facebook is an example of undirected
graph.
 In World Wide Web, web pages are considered to be the vertices. There is
an edge from a page u to other page v if there is a link of page v on page u.
This is an example of Directed graph. It was the basic idea behind Google
Page Ranking Algorithm.
 In Operating System, we come across the Resource Allocation Graph
where each process and resources are considered to be vertices. Edges are
drawn from resources to the allocated process, or from requesting process
to the requested resource. If this leads to any formation of a cycle then a
deadlock will occur.

Hashing

Hashing is a technique that is used to uniquely identify a specific object from a


group of similar objects. Some examples of how hashing is used in our lives
include:

 In universities, each student is assigned a unique roll number that can be


used to retrieve information about them.
 In libraries, each book is assigned a unique number that can be used to
determine information about the book, such as its exact position in the
library or the users it has been issued to etc.

In both these examples the students and books were hashed to a unique number.
Assume that you have an object and you want to assign a key to it to make
searching easy. To store the key/value pair, you can use a simple array like a
data structure where keys (integers) can be used directly as an index to store
values. However, in cases where the keys are large and cannot be used directly
as an index, you should use hashing.
In hashing, large keys are converted into small keys by using hash functions.
The values are then stored in a data structure called hash table. The idea of
hashing is to distribute entries (key/value pairs) uniformly across an array. Each
element is assigned a key (converted key). By using that key you can access the
element in O(1) time. Using the key, the algorithm (hash function) computes an
index that suggests where an entry can be found or inserted.

Hash Table
Hash Table is a data structure which stores data in an associative manner. In a
hash table, data is stored in an array format, where each data value has its own
unique index value. Access of data becomes very fast if we know the index of
the desired data.
Thus, it becomes a data structure in which insertion and search operations are
very fast irrespective of the size of the data. Hash Table uses an array as a
storage medium and uses hash technique to generate an index where an
element is to be inserted or is to be located from.
Hashing
Hashing is a technique to convert a range of key values into a range of indexes
of an array. We're going to use modulo operator to get a range of key values.
Consider an example of hash table of size 20, and the following items are to be
stored. Item are in the (key,value) format.

 (1,20)
 (2,70)
 (42,80)
 (4,25)
 (12,44)
 (14,32)
 (17,11)
 (13,78)
 (37,98)
Sr.No. Key Hash Array Index

1 1 1 % 20 = 1 1

2 2 2 % 20 = 2 2

3 42 42 % 20 = 2 2
4 4 4 % 20 = 4 4

5 12 12 % 20 = 12 12

6 14 14 % 20 = 14 14

7 17 17 % 20 = 17 17

8 13 13 % 20 = 13 13

9 37 37 % 20 = 17 17

Infix, Postfix and Prefix

Infix, Postfix and Prefix notations are three different but equivalent ways of
writing expressions. It is easiest to demonstrate the differences by looking at
examples of operators that take two operands.

Infix notation: X + Y

Operators are written in-between their operands. This is the usual way we
write expressions. An expression such as A * ( B + C ) / D is usually
taken to mean something like: "First add B and C together, then multiply
the result by A, then divide by D to give the final answer."

Infix notation needs extra information to make the order of evaluation of


the operators clear: rules built into the language about operator
precedence and associativity, and brackets ( ) to allow users to override
these rules. For example, the usual rules for associativity say that we
perform operations from left to right, so the multiplication by A is
assumed to come before the division by D. Similarly, the usual rules for
precedence say that we perform multiplication and division before we
perform addition and subtraction. (see CS2121 lecture).
Postfix notation (also known as "Reverse Polish notation"): X Y +

Operators are written after their operands. The infix expression given
above is equivalent to A B C + * D /
The order of evaluation of operators is always left-to-right, and brackets
cannot be used to change this order. Because the "+" is to the left of the
"*" in the example above, the addition must be performed before the
multiplication.
Operators act on values immediately to the left of them. For example, the
"+" above uses the "B" and "C". We can add (totally unnecessary)
brackets to make this explicit:
( (A (B C +) *) D /)
Thus, the "*" uses the two values immediately preceding: "A", and the
result of the addition. Similarly, the "/" uses the result of the
multiplication and the "D".

Prefix notation (also known as "Polish notation"): + X Y

Operators are written before their operands. The expressions given above
are equivalent to / * A + B C D
As for Postfix, operators are evaluated left-to-right and brackets are
superfluous. Operators act on the two nearest values on the right. I have
again added (totally unnecessary) brackets to make this clear:
(/ (* A (+ B C) ) D)

Although Prefix "operators are evaluated left-to-right", they use values to


their right, and if these values themselves involve computations then this
changes the order that the operators have to be evaluated in. In the
example above, although the division is the first operator on the left, it
acts on the result of the multiplication, and so the multiplication has to
happen before the division (and similarly the addition has to happen
before the multiplication).
Because Postfix operators use values to their left, any values involving
computations will already have been calculated as we go left-to-right, and
so the order of evaluation of the operators is not disrupted in the same
way as in Prefix expressions.

In all three versions, the operands occur in the same order, and just the operators
have to be moved to keep the meaning correct. (This is particularly important
for asymmetric operators like subtraction and division: A - B does not mean the
same as B - A; the former is equivalent to A B - or - A B, the latter to B A - or -
B A).

Examples:

Infix Postfix Prefix Notes

multiply A and B,
A * B + C / D A B * C D / + + * A B / C D divide C by D,
add the results

add B and C,
A * (B + C) / D A B C + * D / / * A + B C D multiply by A,
divide by D

divide C by D,
A * (B + C / D) A B C D / + * * A + B / C D add B,
multiply by A

Converting between these notations


The most straightforward method is to start by inserting all the implicit brackets
that show the order of evaluation e.g.:

Infix Postfix Prefix

( (A * B) + (C / D) ) ( (A B *) (C D /) +) (+ (* A B) (/ C D) )

((A * (B + C) ) / D) ( (A (B C +) *) D /) (/ (* A (+ B C) ) D)

(A * (B + (C / D) ) ) (A (B (C D /) +) *) (* A (+ B (/ C D) ) )

You can convert directly between these bracketed forms simply by moving the
operator within the brackets e.g. (X + Y) or (X Y +) or (+ X Y). Repeat this for
all the operators in an expression, and finally remove any superfluous brackets.

You can use a similar trick to convert to and from parse trees - each bracketed
triplet of an operator and its two operands (or sub-expressions) corresponds to a
node of the tree.

The corresponding parse trees are:


/ *
+ / \ / \
/ \ * D A +
/ \ / \ / \
* / A + B /
/ \ / \ / \ / \
A B C D B C C D

((A*B)+(C/D)) ((A*(B+C))/D) (A*(B+(C/D)))

Linear search (Sequential Search)

Linear search is a very simple search algorithm. In this type of search, a


sequential search is made over all items one by one. Every item is checked and
if a match is found then that particular item is returned, otherwise the search
continues till the end of the data collection.

Binary Search

Binary search is a fast search algorithm with run-time complexity of Ο(log n).
This search algorithm works on the principle of divide and conquer. For this
algorithm to work properly, the data collection should be in the sorted form.
Binary search looks for a particular item by comparing the middle most item of
the collection. If a match occurs, then the index of item is returned. If the
middle item is greater than the item, then the item is searched in the sub-array
to the left of the middle item. Otherwise, the item is searched for in the sub-
array to the right of the middle item. This process continues on the sub-array as
well until the size of the subarray reduces to zero.
How Binary Search Works?
For a binary search to work, it is mandatory for the target array to be sorted.
We shall learn the process of binary search with a pictorial example. The
following is our sorted array and let us assume that we need to search the
location of value 31 using binary search.

First, we shall determine half of the array by using this formula −


mid = low + (high - low) / 2
Here it is, 0 + (9 - 0 ) / 2 = 4 (integer value of 4.5). So, 4 is the mid of the array.

Now we compare the value stored at location 4, with the value being searched,
i.e. 31. We find that the value at location 4 is 27, which is not a match. As the
value is greater than 27 and we have a sorted array, so we also know that the
target value must be in the upper portion of the array.

We change our low to mid + 1 and find the new mid value again.
low = mid + 1
mid = low + (high - low) / 2
Our new mid is 7 now. We compare the value stored at location 7 with our
target value 31.
The value stored at location 7 is not a match, rather it is more than what we are
looking for. So, the value must be in the lower part from this location.

Hence, we calculate the mid again. This time it is 5.

We compare the value stored at location 5 with our target value. We find that it
is a match.

We conclude that the target value 31 is stored at location 5.


Binary search halves the searchable items and thus reduces the count of
comparisons to be made to very less numbers.
Pseudocode
The pseudocode of binary search algorithms should look like this −
Procedure binary_search
A ← sorted array
n ← size of array
x ← value to be searched

Set lowerBound = 1
Set upperBound = n

while x not found


if upperBound < lowerBound
EXIT: x does not exists.

set midPoint = lowerBound + ( upperBound - lowerBound ) / 2

if A[midPoint] < x
set lowerBound = midPoint + 1

if A[midPoint] > x
set upperBound = midPoint - 1

if A[midPoint] = x
EXIT: x found at location midPoint
end while

end procedure

Sorting
Sorting refers to arranging data in a particular format. Sorting algorithm
specifies the way to arrange data in a particular order. Most common orders are
in numerical or lexicographical order.
The importance of sorting lies in the fact that data searching can be optimized
to a very high level, if data is stored in a sorted manner. Sorting is also used to
represent data in more readable formats. Following are some of the examples
of sorting in real-life scenarios −
 Telephone Directory − The telephone directory stores the telephone
numbers of people sorted by their names, so that the names can be
searched easily.
 Dictionary − The dictionary stores words in an alphabetical order so that
searching of any word becomes easy.
Bubble Sort Algorithm
Bubble sort is a simple sorting algorithm. This sorting algorithm is
comparison-based algorithm in which each pair of adjacent elements is
compared and the elements are swapped if they are not in order. This algorithm
is not suitable for large data sets as its average and worst case complexity are
of Ο(n2) where n is the number of items.
How Bubble Sort Works?
We take an unsorted array for our example. Bubble sort takes Ο(n2) time so
we're keeping it short and precise.
Bubble sort starts with very first two elements, comparing them to check which
one is greater.

In this case, value 33 is greater than 14, so it is already in sorted locations.


Next, we compare 33 with 27.

We find that 27 is smaller than 33 and these two values must be swapped.

The new array should look like this −

Next we compare 33 and 35. We find that both are in already sorted positions.

Then we move to the next two values, 35 and 10.

We know then that 10 is smaller 35. Hence they are not sorted.

We swap these values. We find that we have reached the end of the array. After
one iteration, the array should look like this −
To be precise, we are now showing how an array should look like after each
iteration. After the second iteration, it should look like this −

Notice that after each iteration, at least one value moves at the end.

And when there's no swap required, bubble sorts learns that an array is
completely sorted.

Now we should look into some practical aspects of bubble sort.


Algorithm
We assume list is an array of n elements. We further assume
that swap function swaps the values of the given array elements.
begin BubbleSort(list)

for all elements of list


if list[i] > list[i+1]
swap(list[i], list[i+1])
end if
end for

return list

end BubbleSort

Insertion Sort
This is an in-place comparison-based sorting algorithm. Here, a sub-list is
maintained which is always sorted. For example, the lower part of an array is
maintained to be sorted. An element which is to be 'insert'ed in this sorted sub-
list, has to find its appropriate place and then it has to be inserted there. Hence
the name, insertion sort.
The array is searched sequentially and unsorted items are moved and inserted
into the sorted sub-list (in the same array). This algorithm is not suitable for
large data sets as its average and worst case complexity are of Ο(n2), where n is
the number of items.
How Insertion Sort Works?
We take an unsorted array for our example.

Insertion sort compares the first two elements.

It finds that both 14 and 33 are already in ascending order. For now, 14 is in
sorted sub-list.

Insertion sort moves ahead and compares 33 with 27.

And finds that 33 is not in the correct position.

It swaps 33 with 27. It also checks with all the elements of sorted sub-list. Here
we see that the sorted sub-list has only one element 14, and 27 is greater than
14. Hence, the sorted sub-list remains sorted after swapping.

By now we have 14 and 27 in the sorted sub-list. Next, it compares 33 with 10.
These values are not in a sorted order.

So we swap them.

However, swapping makes 27 and 10 unsorted.

Hence, we swap them too.

Again we find 14 and 10 in an unsorted order.

We swap them again. By the end of third iteration, we have a sorted sub-list of
4 items.

This process goes on until all the unsorted values are covered in a sorted sub-
list. Now we shall see some programming aspects of insertion sort.
Algorithm
Now we have a bigger picture of how this sorting technique works, so we can
derive simple steps by which we can achieve insertion sort.
Step 1 − If it is the first element, it is already sorted. return 1;
Step 2 − Pick next element
Step 3 − Compare with all elements in the sorted sub-list
Step 4 − Shift all the elements in the sorted sub-list that is greater than the
value to be sorted
Step 5 − Insert the value
Step 6 − Repeat until list is sorted

Selection Sort
Selection sort is a simple sorting algorithm. This sorting algorithm is an in-
place comparison-based algorithm in which the list is divided into two parts,
the sorted part at the left end and the unsorted part at the right end. Initially, the
sorted part is empty and the unsorted part is the entire list.
The smallest element is selected from the unsorted array and swapped with the
leftmost element, and that element becomes a part of the sorted array. This
process continues moving unsorted array boundary by one element to the right.
This algorithm is not suitable for large data sets as its average and worst case
complexities are of Ο(n2), where n is the number of items.
How Selection Sort Works?
Consider the following depicted array as an example.

For the first position in the sorted list, the whole list is scanned sequentially.
The first position where 14 is stored presently, we search the whole list and
find that 10 is the lowest value.

So we replace 14 with 10. After one iteration 10, which happens to be the
minimum value in the list, appears in the first position of the sorted list.

For the second position, where 33 is residing, we start scanning the rest of the
list in a linear manner.

We find that 14 is the second lowest value in the list and it should appear at the
second place. We swap these values.
After two iterations, two least values are positioned at the beginning in a sorted
manner.

The same process is applied to the rest of the items in the array.
Following is a pictorial depiction of the entire sorting process −
Now, let us learn some programming aspects of selection sort.
Algorithm
Step 1 − Set MIN to location 0
Step 2 − Search the minimum element in the list
Step 3 − Swap with value at location MIN
Step 4 − Increment MIN to point to next element
Step 5 − Repeat until list is sorted

You might also like