Module 1
Introduction to data structures and types
Data Structure definition
Data may be organized in many different ways. The logical or mathematical model of a
particular organization of data is called a data structure.
Need Of Data Structure:
The structure of the data and the synthesis of the algorithm are relative to each other. Data
presentation must be easy to understand so the developer, as well as the user, can make an
efficient implementation of the operation.
Data structures provide an easy way of organising, retrieving, managing, and storing data.
Here is a list of the needs for data.
● Data structure modification is easy.
● It requires less time.
● Save storage memory space.
● Data representation is easy.
● Easy access to the large database.
CLASSIFICATION OF DATA STRUCTURES
Data structures are generally classified into
● Primitive data Structures
● Non-primitive data Structures
1. Primitive data Structures: Primitive data structures are the fundamental data types which are
supported by a programming language. Basic data types such as integer, real, character and
Boolean are known as Primitive data Structures. These data types consists of characters that
cannot be divided and hence they also called simple data types.
2. Non- Primitive data Structures: Non-primitive data structures are those data structures
which are created using primitive data structures. Examples of non-primitive data structures is
the processing of complex numbers, linked lists, stacks, trees, and graphs.
DATA STRUCTURES OPERATIONS
The data appearing in data structures are processed by means of certain operations.
The following four operations play a major role in this text:
1. Traversing: accessing each record/node exactly once so that certain items in the record may
be processed. (This accessing and processing is sometimes called “visiting” the record.)
2. Searching: Finding the location of the desired node with a given key value, or finding the
locations of all such nodes which satisfy one or more conditions.
3. Inserting: Adding a new node/record to the structure.
4. Deleting: Removing a node/record from the structure.
The following two operations, which are used in special situations:
1. Sorting: Arranging the records in some logical order (e.g., alphabetically according to some
NAME key, or in numerical order according to some NUMBER key, such as social security
number or account number)
2. Merging: Combining the records in two different sorted files into a single sorted file.
Based on the structure and arrangement of data, non-primitive data structures is further
classified into
1. Linear Data Structure
2. Non-linear Data Structure
1. Linear Data Structure: A data structure is said to be linear if its elements form a sequence or
a linear list. There are basically two ways of representing such linear structure in memory.
1. One way is to have the linear relationships between the elements represented by means of
sequential memory location. These linear structures are called arrays.
2. The other way is to have the linear relationship between the elements represented by means of
pointers or links. These linear structures are called linked lists.
The common examples of linear data structure are Arrays, Queues, Stacks, Linked lists
2. Non-linear Data Structure: A data structure is said to be non-linear if the data are not
arranged in sequence or a linear. The insertion and deletion of data is not possible in linear
fashion. This structure is mainly used to represent data containing a hierarchical relationship
between elements.
Trees and graphs are the examples of non-linear data structure.
Arrays:
● An Array is defined as, an ordered set of similar data items. All the data items of an array
are stored in consecutive memory locations
● The data items of an array are of same type and each data items can be accessed using the
same name but different index value.
● An array is a set of pairs, , such that each index has a value associated with it. It can be
called as corresponding or a mapping
Ex: < 0 , 25 > list[0]=25
< 1 , 15 > list[1]=15
< 2 , 20 > list[2]=20
< 3 , 17 > list[3]=17
< 4 , 35 > list[4]=35
Here, list is the name of array. By using, list [0] to list [4] the data items in list can be accessed.
An array is called a data structure because it fulfills the definition of a data
structure. Specifically:
● Organization of Data:
An array organizes a collection of data items, known as elements, of the same data type
(homogeneous) in contiguous memory locations. This linear arrangement provides a structured
way to store multiple values under a single variable name.
● Defined Relationships:
The elements within an array have a defined relationship based on their position or index. Each
element can be uniquely identified and accessed using its index, which typically starts from 0.
● Operations:
Arrays support specific operations for efficient data manipulation. These include:
Accessing elements: Elements can be accessed directly and quickly using their index
(e.g., array[index]).
Storing elements: Values can be stored at specific indices.
Traversing: Elements can be iterated over in a sequential manner.
Therefore, an array provides a systematic and efficient way to store, organize, and manage data,
making it a fundamental example of a data structure.
Array in C Declaration:
one dimensional array
● A one dimensional array in C is declared by adding brackets to the name of a variable.
Ex: int list[5], *plist[5];
● The array list[5], defines 5 integers and in C array start at index 0, so list[0], list[1],
list[2], list[3], list[4] are the names of five array elements which contains an integer
value.
● The array *plist[5], defines an array of 5 pointers to integers. Where, plist[0], plist[1],
plist[2], plist[3], plist[4] are the five array elements which contains a pointer to an
integer.
Declaration
● data_type array_name[size];
● Example
● int arr[5]; // Declares an array of size 5
● Initialization
● int arr[5] = {1, 2, 3, 4, 5};
Example Calculate Average
// Program to find the average of n numbers using arrays
#include <stdio.h>
int main() {
int marks[10], i, n, sum = 0;
double average;
printf("Enter number of elements: ");
scanf("%d", &n);
for(i=0; i < n; ++i) {
printf("Enter number%d: ",i+1);
scanf("%d", &marks[i]);
// adding integers entered by the user to the sum variable
sum += marks[i];
}
// explicitly convert the sum to double
// then calculate average
average = (double) sum / n;
printf("Average = %.2lf", average);
return 0;
}
Run Code
Output
Enter number of elements: 5
Enter number1: 45
Enter number2: 35
Enter number3: 38
Enter number4: 31
Enter number5: 49
Average = 39.60
Multidimensional Array
Multidimensional arrays are arrays within arrays, allowing for data storage in a tabular or matrix
form. They extend the concept of one-dimensional arrays by adding more dimensions.
2-Dimensional Array in C
A 2D array organizes data in rows and columns, making it ideal for matrix representation.
Declaration
data_type array_name[rows][columns];
Example
int arr[3][4]; // Declares a 2D array with 3 rows and 4 columns
Initialization
int arr[2][3] = {
{1, 2, 3},
{4, 5, 6}
};
Examples of 2D Array in C
Printing a Matrix
#include <stdio.h>
int main() {
int arr[2][3] = {
{1, 2, 3},
{4, 5, 6}
};
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 3; j++) {
printf("%d ", arr[i][j]);
}
printf("\n");
}
return 0;
}
Output
123
456
3-Dimensional Array in C
A 3D array extends the concept of 2D arrays by adding another dimension. It stores elements in a
cube-like structure.
Declaration
data_type array_name[size1][size2][size3];
Example
int arr[2][3][4];
Initialization
int arr[2][2][2] = {
{
{1, 2},
{3, 4}
},
{
{5, 6},
{7, 8}
}
};
Example of 3D Array in C
#include <stdio.h>
int main() {
int arr[2][2][2] = {
{
{1, 2},
{3, 4}
},
{
{5, 6},
{7, 8}
}
};
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++) {
for (int k = 0; k < 2; k++) {
printf("arr[%d][%d][%d] = %d\n", i, j, k, arr[i][j][k]);
}
}
}
return 0;
}
Output
arr[0][0][0] = 1
arr[0][0][1] = 2
arr[0][1][0] = 3
arr[0][1][1] = 4
arr[1][0][0] = 5
arr[1][0][1] = 6
arr[1][1][0] = 7
arr[1][1][1] = 8
Five real-world examples of arrays
● Image Processing:
Images are commonly represented as two-dimensional arrays (matrices) where each element
corresponds to a pixel and stores its color information (e.g., RGB values). Operations like
resizing, rotating, filtering, and applying effects to an image involve manipulating these arrays.
● Spreadsheets and Databases:
Data in spreadsheets like Microsoft Excel or Google Sheets, and in relational databases, is often
structured in rows and columns, which can be thought of as two-dimensional arrays. Each row
represents a record, and each column represents a specific attribute, allowing for efficient
storage, retrieval, and analysis of tabular data.
● Game Development:
In video games, arrays are used for various purposes. For example, a game board in a chess or
checkers game can be represented as a 2D array, with each element storing information about the
piece on that square. Character inventories, enemy lists, or level layouts can also be managed
using arrays.
● Audio Processing:
Digital audio signals are often represented as arrays of numerical samples, where each sample
represents the amplitude of the sound wave at a specific point in time. Audio processing
applications utilize arrays for tasks such as applying filters, adding effects, or analyzing audio
characteristics.
● Scientific Computing and Data Analysis:
In scientific and engineering fields, arrays are extensively used to store and manipulate large
datasets. For instance, in physics simulations, arrays might store positions and velocities of
particles. In data analysis, statistical operations like calculating averages, standard deviations, or
performing regressions are often applied to data stored in arrays.
Queue
A queue is an important data structure in programming. It follows the FIFO (First In, First Out)
method and is open at both ends. Data insertion is done at one end, the rear end or the tail of the
queue, while deletion is done at the other end, the front end or the head of the queue.
Types of Queues in Data Structure
There are four different types of queues in data structures:
● Simple Queue
● Circular Queue
● Priority Queue
● Double-Ended Queue (Deque)
Simple Queue
Simple Queue is a linear data structure that follows the First-In-First-Out (FIFO) principle,
where elements are added to the rear (back) and removed from the front (head).
● Ordered collection of comparable data kinds.
● Queue structure is FIFO (First in, First Out).
● When a new element is added, all elements added before the new element must be deleted to
remove the new element.
Circular Queue
A circular queue is a special case of a simple queue in which the last member is linked to the
first, forming a circle-like structure.
● The last node is connected to the first node.
● Also known as a Ring Buffer, the nodes are connected end to end.
● Insertion takes place at the front of the queue, and deletion at the end of the queue.
● Example of circular queue application: Insertion of days in a week.
Priority Queue
In a priority queue, the nodes will have some predefined priority in the priority queue. The node
with the least priority will be the first to be removed from the queue. Insertion takes place in the
order of arrival of the nodes.
Some of the applications of priority queue:
● Dijkstra’s shortest path algorithm
● Prim’s algorithm
● Data compression techniques like Huffman code
The diagram below shows how an application uses a priority queue for the items the user
consumes
Deque (Double Ended Queue)
A double-ended queue, often abbreviated as deque (pronounced "deck"), is a type of data
structure that allows elements to be added or removed from both the front and the rear, making it
a versatile option for various applications. Unlike regular queues that follow the FIFO)principle,
deques do not have this restriction.
Basic Queue Operations in Queue Data Structure
Below are the basic queue operations in data structure:
Operation Description
enqueue() Process of adding or storing an element to the end of the queue
dequeue() Process of removing or accessing an element from the front of the queue
peek() Used to get the element at the front of the queue without removing it
initialize() Creates an empty queue
isfull() Checks if the queue is full
isempty() Check if the queue is empty
Stack
Stack is a linear data structure that follows LIFO (Last In First Out) Principle, the last
element inserted is the first to be popped out. It means both insertion and deletion operations
happen at one end only.
Representation of
Stack Data Structure:
Stack follows LIFO (Last In First Out) Principle so the element which is pushed last is popped
first.
Types of Stack:
● Fixed Size Stack : As the name suggests, a fixed size stack has a fixed size and cannot grow
or shrink dynamically. If the stack is full and an attempt is made to add an element to it, an
overflow error occurs. If the stack is empty and an attempt is made to remove an element
from it, an underflow error occurs.
● Dynamic Size Stack : A dynamic size stack can grow or shrink dynamically. When the stack
is full, it automatically increases its size to accommodate the new element, and when the
stack is empty, it decreases its size. This type of stack is implemented using a linked list, as it
allows for easy resizing of the stack.
Basic Operations on Stack:
In order to make manipulations in a stack, there are certain operations provided to us.
● push() to insert an element into the stack
● pop() to remove an element from the stack
● top() Returns the top element of the stack.
● isEmpty() returns true if stack is empty else false.
● isFull() returns true if the stack is full else false.
Push Operation on Stack
Adds an item to the stack. If the stack is full, then it is said to be an Overflow condition.
Algorithm for Push Operation:
● Before pushing the element to the stack, we check if the stack is full .
● If the stack is full (top == capacity-1) , then Stack Overflows and we cannot insert the
element to the stack.
● Otherwise, we increment the value of top by 1 (top = top + 1) and the new value is inserted
at top position .
● The elements can be pushed into the stack till we reach the capacity of the stack
.
Pop Operation in Stack
Removes an item from the stack. The items are popped in the reversed order in which they are
pushed. If the stack is empty, then it is said to be an Underflow condition.
Algorithm for Pop Operation:
● Before popping the element from the stack, we check if the stack is empty .
● If the stack is empty (top == -1), then Stack Underflows and we cannot remove any element
from the stack.
● Otherwise, we store the value at top, decrement the value of top by 1 (top = top - 1) and
return the stored top value.
Top or Peek Operation on Stack
Returns the top element of the stack.
Algorithm for Top Operation:
● Before returning the top element from the stack, we check if the stack is empty.
● If the stack is empty (top == -1), we simply print "Stack is empty".
● Otherwise, we return the element stored at index = top .
isEmpty Operation in Stack Data Structure:
Returns true if the stack is empty, else false.
Algorithm for isEmpty Operation:
● Check for the value of top in stack.
● If (top == -1), then the stack is empty so return true .
● Otherwise, the stack is not empty so return false .
isFull Operation in Stack Data Structure:
Returns true if the stack is full, else false.
Algorithm for isFull Operation:
● Check for the value of top in stack.
● If (top == capacity-1), then the stack is full so return true.
● Otherwise, the stack is not full so return false
Algorithm to convert infix expression to postfix expression using stack
1. Push “(“ onto STACK, and add “)” to the end of X.
2. Scan X from left to right and REPEAT Steps 3 to 6 for each element of X UNTIL the
STACK is empty.
3. If an operand is encountered, add it to Y.
4. If a left parenthesis is encountered, push it onto STACK.
5. If an operator is encountered, then:
(a) Repeatedly pop from STACK and add to Y each operator (on the top of STACK) which has
the same precedence as or higher precedence than operator.
(b) Add operator to STACK.
/* End of If structure */
6. If a right parenthesis is encountered, then:
(a) Repeatedly pop from STACK and add to Y each operator (on the top of STACK) until a left
Parenthesis is encountered.
Linked List
A linked list is a linear data structure used for storing a sequence of elements, where each
element is stored in a node that contains both the element and a pointer to the next node in the
sequence.
Basic Operations on Singly Linked List
The following are some basic operations performed on a Single Linked List:
● Insertion: The insertion operation can be performed in three ways. They are as follows:
o Inserting At the Beginning of the list
o Inserting At End of the list
o Inserting At Specific location in the list
● Deletion: The deletion operation can be performed in three ways. They are as follows:
o Deleting from the Beginning of the list
o Deleting from the End of the list
o Deleting a Specific Node
● Traverse: This process displays the elements of a Single-linked list.
● Search: It is a process of determining and retrieving a specific node either from the front, the
end or anywhere in the list.
Operations on Doubly Linked List:
In a doubly linked list, we perform the following operations...
● Insertion: The insertion operation can be performed in three ways as follows:
o Inserting At the Beginning of the list
o Inserting after a given node.
o Inserting at the end.
o Inserting before a given node
● Deletion: The deletion operation can be performed in three ways as follows...
o Deleting from the Beginning of the list
o Deleting from the End of the list
o Deleting a Specific Node
● Display: This process displays the elements of a double-linked list.
Commonly used operations on Circular Linked List:
The following operations are performed on a Circular Linked List
● Insertion: The insertion operation can be performed in three ways:
o Insertion in an empty list
o Insertion at the beginning of the list
o Insertion at the end of the list
o Insertion in between the nodes
● Deletion: The deletion operation can be performed in three ways:
o Deleting from the Beginning of the list
o Deleting from the End of the list
o Deleting a Specific Node
● Display: This process displays the elements of a Circular linked list.
Types of linked lists:
Linked lists can be classified in the following categories
1. Singly Linked List
Singly linked list is the simplest type of linked list in which every node contains some data and a
pointer to the next node of the same data type.
The node contains a pointer to the next node means that the node stores the address of the next
node in the sequence. A single linked list allows the traversal of data only in one way. Below is
the image for the same:
2. Doubly Linked List
A doubly linked list or a two-way linked list is a more complex type of linked list that contains a
pointer to the next as well as the previous node in sequence.
Therefore, it contains three parts of data, a pointer to the next node, and a pointer to
the previous node. This would enable us to traverse the list in the backward direction as well.
3. Circular Linked List
A circular linked list is a type of linked list in which the last node's next pointer points back to
the first node of the list, creating a circular structure. This design allows for continuous traversal
of the list, as there is no null to end the list.
While traversing a circular linked list, we can begin at any node and traverse the list in any
direction forward and backward until we reach the same node we started. Thus, a circular linked
list has no beginning and no end. Below is the image for the same:
Applications
● Web Browser Navigation:
The "back" and "forward" buttons in web browsers frequently utilize a doubly linked list to
manage browsing history, allowing users to easily navigate between previously visited and
subsequent URLs.
● Music and Video Playlists:
Media players often employ linked lists to manage song or video playlists. This allows for
seamless transitions between tracks, as well as features like shuffling, repeating, and skipping.
● Image Viewers:
Similar to media players, image viewers use linked lists to enable navigation between images
(e.g., "next" and "previous" image buttons).
● Undo/Redo Functionality:
Many applications, such as text editors, graphic design software, and IDEs, implement undo and
redo features using doubly linked lists. Each action is stored as a node, allowing users to revert
or reapply changes.
Graph
A Graph is a non-linear data structure consisting of vertices and edges. The vertices are
sometimes also referred to as nodes and the edges are lines or arcs that connect any two
nodes in the graph. More formally a Graph is composed of a set of vertices( V ) and a set
of edges( E ). The graph is denoted by G(V, E).
Graphs in data structures are classified based on various properties of their vertices and
edges. The main types include:
● Undirected Graph: Edges have no direction, meaning the connection between two vertices is
mutual.
● Directed Graph (Digraph): Edges have a direction, indicating a one-way relationship from a
source vertex to a destination vertex.
● Weighted Graph: Edges have associated numerical values (weights) representing cost, distance,
time, or other measures.
● Unweighted Graph: Edges do not have any associated values.
● Connected Graph: A path exists between every pair of vertices. All vertices are reachable from
any other vertex.
● Disconnected Graph: At least one pair of vertices does not have a path connecting them.
● Null Graph: Contains only vertices and no edges.
● Trivial Graph: Contains only one vertex and no edges.
● Simple Graph: Contains no loops (edges connecting a vertex to itself) and no multiple edges
between the same pair of vertices.
● Multigraph: Allows multiple edges between the same pair of vertices.
● Complete Graph: Every vertex is directly connected to every other vertex by a single edge.
● Regular Graph: Every vertex has the same degree (number of connections).
● Cyclic Graph: Contains at least one cycle (a path that starts and ends at the same vertex).
● Acyclic Graph: Contains no cycles.
● Directed Acyclic Graph (DAG): A directed graph with no cycles.
● Bipartite Graph: The vertices can be divided into two disjoint sets such that edges only connect
vertices from different sets, not within the same set.
Representations of Graph
1. Adjacency Matrix
2. Adjacency List
Adjacency Matrix Representation
An adjacency matrix is a way of representing a graph as a matrix of boolean (0's and 1's)
Let's assume there are n vertices in the graph So, create a 2D matrix adjMat[n][n] having
dimension n x n.
● If there is an edge from vertex i to j, mark adjMat[i][j] as 1.
● If there is no edge from vertex i to j, mark adjMat[i][j] as 0.
Representation of Undirected Graph as Adjacency Matrix:
The below figure shows an undirected graph. Initially, the entire Matrix is initialized to 0. If
there is an edge from source to destination, we insert 1 to both cases
(adjMat[source][destination] and adjMat[destination][source]) because we can go either way.
Representation of Directed Graph as Adjacency Matrix:
The below figure shows a directed graph. Initially, the entire Matrix is initialized to 0. If there is
an edge from source to destination, we insert 1 for that particular adjMat[source][destination].
Adjacency List Representation
An array of Lists is used to store edges between two vertices. The size of array is equal to the
number of vertices (i.e, n). Each index in this array represents a specific vertex in the graph. The
entry at the index i of the array contains a linked list containing the vertices that are adjacent to
vertex i. Let's assume there are n vertices in the graph So, create an array of list of
size n as adjList[n].
● adjList[0] will have all the nodes which are connected (neighbour) to vertex 0.
● adjList[1] will have all the nodes which are connected (neighbour) to vertex 1 and so on.
Representation of Undirected Graph as Adjacency list:
The below undirected graph has 3 vertices. So, an array of list will be created of size 3, where
each indices represent the vertices. Now, vertex 0 has two neighbours (i.e, 1 and 2). So, insert
vertex 1 and 2 at indices 0 of array. Similarly, For vertex 1, it has two neighbour (i.e, 2 and 0) So,
insert vertices 2 and 0 at indices 1 of array. Similarly, for vertex 2, insert its neighbours in array
of list.
Representation of Directed Graph as Adjacency list:
The below directed graph has 3 vertices. So, an array of list will be created of size 3, where each
indices represent the vertices. Now, vertex 0 has no neighbours. For vertex 1, it has two
neighbour (i.e, 0 and 2) So, insert vertices 0 and 2 at indices 1 of array. Similarly, for vertex 2,
insert its neighbours in array of list.
Difference Between Graph and Tree
Feature Graph Tree
A collection of nodes A hierarchical data structure consisting of
(vertices) and edges, where nodes connected by edges with a single root
Definition edges connect nodes. node.
Can have cycles (loops) and No cycles; connected structure with exactly
Structure disconnected components. one path between any two nodes.
No root node; nodes may have
Has a designated root node that has no
multiple parents or no parents
parent.
Root Node at all.
Node Relationships between nodes Parent-child relationship; each node (except
Relationship are arbitrary. the root) has exactly one parent.
Each node can have any If there is n nodes then there would
Edges number of edges. be n-1 number of edges
Traversal can be complex due
Traversal is straightforward and can be done
Traversal to cycles and disconnected
in linear time.
Complexity components.
Used in various scenarios like Commonly used for hierarchical data
social networks, maps, representation like file systems, organization
Application network optimization, etc. charts, HTML DOM, XML documents, etc.
Social networks, road File systems, family trees, HTML DOM
Examples networks, computer networks. structure.
Applications of Graph Data Structure
● In Computer science graphs are used to represent the flow of computation.
● Google maps uses graphs for building transportation systems, where intersection of two(or
more) roads are considered to be a vertex and the road connecting two vertices is considered
to be an edge, thus their navigation system is based on the algorithm to calculate the shortest
path between two vertices.
●
● In Facebook, users are considered to be the vertices and if they are friends then there is an
edge running between them. Facebook's Friend suggestion algorithm uses graph theory.
Facebook is an example of undirected graph.
● In World Wide Web, web pages are considered to be the vertices. There is an edge from a
page u to other page v if there is a link of page v on page u. This is an example of Directed
graph. It was the basic idea behind Google Page Ranking Algorithm.
● In Operating System, we come across the Resource Allocation Graph where each process
and resources are considered to be vertices. Edges are drawn from resources to the allocated
process, or from requesting process to the requested resource. If this leads to any formation
of a cycle then a deadlock will occur.
● In mapping system we use graph. It is useful to find out which is an excellent place from the
location as well as your nearby location. In GPS we also use graph
● Facebook uses graphs. Using graphs suggests mutual friends. it shows a list of the f
following pages, friends, and contact list.
● Microsoft Excel uses DAG means Directed Acyclic Graphs.
● In the Dijkstra algorithm, we use a graph. we find the smallest path between two or many
nodes.
● On social media sites, we use graphs to track the data of the users. liked showing preferred
post suggestions, recommendations, etc.
● Graphs are used in biochemical applications such as structuring of protein, DNA etc.