1.
Graph Data Structure:
A graph is an abstract data type (ADT) which consists of a set of objects that are connected to each other via links.
The interconnected objects are represented by points termed as vertices (V),
The links that connect the vertices are called edges(E).
G= (V , E )
📌 Easy Definition:
Formally, a graph is a pair of sets (V, E), where V is the set of vertices and E is the set of edges, connecting the pairs of vertices. Take a look at the following graph
In the above graph,
V = {a, b, c, d, e}
E = {ab, ac, bd, cd, de}
Graph Data Structure
Mathematical graphs can be represented in data structure. We can represent a graph using an array of vertices and a two-dimensional array of edges. Before we proceed further, let's
familiarize ourselves with some important terms −
1. Vertex − Each node of the graph is represented as a vertex. In the following example, the labeled circle represents vertices. Thus, A to G are vertices. We can represent them
using an array as shown in the following image. Here A can be identified by index 0. B can be identified using index 1 and so on.
2. Edge − Edge represents a path between two vertices or a line between two vertices. In the following example, the lines from A to B, B to C, and so on represents edges. We can
use a two-dimensional array to represent an array as shown in the following image. Here AB can be represented as 1 at row 0, column 1, BC as 1 at row 1, column 2 and so on,
keeping other combinations as 0.
3. Adjacency − Two node or vertices are adjacent if they are connected to each other through an edge. In the following example, B is adjacent to A, C is adjacent to B, and so on.
4. Path − Path represents a sequence of edges between the two vertices. In the following example, ABCD represents a path from A to D.
Graph Terminology
1. Graph: A collection of nodes (or vertices) and edges that connect pairs of nodes.
2. Vertex (Node): The fundamental unit by which graphs are formed. A vertex represents a point in the graph.
3. Edge: A line connecting two vertices in a graph. It can be directed (having a direction) or undirected (no direction).
4. Directed Graph (Digraph): A graph in which the edges have a direction, indicating a one-way relationship between vertices.
5. Undirected Graph: A graph in which the edges do not have a direction, indicating a two-way relationship.
6. Weighted Graph: A graph in which edges have weights assigned to them, typically representing costs, lengths, or capacities.
7. Unweighted Graph: A graph in which edges do not have weights.
8. Adjacent (Neighbors): Two vertices are adjacent if there is an edge connecting them.
9. Degree: The degree of a vertex is the number of edges connected to it.
For directed graphs, there are in-degree - edges coming into the vertex)
out-degree - edges going out from the vertex).
10. Path: A sequence of edges that allows you to go from one vertex to another.
11. Cycle: A path that starts and ends at the same vertex without traversing any edge more than once.
12. Loop: An edge that connects a vertex to itself.
13. Subgraph: A graph formed from a subset of the vertices and edges of another graph.
14. Connected Graph: An undirected graph is connected if there is a path between every pair of vertices.
15. Disconnected Graph: A graph is disconnected if it is not connected, i.e., if there are at least two vertices with no path between them.
16. Complete Graph: A graph in which there is an edge between every pair of vertices.
17. Bipartite Graph: A graph whose vertices can be divided into two disjoint sets such that every edge connects a vertex in one set to a vertex in the other set.
18. Tree: A connected, undirected graph with no cycles.
19. Acyclic Graph: A graph with no cycles. In the context of directed graphs, it is often called a Directed Acyclic Graph (DAG).
20. Graph Isomorphism: Two graphs are isomorphic if there is a one-to-one correspondence between their vertex sets that preserves edge connectivity.
21. Parallel edges: In graph theory, parallel edge (also called multiple edges or a multi-edge), are two or more edges that are incident to the same two vertices. A simple graph
has no parallel edges.
22. Topological Graph: Topological sorting is an algorithm that sorts the vertices of a directed acyclic graph (DAG) in a specific order to satisfy all dependencies. This algorithm is
commonly used in several domains, such as task scheduling, software engineering, dependency resolution, and graph theory.
23. Spanning Tree: A spanning tree is a subset of Graph G, which has all the vertices covered with minimum possible number of edges. Hence, a spanning tree does not have
cycles and it cannot be disconnected.
By this definition, we can draw a conclusion that every connected and undirected Graph G has at least one spanning tree. A disconnected graph does not have any spanning
tree, as it cannot be spanned to all its vertices.
Types of Graphs:
There are two basic types of graphs −
1. Directed graph, as the name suggests, consists of edges that possess a direction that goes either away from a vertex or towards the vertex.
2. Undirected graphs have edges that are not directed at all.
3. Other Type of Graphs
Graph Representation
Graphs are fundamental data structures used to represent relationships between entities. To store and manipulate these relationships efficiently, several different graph representations
exist.
Here's an overview of the most common ones:
1. Adjacency Matrix
2. Adjacency List
1. Adjacency Matrix
An adjacency matrix is a 2D array where each cell [i, j] represents the presence or absence of an edge between vertices i and j. If the graph is weighted, the cell can store the
weight of the edge.
Here's an overview of the key aspects of adjacency matrices:
1.1 Adjacency Matrix Structure
Let G = (V, E) be a graph with n nodes, n>0. aij is an adjacency matrix of G. The aij is an n x n array whose elements are given by
The adjacency matrix is a square matrix of size N x N, where N is the number of vertices in the graph.
Each element (i, j) of the matrix represents the presence (1) or absence (0) of an edge between vertex i and vertex j.
For undirected graphs, the matrix is symmetric, meaning a[i][j] = a[j][i].
For directed graphs, the adjacency matrix is not necessarily symmetric, meaning a[i][j] ≠ a[j][i].
Advantages of Adjacency Matrix
Efficient for dense graphs.
Quick to check if there is an edge between two vertices.
Disadvantages of Adjacency Matrix
Requires more space, as it allocates memory for every possible edge, which is impractical for large sparse graphs.
Adding or removing vertices involves resizing the matrix, which can be inefficient.
2. Adjacency List
The adjacency list is a list of the vertices directly connected to the other vertices in the graph.
The n rows of adjacency matrix are represented as n linked lists. There is one list for each vertex in the graph. The nodes in the list i represent the vertices that are
adjacent from vertex i. Each node has at least two fields: vertex and link. The vertex field contains the indices of the vertices adjacent to vertex i. The adjacency lists for
the undirected graph and directed graph, are illustrated below.
Here's an overview of the key features of adjacency lists:
2.2. Adjacency List Structure
Each vertex in the graph has a corresponding list.
This list stores the identifiers of all the vertices connected to the original vertex by an edge.
The list can be implemented using different data structures like arrays, linked lists, or even hash tables.
The implementation choice depends on the desired functionality and performance characteristics.
Advantages of Adjacency List
Space-efficient for sparse graphs (graphs with relatively few edges compared to the number of vertices).
Easy to add vertices and edges.
Efficient in finding all adjacent vertices of a given vertex.
Disadvantages of Adjacency List
Can be less efficient for dense graphs (where the number of edges is close to the maximum possible).
Determining whether an edge exists between two vertices can be less efficient than with an adjacency matrix.
📦 Operations of Graphs
1. Breadth-First Search (BFS):
It is a graph traversal algorithm that systematically visits all nodes in a graph, exploring all the nodes at a particular level before moving on to the next level.
Definition of Breadth-First Search
Breadth-First Search starts at a specific 'source' node and explores all its neighboring nodes before moving on to their neighbors. This process continues until all nodes in the
component containing the source have been explored.
Characteristics of Breadth-First Search
Level-by-Level Traversal: BFS visits nodes in a level-wise order. It first visits all nodes at one level before moving to the next.
Queue Utilization: BFS uses a queue data structure to manage the order of node traversal. Nodes are dequeued for exploration and their unvisited neighbors are
enqueued.
Marking Visited Nodes: To avoid processing a node more than once, BFS marks each node as visited when it is enqueued.
Uniformity in Path Lengths: BFS is especially useful in finding the shortest path in unweighted graphs, as it visits nodes in order of their distance from the source, ensuring
the shortest path is found first.
Implementation of Breadth-First Search
1. Define a Queue size as total number of vertices in the graph.
2. Select any vertex as starting point for traversal. Visit that vertex and insert it into the Queue.
3. Visit all the non-visited adjacent vertices of the vertex which is in front of the Queue and insert them into the Queue.
4. When there is no new vertex to visited from the vertex at front of the Queue then delete that vertex.
5. Repeat steps 3 and 4 until queue becomes empty.
6. When queue becomes empty, then produce final spanning tree by removing unused edges from the graph
Advantages of Breadth-First Search:
Simple and easy to implement.
Guaranteed to find all reachable nodes.
Efficient for finding shortest paths in unweighted graphs.
Disadvantages of Breadth-First Search:
May be less efficient than Depth-First Search (DFS) for large graphs.
May explore unnecessary nodes if searching for a specific target node.
Pseudo-Code
BFS(graph, start_vertex):
Create a queue Q
Mark start_vertex as visited and enqueue it into Q
While Q is not empty:
vertex = Q.dequeue() // Remove the vertex from the front of the queue
Visit(vertex)
For each neighbor 'n' of vertex:
If n is not visited:
Mark n as visited
Enqueue n into Q
Queue
Front, Rear
9 (visit)
9 3 (visit)
9 3 2 4 10
9 3 2(visit) 4 10 1 5 7 8
9 3 2 4(visit) 10 1 5 7 8
9 3 2 4 10 (visit) 1 5 7 8
9 3 2 4 10 1(visit) 5 7 8
9 3 2 4 10 1 5(visit) 7 8 6
Front Rear
Notes:
1. 📚 BFS: Library Search
You want to find a book by topic:
You first scan all books in the first row (level),
Then move to the second row, and so on.
2. 📦 Uses a Queue (FIFO) to remember nodes to visit next.
3. Depth First Search:
Depth-First Search (DFS) is a fundamental algorithm in graph theory used for traversing or searching through a graph or tree data structure. The algorithm starts at a
selected node (the 'source') and explores as far as possible along each branch before backtracking. This method is called "depth-first" because it tries to go as deep as
possible into the graph from the current vertex, before exploring other vertices at the same level or backing up to previous levels.
Characteristics of Depth-First Search
Exploration Strategy: Follows a single path as far as possible, backtracking if necessary, before exploring other branches.
Data Structure: Utilizes a stack to keep track of visited nodes and the order of exploration.
Strengths: Efficient for finding paths, identifying connected components, and topological sorting.
Weaknesses: May not find the shortest path in a weighted graph and can be inefficient for finding all nodes in large, sparse graphs.
Implementation of Depth-First Search
1. Define a Stack size as total number of vertices in the graph.
2. Select any vertex as starting point for traversal. Visit that vertex and push it on to the Stack.
3. Visit any one of the non-visited adjacent vertices of a vertex which is at the top of stack and push it on to the stack.
4. Repeat step 3 until there is no new vertex to be visited from the vertex which is at the top of the stack.
5. When there is no new vertex to visit then use back tracking and pop one vertex from the stack.
6. Repeat steps 3, 4 and 5 until stack becomes Empty.
7. When stack becomes Empty, then produce final spanning tree by removing unused edges from the graph
1,2,8,7,5,6,4,3,9,10 1,2,3,4,9,10,5,6,7,8
Floyd-Warshall’s Algorithm
✅ Warshall’s Algorithm
Warshall’s Algorithm is used to find the transitive closure of a directed graph.
In simple words: It tells you whether you can go from one node to another — directly or through other nodes.
🔍 What is Transitive Closure?
If you can go from node A → B, and from B → C, then it means A → C is also possible — this is transitivity.
Transitive closure shows all such paths.
🔧 What does Warshall’s Algorithm Do?
Given a graph in the form of an adjacency matrix, Warshall’s Algorithm updates it to tell whether a path exists between every pair of nodes.
📥 Input: Adjacency Matrix
A matrix where:
1 means there's a direct edge from node i to j
0 means no direct edge
📤 Output: Path Matrix (Transitive Closure)
A matrix where:
1 means there's some path from i to j (direct or indirect)
0 means there's no path
🧠 How It Works – Step by Step
For a graph with n nodes:
1. Start with the adjacency matrix A
2. Repeat the following for every node k = 1 to n:
o For each pair of nodes (i, j):
Update A[i][j] = A[i][j] OR (A[i][k] AND A[k][j])
This checks:
"Is there already a path from i to j, or can we go from i to j through k?"
🔢 Example
Suppose we have 3 nodes: A, B, C
Initial adjacency matrix:
A B C
A 1 1 0
B 0 1 1
C 0 0 1
This means:
A→B
B→C
Each node reaches itself (diagonal = 1)
Step 1: Check paths through node A
Update using paths that go through A.
Step 2: Check paths through node B
Now we find:
A → B and B → C
So A → C (indirect path through B)
Update A → C to 1
Final matrix (after Warshall):
A B C
A 1 1 1
B 0 1 1
C 0 0 1
✅ Now it shows all possible paths between nodes.
🧾 Summary
Feature Description
Purpose Find transitive closure
Input Adjacency matrix of a graph
Output Path matrix (shows all reachabilities)
Time Complexity O(n³)
Works on Directed graphs
Let me know if you'd like to try a step-by-step example with a custom graph!
Transitive closure of a graph using Floyd War-Shall Algorithm
Given a directed graph, determine if a vertex j is reachable from another vertex i for all vertex pairs (i, j) in the given graph. Here reachable means that there is a path from vertex i to j.
The reach-ability matrix is called the transitive closure of a graph.
Simple logic will be:
A Connected B
B connected C
hence A C
Formula:
Example:
Complexity
1. Time Complexity: O ( v 3 ) −¿ where V is the number of vertexes.
Reason: For the main iteration, three loops are layered inside of one another. Each loop iterates for V times, and this number changes depending on the input V. As a result, our
temporal complexity is O ( v 3 )
2. Space complexity: O ( v 2 )where n is the array's size.
Reason: We allocate a single two-dimensional matrix with a total number of rows and columns equal to the number of vertices V in each at the start of the process. As V rises,
the program's space requirements grow. Therefore, that depends on V. Thus, the complexity of space is O ( v 2 )