4th Data Structure
4th Data Structure
A graph can be defined as group of vertices and edges that are used to connect
these vertices. A graph can be seen as a cyclic tree, where the vertices (Nodes)
maintain any complex relationship among them instead of having parent child
relationship.
Definition
A graph G can be defined as an ordered set G(V, E) where V(G) represents the set
of vertices and E(G) represents the set of edges which are used to connect these
vertices.
A Graph G(V, E) with 5 vertices (A, B, C, D, E) and six edges ((A,B), (B,C), (C,E),
(E,D), (D,B), (D,A)) is shown in the following figure.
In a directed graph, edges form an ordered pair. Edges represent a specific path
from some vertex A to another vertex B. Node A is called initial node while node B is
called terminal node.
Closed Path
A path will be called as closed path if the initial node is same as terminal node. A
path will be closed path if V0=VN.
Simple Path
If all the nodes of the graph are distinct with an exception V0=VN, then such path P is
called as closed simple path.
Cycle
A cycle can be defined as the path which has no repeated edges or vertices except
the first and last vertices.
Connected Graph
A connected graph is the one in which some path exists between every two vertices
(u, v) in V. There are no isolated nodes in connected graph.
Complete Graph
A complete graph is the one in which every node is connected with all other nodes. A
complete graph contain n(n-1)/2 edges where n is the number of nodes in the graph.
Weighted Graph
In a weighted graph, each edge is assigned with some data such as length or
weight. The weight of an edge e can be given as w(e) which must be a positive (+)
value indicating the cost of traversing the edge.
Digraph
A digraph is a directed graph in which each edge of the graph is associated with
some direction and the traversing can be done only in the specified direction.
Loop
An edge that is associated with the similar end points can be called as Loop.
Adjacent Nodes
If two nodes u and v are connected via an edge e, then the nodes u and v are called
as neighbours or adjacent nodes.
BFS algorithm
In this article, we will discuss the BFS algorithm in the data structure. Breadth-first
search is a graph traversal algorithm that starts traversing the graph from the root
node and explores all the neighboring nodes. Then, it selects the nearest node and
explores all the unexplored nodes. While using BFS for traversal, any node in the
graph can be considered as the root node.
There are many ways to traverse the graph, but among them, BFS is the most
commonly used approach. It is a recursive algorithm to search all the vertices of a
tree or graph data structure. BFS puts every vertex of the graph into two categories -
visited and non-visited. It selects a single node in a graph and, after that, visits all the
nodes adjacent to the selected node.
Algorithm
The steps involved in the BFS algorithm to explore a graph are given as follows -
Step 2: Enqueue the starting node A and set its STATUS = 2 (waiting state)
Step 4: Dequeue a node N. Process it and set its STATUS = 3 (processed state).
Step 5: Enqueue all the neighbours of N that are in the ready state (whose STATUS
= 1) and set
their STATUS = 2
Advertisement
(waiting state)
[END OF LOOP]
Step 6: EXIT
BFS algorithm
In this article, we will discuss the BFS algorithm in the data structure. Breadth-first
search is a graph traversal algorithm that starts traversing the graph from the root
node and explores all the neighboring nodes. Then, it selects the nearest node and
explores all the unexplored nodes. While using BFS for traversal, any node in the
graph can be considered as the root node.
There are many ways to traverse the graph, but among them, BFS is the most
commonly used approach. It is a recursive algorithm to search all the vertices of a
tree or graph data structure. BFS puts every vertex of the graph into two categories -
visited and non-visited. It selects a single node in a graph and, after that, visits all the
nodes adjacent to the selected node.
o BFS can be used to find the neighboring locations from a given source
location.
o In a peer-to-peer network, BFS algorithm can be used as a traversal
method to find all the neighboring nodes. Most torrent clients, such as
BitTorrent, uTorrent, etc. employ this process to find "seeds" and
"peers" in the network.
o BFS can be used in web crawlers to create web page indexes. It is one
of the main algorithms that can be used to index web pages. It starts
traversing from the source page and follows the links associated with
the page. Here, every web page is considered as a node in the graph.
o BFS is used to determine the shortest path and minimum spanning tree.
o BFS is also used in Cheney's technique to duplicate the garbage
collection.
o It can be used in ford-Fulkerson method to compute the maximum flow
in a flow network.
Algorithm
The steps involved in the BFS algorithm to explore a graph are given as follows -
Step 2: Enqueue the starting node A and set its STATUS = 2 (waiting state)
Step 4: Dequeue a node N. Process it and set its STATUS = 3 (processed state).
Step 5: Enqueue all the neighbours of N that are in the ready state (whose STATUS
= 1) and set
their STATUS = 2
(waiting state)
[END OF LOOP]
Step 6: EXIT
In the above graph, minimum path 'P' can be found by using the BFS that will start
from Node A and end at Node E. The algorithm uses two queues, namely QUEUE1
and QUEUE2. QUEUE1 holds all the nodes that are to be processed, while
QUEUE2 holds all the nodes that are processed and deleted from QUEUE1.
DFS (Depth First Search) algorithm
In this article, we will discuss the DFS algorithm in the data structure. It is a recursive
algorithm to search all the vertices of a tree data structure or a graph. The depth-first
search (DFS) algorithm starts with the initial node of graph G and goes deeper until
we find the goal node or the node with no children.
Because of the recursive nature, stack data structure can be used to implement the
DFS algorithm. The process of implementing the DFS is similar to the BFS algorithm.
The step by step process to implement the DFS traversal is given as follows -
1. First, create a stack with the total number of vertices in the graph.
2. Now, choose any vertex as the starting point of traversal, and push that vertex
into the stack.
3. After that, push a non-visited vertex (adjacent to the vertex on the top of the
stack) to the top of the stack.
4. Now, repeat steps 3 and 4 until no vertices are left to visit from the vertex on the
stack's top.
5. If no vertex is left, go back and pop a vertex from the stack.
6. Repeat steps 2, 3, and 4 until the stack is empty.
Algorithm
Step 1: SET STATUS = 1 (ready state) for each node in G
Step 2: Push the starting node A on the stack and set its STATUS = 2 (waiting state)
Step 4: Pop the top node N. Process it and set its STATUS = 3 (processed state)
Step 5: Push on the stack all the neighbors of N that are in the ready state (whose
STATUS = 1) and set their STATUS = 2 (waiting state)
[END OF LOOP]
Step 6: EXIT
Example of DFS algorithm
Now, let's understand the working of the DFS algorithm by using an example. In the
example given below, there is a directed graph having 7 vertices.
1. STACK: H
Step 2 - POP the top element from the stack, i.e., H, and print it. Now, PUSH all the
neighbors of H onto the stack that are in ready state.
1. Print: H]STACK: A
Step 3 - POP the top element from the stack, i.e., A, and print it. Now, PUSH all the
neighbors of A onto the stack that are in ready state.
1. Print: A
2. STACK: B, D
Step 4 - POP the top element from the stack, i.e., D, and print it. Now, PUSH all the
neighbors of D onto the stack that are in ready state.
Advertisement
1. Print: D
2. STACK: B, F
Step 5 - POP the top element from the stack, i.e., F, and print it. Now, PUSH all the
neighbors of F onto the stack that are in ready state.
1. Print: F
2. STACK: B
Step 6 - POP the top element from the stack, i.e., B, and print it. Now, PUSH all the
neighbors of B onto the stack that are in ready state.
1. Print: B
2. STACK: C
Step 7 - POP the top element from the stack, i.e., C, and print it. Now, PUSH all the
neighbors of C onto the stack that are in ready state.
Advertisement
1. Print: C
2. STACK: E, G
Step 8 - POP the top element from the stack, i.e., G and PUSH all the neighbors of
G onto the stack that are in ready state.
1. Print: G
2. STACK: E
Step 9 - POP the top element from the stack, i.e., E and PUSH all the neighbors of E
onto the stack that are in ready state.
1. Print: E
2. STACK:
Spanning tree
In this article, we will discuss the spanning tree and the minimum spanning tree. But
before moving directly towards the spanning tree, let's first see a brief description of
the graph and its types.
Graph
A graph can be defined as a group of vertices and edges to connect these vertices.
The types of graphs are given as follows -
A spanning tree consists of (n-1) edges, where 'n' is the number of vertices (or
nodes). Edges of the spanning tree may or may not have weights assigned to them.
All the possible spanning trees created from the given graph G would have the same
number of vertices, but the number of edges in the spanning tree would be equal to
the number of vertices in the given graph minus 1.
A complete undirected graph can have nn-2 number of spanning trees where n is the
number of vertices in the graph. Suppose, if n = 5, the number of maximum possible
spanning trees would be 55-2 = 125.
o Cluster Analysis
o Civil network planning
o Computer network routing protocol
Now, let's understand the spanning tree with the help of an example.
Some of the possible spanning trees that will be created from the above graph are
given as follows -
Properties of spanning-tree
Some of the properties of the spanning tree are given as follows -
o There can be more than one spanning tree of a connected graph G.
o A spanning tree does not have any cycles or loop.
o A spanning tree is minimally connected, so removing one edge from
the tree will make the graph disconnected.
o A spanning tree is maximally acyclic, so adding one edge to the tree
will create a loop.
o There can be a maximum nn-2 number of spanning trees that can be
created from a complete graph.
o A spanning tree has n-1 edges, where 'n' is the number of nodes.
o If the graph is a complete graph, then the spanning tree can be
constructed by removing maximum (e-n+1) edges, where 'e' is the
number of edges and 'n' is the number of vertices.
So, a spanning tree is a subset of connected graph G, and there is no spanning tree
of a disconnected graph.
The sum of the edges of the above graph is 16. Now, some of the possible spanning
trees created from the above graph are -
So, the minimum spanning tree that is selected from the above spanning trees for
the given weighted graph is -
Prim's algorithm - It is a greedy algorithm that starts with an empty spanning tree. It
is used to find the minimum spanning tree from the graph. This algorithm finds the
subset of edges that includes every vertex of the graph such that the sum of the
weights of the edges can be minimized.
Prim's Algorithm
In this article, we will discuss the prim's algorithm. Along with the algorithm, we will
also see the complexity, working, example, and implementation of prim's algorithm.
Before starting the main topic, we should discuss the basic and important terms such
as spanning tree and minimum spanning tree.
Minimum Spanning tree - Minimum spanning tree can be defined as the spanning
tree in which the sum of the weights of the edge is minimum. The weight of the
spanning tree is the sum of the weights given to the edges of the spanning tree.
Prim's Algorithm is a greedy algorithm that is used to find the minimum spanning
tree from a graph. Prim's algorithm finds the subset of edges that includes every
vertex of the graph such that the sum of the weights of the edges can be minimized.
Prim's algorithm starts with the single node and explores all the adjacent nodes with
all the connecting edges at every step. The edges with the minimal weights causing
no cycles in the graph got selected.
Step 1 - First, we have to choose a vertex from the above graph. Let's choose B.
Step 2 - Now, we have to choose and add the shortest edge from vertex B. There
are two edges from vertex B that are B to C with weight 10 and edge B to D with
weight 4. Among the edges, the edge BD has the minimum weight. So, add it to the
MST.
Step 3 - Now, again, choose the edge with the minimum weight among all the other
edges. In this case, the edges DE and CD are such edges. Add them to MST and
explore the adjacent of C, i.e., E and A. So, select the edge DE and add it to the
MST.
Step 4 - Now, select the edge CD, and add it to the MST.
Step 5 - Now, choose the edge CA. Here, we cannot select the edge CE as it would
create a cycle to the graph. So, choose the edge CA and add it to the MST.
So, the graph produced in step 5 is the minimum spanning tree of the given graph.
The cost of the MST is given below -
Advertisement
Algorithm
1. Step 1: Select a starting vertex
2. Step 2: Repeat Steps 3 and 4 until there are fringe vertices
3. Step 3: Select an edge 'e' connecting the tree vertex and fringe vertex that has
minimum weight
4. Step 4: Add the selected edge and the vertex to the minimum spanning tree T
5. [END OF LOOP]
6. Step 5: EXIT
Kruskal's algorithm - This algorithm is also used to find the minimum spanning tree
for a connected weighted graph. Kruskal's algorithm also follows greedy approach,
which finds an optimum solution at every stage instead of focusing on a global
optimum.
Kruskal's Algorithm
In this article, we will discuss Kruskal's algorithm. Here, we will also see the
complexity, working, example, and implementation of the Kruskal's algorithm.
But before moving directly towards the algorithm, we should first understand the
basic terms such as spanning tree and minimum spanning tree.
Minimum Spanning tree - Minimum spanning tree can be defined as the spanning
tree in which the sum of the weights of the edge is minimum. The weight of the
spanning tree is the sum of the weights given to the edges of the spanning tree.
Kruskal's Algorithm is used to find the minimum spanning tree for a connected
weighted graph. The main target of the algorithm is to find the subset of edges by
using which we can traverse every vertex of the graph. It follows the greedy
approach that finds an optimum solution at every stage instead of focusing on a
global optimum.
The weight of the edges of the above graph is given in the below table -
Edge AB AC AD AE BC CD DE
Weight 1 7 10 5 3 4 2
Now, sort the edges given above in the ascending order of their weights.
Edge AB DE BC CD AE AC AD
Weight 1 2 3 4 5 7 10
Step 3 - Add the edge BC with weight 3 to the MST, as it is not creating any cycle or
loop.
Step 4 - Now, pick the edge CD with weight 4 to the MST, as it is not forming the
cycle.
Advertisement
Step 5 - After that, pick the edge AE with weight 5. Including this edge will create the
cycle, so discard it.
Step 6 - Pick the edge AC with weight 7. Including this edge will create the cycle, so
discard it.
Step 7 - Pick the edge AD with weight 10. Including this edge will also create the
cycle, so discard it.
Advertisement
So, the final minimum spanning tree obtained from the given weighted graph by
using Kruskal's algorithm is -
The cost of the MST is = AB + DE + BC + CD = 1 + 2 + 3 + 4 = 10.
Now, the number of edges in the above tree equals the number of vertices minus 1.
So, the algorithm stops here.
Algorithm
1. Step 1: Create a forest F in such a way that every vertex of the graph is a sep
arate tree.
2. Step 2: Create a set E that contains all the edges of the graph.
3. Step 3: Repeat Steps 4 and 5 while E is NOT EMPTY and F is not spanning
4. Step 4: Remove an edge from E with minimum weight
5. Step 5: IF the edge obtained in Step 4 connects two different trees, then add it
to the forest F
6. (for combining two trees into one tree).
7. ELSE
8. Discard the edge
9. Step 6: END
o TimeComplexity
The time complexity of Kruskal's algorithm is O(E logE) or O(V logV),
where E is the no. of edges, and V is the no. of vertices.
Dijkstra's Algorithm
The following tutorial will teach us about Dijkstra's Shortest Path Algorithm. We will
understand the working of Dijkstra's Algorithm with a stepwise graphical explanation.
We will cover the following:
Components of a Graph
1. Vertices:Vertices are the basic units of the graph used to represent real-life
objects, persons, or entities. Sometimes, vertices are also known as Nodes.
2. Edges:Edges are drawn or used to connect two vertices of the graph.
Sometimes, edges are also known as Arcs.
In the above figure, the Vertices/Nodes are denoted with Colored Circles, and the
Edges are denoted with the lines connecting the nodes.
For example, we could use Graphs to design a transportation network model where
the vertices display the facilities that send or receive the products, and the edges
represent roads or paths connecting them. The following is a pictorial representation
of the same:
Graphs are also utilized in different Social Media Platforms like LinkedIn, Facebook,
Twitter, and more. For example, Platforms like Facebook use Graphs to store the
data of their users where every person is indicated with a vertex, and each of them is
a structure containing information like Person ID, Name, Gender, Address, etc.
Types of Graphs
The Graphs can be categorized into two types:
1. Undirected Graph
2. Directed Graph
Undirected Graph: A Graph with edges that do not have a direction is termed an
Undirected Graph. The edges of this graph imply a two-way relationship in which
each edge can be traversed in both directions. The following figure displays a simple
undirected graph with four nodes and five edges.
Directed Graph: A Graph with edges with direction is termed a Directed Graph. The
edges of this graph imply a one-way relationship in which each edge can only be
traversed in a single direction. The following figure displays a simple directed graph
with four nodes and five edges.
Figure 4: A Simple Directed Graph
Advertisement
For instance, we can observe a blue number next to each edge in the following
figure of the Weighted Graph. This number is utilized to signify the weight of the
corresponding edge.
Figure 5: An Example of a Weighted Graph
Ever wondered how does Google Maps finds the shortest and fastest route between
two places?
During an Interview with Philip L. Frana for the Communications of the ACM
journal in the year 2001, Dr. Edsger W. Dijkstra revealed:
"What is the shortest way to travel from Rotterdam to Groningen, in general: from
given city to given city? It is the algorithm for the shortest path, which I designed in
about twenty minutes. One morning I was shopping in Amsterdam with my young
fiancée, and tired, we sat down on the café terrace to drink a cup of coffee and I was
just thinking about whether I could do this, and I then designed the algorithm for the
shortest path. As I said, it was a twenty-minute invention. In fact, it was published in
'59, three years later. The publication is still readable, it is, in fact, quite nice. One of
the reasons that it is so nice was that I designed it without pencil and paper. I
learned later that one of the advantages of designing without pencil and paper is that
you are almost forced to avoid all avoidable complexities. Eventually, that algorithm
became to my great amazement, one of the cornerstones of my fame."
Dijkstra thought about the shortest path problem while working as a programmer at
the Mathematical Centre in Amsterdam in 1956 to illustrate the capabilities of a new
computer known as ARMAC. His goal was to select both a problem and a solution
(produced by the computer) that people with no computer background could
comprehend. He developed the shortest path algorithm and later executed it for
ARMAC for a vaguely shortened transportation map of 64 cities in the Netherlands
(64 cities, so 6 bits would be sufficient to encode the city number). A year later, he
came across another issue from hardware engineers operating the next computer of
the institute: Minimize the amount of wire required to connect the pins on the
machine's back panel. As a solution, he re-discovered the algorithm called Prim's
minimal spanning tree algorithm and published it in the year 1959.
1. Dijkstra's Algorithm begins at the node we select (the source node), and it
examines the graph to find the shortest path between that node and all the other
nodes in the graph.
2. The Algorithm keeps records of the presently acknowledged shortest distance
from each node to the source node, and it updates these values if it finds any
shorter path.
3. Once the Algorithm has retrieved the shortest path between the source and
another node, that node is marked as 'visited' and included in the path.
4. The procedure continues until all the nodes in the graph have been included in
the path. In this manner, we have a path connecting the source node to all other
nodes, following the shortest possible path to reach each node.
Each Vertex in this Algorithm will have two properties defined for it:
1. Visited Property
2. Path Property
Visited Property:
1. The 'visited' property signifies whether or not the node has been visited.
2. We are using this property so that we do not revisit any node.
3. A node is marked visited only when the shortest path has been found.
Path Property:
1. The 'path' property stores the value of the current minimum path to the node.
2. The current minimum path implies the shortest way we have reached this node till
now.
3. This property is revised when any neighbor of the node is visited.
4. This property is significant because it will store the final answer for each node.
Initially, we mark all the vertices, or nodes, unvisited as they have yet to be visited.
The path to all the nodes is also set to infinity apart from the source node. Moreover,
the path to the source node is set to zero (0).
We then select the source node and mark it as visited. After that, we access all the
neighboring nodes of the source node and perform relaxation on every node.
Relaxation is the process of lowering the cost of reaching a node with the help of
another node.
In the process of relaxation, the path of each node is revised to the minimum value
amongst the node's current path, the sum of the path to the previous node, and the
path from the previous node to the current node.
Let us suppose that p[n] is the value of the current path for node n, p[m] is the value
of the path up to the previously visited node m, and w is the weight of the edge
between the current node and previously visited one (edge weight between n and
m).
We then mark an unvisited node with the least path as visited in every subsequent
step and update its neighbor's paths.
We repeat this procedure until all the nodes in the graph are marked visited.
Whenever we add a node to the visited set, the path to all its neighboring nodes also
changes accordingly.
If any node is left unreachable (disconnected component), its path remains 'infinity'.
In case the source itself is a separate component, then the path to all other nodes
remains 'infinity
Since the shortest path can be calculated from single source vertex to all
the other vertices in the graph, Dijkstra’s algorithm is also called single-
source shortest path algorithm. The output obtained is called shortest path
spanning tree.
In this chapter, we will learn about the greedy approach of the dijkstra’s
algorithm.
Dijkstra’s Algorithm
The dijkstra’s algorithm is designed to find the shortest path between two
vertices of a graph. These two vertices could either be adjacent or the
farthest points in the graph. The algorithm starts from the source. The
inputs taken by the algorithm are the graph G {V, E}, where V is the set
of vertices and E is the set of edges, and the source vertex S. And the
output is the shortest path spanning tree.
Algorithm
Declare two arrays − distance[] to store the distances from the
source vertex to the other vertices in graph and visited[] to store
the visited vertices.
Set distance[S] to ‘0’ and distance[v] = ∞, where v represents all
the other vertices in the graph.
Add S to the visited[] array and find the adjacent vertices of S with
the minimum distance.
The adjacent vertex to S, say A, has the minimum distance and is
not in the visited array yet. A is picked and added to the visited
array and the distance of A is changed from ∞ to the assigned
distance of A, say d1, where d1 < ∞.
Repeat the process for the adjacent vertices of the visited vertices
until the shortest path spanning tree is formed.
Examples
To understand the dijkstra’s concept better, let us analyze the algorithm
with the help of an example graph −
Step 1
Initialize the distances of all the vertices as ∞, except the source node S.
Vertex S A B C D E
Distance 0 ∞ ∞ ∞ ∞ ∞
Now that the source vertex S is visited, add it into the visited array.
visited = {S}
Step 2
The vertex S has three adjacent vertices with various distances and the
vertex with minimum distance among them all is A. Hence, A is visited
and the dist[A] is changed from ∞ to 6.
S→A=6
S→D=8
S→E=7
Vertex S A B C D E
Distance 0 6 ∞ ∞ 8 7
Visited = {S, A}
Step 3
There are two vertices visited in the visited array, therefore, the adjacent
vertices must be checked for both the visited vertices.
S → D = 8 and S → E = 7.
S → B = S → A + A → B = 6 + 9 = 15
Vertex S A B C D E
Distance 0 6 15 ∞ 8 7
Visited = {S, A, E}
Step 4
S→D=8
S → B = 15
S → C = S → E + E → C = 7 + 5 = 12
Vertex S A B C D E
Distance 0 6 15 12 8 7
Visited = {S, A, E, D}
Step 5
S → C = S → E + E → C = 7 + 5 = 12
S → C = S → D + D → C = 8 + 3 = 11
S → B = S → A + A → B = 6 + 9 = 15
S → B = S → D + D → C + C → B = 8 + 3 + 12 = 23
Vertex S A B C D E
Distance 0 6 15 11 8 7
Visited = { S, A, E, D, C}
Step 6
Visited = {S, A, E, D, C, B}
The shortest path spanning tree is obtained as an output using the
dijkstra’s algorithm.