Graph Theory Basics, DFS, and BFS Explained
1. Basics of Graph Theory
Graph theory is a branch of mathematics that studies graphs, which are
mathematical structures used to model pairwise relations between objects. A
graph is a powerful tool for representing connections and relationships in
various domains.
What is a Graph?
A graph G is formally defined as a pair (V,E), where:
● V is a set of vertices (also called nodes). These represent the objects or
entities.
● E is a set of edges (also called links or arcs). These represent the
connections or relationships between pairs of vertices.
Types of Graphs:
● Undirected Graph: Edges have no direction. If an edge connects vertex
A to vertex B, it also connects B to A. (e.g., friendship on social media).
● Directed Graph (Digraph): Edges have a direction. An edge from A to
B does not necessarily imply an edge from B to A. (e.g., one-way streets,
Twitter followers).
● Weighted Graph: Each edge has an associated numerical value (weight
or cost), representing, for example, distance, time, or capacity. (e.g.,
road network with distances).
● Unweighted Graph: Edges have no associated weights; they simply
represent a connection.
Graph Representations
Graphs can be represented in various ways, each with its own advantages
and disadvantages depending on the specific application and operations to
be performed. The two most common representations are the Adjacency List
and the Adjacency Matrix.
Adjacency List Representation
An adjacency list represents a graph as an array or dictionary where each
index/key corresponds to a vertex, and the value at that index/key is a list
(or collection) of its adjacent vertices.
Example (Undirected Graph):
Consider a graph with vertices V=0,1,2,3 and edges E=(0,1),(0,2),(1,2),(2,3).
Adjacency List:
● 0-> [1, 2]
● 1->[0, 2]
● 2-> [0, 1, 3]
● 3-> [2]
Pros:
● Space Efficient: For sparse graphs (graphs with relatively few edges
compared to the maximum possible), it uses less memory, as it only
stores existing edges. Space complexity is O(∣V∣+∣E∣).
● Efficient for finding neighbors: Quickly lists all neighbors of a vertex.
Useful for traversal algorithms like DFS and BFS.
Cons:
● Checking for existence of an edge: To check if an edge (u,v) exists,
you might need to iterate through the adjacency list of u, which can be
O(textdegree(u)) in the worst case.
Adjacency Matrix Representation
An adjacency matrix represents a graph as a square matrix of size
∣V∣times∣V∣, where ∣V∣ is the number of vertices.
● For an unweighted graph, matrix[i][j] is 1 if there is an edge from vertex i
to vertex j, and 0 otherwise.
● For a weighted graph, matrix[i][j] stores the weight of the edge from i to
j, and often 0 or infinity if no edge exists.
Example (Undirected Graph, same as above):
Vertices V=0,1,2,3 and edges E=(0,1),(0,2),(1,2),(2,3).
0 1 2 3
_____________________
0 | 0 1 1 0
1 | 1 0 1 0
2 | 1 1 0 1
3 | 0 0 1 0
Note: For an undirected graph, the adjacency matrix is symmetric (matrix[i]
[j] == matrix[j][i]).
Pros:
● Efficient for checking edge existence: Checking if an edge (u,v)
exists takes O(1) time by simply looking up matrix[u][v].
● Easy to implement: Simple 2D array.
Cons:
● Space Inefficient: For sparse graphs, it uses a lot of memory (O(∣V∣2)),
as it stores a value for every possible pair of vertices, even if no edge
exists.
● Inefficient for finding neighbors: To find all neighbors of a vertex,
you must iterate through an entire row (or column) of the matrix, taking
O(∣V∣) time.
Applications of Graph Theory in AI
Graph theory is fundamental to many areas of Artificial Intelligence:
● Pathfinding and Navigation: Algorithms like Dijkstra's, A*, and BFS
are used extensively in robotics, game AI, and GPS systems to find the
shortest or optimal paths.
● Social Network Analysis: Representing social connections as graphs
helps in understanding community structures, influence propagation, and
recommendation systems.
● Knowledge Representation: Semantic networks and knowledge
graphs represent relationships between concepts and entities, forming
the backbone of many expert systems and natural language
understanding models.
● Machine Learning:
○ Graph Neural Networks (GNNs): A growing field that applies deep
learning to graph-structured data (e.g., molecular structures, social
networks) for tasks like node classification, link prediction, and graph
classification.
○ Clustering: Graph-based clustering algorithms group similar data
points represented as nodes in a graph.
● Computer Vision: Graph cuts are used in image segmentation, and
scene understanding can involve representing objects and their
relationships as graphs.
● Natural Language Processing (NLP): Dependency parsing and
semantic role labeling can be modeled using graphs to represent
grammatical relationships between words in a sentence.
● Game Theory: Representing game states and possible moves as a
game tree (a type of graph) is crucial for AI agents to plan strategies
(e.g., minimax algorithm).
2. Depth First Search (DFS)
Definition: Depth First Search (DFS) is an algorithm for traversing or
searching tree or graph data structures. The algorithm starts at the root (or
an arbitrary node) and explores as far as possible along each branch before
backtracking.
3. Working Principle of DFS
DFS operates like exploring a maze: you go down one path as far as you can
until you hit a dead end or a visited node. Then, you backtrack to the last
point where you had an unexplored option and try a different path.
Working Principle:
1. Start: Pick a starting node and mark it as visited. Add it to the traversal
order.
2. Explore Deeply: For the current node, pick one of its unvisited
neighbors.
3. Recurse/Push: Move to that neighbor, mark it visited, add it to the
traversal order, and repeat the process from this new node. This is often
implemented using recursion, where the call stack implicitly manages
the "backtracking" points.
4. Backtrack: If the current node has no unvisited neighbors (i.e., all its
neighbors have been visited or there are no neighbors), or if you've
explored all paths from it, you "backtrack" to the previous node in the
path and continue exploring from there.
5. Repeat: Continue until all reachable nodes from the starting point have
been visited. If the graph is disconnected, you might need to pick
another unvisited node and restart DFS from there to cover all
components.
Data Structure Used:
DFS primarily uses a Stack (either explicitly or implicitly through recursion's
call stack).
● Explicit Stack: When implementing DFS iteratively, you use a stack to
keep track of the nodes to visit. When you pop a node, you push its
unvisited neighbors onto the stack.
● Implicit Stack (Recursion): When implementing DFS recursively, the
function call stack serves as the stack. Each recursive call adds a new
frame to the stack, and when a function returns (backtracks), its frame is
popped.
Note on Iterative DFS and order: The order of neighbors pushed onto the
stack can affect the exact traversal path, but it will still be a valid DFS.
Pushing them in reverse order of how you'd normally iterate them (e.g., if
adj_list[u] is [v1, v2, v3], push v3, then v2, then v1 so v1 is processed first)
can make the iterative output match the recursive one more closely if the
recursive one processes neighbors in the order they appear in the adjacency
list.
5. Breadth First Search (BFS)
Definition: Breadth First Search (BFS) is an algorithm for traversing or
searching tree or graph data structures. It starts at the tree root (or some
arbitrary node of a graph) and explores all of the neighbor nodes at the
present depth before moving on to the nodes at the next depth level.
6. Working Principle of BFS
BFS explores the graph level by level. Imagine throwing a stone into a pond;
the ripples expand outwards. BFS works similarly, visiting all nodes at
"distance 1" from the start node, then all nodes at "distance 2", and so on.
Working Principle:
1. Start: Pick a starting node, mark it as visited, and add it to a queue.
2. Dequeue and Explore: While the queue is not empty:
○ Dequeue a node from the front of the queue. This is the current node.
Add it to the traversal order.
○ Enqueue Neighbors: For each unvisited neighbor of the current
node:
■ Mark the neighbor as visited.
■ Enqueue the neighbor.
3. Repeat: Continue until the queue is empty. This means all reachable
nodes from the starting point have been visited. If the graph is
disconnected, you might need to pick another unvisited node and restart
BFS from there to cover all components.
Data Structure Used:
BFS primarily uses a Queue.
● The queue stores nodes that have been visited but whose neighbors
have not yet been fully explored. This ensures that nodes are processed
in the order they are discovered at each "level" of the graph.
This recursive BFS is less intuitive because the "recursion" is based on levels,
not individual node paths, and it still relies on a list (acting as a temporary
queue for the current level) to collect nodes for the next recursive call. The
iterative approach with a deque is generally preferred for BFS due to its
clarity and direct mapping to the level-by-level exploration.