CSE 208: Data Structure
and Algorithms II
Introduction and Graph Basics
Dr. Mohammed Eunus Ali
Professor
CSE, BUET
Slides are from Dr Tanzima Hashem, tanzimahashem@cse.buet.ac.bd
GRAPHS
? A graph G = (V, E)
⚫ V = set of vertices
⚫ E = set of edges = subset of V × V
⚫ Thus |E| = O(|V|2)
GRAPH VARIATIONS
? Variations:
⚫ A connected graph has a path from every vertex
to every other
⚫ In an undirected graph:
? Edge (u,v) = edge (v,u)
? No self-loops
⚫ In a directed graph:
? Edge (u,v) goes from vertex u to vertex v,
notated u→v
? Self loops are allowed.
GRAPH VARIATIONS
? More variations:
⚫ A weighted graph associates weights with either
the edges or the vertices
? E.g., a road map: edges might be weighted w/
distance
⚫ A multigraph allows multiple edges between the
same vertices
? E.g., the call graph in a program (a function
can get called from multiple points in another
function)
GRAPHS
? We will typically express running times in terms of
|E| and |V| (often dropping the |’s)
⚫ If |E| ≈ |V|2 the graph is dense
⚫ If |E| ≈ |V| the graph is sparse
? If you know you are dealing with dense or sparse
graphs, different data structures may make sense
REPRESENTING GRAPHS
? Assume V = {1, 2, …, n}
? An adjacency matrix represents the graph as a n x
n matrix A:
⚫ A[i, j] = 1 if edge (i, j) ∈ E (or weight of edge)
= 0 if edge (i, j) ∉ E
GRAPHS: ADJACENCY MATRIX
? Example:
A 1 2 3 4
1
a 1
2
d
4 2
3
b c
??
3 4
GRAPHS: ADJACENCY MATRIX
? Example:
A 1 2 3 4
1
a 1 0 1 1 0
2
d
4 2 0 0 1 0
b c 3 0 0 0 0
3 4 0 0 1 0
GRAPHS: ADJACENCY MATRIX
? Space: Θ(V2).
⚫ Not memory efficient for large graphs.
? Time: to list all vertices adjacent to u: Θ(V).
? Time: to determine if (u, v) ∈ E: Θ(1).
GRAPHS: ADJACENCY MATRIX
? The adjacency matrix is a dense representation
⚫ Usually too much storage for large graphs
⚫ But can be very efficient for small graphs
? Most large interesting graphs are sparse
⚫ E.g., planar graphs, in which no edges cross,
have |E| = O(|V|) by Euler’s formula
⚫ For this reason the adjacency list is often a more
appropriate respresentation
GRAPHS: ADJACENCY LIST
? Adjacency list: for each vertex v ∈ V, store a list of
vertices adjacent to v
? Example:
⚫ Adj[1] = {2,3} 1
⚫ Adj[2] = {3}
⚫ Adj[3] = {}
⚫ Adj[4] = {3} 2 4
? Variation: can also keep
a list of edges coming into vertex 3
GRAPHS: ADJACENCY LIST
? For directed graphs:
⚫ Sum of lengths of all adj. lists is
∑out-degree(v) = |E|
v∈V
No. of edges
⚫ Total storage: Θ(V+E) leaving v
? For undirected graphs:
⚫ Sum of lengths of all adj. lists is
∑degree(v) = 2|E|
v∈V No. of edges incident on v. Edge (u,v) is
incident on vertices u and v.
⚫ Total storage: Θ(V+E)
GRAPH DEFINITIONS
? Path
⚫ Sequence of nodes n1, n2, … nk
⚫ Edge exists between each pair of nodes ni , ni+1
⚫ Example
? A, B, C is a path
GRAPH DEFINITIONS
? Path
⚫ Sequence of nodes n1, n2, … nk
⚫ Edge exists between each pair of nodes ni , ni+1
⚫ Example
? A, B, C is a path
? A, E, D is not a path
GRAPH DEFINITIONS
? Cycle
⚫ Path that ends back at starting node
⚫ Example
? A, E, A
GRAPH DEFINITIONS
? Cycle
⚫ Path that ends back at starting node
⚫ Example
? A, E, A
? A, B, C, D, E, A
? Simple path
⚫ No cycles in path
? Acyclic graph
⚫ No cycles in graph
GRAPH SEARCHING
? Given: a graph G = (V, E), directed or undirected
? Goal: methodically explore every vertex and every
edge
? Ultimately: build a tree on the graph
⚫ Pick a vertex as the root
⚫ Choose certain edges to produce a tree
⚫ Note: might also build a forest if graph is not
connected
BREADTH-FIRST SEARCH
? “Explore” a graph, turning it into a tree
⚫ One vertex at a time
⚫ Expand frontier of explored vertices across the
breadth of the frontier
? Builds a tree over the graph
⚫ Pick a source vertex to be the root
⚫ Find (“discover”) its children, then their
children, etc.
BFS - VERSION -1
BFS (s,Adj)
level ={s:0}
parent ={s:null}
i =0
frontiers = [s]
while frontiers:
next = []
for u in frontiers
for v in Adj[u]
if v is not in level
level[v] = i
paren[v] = u
next.append(v)
i = i+1
frontiers = next
BREADTH-FIRST SEARCH
? Input: Graph G = (V, E), either directed or
undirected, and source vertex s ∈ V.
? Output:
⚫ d[v] = distance (smallest # of edges, or shortest
path) from s to v, for all v ∈ V. d[v] = ∞ if v is
not reachable from s.
⚫ π[v] = u such that (u, v) is last edge on shortest
path s v.
? u is v’s predecessor.
⚫ Builds breadth-first tree with root s that
contains all reachable vertices.
BREADTH-FIRST SEARCH
? Associate vertex “colors” to guide the algorithm
⚫ White vertices have not been discovered
? All vertices start out white
⚫ Grey vertices are discovered but not fully
explored
? They may be adjacent to white vertices
⚫ Black vertices are discovered and fully explored
? They are adjacent only to black and gray
vertices
? Explore vertices by scanning adjacency list of grey
vertices
BFS(G,s)
1. for each vertex u in V[G] – {s}
2 color[u] ← white
initialization
3 d[u] ← ∝
4 π[u] ← nil white: undiscovered
gray: discovered
5 color[s] ← gray black: finished
6 d[s] ← 0 access source s
7 π[s] ← nil
Q: a queue of discovered
8 Q←Φ
vertices
9 enqueue(Q,s) color[v]: color of v
10while Q ≠ Φ d[v]: distance from s to v
π[u]: predecessor of v
11 u ← dequeue(Q)
12 for each v in Adj[u]
13 if color[v] = white
14 color[v] ← gray
15 d[v] ← d[u] + 1
16 π[v] ← u
17 enqueue(Q,v)
18 color[u] ← black
BREADTH-FIRST SEARCH: EXAMPLE
r s t u
∞ ∞ ∞ ∞
∞ ∞ ∞ ∞
v w x y
BREADTH-FIRST SEARCH: EXAMPLE
r s t u
∞ 0 ∞ ∞
∞ ∞ ∞ ∞
v w x y
Q: s
BREADTH-FIRST SEARCH: EXAMPLE
r s t u
1 0 ∞ ∞
∞ 1 ∞ ∞
v w x y
Q: w r
BREADTH-FIRST SEARCH: EXAMPLE
r s t u
1 0 2 ∞
∞ 1 2 ∞
v w x y
Q: r t x
BREADTH-FIRST SEARCH: EXAMPLE
r s t u
1 0 2 ∞
2 1 2 ∞
v w x y
Q: t x v
BREADTH-FIRST SEARCH: EXAMPLE
r s t u
1 0 2 3
2 1 2 ∞
v w x y
Q: x v u
BREADTH-FIRST SEARCH: EXAMPLE
r s t u
1 0 2 3
2 1 2 3
v w x y
Q: v u y
BREADTH-FIRST SEARCH: EXAMPLE
r s t u
1 0 2 3
2 1 2 3
v w x y
Q: u y
BREADTH-FIRST SEARCH: EXAMPLE
r s t u
1 0 2 3
2 1 2 3
v w x y
Q: y
BREADTH-FIRST SEARCH: EXAMPLE
r s t u
1 0 2 3
2 1 2 3
v w x y
Q: Ø
ANALYSIS OF BFS
? Initialization takes O(|V|).
? Traversal Loop
⚫ After initialization, each vertex is enqueued and
dequeued at most once, and each operation takes
O(1). So, total time for queuing is O(|V|).
⚫ The adjacency list of each vertex is scanned at most
once. The total time spent in scanning adjacency
lists is O(|E|).
? Summing up over all vertices => total running
time of BFS is O(|V| + |E|)
BREADTH-FIRST TREE
? For a graph G = (V, E) with source s, the
predecessor subgraph of G is Gπ = (Vπ , Eπ)
where
⚫ Vπ ={v∈V : π[v] ≠ nil} {s}
⚫ Eπ ={(π[v], v) ∈ E : v ∈ Vπ - {s}}
? The predecessor subgraph Gπ is a breadth-first
tree if:
⚫ Vπ consists of the vertices reachable from s and
⚫ for all v ∈ Vπ , there is a unique simple path from s
to v in Gπ that is also a shortest path from s to v in
G.
? The edges in Eπ are called tree edges.
|Eπ| = |Vπ| - 1.
DEPTH-FIRST SEARCH (DFS)
? Explore edges out of the most recently discovered
vertex v.
? When all edges of v have been explored, backtrack
to explore other edges leaving the vertex from
which v was discovered (its predecessor).
? “Search as deep as possible first.”
? Continue until all vertices reachable from the
original source are discovered.
? If any undiscovered vertices remain, then one of
them is chosen as a new source and search is
repeated from that source.
DFS - 1
parent = {s:none} DFS (V, Adj)
DFS_Visit(s, Adj.s) parent ={}
for v in Adj[s] for s in V
if v is not in parent if s is not in parent
parent[v] =s parent[s] = none
DFS_Visit(v, Adj.v) DFS_Visit(s,Adj.s)
DEPTH-FIRST SEARCH
? Input: G = (V, E), directed or undirected.
No source vertex given!
? Output:
⚫ 2 timestamps on each vertex.
? d[v] = discovery time (v turns from white to gray)
? f [v] = finishing time (v turns from gray to black)
⚫ π[v] : predecessor of v = u, such that v was
discovered during the scan of u’s adjacency list.
⚫ Depth-first forest
DEPTH-FIRST SEARCH
? Coloring scheme for vertices as BFS.
⚫ A vertex is “discovered” the first time it is
encountered during the search.
⚫ A vertex is “finished” if it is a leaf node or all
vertices adjacent to it have been finished.
⚫ White before discovery, gray while processing
and black when finished processing
1 ≤ d[u] < f [u] ≤ 2 |V|
WHIT GRAY BLAC
0 E d[u] f[u] K 2
V
PSEUDOCODE
DFS(G) DFS-Visit(u)
1. for each vertex u ∈ V[G] 1. color[u] ← GRAY // White vertex u
2. do color[u] ← white has been discovered
2. time ← time + 1
3. π[u] ← NIL
3. d[u] ← time
4. time ← 0
4. for each v ∈ Adj[u]
5. for each vertex u ∈ V[G]
5. do if color[v] = WHITE
6. do if color[u] = white
6. then π[v] ← u
7. then DFS-Visit(u) 7. DFS-Visit(v)
8. color[u] ← BLACK // Blacken u;
it is finished.
Uses a global timestamp time.
9. f[u] ← time ← time + 1
DFS EXAMPLE
DFS EXAMPLE
d
f
1 | | |
| |
| | |
DFS EXAMPLE
d
f
1 | | |
2 | |
| | |
DFS EXAMPLE
d
f
1 | | |
2 | |
3 | | |
DFS EXAMPLE
d
f
1 | | |
2 | |
3 |
| |
4
DFS EXAMPLE
d
f
1 | | |
2 | |
3 |
5 | |
4
DFS EXAMPLE
d
f
1 | | |
2 | |
3 | 5 |
|
4 6
DFS EXAMPLE
d
f
1 | 8 | |
2 |
|
7
3 | 5 |
|
4 6
DFS EXAMPLE
d
f
1 | 8 | |
2 |
|
7
3 | 5 |
|
4 6
DFS EXAMPLE
d
f
1 | 8 | |
2 |
9 |
7
3 | 5 |
|
4 6
DFS EXAMPLE
d
f
1 | 8 | |
2 | 9
7 |10
3 | 5 |
|
4 6
DFS EXAMPLE
d
f
8
1 | |
|11
2 | 9
7 |10
3 | 5 |
|
4 6
DFS EXAMPLE
d
f
1 8
|
|12 |11
2 | 9
7 |10
3 | 5 |
|
4 6
DFS EXAMPLE
d
f
1 8
13|
|12 |11
2 | 9
7 |10
3 | 5 |
|
4 6
DFS EXAMPLE
d
f
1 8
13|
|12 |11
2 | 9
7 |10
3 | 5 |
14|
4 6
DFS EXAMPLE
d
f
1 8
13|
|12 |11
2 | 9
7 |10
3 | 5 | 14|
4 6 15
DFS EXAMPLE
d
f
1 8 13|
|12 |11 16
2 | 9
7 |10
3 | 5 | 14|
4 6 15
ANALYSIS OF DFS
? Loops on lines 1-3 & 5-7 take Θ(V) time, excluding
time to execute DFS-Visit.
? DFS-Visit is called once for each white vertex v∈V
when it’s painted gray the first time. Lines 4-7 of
DFS-Visit is executed |Adj[v]| times. The total
cost of executing DFS-Visit is ∑v∈V|Adj[v]| = Θ(E)
? Total running time of DFS is Θ(|V| + |E|).
DEPTH-FIRST TREES
? Predecessor subgraph defined slightly different
from that of BFS.
? The predecessor subgraph of DFS is Gπ = (V, Eπ)
where Eπ ={(π[v], v) : v ∈ V and π[v] ≠ nil}.
⚫ How does it differ from that of BFS?
⚫ The predecessor subgraph Gπ forms a depth-first
forest composed of several depth-first trees. The
edges in Eπ are called tree edges.
TIME-STAMP STRUCTURE IN DFS
? There is also a nice structure to the time stamps,
which is referred to as Parenthesis Structure.
Theorem 22.7
For all u, v, exactly one of the following holds:
1. d[u] < f [u] < d[v] < f [v] or d[v] < f [v] < d[u] < f [u] and
neither u nor v is a descendant of the other.
2. d[u] < d[v] < f [v] < f [u] and v is a descendant of u.
3. d[v] < d[u] < f [u] < f [v] and u is a descendant of v.
DFS: KINDS OF EDGES
? Consider a directed graph G = (V, E). After a DFS
of graph G we can put each edge into one of four
classes:
⚫ A tree edge is an edge in a DFS-tree.
⚫ A back edge connects a vertex to an ancestor in
a DFS-tree. Note that a self-loop is a back edge.
⚫ A forward edge is a non-tree edge that
connects a vertex to a descendent in a DFS-tree.
⚫ A cross edge is any other edge in graph G. It
connects vertices in two different DFS-tree or
two vertices in the same DFS-tree neither of
which is the ancestor of the other.
EXAMPLE OF CLASSIFYING EDGES
tree in DFS forest
b a e
not in DFS
forward forest
tree tree
back cross
c d f
tree back
CLASSIFYING EDGES OF A DIGRAPH
? (u, v) is:
⚫ Tree edge – if v is white
⚫ Back edge – if v is gray
⚫ Forward or cross - if v is black
? (u, v) is:
⚫ Forward edge – if v is black and d[u] < d[v] (v
was discovered after u)
⚫ Cross edge – if v is black and d[u] > d[v] (u
discovered after v)
62
DFS: KINDS OF EDGES
DFS-Visit(u) ▷ with edge classification. G must be a
directed graph
1. color[u] ← GRAY
2. time ← time + 1
3. d[u] ← time
4. for each vertex v adjacent to u
5. do if color[v] ← BLACK
6. then if d[u] < d[v]
7. then Classify (u, v) as a forward edge
8. else Classify (u, v) as a cross edge
9. if color[v] ← GRAY
10. then Classify (u, v) as a back edge
11. if color[v] ← WHITE
12. then π[v] ← u
13. Classify (u, v) as a tree edge
14. DFS-Visit(v)
15. color[u] ← BLACK
16. time ← time + 1
17. f[u] ← time
DFS: KINDS OF EDGES
? Suppose G be an undirected graph, then we have
following edge classification:
⚫ Tree Edge an edge connects a vertex with its
parent.
⚫ Back Edge a non-tree edge connects a vertex
with an ancestor.
⚫ Forward Edge There is no forward edges
because they become back edges when
considered in the opposite direction.
⚫ Cross Edge There cannot be any cross edge
because every edge of G must connect an
ancestor with a descendant.
SOME APPLICATIONS OF BFS AND DFS
? BFS
⚫ To find the shortest path from a vertex s to a
vertex v in an unweighted graph
⚫ To find the length of such a path
⚫ Find the bipartiteness of a graph.
? DFS
⚫ To find a path from a vertex s to a vertex v.
⚫ To find the length of such a path.
⚫ To find out if a graph contains cycles
65
APPLICATION OF BFS: BIPARTITE GRAPH
? Graph G = (V,E) is bipartite iff V can be
partitioned into two sets of nodes A and B
such that each edge has one end in A and
the other end in B
Alternatively:
⚫ Graph G = (V,E) is bipartite iff all its cycles have
even length
⚫ Graph G = (V,E) is bipartite iff nodes can be
coloured using two colours
Note: graphs without cycles (trees) are bipartite
APPLICATION OF BFS: BIPARTITE GRAPH
bipartite:
non bipartite
Question: given a graph G, how to test if the graph
is bipartite?
APPLICATION OF BFS: BIPARTITE GRAPH
For each vertex u in V[G] − {s}
do color[u] ← WHITE
d[u] ← ∞
partition[u] ← 0
color[s] ← GRAY
partition[s] ← 1
d[s] ← 0
Q ← [s]
while Queue 'Q' is non-empty
do u ← head [Q]
for each v in Adj[u] do
if partition [u] = partition [v] then
return 0
else if color[v] ← WHITE then
color[v] ← gray
d[v] = d[u] + 1
partition[v] ← 3 − partition[u]
ENQUEUE (Q, v)
DEQUEUE (Q)
Color[u] ← BLACK
Return 1
APPLICATION OF DFS:
DETECTING CYCLE FOR DIRECTED GRAPH
DFS_visit(u)
color(u) ← GRAY
d[u] ← time ← time + 1
for each v adjacent to u do
if color[v] ← GRAY then
return "cycle exists"
else if color[v] ← WHITE then
predecessor[v] ← u
DFS_visit(v)
color[u] ← BLACK
f[u] ← time ← time + 1
APPLICATION OF DFS:
DETECTING CYCLE FOR UNDIRECTED GRAPH
DFS_visit(u)
color(u) ← GRAY
d[u] ← time ← time + 1
for each v adjacent to u do
if color[v] ← GRAY and π[u] ≠ v then
return "cycle exists"
else if color[v] ← WHITE then
predecessor[v] ← u
DFS_visit(v)
color[u] ← BLACK
f[u] ← time ← time + 1
TOPOLOGICAL SORT
Want to “sort” a directed acyclic graph (DAG).
A B D
C E
A B C D E
Think of original DAG as a partial order.
Want a total order that extends this partial order.
TOPOLOGICAL SORT
? Performed on a DAG.
? Linear ordering of the vertices of G such that if (u, v) ∈ E,
then u appears somewhere before v.
Topological-Sort (G)
1. call DFS(G) to compute finishing times f [v] for all v ∈ V
2. as each vertex is finished, insert it onto the front of a linked list
3. return the linked list of vertices
Time: Θ(V + E).
EXAMPLE
A B D
1/
C E
Linked List:
EXAMPLE
A B D
1/
2/
C E
Linked List:
EXAMPLE
A B D
1/
2/3
C E
Linked List:
2/3
E
EXAMPLE
A B D
1/4
2/3
C E
Linked List:
1/4 2/3
D E
EXAMPLE
A B D
5/ 1/4
2/3
C E
Linked List:
1/4 2/3
D E
EXAMPLE
A B D
5/ 1/4
6/ 2/3
C E
Linked List:
1/4 2/3
D E
EXAMPLE
A B D
5/ 1/4
6/7 2/3
C E
Linked List:
6/7 1/4 2/3
C D E
EXAMPLE
A B D
5/8 1/4
6/7 2/3
C E
Linked List:
5/8 6/7 1/4 2/3
B C D E
EXAMPLE
A B D
9/ 5/8 1/4
6/7 2/3
C E
Linked List:
5/8 6/7 1/4 2/3
B C D E
EXAMPLE
A B D
5/8 1/4
9/1
0
6/7 2/3
C E
Linked List:
9/1 5/8 6/7 1/4 2/3
0A B C D E
THE END