0% found this document useful (0 votes)

52 views13 pages

Efficient Graph Storage Methods

Uploaded by

Sanad Hammas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views13 pages

Efficient Graph Storage Methods

Uploaded by

Sanad Hammas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 13

So in our previous lesson, we discussed

one possible way of

storing and representing a graph in

which

we used two list. One to store the

vertices and another to store the

edges. A record in vertex list here

is name of a node

and a record in edge list is an

object

containing references to the two endpoints

of an edge and also the weight of that edge

because this example graph that I am showing

you here is a

weighted graph. We called this kind of

representation

edge list representation but we realised

that this kind of storage is not very

efficient in terms of

time cost of most frequently performed

operations

like finding nodes adjacent to a given

node

or finding if two nodes are

connected are not.

To perform any of these operations, we

need to scan the whole

edge list. We need to perform a

linear search on the edge list.

So the time complexity is big oh of number

edges and we know that number of edges

in the graph

can be really really large. In worst case

it can be close to square of number of

vertices.

In a graph, anything running in order

of number of

edges is considered very costly. We

often want to keep the cost

in order of number of vertices. So we

should think of some other efficient

design.

We should think of something better than

this. One more possible design is that

we can store the edges in a

two-dimensional array

or matrix. We can have a

two-dimensional matrix

or array of size V*V

where V is number of vertices.

As you can see, I have drawn an 8*8

array here because number of vertices

in my sample graph here

is 8. Let's name this array A.

Now if we want to store a graph that is

unweighted. Let's just remove the weights

from this sample graph here

and now our graph is unweighted and if we

have

of value or index between 0 and V-1

for each vertex which we have here

if we are storing the vertices in a

vertex list

than we have an index between 0 and V-1

for each vertex. We can say that A

is zeroth node,

B is 1th node, C is

2th

node and so on. We are picking up

indices from vertex list. Okay

so if the graph

is unweighted and each vertex has an

index between 0 and

V-1, then in this matrix

or 2d array. We can set ith row

and jth column that is A[i][j]

as 1 or boolean value

true. if there is an edge from i to j

0 or false otherwise. If I have

to fill this matrix for this example

graph here then I'll go vertex by vertex.

Vertex 0 is connected to Vertex 1

2 and 3. Vertex 1

is connected to 0, 4 and 5.

This is an undirected graph so if we

have and edge from 0 to 1,

we also have an edge from 1 to 0

so
1th row and 0th column should also be

set as 1.

Now let's go to nodes 2, it's connected

to 0

and 6, 3 is connected to 0 and 7,

4 is connected to 1 and 7,

5 once again is connected to 1 and 7,

6 is connected to

2 and 7 and 7 is connected

to 3, 4, 5 and 6.

All the remaining positions in

this array should be set as 0.

Notice that this matrix

is symmetric. For an undirected graph,

this matrix would be symmetric

because A[i][j] would be equal to A[j][i].

We would have two positions filled for

each edge.

In fact to see all the edges in the graph,

we need to go through only one of these

two halves.

Now this would not be true for our

directed graph. Only one position will be

filled for each

edge and we will have to go through

the entire matrix

to see all the edges. Okay,

now this kind of representation of a

graph in which
edges or connections are stored in a

matrix

or 2D array is called adjacency matrix

representation. This particular matrix that

I have drawn here

is an adjacency matrix. Now with this

kind of storage or representation,

what do you think would be the time cost

of finding

all nodes adjacent to a given node. Let's say

given this vertex list

and adjacency matrix, we want to find

all nodes adjacent to node named F.

If we are given name of a node than

we first need to know it's

index and to know the index, we will have to

scan the vertex list.

There is no other way. Once we figured out

index

like for F index is 5 then

we can go to the row with that index

in the adjacency matrix

and we can scan this complete row to

find all the

adjacent nodes. Scanning the vertex

list

to figure it out the index in worst case

will cost us time proportional to the

number of vertices

because in worst case we may have to

scan the whole list,

and scanning a row

in the adjacency matrix would once again

cost us time proportional to number of

what vertices because

in a row we would have exactly

V columns where V is number of a

vertices.

So overall time cost of this operation

is big oh of V. Now most of the time

while performing operations,

we must pass indices to avoid

scanning the vertex list all the time.

If we know an index, we can figure out

the name in constant time,

because in an array we can access element at

any index in constant time but if we know

a name

want to figure out index then it will

cost us big oh of V.

We will have to scan the vertex list.

wWe will have to perform linear search

on it. Okay moving on.

Now what would be the time cost of

finding if 2 nodes

are connected or not. Now once again the

two nodes can be given to us

as indices or names. If the nodes

would be passed test as indices

then we simply need to look at value in

a particular row and

particular column. We simply need to look

A[I][J] for some values of I and J

and this will cost us constant time.

You can look at Value in any cell in

a two-dimensional array in constant time.

So if

indices are given time complexity of

this operation would be big oh of 1

which simply means that we will

take constant time

but if names are given then we also need

to do the scanning

to figure out the indices which will

cost us big oh of V.

Overall time complexity would be

Big oh of V.

The constant time access would not mean

anything.

The scanning of vertex list all the

time to figure it out

indices can be avoided. We can use

some extra memory to create

a hash table with names and indices

as key value pairs and then the time

cost of finding

index from name would also be big oh

of 1 that is constant. Hash table is

a data structure

and I have not talked about it in any of

my lessons so far.

If you do not know about hash table, just

search online for

a basic idea of it. Okay, so as you can

see

with adjacency matrix representation

our time cost of some of the most

frequently performed operations

is in order of number of vertices

and not in order of number of

edges which can be as high as square of

number of vertices.

Okay now if we want to store

a weighted graph in adjacency matrix

representation

then A[i][j] in the matrix can be set as

weight of an edge. For non-existent ages we

can have

a default value like a really large

or maximum possible integer value

that is never expected to be an edge

weight. I have just filled in infinity

have to mean that

we can choose the default as infinity

minus infinity

or any other value that would never

ever be a valid
edge weight. Okay, now for further

discussion

I'll come back to an unweighted graph.

Ajacency matrix

looks really good so should we not use it

always.

Well, with this design we have improved

time, but we have gone really high on

memory usage

instead of using memory units exactly

equal to the number of edges

what we're doing with

edge list kind of storage.

Here we're using exactly V square

units of memory.

We are using big oh of V square space.

We are not just storing the information

that these two

nodes are connected, we are also storing not

of it

that is these two nodes side not connected

which probably is

redundant information. If a graph is

dense,

if the number of edges is really close

to V square

then this is good but if the graph is

sparse

that is if number of edges is lot lessser

than V square

then we are wasting a lot of

memory in storing the zeros.

Like for this example graph that I have

drawn here, in the edge list we were

consuming

10 units of memory we had ten rows

consumed in the edge list

but here we are consuming 64 unit.

Most graphs with

really large number of vertices would

not be very dense,

would not have number of edges anywhere

close to V sqaure

like for example, Let's say we are modeling

a social network like Facebook as a

graph such that a user in the network

is a node

and there is an undetected edge if two

users are friends.

Facebook has a billion users but I'm

showing only a few in my example graph

here because I'm short of space.

Let's just assume that we have a billion

users in our network,

so number of vertices in a graph is

10 to the power 9

which is billion. Now do you think number

of connections
in our social network can ever be close

to square of number of users

that will mean everyone in the network

is a friend of

everyone else. A user of our social

network will not be friend to all other

billion users.

We can safely assume that a user

on an average would not have more than

a thousand friends

with this assumption we would have

10 to the power 12

edges in our graph. Actually, this is an

undirected graph

so we should do a divide by 2 here. So

that we do not

count an edge twice. So if

average number of friends is 1000 then total

number of connections in my graph is

5 * 10 to power 11. Now this

is lot lesser than a square of number

of vertices.

So basically if you would use an adjacency

matrix for this kind of a graph,

we would waste a hell lot of space

and moreover

even if we are not looking in relative

terms 10 to the power 18

units of memory, even in absolute

sense
is alot. 10 to the power 18 bytes

would be about a 1000 petabytes.

Now this really is a lot of space. This

much data would never ever fit on one

physical disk.

5 into 10 to the power 11 byts on the other

hand

it's just 0.5 terabytes. A typical

personal computer these days would have this

much of storage.

So as you can see for something like a

large

social graph adjacency matrix

representation is not very efficient.

Agency matrix is good when a graph is

dense

that is when the number of edges is

close to square of number of vertices

or sometimes when total number of

possible connection that is V square

is so less that wasted space would not

even matter

but most real-world graphs would be

sparse

and adjacency matrix would not be a good

fit.

Let's think about another example. Let's

think about

world wide web as are directed graph.

If you can think of web pages as nodes

in a graph

and hyperlinks as directed edges

then a webpage would not have linked to

all other pages

and once again number of webpages

would be in order of millions.

A webpage would have link to only

the

a few other pages, so the graph would be

sparse.

Most real world graphs would be sparse

and adjacency matrix. Even though it's

giving us good running time for most

frequently performed

operations would not be a good fit

because it's not very efficient in terms

of space

so what should we do. Well there's

another

representation that gives us similar

or maybe even better running time than

adjacency matrix and does not consume so

much space

It's called adjacency list

representation and we will talk about it

in our next lesson.

This is it for this lesson.

Thanks for watching

2 Representation
No ratings yet
2 Representation
40 pages
Graph Algorithms Study Guide
No ratings yet
Graph Algorithms Study Guide
98 pages
Graph Representation
No ratings yet
Graph Representation
8 pages
DSA Unit-5
No ratings yet
DSA Unit-5
227 pages
Graph Algorithms and Hashing
No ratings yet
Graph Algorithms and Hashing
230 pages
Graph Algorithms
No ratings yet
Graph Algorithms
82 pages
CH 13 Graphs
No ratings yet
CH 13 Graphs
32 pages
Lecture6 of The Mafis Hgadd. Uyddddexcfdds
No ratings yet
Lecture6 of The Mafis Hgadd. Uyddddexcfdds
54 pages
DSA Chapter 5.1 2024
No ratings yet
DSA Chapter 5.1 2024
15 pages
Graph Theory Basics & Algorithms
No ratings yet
Graph Theory Basics & Algorithms
30 pages
2 Representation en
No ratings yet
2 Representation en
35 pages
Graph (Abstract Data Type) - Wikipedia
No ratings yet
Graph (Abstract Data Type) - Wikipedia
7 pages
Graph Data Structures Overview
No ratings yet
Graph Data Structures Overview
20 pages
202003242118236659shruti Saxena Data Structure-GRAPHS
No ratings yet
202003242118236659shruti Saxena Data Structure-GRAPHS
19 pages
Graph
No ratings yet
Graph
12 pages
Graph
No ratings yet
Graph
15 pages
Adjacency Matrix Representation Adjacency List Representation
No ratings yet
Adjacency Matrix Representation Adjacency List Representation
42 pages
Graph
No ratings yet
Graph
31 pages
Graph Mapping
No ratings yet
Graph Mapping
7 pages
DSA Day 4
No ratings yet
DSA Day 4
7 pages
CH 02
No ratings yet
CH 02
44 pages
Graph Representation
No ratings yet
Graph Representation
5 pages
Class01 Computer Contest Level 3 Notes
No ratings yet
Class01 Computer Contest Level 3 Notes
46 pages
Unit III
No ratings yet
Unit III
146 pages
Lecture 09 - Graphs Graph Algorithms
No ratings yet
Lecture 09 - Graphs Graph Algorithms
47 pages
Aph Theory
No ratings yet
Aph Theory
46 pages
Graphs
No ratings yet
Graphs
24 pages
Unit 5
No ratings yet
Unit 5
28 pages
CH 02
No ratings yet
CH 02
46 pages
Graph Representation
No ratings yet
Graph Representation
5 pages
Recitation 9: Graphs
No ratings yet
Recitation 9: Graphs
5 pages
Step 15 Graph Striver
No ratings yet
Step 15 Graph Striver
91 pages
L27-30 Graphs For Data Structure and Algo
No ratings yet
L27-30 Graphs For Data Structure and Algo
68 pages
Graph Theory for CS Students
No ratings yet
Graph Theory for CS Students
24 pages
Weeks 8, 9 - Sessions 15, 16, 17, 18 - Chapter 16 - Graphs and Digraphs
No ratings yet
Weeks 8, 9 - Sessions 15, 16, 17, 18 - Chapter 16 - Graphs and Digraphs
92 pages
Graphs: Terminology and Representation
No ratings yet
Graphs: Terminology and Representation
29 pages
Unit 5 DSA
No ratings yet
Unit 5 DSA
30 pages
DS-unit-5 IQ
No ratings yet
DS-unit-5 IQ
43 pages
Graphs
No ratings yet
Graphs
30 pages
11 Graph Data Structure 23052023 090026am
No ratings yet
11 Graph Data Structure 23052023 090026am
21 pages
Graphs: Representation and Operations
No ratings yet
Graphs: Representation and Operations
79 pages
Unit-II Sem-IV Topic-II Graph 2023-24
No ratings yet
Unit-II Sem-IV Topic-II Graph 2023-24
23 pages
Graph
No ratings yet
Graph
35 pages
10 Graphs
No ratings yet
10 Graphs
54 pages
Slide 11 - Graph
No ratings yet
Slide 11 - Graph
32 pages
ADSA
No ratings yet
ADSA
39 pages
Graph Data Structure and Algorithms: Recent Articles On Graph
No ratings yet
Graph Data Structure and Algorithms: Recent Articles On Graph
30 pages
Data Structure UNIT V
No ratings yet
Data Structure UNIT V
44 pages
Notes 3
No ratings yet
Notes 3
28 pages
A Brief Study of Graph Data Structure: Ijarcce
No ratings yet
A Brief Study of Graph Data Structure: Ijarcce
5 pages
Graph Representation
No ratings yet
Graph Representation
3 pages
Theory
No ratings yet
Theory
82 pages
Graphs
No ratings yet
Graphs
112 pages
Graph Data Structures Explained
No ratings yet
Graph Data Structures Explained
75 pages
Graphs and Spanning Trees Notes
No ratings yet
Graphs and Spanning Trees Notes
40 pages
Graph Data Structure-Notes
100% (1)
Graph Data Structure-Notes
15 pages
Class12 CS Project Hospital Management System Bhavan New (1) Button
No ratings yet
Class12 CS Project Hospital Management System Bhavan New (1) Button
24 pages
History 2ND Year
100% (1)
History 2ND Year
4 pages
Nov 2022 - Dela Cruz F - Set 1
No ratings yet
Nov 2022 - Dela Cruz F - Set 1
2 pages
NCISM Rasa Shastra Evam Bhaishajya Kalpana Syllabus
No ratings yet
NCISM Rasa Shastra Evam Bhaishajya Kalpana Syllabus
17 pages
P&Id Reverse Osmosis: Shuqaiq 3 Independent Water Project
No ratings yet
P&Id Reverse Osmosis: Shuqaiq 3 Independent Water Project
20 pages
Classic Metallic Brochure 2010
No ratings yet
Classic Metallic Brochure 2010
24 pages
U600 ULTRASONIC SCALER - Woodpecker Medical
No ratings yet
U600 ULTRASONIC SCALER - Woodpecker Medical
18 pages
Chapter 1 PowerPoint Slides PDF
No ratings yet
Chapter 1 PowerPoint Slides PDF
20 pages
Community Awareness Speeches
No ratings yet
Community Awareness Speeches
3 pages
CSC Books
No ratings yet
CSC Books
20 pages
MCQ Bank For Promotion Test - UDC LDC Assistant DEO DPS Associate Steno
No ratings yet
MCQ Bank For Promotion Test - UDC LDC Assistant DEO DPS Associate Steno
354 pages
BC Calculus Unit 7: Polar, Parametric, Vector Practice
No ratings yet
BC Calculus Unit 7: Polar, Parametric, Vector Practice
28 pages
Magalogue 102023
No ratings yet
Magalogue 102023
43 pages
Inspection Punch List
No ratings yet
Inspection Punch List
2 pages
Gary Goldschneider's Everyday Astrology PDF
No ratings yet
Gary Goldschneider's Everyday Astrology PDF
31 pages
TOEIC Test Prep Course Syllabus
No ratings yet
TOEIC Test Prep Course Syllabus
10 pages
English 10 1
No ratings yet
English 10 1
24 pages
Laptop Guide PDF
No ratings yet
Laptop Guide PDF
19 pages
Ysr Designs-14561
No ratings yet
Ysr Designs-14561
3 pages
Digital Number Systems Guide
No ratings yet
Digital Number Systems Guide
12 pages
Personal Transition Reflections
No ratings yet
Personal Transition Reflections
5 pages
BSD 1307 Object Oriented Analysis and Design
No ratings yet
BSD 1307 Object Oriented Analysis and Design
2 pages
Pulsar Electronic Components Price List
No ratings yet
Pulsar Electronic Components Price List
9 pages
IS208 PROFESSIONAL ISSUES IN INFORMATION SYSTEMS Revised
67% (3)
IS208 PROFESSIONAL ISSUES IN INFORMATION SYSTEMS Revised
2 pages
Ozone Therapy in Dentistry
100% (1)
Ozone Therapy in Dentistry
16 pages
Weeks 1 To 4 Fundamental Analysis
No ratings yet
Weeks 1 To 4 Fundamental Analysis
166 pages
Your Bill
No ratings yet
Your Bill
4 pages
Hastamalaka
No ratings yet
Hastamalaka
8 pages
Ongoing Regular Recruit Intake Applications
No ratings yet
Ongoing Regular Recruit Intake Applications
3 pages
17MU5A0305 Project Report
No ratings yet
17MU5A0305 Project Report
107 pages