I
NTRODUCTION
n the world of computer science, efficient data organization is paramount, and sorting algorithms
like Quick Sort stand out as powerful tools that optimize performance. Quick Sort, renowned for
its divide-and-conquer approach, offers an elegant solution to sorting that outperforms many
others in terms of average-case efficiency. But what makes it captivating is its ability to handle large
datasets with minimal overhead, even though it comes with caveats like worst-case performance in certain
scenarios. Coupled with this are graphs, intricate structures that model relationships and connections, which
can be traversed using robust algorithms like Breadth-First Search (BFS) and Depth-First Search (DFS).
Understanding these traversal methods not only reveals the underlying properties of graphs but also
showcases their practical applications in everything from social networks to route optimization.
Delving deeper, the concept of hashing introduces us to a method of efficiently accessing data through unique
keys, streamlining searches and enhancing performance in various applications such as databases and
caches. The magic of hashing lies in its ability to convert variable inputs into fixed-sized values, ensuring
quick data retrieval. As we explore these fundamental concepts of sorting, graph theory, and hashing, we
uncover a web of interconnected ideas that drive innovation and efficiency in technology. Each algorithm
weaves its unique narrative, enhancing our understanding of how data shapes the digital landscape we
navigate every day
Data Structure 2
QUICK SORT
As its name implies, it is quick, and it is the algorithm generally preferred for sorting for its
average-case efficiency &are examples of divide-and-conquer, As we have seen for binary
search where We divide a problem into simpler subproblems that can be solved independently
and then combine the solution.
Here are some of best places where quick sort highly fit:
Big Data-sets: It sorts large datasets very quickly due to its O(n log n) average-
case time complexity.
Memory Sorting: It's commonly used for memory sorting since it has less
memory usage than other sorts such as merge sort.
ArraySorting:Quicksort works very efficiently for array sorting and therefore it's co
mmonly used in libraries for sorting.
Database Systems: It's used in database systems for
query evaluation and record sorting.
The back-end of quick sort
To understand how quick sort works let’s make things easier by considering(imagining) array of
numbers
Pick an element(array), say P (which is the term called pivot in quick sort)
Re-arrange the elements into 2 sub-blocks(partitioning)
I. Those less than or equal to ( ≤) P.
II. Those greater than or equal to ( ≥) P.
Repeat the process recursively for the left- and right- sub-blocks. Return then return
sorted form
The main idea is to find the “right” position for the pivot element P. „ After each
“pass”, the pivot element, P, should be “in place”. „ Eventually, the elements are
sorted since each pass puts at least one element (i.e., P) into its final position.
Data Structure 3
Algorithm: Quick Sort
if length of A ≤ 1:
return A
pivot = A[last index of A]
left = empty array
right = empty array
for i = 0 to second-to-last index of A:
if A[i] ≤ pivot:
append A[i] to left
else:
append A[i] to right
return Quick Sort(left) + [pivot] + Quick Sort(right)
Data Structure 4
<full code:Team2/code/quick sort.cpp>
Advantage:
Fast Average Performance
Time complexity: O(n log n) (very effective for large data).
Typically, faster than Heap Sort and Merge Sort in most practical applications due to
lower constant factors.
Disadvantages
Worst-Case Time Complexity: O(n²) Occurs when: Fixed pivot is the smallest or largest
element (e.g., already sorted/reverse-sorted array).
Avoidable by: Randomized Quick Sort (randomly selecting the pivot).
Not Stable
Does not preserve the relative order of equal elements (in contrast to Merge Sort).
Recursive Overhead
Pivot Choice Sensitivity
Running time analysis
I. Worst-Case (Data is sorted already)
When the pivot is the smallest (or largest) element at partitioning on a block of size n,
the result
yields one empty sub-block, one element (pivot) in the “correct” place and one
sub-block of size (n-1)
takes θ(n) times. „
Recurrence Equation:
T(1) = 1 T(n) = T(n-1) + cn
So Solution: θ(n2 )
Worse than Mergesort!!!
II. Best case:
The pivot is in the middle (median) (at each partition step), i.e. after each partitioning,
on a block of size n, the result
yields two sub-blocks of approximately equal size and the pivot element in the
“middle” position
takes n data comparisons.
Recurrence Equation becomes
Data Structure 5
T(1) = 1
T(n) = 2T(n/2) + cn
So Solution: θ(n logn)
iii. Average case:
It turns out the average case running time also is θ(n logn). We will wait until COMP
271 to discuss the analysis.
Graph
A graph is not lines and dots together; A graph is a mathematical abstraction of vertices (nodes)
and edges (links) that captures relations. In math alone, it analyzes network properties like paths
and connectivity. In biology, it pictures protein interactions; in physics, particle collisions. Social
networks use graphs to link users, transportation networks to plot routes. The internet's
backbone? A graph of linked servers. From epidemiology to recommendation systems
Graphs turn unruly data into mathematical models—enabling discovery and technology. This is
the mathematics that maps out reality.
Introduction to Graphs
Definition: A graph is a non-linear data structure consisting of nodes (vertices) connected by
edges. Graphs are widely used to model real-world systems like social networks, maps, and
computer networks.
Graph Terminology
1. Vertex: An individual data element of a graph is called as Vertex. Vertex is also known as
node. In above example graph, A, B, C, D & E are known as vertices.
2. Edge: An edge is a connecting link between two vertices. Edge is also known as Arc. An
edge is represented as (starting Vertex, ending Vertex). In above graph, the link between
vertices A and B is represented as (A,B).
Types of Graphs
Data Structure 6
1. Undirected Graph(graphs)
Is a graph where edges have no direction (A-B means A is connected to B, and B is
connected to A).which makes it bidirectional.
Example: Facebook friendships (if you’re friends with someone, they’re also friends
with you).
2. Directed Graph (Digraph)
Edges have direction (A→B means A points to B, but B does not necessarily point back
to A).
Example: Twitter follow system (you can follow someone without them following you
back).
3. Complete Graph:
A graph in which any V node is adjacent to all other nodes present in the graph is known
as a complete graph.
An undirected graph contains the edges that are equal to edges = n(n-1)/2 where n is the
number of vertices present in the graph. The following figure shows a complete graph.
4. Regular Graph:
Regular graph is the graph in which nodes are adjacent to each other, i.e., each node
is accessible from any other node.
Data Structure 7
5. Cycle Graph:
A graph having cycle is called cycle graph. In this case the first and last nodes are
the same. A closed simple path is a cycle.
6. Acyclic Graph: A graph without cycle is called acyclic graphs.
7. Weighted Graph: A graph is said to be weighted if there are some non negative value assigned
to each edges of the graph. The value is equal to the length between two vertices. Weighted
graph is also called a network.
Unweighted Graph
Definition: A graph where edges have no numerical weights (all connections are treated
equally).
Data Structure 8
Directed vs undirected graph
The choice between directed and undirected graphs is not just a matter of lines or arrows—it's a
matter of constraints and authority. A digraph (directed graph) is your best choice for depicting
dependencies, hierarchies, or one-way relationships (like course prereqs or Twitter follows),
where the direction of edges dictates strict flow—get it wrong, and your topological sort
spectacularly fails. Conversely, an undirected graph shines on symmetric worlds (Facebook
friends or road networks), where edges are directional and algorithms like BFS/DFS get to go
their merry way unconstrained without caring about "who followed whom." Directed graphs
need careful handling (in-degree/out-degree considerations!), but undirected graphs make life
simple—isn't that nice?—but both have unique algorithmic superpowers, ranging from detecting
cycles in other ways to shortest path optimization (Dijkstra versus Bellman-Ford?).
Sparse Graph
Definition: A graph where the number of edges ∣E∣ is significantly fewer than the
maximum possible edges.
For a directed graph: ∣E∣<∣V∣2∣E∣≪∣V∣2
For an undirected graph: ∣E∣<∣V∣(∣V∣−1)2∣E∣≪2∣V∣(∣V∣−1)
Dense Graph
Definition: A graph where ∣E∣∣E∣ approaches the maximum possible edges.
∣E∣<∣V∣2∣E∣≪∣V∣2
Data Structure 9
Traversing Graph
Depth first Search (DFS)
A depth-first search (DFS) in an undirected graph G is like wandering in a labyrinth
with a string and a can of red paint without getting lost.
We start at vertex s, tying the end of our string to the point and painting s “visited”.
Next we label s as our current vertex called u.
Now we travel along an arbitrary edge (u,v).
If edge (u,v) leads us to an already visited vertex v we return to u.
If vertex v is unvisited, we unroll our string and move to v, paint v “visited”, set v as
our current vertex, and repeat the previous steps.
Eventually, we will get to a point where all incident edges on u lead to visited
vertices. We then backtrack by unrolling our string to a previously visited vertex v.
Then v becomes our current vertex and we repeat the previous steps.
Then, if we all incident edges on v lead to visited vertices, we backtrack as we did
before. We continue to backtrack along the path we have traveled, finding and
exploring unexplored edges, and repeating the procedure. • When we backtrack to
vertex s and there are no more unexplored edges incident on s, we have finished our
DFS search
Data Structure 10
<full code:Team2/code/DFS.cpp>
DFS Properties
Proposition 9.12 : Let G be an undirected graph on which a DFS traversal starting at a
vertex s has been preformed.
Data Structure 11
Then: 1) The traversal visits all vertices in the connected component of s
2) The discovery edges form a spanning tree of the connected component of s
Justification of 1):
Let’s use a contradiction argument: suppose there is at least on vertex v not
visited and let w be the first unvisited vertex on some path from s to v.
Because w was the first unvisited vertex on the path, there is a neighbor u that
has been visited. - But when we visited u we must have looked at edge(u, w).
Therefore w must have been visited. - and justification
Justification of 2):
We only mark edges from when we go to unvisited vertices. So we never form a
cycle of discovery edges, i.e. discovery edges form a tree.
This is a spanning tree because DFS visits each vertex in the connected component
of s
Breadth-First Search (BDF)
Like DFS, a Breadth-First Search (BFS) traverses a connected component of a graph, and
in doing so defines a spanning tree with several useful properties
The starting vertex s has level 0, and, as in DFS, defines that point as an “anchor.” - In
the first round, the string is unrolled the length of one edge, and all of the edges that are
only one edge away from the anchor are visited.
These edges are placed into level 1 - In the second round, all the new edges that can be
reached by unrolling the string 2 edges are visited and placed in level 2.
This continues until every vertex has been assigned a level.
The label of any vertex v corresponds to the length of the shortest path from s to v.
Data Structure 12
implementation
<full code:Team2/code/BFS.cpp>
Data Structure 13
Properties of BFS
Proposition: Let G be an undirected graph on which a a BFS traversal starting at vertex
s has been performed. Then
The traversal visits all vertices in the connected component of s.
The discovery-edges form a spanning tree T, which we call the BFS tree, of
the connected component of s
For each vertex v at level i, the path of the BFS tree T between s and v has i
edges, and any other path of G between s and v has at least i edges.
If (u, v) is an edge that is not in the BFS tree, then the level numbers of u and
v differ by at most one.
Proposition: Let G be a graph with n vertices and m edges. A BFS traversal of G takes
time O(n + m). Also, there exist O(n + m) time algorithms based on BFS for the
following problems:
Testing whether G is connected.
Computing a spanning tree of G
Computing the connected components of G
Computing, for every vertex v of G, the minimum number of edges of any path
between s and v.
There are various practical application of here graph are some:
web search & pagerank: Undirected graphs model mutual friendships; directed
graphs track followers (Instagram)
social network: Directed graphs link webpages (edges = hyperlinks).
Data Structure 14
Gps navigation: Weighted graphs model roads (edges = routes, weights =
distance/traffic).
Computer network: Nodes = servers/routers; edges = connections.
Biology: Protein-protein interaction networks; brain connectomes.
Hashing
In data structures,
Hashing is a well-known technique to search any particular element among several
elements. It minimizes the number of comparisons while performing the search.
Whereas A hash table is a probabilistic data structure that maps keys to values using a
hash function, which converts arbitrary keys into array indices
The back-end of hashing
The basic idea of hashing is simply to make searching & sorting faster.
Here’s how it’s work:
First, we have to insert our data to hash function & get hash key(often we call it ‘index’) from
the same function then after that we construct hash table & store the data into it if something
goes wrong with storing like our hash function give as the same key for identical data we have a
brief advocator which is called collision resolution which make every similar keys go down let
as take some practical example & let’s view it.
Algorithm HashTable:
1. Initialize:
- bucketSize = size
- table = array of linked lists (size = bucketSize)
2. hashFunction(key):
- return key % bucketSize
3. insert(key):
- index = hashFunction(key)
- table[index].append(key)
4. display():
- for i = 0 to bucketSize-1:
print("Bucket " + i + ": ")
Data Structure 15
for each key in table[i]:
print(key + " -> ")
print("NULL")
5. Main:
- ht = HashTable(5)
- ht.insert(19)
- ht.insert(50)
- ht.insert(89)
- ht.insert(40)
- ht.display()
Let’s create hash table using practical example. Let’s insert 19,50,89 & 40 into the following
hash table using hash function h(X)= key mod 5
Now let’s do hashing(giving index(key) to our values)
Our first value is X=19, h(19) = 19 mod(%) 5= 4
Stored at 4th index
Our Second value is 50, h(50) =50%5=0
Stored at 0th index
Our 3rd value is 89, h(89) =89%5= 4
Since 4th index is occupied by 19, linear probing takes it
down to 1st index
So, Stored at 1st index
Our 4th value is 40, h(40) = 40%5=0
Since 0th index is occupied by 50, linear probing takes it
down to 2nd index
Data Structure 16
So, Stored at 2nd index
<full code:Team2/code/Hash table.cpp>
Data Structure 17
SUMMARY
Finally, we revisited classic sorting algorithms, with a focus on Quick Sort—a divide-and-
conquer champion renowned for its average-case speed of O(n log n). By selecting a pivot and
partitioning data recursively, Quick Sort organizes large datasets efficiently, making it highly
suitable for applications like memory sorting and database management. Yet, its performance
can decline to O(n²) if the pivot is poorly chosen, emphasizing the importance of strategic
selection or randomization to keep it at peak performance.
Throughout our exploration, we've seen how various models like social networks, navigation
systems, and biological networks are represented through different types of graphs—directed,
undirected, complete, and regular—each serving unique purposes based on the relationships
they depict. For instance, directed graphs are ideal for modeling dependencies or hierarchies,
such as Twitter follows, while undirected graphs excel in symmetric relationships like Facebook
friendships or road networks. Recognizing the nature of a network helps in choosing the right
graph type, whether sparse or dense, to efficiently analyze properties like connectivity and
shortest paths.
Complementing graphs, hashing stands out as a powerful data structure, designed for rapid
element retrieval. By converting keys into array indices using hash functions, hashing provides
quick access to data, significantly speeding up search operations. The process involves storing
data in hash tables, which use collision resolution techniques to handle cases where different keys
produce the same index, thus maintaining efficiency even in crowded data environments.
Data Structure 18
REFERENCE
Introduction to Algorithms" by Thomas H. Cormen, Charles E. Leiserson, Ronald L.
Rivest, and Clifford Stein (CLRS)
Geeksforgeeks
Coursera
Youtube tetorial
Data Structure 19