Roll # 2023F-BSE-190 Section: D Name: TABIN ALAM
Department: Software Engineering Program: BS (SE)
Assignment 3
SE – 203T Data Structures and Algorithms
Announced Date:23-12-2024 Due Date:20-01-2025 Total Marks: 30 Marks Marks Obtained:
Teacher Name: Ms. Dur e Shawar Agha, Ms. Falak Saleem
Sr. No Course Learning Outcomes PLOs Blooms
Taxonomy
PLO_4 C3 (Apply)
CLO3 Construct non-linear data structures and perform its (Design of
operations. Solutions)
Q1) Consider Problem Solving Scenario (case study/research paper) for non-linear type and apply
appropriate data structure. Also you have to suggest or replace already implemented data stricture
to improve complexity for selected scenario.
Note: Attach the case study / research paper with the assignment.
Problem Solving Scenario
Scenario: Social Network Graph
In this scenario, we are working with a social network platform, where we want to analyze connections
between users. The network has millions of users, and each user can be connected to hundreds or
thousands of other users.
Each user has certain attributes such as name, age, location, interests, etc. We are interested in:
1. Friendship Recommendations: Suggest friends to users based on mutual connections.
2. Shortest Path: Determine the shortest path between two users (e.g., how one user can reach
another via the fewest intermediaries).
3. Community Detection: Identify groups of users who are closely connected.
The platform needs to handle real-time updates like adding or removing users and connections
(friendships), and quickly process complex queries about user connections.
Non-linear Data Structures for the Scenario
Given that the problem involves relationships between users, a graph is an ideal data structure to represent
the network. In a graph:
Nodes represent users.
Edges represent connections (friendships) between users.
Graphs are inherently non-linear data structures because there is no hierarchical order (unlike trees) —
each node can have multiple connections, and the relationships between nodes are more flexible.
Types of Graphs:
1. Undirected Graph: This is ideal for representing friendships, as if User A is friends with User B,
then User B is also friends with User A.
2. Directed Graph: Useful in cases where one-way relationships exist, e.g., user A follows user B,
but user B doesn't necessarily follow user A.
Suggested Data Structures for the Graph:
1. Adjacency List: A space-efficient representation of a graph, where each node has a list of adjacent
nodes (friends). This is particularly useful if the graph is sparse, meaning not every user is friends
with every other user.
2. Adjacency Matrix: An alternative representation where a 2D matrix is used to represent edges
between nodes. This is more useful for dense graphs but uses more memory. It’s less efficient for
sparse graphs.
3. Hash Map of Adjacency Lists: A hash map could be used to dynamically store the list of friends
for each user. This allows for fast access and updates.
Suggested Improvements for Performance/Complexity:
1. Optimizing for Friendship Recommendations:
o Current Approach: We can compute recommendations by finding mutual friends between
users.
o Improvement: Use a hash-based set intersection to quickly find mutual friends. Since sets
have average time complexity of O(1) for insertions and lookups, this significantly reduces
the time for finding mutual friends compared to iterating through the lists.
Instead of iterating over each user’s friend list for mutual friends, you can store the friends in a
hash set and perform efficient set intersections for recommendations.
2. Optimizing for Shortest Path (e.g., BFS):
o Current Approach: For finding the shortest path between users, we can use Breadth-First
Search (BFS) for an unweighted graph (like friendships).
o Improvement: To improve the performance of BFS, we can implement memoization to
avoid recalculating paths between the same set of nodes repeatedly. This can significantly
reduce redundant computations in large networks.
3. Handling Dynamic Graph Changes (Real-Time Updates):
o Current Approach: Using basic adjacency lists or matrices might lead to slow updates
(insertion/deletion of users or connections).
o Improvement: Implement dynamic connectivity structures like Union-Find (Disjoint Set
Union, DSU), which allows for efficient union and find operations to track connected
components and handle real-time updates more effectively.
4. Optimizing for Large-Scale Data:
o For large graphs (millions of nodes), using graph databases such as Neo4j or Amazon
Neptune might be more suitable. These databases are specifically optimized for graph
queries and can handle complex queries (e.g., community detection, pathfinding) much
faster than in-memory data structures.
5. Community Detection:
o Current Approach: A simple approach could involve clustering algorithms or applying
graph traversal techniques like DFS/BFS.
o Improvement: Consider using more advanced techniques like Louvain Method or Spectral
Clustering, which are efficient algorithms for detecting communities in large graphs. These
techniques improve over simpler approaches like BFS by finding groups of nodes that are
more densely connected with each other compared to other nodes in the graph.
Conclusion
In this social network scenario, using non-linear data structures like graphs is essential to represent
complex relationships between users. Implementing adjacency lists with hash sets for fast lookups, BFS for
shortest paths, and Union-Find for dynamic connectivity will improve both the time complexity and
scalability of the system. For very large-scale networks, transitioning to a graph database would be a
further improvement.
This approach ensures that we can efficiently solve problems related to friendship recommendations,
pathfinding, and community detection, all of which are crucial for real-time social network analysis.