Data Structures
1
Data Structures
Data structures reflect the organization of data in
software
Organization of data determines:
Computational complexity of algorithms
Maintainability of applications
Versatility for a range of applications
Some user interface characteristics
2
Data Structures (Cont’d)
• Data structures and databases serve different
primary purposes, they often complement each
other in data-centric applications.
• Databases use advanced data structures for their
internal operations, and applications use data
structures for real-time data processing after
retrieving the data from databases.
3
Data Structures (Cont’d)
Data Structure:
•Definition: A data structure is a specific means of organizing and storing
data in a computer such that access to and modifications of the data can be
performed efficiently. Examples include arrays, linked lists, trees, hash
tables, stacks, and queues.
•Use Case: They are primarily used in algorithms for quick data access,
manipulation, and representation.
•Scope: Typically reside in the application's memory (RAM) and are
transient, meaning their existence is often temporary for the duration of the
program execution.
•Size: Suited for smaller amounts of data that require quick, real-time
operations.
•Persistence: Not inherently persistent. Data in a typical data structure
disappears after the program ends unless steps are taken to save or serialize
the data.
•Structure: Highly structured. Designed to optimize specific operations (e.g.,
a binary search tree facilitates fast search operations).
4
Data Structures (Cont’d)
Database:
•Definition: A database is a collection of related data that is
organized and stored in a manner that ensures data integrity, data
availability, and often, persistence across sessions or applications. It
often includes mechanisms to add, modify, delete, and query data.
•Use Case: Designed for storing, retrieving, updating, and managing
data in a consistent and reliable manner.
•Scope: Typically resides on persistent storage like hard disks,
SSDs, or distributed storage systems.
•Size: Can store vast amounts of data, often many terabytes or
even petabytes.
•Persistence: Inherently persistent. The primary purpose is to retain
data over long periods, making it available and consistent for future
retrieval and updates.
•Structure: While they can be structured (like relational databases),
there are also NoSQL databases that can be semi-structured or even
unstructured.
Data Structures (Cont’d)
Relation between Data Structure and Database:
•Foundation: At the core of every database management system (DBMS),
various data structures are implemented to ensure efficient data access,
indexing, caching, and concurrency control.
•Temporary vs. Permanent: Data structures can be seen as temporary in-
memory stores for computation, while databases are more permanent
storage solutions.
•Use Together: In many applications, both databases and data structures
are used. Data might be retrieved from a database and then processed and
manipulated in data structures within an application.
•Performance: To optimize database operations like indexing, searching, or
sorting, intricate data structures are employed within the database system.
For instance, B-trees are commonly used in relational databases for indexing.
6
Structured Organization of Data
Logically related data are grouped together to form a
structure.
A data structure may contain elements that are
themselves be data structures. This results in a
decomposition hierarchy.
7
Structured Organisation of Data (cont’d.)
Flat list of data Hierarchical organisation
items of data
Door width Building
Door height
Window width
Window height
Door Window
Height Width Height Width
8
Outline
Abstract Data Types
Linked lists
Graphs
Trees
Stacks
Queues
Network Science
9
Introduction
Basic data types are groups such as integer, string and
real. Abstract data types are more general.
When details are left out, data often have standard
forms. Abstract data types are standard forms of data
structures.
Common abstract data types are:
Linked lists
Graphs
Trees
Stacks
Queues
10
Operations on Abstract Data Types
Standard operations are defined on each abstract data
type. Such operations permit descriptions of algorithms
using standard vocabulary.
For example, a person who knows about queues will
understand when we say:
“Element X is added to the queue and element Y is
removed from the queue.“
Note that each element in an abstract data type may be
assigned a basic element type.
11
Outline
Abstract Data Types
Linked lists
Graphs
Trees
Stacks
Queues
Network Science
Linked Lists
A linked list organises a collection of data items
(elements) such that elements can easily be added to
and deleted from any position in the list.
Element Next
Element Next
Element Next
NULL
There are also doubly linked lists. They are not in the
scope of this module
13
Linked Lists: Simple Example
Linked Lists: Simple Example
Linked Lists: Simple Example…
Linked Lists: Simple Example…
Linked Lists: Simple Example…
Explanation:
•Each Station has a name (like "Station A") and a reference to the
next_station.
•The TrainRoute class represents our monorail route.
•We can add new stations to the route using add_station.
•We can display the entire route using display_route.
•In the example, we added four stations (A, B, C, and D) to our route and
then displayed the entire route.
Station A -> Station B -> Station C -> Station D -> End of Line
Operations on Linked Lists
Operations such as inserting an element and deleting
an element are easily carried out. Only references to
next elements are updated in these operations.
There is no need to copy or move large blocks of data
to facilitate insertion and deletion of elements.
Lists grow and shrink dynamically.
19
Linked Lists: Insertion
New data
Element Next Element Next
Element Next
Element Next
NULL
20
Linked Lists: Deletion
Element Next
Deleted element
Element Next
Element Next
NULL
21
Python Example
22
Python Example…
23
Python Example…
24
Python Example… (def delete_station)
25
Python Example…
Initial Route:
Station A -> Station B -> Station C -> Station D -> End of Line
After inserting 'Station E' after 'Station B':
Station A -> Station B -> Station E -> Station C -> Station D ->
End of Line
After deleting 'Station E':
Station A -> Station B -> Station C -> Station D -> End of Line
After deleting 'Station A':
Station B -> Station C -> Station D -> End of Line
26
Outline
Abstract Data Types
Linked lists
Graphs
Trees
Stacks
Queues
Network Science
Graphs
A graph data structure consists of vertices
and elements.
D E G
C I
F H
B J K
A
Example: A network of computers is
represented as a graph.
28
Types of Graphs
Directed graph
Each edge of the graph has a direction (arrow).
Connected graph
A graph in which there exists a path between every
pair of nodes.
Acyclic graph
A graph not containing any cycle.
29
Types of Graphs (cont’d.)
Weighted graph
Each edge has an allotted value (weight).
For example, this weight can represent distance or
duration.
Planar graph
A graph is planar if it can be drawn in 2-D so that
the edges do not cross over each other.
30
Example Graph
Intersections (Nodes):
A: Town Hall
B: Library
C: School
D: Park
E: Shopping Mall
Roads (Edges):
A <-> B: Main St.
A <-> C: First Ave.
B <-> D: Second Ave.
C <-> D: Third St.
C <-> E: Fourth St.
31
Example Graph (cont’d.)
class Graph:
def __init__(self):
self.adjacency_list = {}
def add_node(self, node):
"""Add a node to the graph."""
if node not in self.adjacency_list:
self.adjacency_list[node] = []
def add_edge(self, node1, node2, road_name):
"""Add an edge between two nodes with a road name."""
if node1 in self.adjacency_list and node2 in self.adjacency_list:
self.adjacency_list[node1].append((node2, road_name))
self.adjacency_list[node2].append((node1, road_name))
32
Example Graph (cont’d.)
def display(self):
"""Display the graph."""
for node, edges in self.adjacency_list.items():
connections = ", ".join([f"{edge[0]} ({edge[1]})" for edge in edges])
print(f"{node} -> {connections}")
List Comprehension:
[f"{edge[0]} ({edge[1]})" for edge in edges]
This is a list comprehension, which is a concise way to generate a list in
Python. Here, for every edge in the edges list, it creates a string in the
format node (road_name).
edge[0]: Represents the connected node (or destination node).
edge[1]: Represents the name of the road connecting the nodes.
33
Example Graph (cont’d.)
def display(self):
"""Display the graph."""
for node, edges in self.adjacency_list.items():
connections = ", ".join([f"{edge[0]} ({edge[1]})" for edge in edges])
print(f"{node} -> {connections}")
String Formatting:
The f before the string indicates it's an f-string (formatted string literal)
which is a feature introduced in Python 3.6. It allows embedded
expressions inside string literals. Inside the f-string:
{edge[0]}: This will be replaced by the first element of the edge (i.e.,
destination node).
{edge[1]}: This will be replaced by the second element of the edge (i.e.,
road name).
34
Example Graph (cont’d.)
def display(self):
"""Display the graph."""
for node, edges in self.adjacency_list.items():
connections = ", ".join([f"{edge[0]} ({edge[1]})" for edge in edges])
print(f"{node} -> {connections}")
Joining:
", ".join(...)
The join() method is a string method that takes a list of strings and
concatenates them into a single string. The string on which the join()
method is called will be used as a separator. Here, it's called on the string
", ", so it will concatenate the list of strings generated by the list
comprehension with a comma followed by a space.
35
Example Graph (cont’d.)
# Using the Graph to represent the road network:
road_network = Graph()
locations = ["Town Hall", "Library", "School", "Park", "Shopping Mall"]
for loc in locations:
road_network.add_node(loc)
road_network.add_edge("Town Hall", "Library", "Main St.")
road_network.add_edge("Town Hall", "School", "First Ave.")
road_network.add_edge("Library", "Park", "Second Ave.")
road_network.add_edge("School", "Park", "Third St.")
road_network.add_edge("School", "Shopping Mall", "Fourth St.")
road_network.display()
36
Example Graph (cont’d.)
In this implementation:
The Graph class has an adjacency list to represent nodes and their
connections.
•The add_node method lets us add intersections/locations.
•The add_edge method lets us define roads between these locations, and we
can even store road names.
•The display method simply shows all locations and their direct road
connections.
This is a simple undirected graph, but in real-world scenarios, we might use
weighted edges (to represent distances or travel times), directed edges (to
represent one-way streets), and more sophisticated data structures for efficient
operations.
37
Possible Applications of Graphs
Reasoning with inter-connected objects, for example:
Road networks
Office buildings (access, fire escapes, etc.)
Structures and frameworks
Social networks (for example, those used to deal
with sicknesses such as SARS and AIDS)
38
Review Questions
What are data structures?
What is an advantage of data structures?
Name the five common abstract data types.
39
Answers
What are data structures?
Organizations of data
What is an advantage of data structures?
They provide understandability and maintainability
through common types.
Name the five common abstract data types.
Linked lists, graphs, trees, stacks and queues.
40
Outline
Abstract Data Types
Linked lists
Graphs
Trees
Stacks
Queues
Network Science
Trees
A tree is a graph that does not contain any cycles.
Efficient algorithms are available for processing
trees.
A tree Not a tree
42
Example: Classification tree
Bridges
N1
N2 N3 N4
Country USA Switzerland UK
N5 N6 N7 N8
Type Cable- Pre-stressed Cable- Arch
stayed stayed
N9 N10 N11 N12
Span Long Short
Medium Long
Name Tacoma
Narrows Lutrive London
43
Example: Classification tree
Example: Classification tree (Class Node)
Example: Classification tree (Class Node)
Root
|
|-- USA
|
|-- Suspension
|
|-- Medium
|
|-- Brooklyn Bridge
"USA" is a child node of the root.
"Suspension" is a child node of "USA".
"Medium" is a child node of "Suspension".
"Brooklyn Bridge" is a child node of "Medium".
Each level of the hierarchy is represented by nodes, and the add_child
method helps in adding nodes at each level as we move down the tree.
Example: Classification tree
In this example:
•Node class represents each node in the tree.
•BridgeTree class represents the bridge tree, with a root node titled "Bridges".
•The add_bridge method of BridgeTree allows you to add a bridge to the tree
by specifying the country, type, span, and name.
•The display method recursively prints the tree in an indented format.
Example: Classification tree
add_child(self,data) method
Example: Classification tree
Bridges
USA
Cable-Stayed
Long
Tacoma Narrows
Switzerland
Pre-Stressed
Medium
Lutrive
Cable-Stayed
Long
UK
Arch
Short
London
Graphs and Trees
Graph Connected graph
Connected acyclic
graph (Tree) Directed graph
50
Outline
Abstract Data Types
Linked lists
Graphs
Trees
Stacks
Queues
Network Science
Stacks
Stacks are LIFO (Last In First Out) data structures.
Operations on stacks are to add an element to the top
of the stack (push) and extract an element from the top
of the stack (pop).
Add an element
(push) Remove an element
(pop)
52
Stacks in Simple Terms
Stacks in Context of Parking
Stack: Example
Stack: Example..
Another Example of Stacks
Stacks can be used to keep track of changes in order
to be able to carry out backtracking (undo).
The changes are added (push) to the stack. During
the undo operation, the changes are withdrawn from
the top of the stack (pop).
Therefore, the last changes are there to be popped
out (LIFO).
57
Outline
Abstract Data Types
Linked lists
Graphs
Trees
Stacks
Queues
Network Science
Queues
Queues are FIFO (First In First Out) data structures.
Standard operations are Enqueue (add element to
the tail of the queue) and Dequeue (take out
element from the front of the queue).
Examples: order processing queues, messaging
queues in computer networks, etc.
Add an element
(Enqueue)
Remove an element
(Dequeue)
59
Queues
Queues are a common data structure that follow the First In, First Out (FIFO)
principle.
Simple Analogy: Imagine a line of people waiting at a bank. The first person in
line is the first to be served, and as new people arrive, they join the end of the
line. That's how a queue works.
Other Example:
Imagine a toll booth on a highway where vehicles have to stop and pay a toll
before proceeding. During rush hours or holidays, multiple vehicles line up to
pass through the toll booth. The first vehicle in line will be the first to pay and
proceed, and vehicles that arrive later join the end of the queue. This behavior
follows the First In, First Out (FIFO) principle of a Queue.
In this context:
Enqueue: A vehicle arriving at the toll booth and getting in line.
Dequeue: A vehicle paying the toll and leaving the queue.
Peek (or Front): Checking which vehicle is next in line to pay.
60
Remarks
Abductive tasks
Abductive engineering tasks are well supported by the
abstract data types.
Exponential complexity
The presence of stacks in the loops of a program may
indicate exponential complexity.
61
Review Questions
What is the difference between a stack and a
queue?
Name two uses for stacks.
What feature of trees makes reasoning with
trees easier than with other types of graphs?
62
Answers
What is the difference between a stack and a
queue?
In a stack, the last data entry is used first while
in a queue, the first data entry is used first
Name two uses for stacks.
Undo and backtracking
What feature of trees makes reasoning with trees
easier than with other types of graphs?
A tree graph contains no cycles (acyclic) and
therefore, there is no risk of endless loops.
63
Discussion
Once data is organized into forms such as trees,
stacks and queues, standard algorithms can be
used to process them.
Properly organized data lead to easy-to-
understand user interfaces
Data structure design is an important task.
It should begin at the start of software
application development.
64
Outline
Abstract Data Types
Linked lists
Graphs
Trees
Stacks
Queues
Network Science
Network Science
Networks
A group of elements interconnected by links.
Random Networks
A network which has a random distribution between the
number of elements and the number of links.
Number of Nodes
Characteristic Node
Number of Links
66
Examples of Random Networks
Roads and crossroads
Rooms, doors and corridors in a building
Links and elements of a structure
Water distribution networks
Any situation where the geometric proximity is
important. For example, the length of the links
significantly contributes to costs.
67
Scale-Free Networks
Scale-Free Networks
A network with a logarithmic relationship between the
number of elements and the number of links.
Number of elements Number of elements (log)
Number of Number of
Links links (log)
Why scale-free? With less (or more) granularity, the
form does not change.
68
Examples of Scale-Free Networks
Construction projects (general enterprises and
subcontractors)
Networks of suppliers
Ski resorts – networks of ski runs and ski lifts,
chair lifts and cable cars
Airway networks
The Internet
Social networks
Cellular networks
69
70
Properties of a Scale-Free Network
The presence of “hubs” – a small number of
elements with many links
The small-world phenomenon (nodes can be
accessed through a small number of links.)
Elements with many links are preferred during
the growth of a network
Robust in the event of a random failure
Sensitive to targeted attacks
71
Questions
• What is the point of having standard data structures?
There are libraries of functions for using them and they
have known strengths and weaknesses.
• What is the danger of implementing stacks?
Stacks may lead to systematic backtracking and this
causes exponential complexity.
• When are scale-free networks robust and when are
they vulnerable?
They are robust in a random attack (or breakdown) and
they are vulnerable in a targeted attack.
72
Further Reading
Kruse, R. L., Leung, B. P. and Tondo, C. L. Data
Structures and Program Design in C, Englewood
Cliffs NJ: Prentice Hall, 1991
Liu, C. L., Elements of Discrete Mathematics, New
York: McGraw-Hill, 1997
Engineering Informatics - Fundamentals of
Computer-Aided Engineering, B. Raphael and I.F.C.
Smith, Wiley, 2013
Barabási Albert-László, Linked: How Everything is
Connected to Everything Else, Plume, 2003
73
Extra Remarks
Exponential complexity
The presence of stacks in the loops of a program may
indicate exponential complexity.
The statement is not universally true. Stacks
themselves don't inherently lead to exponential
complexity. However, if we were to look at a scenario
where we're dealing with recursive calls, where each
call further splits into more calls, that could
potentially lead to exponential growth.
74
Extra Remarks…
Exponential complexity
Imagine you're trying to plan a route for a delivery
truck. The truck has to visit multiple houses in a
city. But, due to certain constraints, the truck can't
just take the shortest path from one house to the
next. Instead, at each house, it evaluates multiple
potential paths to reach the next house.
75
Extra Remarks…
Exponential complexity
Let's visualize this scenario:
At House A, the truck can take two paths to get to
House B.
From House B, for each path it took to get there,
the truck can take two paths to get to House C.
76
Extra Remarks…
Exponential complexity
This pattern continues for each subsequent house.
If you think about the number of potential routes
the truck can take:
- To House B: 2 routes
- To House C: 2 x 2 = 4 routes
- To House D: 2 x 2 x 2 = 8 routes
- And so on...
The number of potential routes grows
exponentially. 77