Distributed Systems Design
COMP 6231
Introduction
Lecture 1
Essam Mansour
1
2
Who am I?
Dr. Essam Mansour
Assistant Professor
www.emansour.com
@DrEssam_Mansour
2019 - Now
Concordia Data
2013 - 2019 Systems (CoDS) Lab
2009 - 2013
Large-scale Analytics
2008 - 2009 on Strings
Ireland
2004 - 2008 Querying
Ph.D. in Computer Science Geo-Distributed Graphs
Data Management
Egypt
Semi-structured data
Active Databases Mobile Data Management
1997 - 2003 Energy-aware algorithms Elastic OLTP Systems
B.Sc. and M.Sc.
Database Systems Data Integration and Discovery
Software Engineering
Temporal Query Processing
3
Who am I?
Dr. Essam Mansour
Assistant Professor
www.emansour.com
@DrEssam_Mansour
2019 - Now
Concordia Data
2013 - 2019 Systems (CoDS) Lab
2009 - 2013
2008 - 2009 30+ Publications
Ireland
Expertise Venues
2004 - 2008
• Big data • VLDBJ
Ph.D. in Computer Science
• PVLDB
Egypt
Data Management • Database systems
Semi-structured data • SIGMOD
• Data discovery and integration • ICDE
Active Databases
1997 - 2003 • Distributed and parallel systems • WWW
B.Sc. and M.Sc.
Database Systems
• Graph and web data management • CIKM
Software Engineering • …….
Temporal Query Processing
4
Supervision/Mentoring Experience
- Co-supervising postgraduate students:
3 PhD students
2 students in the string analytics project (2015, and 2017)
1 student in the decentralized graph management (2018)
1 VLDBJ, 2 PVLDB, 1 ICDE, and 1 CIKM research papers
1 SIGMOD, 1 VLDB, and 1 ICDE demo/poster papers
2 Master students
1 student in the string analytics project (2012)
1 student in the mobile data management project (2009)
1 PVLDB and 1 ICSOFT research papers
- Mentoring different postdocs.
5
Academic Service
• A Program Committee member of:
• I also regularly review papers for a number of top international journals including:
• ACM Transactions on Database Systems (TODS)
• The International Journal on Very Large Data Bases (VLDB Journal)
• IEEE Transactions on Knowledge and Data Engineering (TKDE)
• IEEE Transactions on Parallel and Distributed Systems (TPDS)
• IEEE Transactions on Cloud Computing (TCCSI)
• IEEE Transactions on Big Data (TBD)
6
Teaching Style
• I like interaction in class
• I like to ask questions
• I like to be asked questions
• I like to know (and memorize) your names
• I like to give practical assignments and projects
• I like to learn …
7
… and now
8
Guest Lecture, CMPT 523 Distributed Systems, Spring 2016 9
2009
10
On the Verge of A Disruptive Century :
Breakthroughs
Ubiquitous
Astronomy
Computing
Smaller, Faster,
Cheaper Sensors
Gene
Sequencing and
Biotechnology
11
A Common Theme is Data
The amount of data is only growing…
1.2 Zettabytes (1021 B or 1 Billion TB)
12
We Live in a World of Data…
13
What Do We Do With Data?
Store Share
Access Process
…. and
Encrypt more!
We want to do these seamlessly...
14
Using Diverse Interfaces & Devices
Mobile Devices
Computers
…and even appliances
Consumer Electronics Personal Monitors and
Sensors
We also want to access, share and process our data from all of
our devices, anytime, anywhere!
15
Data Becoming Critical to Our Lives
Health Science
Domains
Education of Data
Work
Environment Finance
… and more
16
How to Store and Process Data at Scale?
• A system can be scaled:
• Either vertically (or up)
• Can be achieved by hardware upgrades (e.g., faster CPU, more
memory, and/or larger disk)
• And/Or horizontally (or out)
• Can be achieved by adding more machines
17
Vertical Scaling
• Caveat: Individual computers can still suffer from limited resources
with respect to the scale of today’s problems
1. Caches and Memory:
L1
16-32 KB/Core, 4-5 cycles
Cache
L2 Cache 128-256 KB/Core, 12-15 cycles
L3 Cache 512KB- 2 MB/Core, 30-50 cycles
Main Memory 8GB- 128GB, 300+ cycles
18
Vertical Scaling
• Caveat: Individual computers can still suffer from limited resources
with respect to the scale of today’s problems
2. Disks-- some advancements, but still:
Limited capacity
Limited number of channels
Limited bandwidth
19
Vertical Scaling
• Caveat: Individual computers can still suffer from limited resources
with respect to the scale of today’s problems
2. Processors:
Moore’s law still holds
Chip Multiprocessors (CMPs) are now available
P P P P
P L1 L1 L1 L1
L1
Interconnect
L2
L2 Cache
A single Processor Chip
20
A CMP
Vertical Scaling
• Caveat: Individual computers can still suffer from limited resources
with respect to the scale of today’s problems
2. Processors:
But up until a few years ago, CPU speed grew at the rate of 55%
annually, while the memory speed grew at the rate of only 7%
Processor-Memory speed gap
M
21
Vertical Scaling
• Caveat: Individual computers can still suffer from limited resources
with respect to the scale of today’s problems
2. Processors:
But up until a few years ago, CPU speed grew at the rate of 55%
annually, while the memory speed grew at the rate of only 7%
Even if 100s or 1000s of cores are placed on a CMP, it is a challenge
to deliver input data to these cores fast enough for processing
Vertical Scaling
P P P P
10000 Suffer from
L1 L1 L1 L1 seconds (or limited scalability
3 hours) to
Interconnect
load data
L2 Cache
A Data Set
of 4 TBs
Memory
4 100MB/S IO Channels
23
How to Store and Process Data at Scale?
• A system can be scaled:
• Either vertically (or up)
• Can be achieved by hardware upgrades (e.g., faster CPU, more
memory, and/or larger disk)
• And/Or horizontally (or out)
• Can be achieved by adding more machines
24
Horizontal Scaling
P 100
P
Splits Only 3
L1 Machines L1
minutes to
L2 L2 load data
Memory Memory
A Data Set (data)
of 4 TBs
25
Requirements
• But, this necessitates:
• A way to express the problem in terms of parallel processes and execute them
on different machines (Programming and Concurrency Models)
• A way to organize processes (Architectures)
• A way for distributed processes to exchange information (Communication
Paradigms)
• A way to locate and share resources (Naming Protocols)
26
Requirements
• But, this necessitates:
• A way for distributed processes to cooperate, synchronize with one another,
and agree on shared values (Synchronization)
• A way to reduce latency, enhance reliability, and improve performance
(Caching, Replication, and Consistency)
• A way to enhance load scalability, reduce diversity across heterogeneous
systems, and provide a high degree of portability and flexibility
(Virtualization)
• A way to recover from partial failures (Fault Tolerance)
27
Degree of Parallelism
DATA D A T A
Task Parallelism Data Parallelism
28
So, What is a Distributed System?
A distributed system is:
One in which components
A collection of independent
located at networked
computers that appear to its
computers communicate and
users as a single coherent
coordinate their actions only
system
by passing messages
29
Features
• Distributed Systems imply four main features:
1 Geographical Separation
2 No Common Physical Clock
3 No Common Physical Memory
4 Autonomy and Heterogeneity
30
Parallel vs. Distributed Systems
• Distributed systems contrast with parallel systems, which entail:
1 Strong Coupling
2 A Common Physical Clock
I am not sure.
Can you verify that?
3 A Shared Physical Memory
4 Homogeneity
Some Administrivia!
32
Distributed Systems Design
Considered: a reasonably critical and
comprehensive perspective. .9. Fault Tolerance
Thoughtful: Fluent, flexible and efficient .8. Distributed Frameworks
perspective. .7. Replication
Masterful: a powerful and illuminating .6. Caching
perspective. .5. Synchronization
.4. Naming
3. Architectures
.2. Remote Procedure Calls
.1. Networking
.0. Introduction
33
Course Objectives
The course aims at providing an in-depth
understanding and hands-on experience on
How modern
distributed
Distributed systems meet the
system demands of
Principles on programming contemporary
which models and distributed
Principles on distributed analytics applications
which systems are engines
distributed optimized
systems are
based
34
Teaching Team
COMP
Instructor:
Essam Mansour Teaching 6231
Assistants Teaching
(EM)
Team
EM Office Hours
• Monday, 2:30 - 4:00PM
• Welcome when my office door is open
• By appointment
• My office: EV 3.251
TAs Office Hours
• To be announced
• By appointment
35
Textbooks
36
Paper Reading and Reviews
37
Teaching Methods
13 Lectures
• Motivate learning
• Provide a framework or roadmap to organize the information of
the course
• Explain subjects and reinforce the critical big ideas
12 Labs
•Get you to reveal what you don’t understand, so that we can help you
•Allow you to practice skills you will need to become an expert
38
Assignments and Projects
Assignments
• required problem solving and reading assignments
Projects
• large programming project
39
Projects
Project Proposal
Introduction and Related Work
Proposed Solution
Experiential Evaluation
Paper Demo
40
Assessment Methods
How do we measure learning?
Type # Weight
Project 1 35%
Final Exam 1 30%
Problem Set 4 20%
Presentations/reviews 3 15%
41
Next Lecture
• Networking- Part I
Questions?
42