PARALLEL & DISTRIBUTED COMPUTING
LECTURE NO: 01
INTRODUCTION
Lecturer: Sardar Un Nisa
Sardarun.nisa@numl.edu.pk
Department of Computer Science
NUML, Rawalpindi
COURSE PRE-REQUISITES
Programming Experience (preferably Python/C++/Java)
Understanding of Computer Organization and
Architecture
Understanding of Operating System
REQUIREMENTS & GRADING
Roughly
50 % Final Exam
25% Internal Evaluation
Quiz 5 Marks
Assignments 5 Marks
Presentation 5 Marks
Project 10 Marks
25% Mid term exam
BOOKS
Some good books are:
Distributed Systems Third edition
PRINCIPLES OF PARALLEL PROGRAMMING
Designing and Building Parallel Programs
Distributed and Cloud Computing
COURSE PROJECT
At the end of the semester students needs to submit a
semester project like
Distributed computing & smart city services
Large scale convolutional neural networks
Distributed computing with delay tolerant network
COURSE OVERVIEW
This course covers following main concepts
Concepts of parallel and distributed computing
Analysis and profiling of applications
Shared memory concepts
Distributed memory concepts
Parallel and distributed programming (OpenMP, MPI)
GPU based computing and programming (CUDA)
Virtualization
Cloud Computing, MapReduce
Grid Computing
Peer-to-Peer Computing
Future trends in computing
RECOMMENDED MATERIAL
Distributed Systems, Maarten van Steen & Andrew S. Tanenbaum,
3rd Edition (2020), Pearson.
Parallel Programming: Concepts and Practice, Bertil Schmidt, Jorge
Gonzalez-Dominguez, Christian Hundt, Moritz Schlarb, 1st Edition
(2018), Elsevier.
Parallel and High-Performance Computing, Robert Robey and Yuliana
Zamora, 1st Edition (2021).
Distributed and Cloud Computing: From Parallel Processing to the
Internet of Things, Kai Hwang, Jack Dongarra, Geoffrey Fox, 1st
Edition (2012), Elsevier.
Multicoreand GPU Programming: An Integrated Approach,
Gerassimos Barlas, 2nd Edition (2015), Elsevier.
Parallel programming: For multicore and cluster systems. Rauber,
Thomas, and Gudula Rünger. Springer Science & Business Media,
2013.
SINGLE PROCESSOR
ARCHITECTURE
MEMORY HIERARCHY
5 YEARS OF
TECHNOLOGY ADVANCE
PIPELINING
Sequential processing—tasks do not overlap, causing
PIPELINING
Pipelined processing—tasks overlap, reducing total time
illustrate that traditional methods of improving single-core CPU performance
(higher clock speeds, deeper pipelines, more ILP) became unsustainable.
MULTICORE
TREND
APPLICATION
PARTITIONING
HIGH-PERFORMANCE
COMPUTING (HPC)
HPC is the use of parallel processing for
running advanced application programs
efficiently, reliably and quickly.
It applies especially to systems that function
above a tera FLOPs (floating-point operations
per second) processing speed.
The term HPC is occasionally used as a
synonym for supercomputing, although
technically a supercomputer is a system that
performs at or near the currently highest
operational rate for computers.
HIGH PERFORMANCE
COMPUTING
GPU-ACCELERATED
COMPUTING
GPU-accelerated computing is the use of a
graphics processing unit (GPU) together with a
CPU to accelerate deep learning, analytics, and
engineering applications.
Pioneered in 2007 by NVIDIA, GPU accelerators
now power energy-efficient data centers in
government labs, universities, enterprises, and
small-and-medium businesses around the world.
They play a huge role in accelerating applications
in platforms ranging from artificial intelligence to
cars, drones, and robots.
WHAT IS GPU?
It is a processor optimized for 2D/3D graphics, video,
visual computing, and display.
It is highly parallel, highly multithreaded multiprocessor
optimized for visual computing.
It provide real-time visual interaction with computed
objects via graphics images, and video.
It serves as both a programmable graphics processor and
a scalable parallel computing platform.
Heterogeneous Systems: combine a GPU with a CPU
SGI ALTIX SUPERCOMPUTER 2300
PROCESSORS
PARALLEL COMPUTERS
Virtually all stand-alone computers
today are parallel from hardware
perspective:
Multiple functional units (L1
cache, L2 cache, branch, pre-
fetch, decode, floating-point,
graphics processing (GPU),
integer, etc.)
Multiple execution units/cores
Multiple hardware threads
IBM BG/Q Compute Chip with 18 cores (PU) and 16 L2 Cache units (L2)
PARALLEL COMPUTERS
Networks connect multiple
stand-alone computers
(nodes) to make larger
parallel computer clusters.
Parallel computer cluster
Each compute node is a
multi-processor parallel
computer in itself
Multiple compute nodes
are networked together
with an Infiniband network
Special purpose nodes,
also multi-processor, are
used for other purposes
SERIAL VARIATION
PARALLEL COMPUTING
TYPES OF PARALLEL AND
DISTRIBUTED COMPUTING
Parallel Computing
Shared Memory
Distributed Memory
Distributed Computing
Cluster Computing
Grid Computing
Cloud Computing
Distributed Pervasive Systems
PARALLEL COMPUTING
DISTRIBUTED (CLUSTER)
COMPUTING
Essentially a group of high-
end systems connected
through a LAN
Homogeneous: same OS,
near-identical hardware
Single managing node
DISTRIBUTED (GRID)
COMPUTING
Lots of nodes from everywhere
Heterogeneous
Dispersed across several organizations
Can easily span a wide-area network
To allow for collaborations, grids generally use virtual
organizations.
In essence, this is a grouping of users (or their IDs) that
will allow for authorization on resource allocation.
DISTRIBUTED (CLOUD)
COMPUTING
DISTRIBUTED (PERVASIVE)
COMPUTING
Emerging next-generation of distributed systems in
which nodes are small, mobile, and often embedded in
a larger system, characterized by the fact that the
system naturally blends into the user’s environment.
Three subtypes
Ubiquitous computing systems: pervasive and
continuously present, i.e., there is a continuous
interaction between system and user.
Mobile computing systems: pervasive, but
emphasis is on the fact that devices are inherently
mobile.
Sensor (and actuator) networks: pervasive, with
emphasis on the actual (collaborative) sensing and
actuation of the environment.
WHY USE
PARALLEL
COMPUTING?
THE REAL WORLD IS
MASSIVELY PARALLEL
In the natural world, many
complex, interrelated
events are happening at
the same time, yet within
a temporal sequence.
Compared to serial
computing, parallel
computing is much better
suited for
modeling, simulating and
understanding complex,
real world phenomena.
For example, imagine
modeling these serially
=>
SAVE TIME AND/OR
MONEY
(MAIN REASONS)
In theory, throwing more
resources at a task will
shorten its time to
completion, with
potential cost savings.
Parallel computers can
be built from cheap,
commodity components.
SOLVE LARGER / MORE COMPLEX
PROBLEMS (MAIN REASONS)
Many problems are so large and/or complex that it is
impractical or impossible to solve them on a single
computer, especially given limited computer
memory.
Example: Web search engines/databases processing
millions of transactions every second
PROVIDE CONCURRENCY
(MAIN REASONS)
A single compute resource can only do one thing at a
time. Multiple compute resources can do many things
simultaneously.
Example: Collaborative Networks provide a global venue
where people from around the world can meet and
conduct work "virtually".
MAKE BETTER USE OF UNDERLYING
PARALLEL HARDWARE
(MAIN REASONS)
Modern computers, even
laptops, are parallel in
architecture with multiple
processors/cores.
Parallel software is
specifically intended for
parallel hardware with
multiple cores, threads,
etc.
In most cases, serial
programs run on modern
computers "waste"
potential computing Intel Xeon processor with 6 cores
power. and 6 L3 cache units
THE FUTURE
(MAIN REASONS)
During the past 20+ years, the
trends indicated by ever faster
networks, distributed systems,
and multi-processor computer
architectures (even at the
desktop level) clearly show that
parallelism is the future of
computing.
In this same time period, there
has been a greater than
500,000x increase in
supercomputer performance, with
no end currently in sight.
The race is already on for
Exascale Computing!
Exaflop = 1018 calculations per
second
PARALLEL COMPUTING VS.
DISTRIBUTED COMPUTING
Sr. Parallel Computing Distributed Computing
Many operations are System components are
1.
performed simultaneously located at different locations
2. Single computer is required Uses multiple computers
Multiple processors perform Multiple computers perform
3.
multiple operations multiple operations
47
PARALLEL COMPUTING VS
DISTRIBUTED COMPUTING
Sr. Parallel Computing Distributed Computing
4. It may have shared or
It have only distributed
distributed
memory
memory
5. Processors communicate Computer communicate
with each other with each other
through bus through message passing.
Improves system scalability,
6. Improves the system fault tolerance and resource
performance sharing capabilities
48
THAT’S ALL FOR TODAY!!