KEMBAR78
ACSA1 Introduction | PDF | Parallel Computing | Central Processing Unit
0% found this document useful (0 votes)
52 views33 pages

ACSA1 Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views33 pages

ACSA1 Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Advanced Computer Systems

Architecture

Introduction
Sudipta Mahapatra, Ph.D.
Department of E & ECE
IIT Kharagpur

Resources: Notes by David Patterson


(http://www.cs.berkeley.edu/~patterson) and power
point slides of 5th Ed. of the book, CA by H & P.
1
Introduction
The revolution in computer system architecture has been driven
primarily by,
• Improvements in semiconductor technology
– Feature size, clock speed
• Improvements in computer architectures
– Enabled by HLL compilers, UNIX
– Led to RISC architectures – use of ILP and Caches
• Together have resulted in
– Lightweight computers,
– SaaS, Cloud Computing
• Applications evolution:
– Speech, sound, images, video, “augmented reality”
2
Introduction (Contd.)
The dramatic and unprecedented growth in the area
of computer systems can be attributed to:
• Advances in technology used to build computers
• Innovations in computer design
The main task of a computer architect is to maximize
the performance within a specified cost.

3
Performance Improvement

RISC

4
Current Architectural Trend
• Cannot continue to leverage Instruction-Level
parallelism (ILP)
– Single processor performance improvement ended in
2003
• New models for performance:
– Data-level parallelism (DLP)
– Thread-level parallelism (TLP)
– Request-level parallelism (RLP)
• These require an explicit restructuring of the
application

5
Types of Computers
• Personal Mobile Device (PMD)
– e.g. smart phones, tablet computers (1.8 billion sold 2010)
– Emphasis on energy efficiency and real-time
• Desktop Computing
– Emphasis on price-performance (0.35 billion)
• Servers
– Emphasis on availability (very costly downtime!), scalability,
throughput (20 million)
• Clusters / Warehouse Scale Computers
– Used for “Software as a Service (SaaS)”, PaaS, IaaS, etc.
– Emphasis on availability ($6M/hour-downtime at Amazon.com!)
and price-performance (power=80% of TCO!)
– Sub-class: Supercomputers, emphasis: floating-point performance
and fast internal networks, and big data analytics
• Embedded Computers (19 billion in 2010)
6
– Emphasis: price
Parallelism
• Classes of parallelism in applications:
– Data-Level Parallelism (DLP)
– Task-Level Parallelism (TLP)

• Classes of architectural parallelism:


– Instruction-Level Parallelism (ILP)
– Vector architectures/Graphic Processor Units (GPUs)
– Thread-Level Parallelism
– Request-Level Parallelism

7
Points to discuss
• Areas of growth
• Task of a computer designer
• Computer Architecture
• Trends in computer organization and usage
• Costs and trends in cost
• Measuring and reporting performance
• Quantitative Principles
• Summary

8
Advanced Computer Systems Architecture
Topics to be covered
 Techniques for quantitative analysis and evaluation of a high performance computer system.
 Techniques for improving the performance of various components including the following:
o Instruction set architecture.
o CPU: Techniques for exploiting instruction level parallelism (ILP) such as
Pipelining: Simple and Advanced concepts
Dynamic scheduling: Scoreboarding and Tomasulo’s Approach
static and dynamic branch prediction
o Techniques for improving the performance of the Main Memory and Cache Memory systems.
o Taking advantage of ILP: Multiple issue processors, SMT Architectures
o Data Level Parallelism: Vector processor architectures, GPUs, SIMD computing.
o Multiprocessor Architectures
o Interconnection Networks and Synchronization
o Introduction to Cluster and Grid computing
Case studies will be undertaken wherever possible to illustrate the underlying concepts.
Text Book:
John L. Hennessy and David A. Patterson, Computer Architecture: A Quantitative Approach,
Morgan Kaufmann Publishers.
Kai Hwang, Advanced Computer Architecture, McGraw Hill Publishers 9
Teaching and Evaluation
The course will be offered through:
o Lectures
o Tutorials
o Student presentations
Evaluation:
• Mid-semester: 30
• End Semester: 50
• Class tests
• Student Projects that will require simulating certain scenarios (from a
related paper) using a Computer Architecture simulator (e.g., Simplescalar); will
be evaluated through project reports and presentations.
Class Attendance:
100% attendance is mandatory; in case of lower attendance, proper justification
is to be provided. Else, there is a chance of deregistration.

10
Evaluation (2020)
• Assignments/home work/project/
presentation/term paper: 40 to 50%
• Class test: Minimum of 3 class tests of 1hr
duration – 60 to 50%
Each class test to consist of
– MCQ (including small problems)
– SAQ (Questions to be answered in brief)

11
12
13
Scaling transistor performance
• Feature sizes have decreased from 10u in 1970s, through
0.18u in 2001, to tens of nanometers in 2010.
• To a first approximation transistor performance improves
linearly with decreasing feature size. This also leads to a
decrease in the required operating voltage.
• The wires also get shorter, with a corresponding decrease
in the resistance and capacitance and with this the delay
involved.
• Increase in the transistor switching and the switching
frequency dominates the power consumption.

14
Cost and Trends in Cost

15
Measuring and Reporting Performance

16
Measuring and Reporting Performance
(Contd.)

17
Amdahl’s Law
The performance improvement that can be
obtained by using some faster mode of execution
is limited by the fraction of time the faster mode
can be used.

Enhancement: Any change/modification in the


system design or realization of a component with
a view to improve the overall performance.
18
Quantitative Principle of Computer Design

19
CPU Performance Equation

20
Performance Equation (Contd.)

21
Performance Equation (Contd.)

22
CPU Performance

23
How to evaluate a new architecture
Benchmark Programs:

24
Although transistor count increases every year, the performance
of uniprocessor or single threaded CPU performance is actually
going down.
Henk Poley
Henk Poley
Fallacy: MIPS is an accurate measure for comparing performance
among computers.
MIPS=IC/(Execution Time x 106)=Clock rate/(CPI x 106)
Although simple to understand,
i. MIPS is instruction set dependent?
ii. MIPS varies between programs in the same computers?
iii. MIPs can vary inversely to the performance?

A Computer system with optional floating point (FP) h/w take less time
to execute floating point instructions, but have a higher MIPs rating as FP
instructions take more CC per instruction.

29
Fallacy: MFLOPS is a consistent and useful measure of
performance.
MFLOPS=No. of FLOs/(Execution Timex106)

i. MFLOPS Is dependent on the machine and the program.


ii. The floating point operations are not consistent across machines.
iii. MFLOPS rating changes not only on the mix of integer and
floating point operations, but also on the mix of fast and slow
floating point operations, e.g., FADD and FMUL.

30
Fallacy: Synthetic benchmarks predict performance of real programs?

Fallacy: Benchmarks remain valid indefinitely?

Fallacy: Peak performance tracks observed performance?


True only if the benchmark consists of small programs that normally
operate close to peak.

31
Fallacy/Pitfall
• The relative performance of two processors with the same
ISA can be judged by the clock rate or by the performance
of a single benchmark suite.
• Comparing hand-coded assembly language and compiler
generated high level language performance.
• The best design for a computer is the one that optimizes
the primary objective without considering implementation.
• Neglecting the cost of software in either evaluating a
system or examining cost-performance tradeoff.
• Falling prey to Amdahl’s law

32
Questions

33

You might also like