KEMBAR78
Chapter 1 | PDF | Central Processing Unit | Computer Memory
0% found this document useful (0 votes)
7 views14 pages

Chapter 1

The document discusses the evolution of computer architecture, highlighting key technological advancements such as Moore's Law and the transition from vacuum tubes to integrated circuits. It categorizes different types of computers, including personal computers, supercomputers, and embedded systems, while also addressing the impact of mobile devices and cloud computing. Additionally, it covers performance metrics, CPU architecture, and the importance of instruction count and cycles per instruction in measuring computer efficiency.

Uploaded by

Manh Nguyen Xuan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views14 pages

Chapter 1

The document discusses the evolution of computer architecture, highlighting key technological advancements such as Moore's Law and the transition from vacuum tubes to integrated circuits. It categorizes different types of computers, including personal computers, supercomputers, and embedded systems, while also addressing the impact of mobile devices and cloud computing. Additionally, it covers performance metrics, CPU architecture, and the importance of instruction count and cycles per instruction in measuring computer efficiency.

Uploaded by

Manh Nguyen Xuan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Morgan Kaufmann Publishers August 12, 2021

The Computer Revolution


Computer Architecture • The third revolution along with agriculture and
Chapter 1: Computer Abstractions and Technology industry
• Progress in computer technology
– Underpinned by Moore’s Law
• Makes novel applications feasible
– Computers in automobiles
– Cell phones
– Human genome project
– World Wide Web
– Search Engines
• Computers are pervasive
Adapted from Computer Organization the Hardware/Software Interface – 5th

Computer Engineering – CSE – HCMUT Chapter 1: Computer Technology 2

The Moore’s Law Intel Processors & Chips


• World record, in terms of the number of transistors
integrated into a chip:
– Altera FPGA device: 30+ Billions
• Intel processor
– Core i 8th generation Coffee Lake
Co-founder of Intel Corp. – 14 nm technology
– >1.4B transistors (6th generation – SkyLake)

The number of transistors integrated in a


chip has doubled every 18-24 months
(1975)

Chapter 1: Computer Technology 3 Chapter 1: Computer Technology 4

Chapter 1 — Computer Abstractions and Technology 1


Morgan Kaufmann Publishers August 12, 2021

The First “Computer” The First “Computer” (cont.)

Source: Internet The ENIAC Computer, source: US Army photo


Chapter 1: Computer Technology 5 Chapter 1: Computer Technology 6

The ENIAC Computer A Brief History of Computers


• The first generation
• 30+ tons – Vacuum tubes
– 1946 – 1955
• 1,500+ square feet (140 square meter) • The second generation
– Transistors
• 18,000+ vacuum tubes – 1955 – 1965
• The third generation
• 140+ KW power – 1965 – 1980
– Integrated circuits
• 5,000+ additions per second • The current generation
– 1980 - …
– Personal computers
• What’s the next?
– Quantum computers?
– Memristor?

Chapter 1: Computer Technology 7 Chapter 1: Computer Technology 8

Chapter 1 — Computer Abstractions and Technology 2


Morgan Kaufmann Publishers August 12, 2021

Classes of Computers Classes of Computers


• Personal computers • Supercomputers
– General purpose, variety of software – High-end scientific and engineering calculations
– Subject to cost/performance tradeoff – Highest capability but represent a small fraction of
the overall computer market
• Server computers
– Network based • Embedded computers
– High capacity, performance, reliability – Hidden as components of systems
– Range from small servers to building sized – Stringent power/performance/cost constraints

Chapter 1: Computer Technology 9 Chapter 1: Computer Technology 10

The PostPC Era The PostPC Era


• Personal Mobile Device (PMD)
– Battery operated
– Connects to the Internet
– Hundreds of dollars
– Smart phones, tablets, electronic glasses,…
• Clouding computing
– Warehouse Scale Computers (WSC)
– Software as a Service (SaaS)
– Portion of software run on a PMD and a portion run in
the Cloud
– Amazon and Google
source: BusinessInsider
Chapter 1: Computer Technology 11 Chapter 1: Computer Technology 12

Chapter 1 — Computer Abstractions and Technology 3


Morgan Kaufmann Publishers August 12, 2021

Understanding Performance Below Your Program


• Algorithm • Application software
– Determines number of operations executed – Written in high-level language
• Programming language, compiler, architecture • System software
– Compiler: translates HLL code to
– Determine number of machine instructions executed per
machine code
operation
– Operating System: service code
• Processor and memory system • Handling input/output
– Determine how fast instructions are executed • Managing memory and storage
• Scheduling tasks & sharing resources
• I/O system (including OS)
• Hardware
– Determines how fast I/O operations are executed
– Processor, memory, I/O controllers

Chapter 1: Computer Technology 13 Chapter 1: Computer Technology 14

Levels of Program Code Components of a Computer


• High-level language • Same components for
The BIG Picture
– Level of abstraction closer all kinds of computer
to problem domain – Desktop, server,
– Provides for productivity embedded
and portability • Input/output includes
• Assembly language – User-interface devices
– Textual representation of • Display, keyboard, mouse
instructions – Storage devices
• Hardware representation • Hard disk, CD/DVD, flash
– Binary digits (bits) – Network adapters
– Encoded instructions and • For communicating with
other computers
data

Chapter 1: Computer Technology 15 Chapter 1: Computer Technology 16

Chapter 1 — Computer Abstractions and Technology 4


Morgan Kaufmann Publishers August 12, 2021

Touchscreen Through the Looking Glass


• PostPC device • LCD screen: picture elements (pixels)
• Supersedes keyboard – Mirrors content of frame buffer memory
and mouse
• Resistive and Capacitive
types
– Most tablets, smart
phones use capacitive
– Capacitive allows
multiple touches
simultaneously

Chapter 1: Computer Technology 17 Chapter 1: Computer Technology 18

Opening the Box Inside the Processor (CPU)


Capacitive multitouch LCD screen

3.8 V, 25 Watt-hour battery • Datapath: performs operations on data


• Control: sequences datapath, memory, ...
Computer board
• Cache memory
– Small fast SRAM memory for immediate access to
data

Chapter 1: Computer Technology 19 Chapter 1: Computer Technology 20

Chapter 1 — Computer Abstractions and Technology 5


Morgan Kaufmann Publishers August 12, 2021

Inside the Processor Abstractions


The BIG Picture
• Apple A5 • Abstraction helps us deal with complexity
– Hide lower-level detail
• Instruction set architecture (ISA)
– The hardware/software interface
• Application binary interface
– The ISA plus system software interface
• Implementation
– The details underlying and interface
Chapter 1: Computer Technology 21 Chapter 1: Computer Technology 22

A Safe Place for Data Networks


• Volatile main memory • Communication, resource sharing, nonlocal
– Loses instructions and data when power off
access
• Non-volatile secondary memory
– Magnetic disk • Local area network (LAN): Ethernet
– Flash memory
– Optical disk (CDROM, DVD)
• Wide area network (WAN): the Internet
• Wireless network: WiFi, Bluetooth

Chapter 1: Computer Technology 23 Chapter 1: Computer Technology 24

Chapter 1 — Computer Abstractions and Technology 6


Morgan Kaufmann Publishers August 12, 2021

Technology Trends Semiconductor Technology


• Electronics technology • Silicon: semiconductor
continues to evolve
– Increased capacity and • Add materials to transform properties:
performance
– Conductors
– Reduced cost
DRAM capacity
– Insulators
Year Technology Relative performance/cost – Switch
1951 Vacuum tube 1
1965 Transistor 35
1975 Integrated circuit (IC) 900
1995 Very large scale IC (VLSI) 2,400,000
2013 Ultra large scale IC 250,000,000,000

Chapter 1: Computer Technology 25 Chapter 1: Computer Technology 26

Manufacturing ICs Intel Core i7 Wafer

• 300mm wafer, 280 chips, 32nm technology


• Yield: proportion of working dies per wafer
• Each chip is 20.7 x 10.5 mm
Chapter 1: Computer Technology 27 Chapter 1: Computer Technology 28

Chapter 1 — Computer Abstractions and Technology 7


Morgan Kaufmann Publishers August 12, 2021

Integrated Circuit Cost Defining Performance


Cost per wafer • Which airplane has the best performance?
Cost per die =
Dies per wafer  Yield Boeing 777 Boeing 777

Dies per wafer  Wafer area Die area Boeing 747 Boeing 747

BAC/Sud BAC/Sud
Concorde Concorde
1
Yield = Douglas DC-
8-50
Douglas DC-
8-50

(1+ (Defects per area  Die area/ )) 0 100 200 300 400 500 0 2000 4000 6000 8000 10000

Passenger Capacity Cruising Range (miles)

• Nonlinear relation to area and defect rate Boeing 777 Boeing 777

– Wafer cost and area are fixed Boeing 747 Boeing 747

BAC/Sud BAC/Sud

– Defect rate determined by manufacturing process Concorde


Douglas DC-
Concorde
Douglas DC-
8-50 8-50

– Die area determined by architecture and circuit design 0 500 1000 1500 0 100000 200000 300000 400000

Cruising Speed (mph) Passengers x mph

Chapter 1: Computer Technology 29 Chapter 1: Computer Technology 30

Response Time and Throughput Relative Performance


• Response time • Define Performance = 1/Execution Time
– How long it takes to do a task • “X is n time faster than Y”
• Throughput Performance X Performance Y
– Total work done per unit time
= Execution time Y Execution time X = n
• e.g., tasks/transactions/… per hour
• How are response time and throughput affected by • Example: time taken to run a program
– Replacing the processor with a faster version? – 10s on A, 15s on B
– Adding more processors? – Execution TimeB / Execution TimeA
= 15s / 10s = 1.5
• We’ll focus on response time for now…
– So A is 1.5 times faster than B
Chapter 1: Computer Technology 31 Chapter 1: Computer Technology 32

Chapter 1 — Computer Abstractions and Technology 8


Morgan Kaufmann Publishers August 12, 2021

Measuring Execution Time CPU Clocking


• Elapsed time • Operation of digital hardware governed by a
– Total response time, including all aspects constant-rate clock
Clock period
• Processing, I/O, OS overhead, idle time
Clock (cycles)
– Determines system performance
Data transfer
• CPU time and computation

– Time spent processing a given job Update state

• Discounts I/O time, other jobs’ shares • Clock period: duration of a clock cycle
– Comprises user CPU time and system CPU time – e.g., 250ps = 0.25ns = 250×10–12s
– Different programs are affected differently by CPU • Clock frequency (rate): cycles per second
and system performance
– e.g., 4.0GHz = 4000MHz = 4.0×109Hz
Chapter 1: Computer Technology 33 Chapter 1: Computer Technology 34

CPU Time CPU Time Example


CPU Time = CPU Clock Cycles Clock Cycle Time
• Computer A: 2GHz clock, 10s CPU time
CPU Clock Cycles • Designing Computer B
=
Clock Rate – Aim for 6s CPU time
– Can do faster clock, but causes 1.2 × clock cycles
• Performance improved by • How fast must Computer B clock be?
– Reducing number of clock cycles Clock CyclesB 1.2  Clock CyclesA
Clock RateB = =
– Increasing clock rate CPU TimeB 6s
– Hardware designer must often trade off clock rate Clock CyclesA = CPU Time A  Clock RateA
against cycle count = 10s  2GHz = 20  109
1.2  20  109 24  109
Clock RateB = = = 4GHz
Chapter 1: Computer Technology 35
6s 6s
Chapter 1: Computer Technology 36

Chapter 1 — Computer Abstractions and Technology 9


Morgan Kaufmann Publishers August 12, 2021

Instruction Count and CPI CPI Example


Clock Cycles = Instruction Count  Cycles per Instruction • Computer A: Cycle Time = 250ps, CPI = 2.0
CPU Time = Instruction Count  CPI  Clock Cycle Time • Computer B: Cycle Time = 500ps, CPI = 1.2
• Same ISA, compiler
Instruction Count  CPI •
= Which is faster, and by how much?
Clock Rate
CPU Time = Instruction Count  CPI  Cycle Time
A A A
• Instruction Count for a program
= I  2.0  250ps = I  500ps A is faster…
– Determined by program, ISA and compiler
CPU Time = Instruction Count  CPI  Cycle Time
• Average cycles per instruction B B B
– Determined by CPU hardware = I  1.2  500ps = I  600ps
– If different instructions have different CPI CPU Time
B = I  600ps = 1.2
• Average CPI affected by instruction mix …by this much
CPU Time I  500ps
A
Chapter 1: Computer Technology 37 Chapter 1: Computer Technology 38

CPI in More Detail CPI Example


• If different instruction classes take different • Alternative compiled code sequences using
instructions in classes A, B, C
numbers of cycles
Class A B C
n
Clock Cycles =  (CPIi  Instruction Counti ) CPI for class 1 2 3
i=1 IC in sequence 1 2 1 2
IC in sequence 2 4 1 1
• Weighted average CPI ◼ Sequence 1: IC = 5 ◼ Sequence 2: IC = 6
Clock Cycles n
 Instruction Counti  Clock Cycles Clock Cycles
=   CPIi 
◼ ◼
CPI =  = 2×1 + 1×2 + 2×3 = 4×1 + 1×2 + 1×3
Instruction Count i=1  Instruction Count 
= 10 =9
Relative frequency
◼ Avg. CPI = 10/5 = 2.0 ◼ Avg. CPI = 9/6 = 1.5
Chapter 1: Computer Technology 39 Chapter 1: Computer Technology 40

Chapter 1 — Computer Abstractions and Technology 10


Morgan Kaufmann Publishers August 12, 2021

Exercise Performance Summary


• A program consists of 1000 instructions in which: The BIG Picture
– 30% load/store instructions, CPI = 2.5 Instructio ns Clock cycles Seconds
– 10% jump instructions, CPI = 1 CPU Time =  
– 20% branch instructions, CPI = 1.5
Program Instructio n Clock cycle
– The rest are arithmetic instructions, CPI = 2.0
• The program is executed on a 2 GHz CPU • Performance depends on
a) What is execution time (CPU time) of the program? – Algorithm: affects IC, possibly CPI
b) What is the weight average CPI of the program? – Programming language: affects IC, CPI
c) If load/store instructions are improved so that their
execution time is reduced by a factor of 2, what is the – Compiler: affects IC, CPI
speed-up of the system? – Instruction set architecture: affects IC, CPI, Tc

Chapter 1: Computer Technology 41 Chapter 1: Computer Technology 42

Power Trends Reducing Power


10000 3600 120
3900
2000 2667 3300 3400
100
1000
frequency 103
95 • Suppose a new CPU has
Frequency (MHz)

200 87 80
Power (W)

100
25
66 75.3 77
65 60 – 85% of capacitive load of old CPU
power
10
12.5 16
29.1 40 – 15% voltage and 15% frequency reduction
10.1 20
Pnew Cold  0.85 (Vold  0.85)2  Fold  0.85
= = 0.854 = 0.52
3.3 4.1 4.9

Cold  Vold  Fold


1 0 2
Pold
Pentium Pro

Pentium 4
Willamette

Core i5 Ivy
Pentium 4

Prescott

Skylake
Core i5
Kentsfield
Pentium

Clarkdal
e (2010)
(2004)

(2015)
Core i5

Bridge
(1982)

(1985)

(1989)

(1993)

(1997)

(2001)

(2012)
80286

80386

80486

Core 2

(2007)

• The power wall


• In CMOS IC technology – We can’t reduce voltage further
Power = Capacitive load  Voltage 2  Frequency – We can’t remove more heat
• How else can we improve performance?
×30 5V → 1V ×1000
Chapter 1: Computer Technology 43 Chapter 1: Computer Technology 44

Chapter 1 — Computer Abstractions and Technology 11


Morgan Kaufmann Publishers August 12, 2021

Uniprocessor Performance Multiprocessors


• Multicore microprocessors
– More than one processor per chip
• Requires explicitly parallel programming
– Compare with instruction level parallelism
• Hardware executes multiple instructions at once
• Hidden from the programmer
– Hard to do
• Programming for performance
• Load balancing
Constrained by power, instruction-level parallelism,
memory latency • Optimizing communication and synchronization
Chapter 1: Computer Technology 45 Chapter 1: Computer Technology 46

SPEC CPU Benchmark CINT2006 for Intel Core i7 920


• Programs used to measure performance
– Supposedly typical of actual workload
• Standard Performance Evaluation Corp (SPEC)
– Develops benchmarks for CPU, I/O, Web, …
• SPEC CPU2006
– Elapsed time to execute a selection of programs
• Negligible I/O, so focuses on CPU performance
– Normalize relative to reference machine
– Summarize as geometric mean of performance ratios
• CINT2006 (integer) and CFP2006 (floating-point)
n
n
 Execution time ratio
i=1 Chapter 1: Computer Technology
i
47 Chapter 1: Computer Technology 48

Chapter 1 — Computer Abstractions and Technology 12


Morgan Kaufmann Publishers August 12, 2021

SPEC Power Benchmark SPECpower_ssj2008 for Xeon X5650

• Power consumption of server at different


workload levels
– Performance: ssj_ops/sec
– Power: Watts (Joules/sec)

 10   10 
Overall ssj_ops per Watt =   ssj_opsi    poweri 
 i =0   i =0 

Chapter 1: Computer Technology 49 Chapter 1: Computer Technology 50

Pitfall: Amdahl’s Law Fallacy: Low Power at Idle


• Improving an aspect of a computer and expecting • Look back at i7 power benchmark
a proportional improvement in overall – At 100% load: 258W
performance
Taffected – At 50% load: 170W (66%)
Timproved = + Tunaffected – At 10% load: 121W (47%)
improvemen t factor
• Example: multiply accounts for 80s/100s • Google data center
– How much improvement in multiply performance to – Mostly operates at 10% – 50% load
get 5× overall? – At 100% load less than 1% of the time
80
– Can’t be done: 20 = + 20
n
• Corollary: make the common case fast • Consider designing processors to make power
proportional to load
Chapter 1: Computer Technology 51 Chapter 1: Computer Technology 52

Chapter 1 — Computer Abstractions and Technology 13


Morgan Kaufmann Publishers August 12, 2021

Pitfall: MIPS as a Performance Metric Concluding Remarks


• MIPS: Millions of Instructions Per Second • Cost/performance is improving
– Doesn’t account for – Due to underlying technology development
• Differences in ISAs between computers • Hierarchical layers of abstraction
• Differences in complexity between instructions – In both hardware and software
MIPS =
Instruction count • Instruction set architecture
Execution time  106 – The hardware/software interface
Instruction count Clock rate • Execution time: the best performance measure
= =
Instruction count  CPI CPI  106
 10 6
• Power is a limiting factor
Clock rate
– Use parallelism to improve performance
• CPI varies between programs on a given CPU
Chapter 1: Computer Technology 53 Chapter 1: Computer Technology 54

Chapter 1 — Computer Abstractions and Technology 14

You might also like