0% found this document useful (0 votes)

240 views13 pages

Advanced Processor Techniques

This document discusses instruction-level parallelism (ILP) and how it can be achieved through deeper pipelines, multiple instruction issue, and static or dynamic multiple issue processor designs. It provides examples of MIPS with static dual issue where instructions are issued in two-instruction packets. The compiler must schedule instructions to avoid hazards either through reordering or inserting NOP instructions. Both static and dynamic multiple issue designs aim to increase parallelism and the instruction execution rate.

Uploaded by

許藝蓁

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

240 views13 pages

Advanced Processor Techniques

Uploaded by

許藝蓁

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Morgan Kaufmann Publishers 105年12月14日

Chapter 4 (Part IV)

The Processor:
Datapath and Control
(Parallelism and ILP)
陳瑞奇(J.C. Chen)
亞洲大學資訊工程學系
Adapted from class notes by
Prof. M.J. Irwin, PSU and Prof. D.
Patterson, UCB

§4.10 Parallelism and Advanced Instruction Level Parallelism

4.10 Instruction-Level Parallelism (ILP)

 ILP: executing multiple instructions in
parallel
 To increase ILP
 Deeper pipeline (superpipelining)
 Less work per stage  shorter clock cycle
 Increase the depth of the pipeline to increase the
clock rate
 Multiple issue (多重分發)
 Fetch more than one instructions at one time
 Replicate pipeline stages  multiple pipelines
 Start multiple instructions per clock cycle
 But dependencies reduce this in practice
Chapter 4 — The Processor — 2

Chapter 4 — The Processor 1

Morgan Kaufmann Publishers 105年12月14日

樹上有10隻鳥，打死1隻，還剩幾隻？

如何提升效率?
一箭雙鵰?

http://imgs.ntdtv.com/pic/2015/6-24/p6481351a175614304.jpg Chapter 4 — The Processor — 3

Multiple issue (多重分發)

一次同時提取2個指令或更多

多發裝子彈

http://p3.pstatp.com/large/3792/19517668
Chapter 4 — The Processor — 5

Chapter 4 — The Processor 2

Morgan Kaufmann Publishers 105年12月14日

MIPS with Static Dual Issue

 Two-issue packets
 One ALU/branch instruction
 One load/store instruction
 64-bit aligned
 ALU/branch, then load/store
http://p3.pstatp.com/large/3792/19517668
 Pad an unused instruction with nop
Address Instructiontype Pipeline Stages
n ALU/branch IF ID EX MEM WB
n+4 Load/store IF ID EX MEM WB
n+8 ALU/branch IF ID EX MEM WB
n + 12 Load/store IF ID EX MEM WB

n + 16 ALU/branch IF ID EX MEM WB
n + 20 Load/store IF ID EX MEM WB

p. 323(頁335) Fig. 4.68 Chapter 4 — The Processor — 6

Instruction-Level Parallelism (cont.)

 Launching multiple instructions per

stage allows the instruction execution
rate, CPI, to be less than 1
 So instead we use IPC: instructions
per clock cycle
 E.g., a 6 GHz, four-way multiple-issue
processor can execute at a peak rate of
24 billion instructions per second with a
best case CPI of 0.25 or
a best case IPC of 4
Chapter 4 — The Processor — 7

Chapter 4 — The Processor 3

Morgan Kaufmann Publishers 105年12月14日

Multiple Issue Processor Styles

 Static multiple issue (aka VLIW)
 Compiler groups instructions to be issued together
 Packages them into “issue slots”
 Compiler detects and avoids hazards (at compile
time by the compiler)
 E.g., Intel Itanium and Itanium 2 for the IA-64 ISA –
EPIC (Explicit Parallel Instruction Computer)

Chapter 4 — The Processor — 8

http://i1111.photobucket.com/albums/h466/kazorptb/Informatica/Intel-itanium-2-microprocessor-chipsss.png

Multiple Issue Processor Styles

 Dynamic multiple issue (aka superscalar)
 CPU examines instruction stream and chooses
instructions to issue each cycle
 Compiler can help by reordering instructions
 CPU resolves hazards using advanced techniques
at runtime (at run time by the hardware)
 E.g., IBM Power 2, Pentium 4, MIPS R10K

Chapter 4 — The Processor — 9

http://cdn.shopclues.net/images/detailed/316/northwoodp413micron_1361196559.jpg

Chapter 4 — The Processor 4

Morgan Kaufmann Publishers 105年12月14日

Static Multiple Issue

 Compiler groups instructions into “issue
packets”
 Group of instructions that can be issued on a
single cycle
 Determined by pipeline resources required
 Think of an issue packet as a very long
instruction
 Specifies multiple concurrent operations
  Very Long Instruction Word (VLIW)
1 2 3 4
Chapter 4 — The Processor — 13

Scheduling Static Multiple Issue

 Compiler must remove some/all hazards
 Reorder instructions into issue packets
 No dependencies with a packet
 Possibly some dependencies between
packets
 Varies between ISAs; compiler must know!
 Pad with nop if necessary

1 2 3 4

5 6 7 8

Chapter 4 — The Processor — 14

Chapter 4 — The Processor 5

Morgan Kaufmann Publishers 105年12月14日

MIPS with Static Dual Issue

 Two-issue packets
1 2
 One ALU/branch instruction
 One load/store instruction 3 4
 64-bit aligned
5 6
 ALU/branch, then load/store
 Pad an unused instruction with nop
Address Instructiontype Pipeline Stages
n ALU/branch IF ID EX MEM WB
n+4 Load/store IF ID EX MEM WB
n+8 ALU/branch IF ID EX MEM WB
n + 12 Load/store IF ID EX MEM WB

n + 16 ALU/branch IF ID EX MEM WB
n + 20 Load/store IF ID EX MEM WB

p. 323(頁335) Fig. 4.68 Chapter 4 — The Processor — 15

p. 324(頁336) Fig. 4.69

MIPS with Static Dual Issue

Store
Load

effective
addr.

Chapter 4 — The Processor — 16

Chapter 4 — The Processor 6

Morgan Kaufmann Publishers 105年12月14日

Hazards in the Dual-Issue MIPS

 More instructions executing in parallel
 EX data hazard
 Forwarding avoided stalls with single-issue
 Now can’t use ALU result in load/store in same packet
add $t0, $s0, $s1


load $s2, 0($t0)

 Split into two packets, effectively a stall

 Load-use hazard
 Still one cycle use latency, but now two instructions
 More aggressive scheduling required

Chapter 4 — The Processor — 18

p. 325(頁338)
Scheduling Example Fig. 4.70
 Schedule this for dual-issue MIPS
Loop: lw $t0, 0($s1) # $t0=array element
addu $t0, $t0, $s2 # add scalar in $s2
sw $t0, 0($s1) # store result
addi $s1, $s1,–4 # decrement pointer
bne $s1, $zero, Loop # branch $s1!=0

ALU/branch Load/store cycle

Loop: nop lw $t0, 0($s1) 1
addi $s1, $s1,–4 nop 2
addu $t0, $t0, $s2 nop 3
bne $s1, $zero, sw $t0, 4($s1) 4
Loop

 IPC = 5/4 = 1.25 (c.f. peak IPC = 2)

Chapter 4 — The Processor — 19

Chapter 4 — The Processor 7

Morgan Kaufmann Publishers 105年12月14日

Loop Unrolling
 Replicate loop body to expose more
parallelism
 Reduces loop-control overhead
 Use different registers per replication
 Called “register renaming”

Chapter 4 — The Processor — 20

p. 326(頁338)
Loop Unrolling Example Fig. 4.71

ALU/branch Load/store cycle

Loop: addi $s1, $s1,–16 lw $t0, 0($s1) 1
nop lw $t1, 12($s1) 2
addu $t0, $t0, $s2 lw $t2, 8($s1) 3
addu $t1, $t1, $s2 lw $t3, 4($s1) 4
addu $t2, $t2, $s2 sw $t0, 16($s1) 5
addu $t3, $t3, $s2 sw $t1, 12($s1) 6
nop sw $t2, 8($s1) 7
bne $s1, $zero, Loop sw $t3, 4($s1) 8

 IPC = 14/8 = 1.75

 Closer to 2, but at cost of registers and code size
Chapter 4 — The Processor — 21

Chapter 4 — The Processor 8

Morgan Kaufmann Publishers 105年12月14日

Dynamic Multiple Issue

 “Superscalar” processors
 CPU (Hardware) decides whether to issue
0, 1, 2, … each cycle
 Avoiding structural and data hazards
 Avoids the need for compiler scheduling
 Though it may still help
 Code semantics ensured by the CPU

Chapter 4 — The Processor — 22

Dynamic Pipeline Scheduling

 Allow the CPU to execute instructions out
of order to avoid stalls
 But commit result to registers in order
 Example
lw $t0, 20($s2)
addu $t1, $t0, $t2
sub $s4, $s4, $t3
slti $t5, $s4, 20
 Can start sub while addu is waiting for lw

Chapter 4 — The Processor — 23

Chapter 4 — The Processor 9

Morgan Kaufmann Publishers 105年12月14日

Speculation
 Predict branch and continue issuing
 Don’t commit until branch outcome
determined
 Load speculation
 Avoid load and cache miss delay
 Predict the effective address
 Predict loaded value
 Load before completing outstanding stores
 Bypass stored values to load unit
 Don’t commit load until speculation cleared

Chapter 4 — The Processor — 26

Why Do Dynamic Scheduling?

 Why not just let the compiler schedule
code?
 Not all stalls are predictable
 e.g., cache misses
 Can’t always schedule around branches
 Branch outcome is dynamically determined
 Different implementations (hardware) of an
ISA have different latencies and hazards

Chapter 4 — The Processor — 27

Chapter 4 — The Processor 10

Morgan Kaufmann Publishers 105年12月14日

§4.14 Concluding Remarks

Concluding Remarks
 ISA influences design of datapath and control
 Datapath and control influence design of ISA
 Pipelining improves instruction throughput
using parallelism
 More instructions completed per second
 Latency for each instruction not reduced
 Hazards: structural, data, control
 Multiple issue and dynamic scheduling (ILP)
 Dependencies limit achievable parallelism
 Complexity leads to the power wall

Chapter 4 — The Processor — 34

第四次作業：第四章後半部習題 (Due in 2 weeks)

4.9 在本習題中，我們檢視資料相依性會如何影響4.5節所述的基本
5-階管道的執行。習題中的各問題請參考下列指令序：
or $s1,$s2,$s3
or $s2,$s1,$s4
or $s1,$s1,$s2
另外，假設每一種前饋方法的相關週期時間如下：
無前饋有充分前饋僅有ALU-ALU前饋
250ps 300ps 290ps

4.9.1 (10%) 指出所有的相依關係以及其類別。

4.9.2 (10%) 假設在此管道化的處理器中無前饋能力。指出所有的危

障，並加入nop指令以消除之。

4.9.3 (10%) 假設有充分的前饋能力。指出所有的危障，並加入nop指

令以消除之。

4.9.5 (10%) 假設僅有ALU-ALU的前饋(但是沒有自MEM至EX階的前饋)

，在該碼中加入nop指令以消除危障。

Chapter 4 — The Processor 11

Morgan Kaufmann Publishers 105年12月14日

4.10 在本習題中，我們檢視資料危障、控制危障，以及指令集架
構(ISA)的設計如何能影響管道的執行。習題中的各問題請參考下列
的MIPS程式碼片段：
sw $s2,12($s6)
lw $s2,8($s6)
beq $s5,$s4,Label #假設$s5!=$s4
add $s5,$s1,$s4
slt $s5,$s3,$s4
假設個別的管道階級中有如下的延遲：
IF ID EX MEM WB

200ps 120ps 150ps 190ps 100ps

4.10.1 (10%) 在本問題中，假設所有分支均可被準確預測(因而消除

了所有的控制危障)且未使用任何延遲槽。設若我們僅有一個記憶體(
其中存放指令以及資料)，則每當我們需要在某一指令存取資料的同一
週期中擷取指令時便會引發結構危障。為了確保程式的推進，該危障
必須給予存取資料的指令較高的優先度。在僅有一個記憶體的五階管
道中，該指令串的總執行時間為何?我們已知資料危障可經由在程式中
加入nops來消除。你可以對結構危障使用相同的方法嗎?為何如此?

4.13 本習題旨在幫助你了解前饋、危障偵測以及ISA設計間的關係。習
題中的各問題請參考下列指令串，並假設其執行於五階的管道式資
料通道中：
add $s5,$s2,$s1
lw $s3,4($s5)
lw $s2,0($s2)
or $s3,$s5,$s3
sw $s3,0($s5)
4.13.1 (10%) 若無前饋或是危障偵測，試插入nops以確保正確的執行
。

4.13.2 (10%) 重複4.13.1，但是只有在無法改變或重新安排這些指令

來避免危障時才使用nops。你可以假設程式碼中可以使用暫存
器$s7來存放暫時值。

4.13.3 (10%) 若處理器可做前饋但是我們忘記置入危障偵測單元，則

執行該指令串將會有何結果?

4.13.4 (10%) 若有前饋功能，則在執行該碼的前五個週期中，指出每

一週期中圖4.60(如次頁)所示的危障偵測及前饋單元將會設定的訊
號。

Chapter 4 — The Processor 12

Morgan Kaufmann Publishers 105年12月14日

圖4.60

4.16本習題檢視不同的分支預測器對以下重複出現的分支結果樣式(
例如迴圈分支)可以得到的準確度： T,NT,T,T,NT。
4.16.1 (5%) 對於該分支結果樣式，總是會發生(Taken)與總是不會
發生(Not taken)的預測器其準確度各為何?

4.16.2 (5%) 對於該樣式的前四個分支，假設2位元分支預測器的開

始狀態位於圖4.63的左下方(即預測不發生)，則其準確度為何?

4.16.3 (10%) 假設該樣式不斷重複，則2位元分支預測器的準確度

為何?

圖4.63

Chapter 4 — The Processor 13

Computer Architecture Chapter 4: The Processor Part 3: Dr. Phạm Quốc Cường
No ratings yet
Computer Architecture Chapter 4: The Processor Part 3: Dr. Phạm Quốc Cường
23 pages
4.4 Pipelining
No ratings yet
4.4 Pipelining
39 pages
Patterson6e MIPS Ch04 PPT
No ratings yet
Patterson6e MIPS Ch04 PPT
137 pages
Patterson6e MIPS Ch04
No ratings yet
Patterson6e MIPS Ch04
137 pages
Chapter 04 Computer Architecture and D
No ratings yet
Chapter 04 Computer Architecture and D
95 pages
CPU Design for Engineers
No ratings yet
CPU Design for Engineers
137 pages
Chapter 04 Computer Organization and Design, Fifth Edition: The Hardware/Software Interface (The Morgan Kaufmann Series in Computer Architecture and Design) 5th Edition
71% (7)
Chapter 04 Computer Organization and Design, Fifth Edition: The Hardware/Software Interface (The Morgan Kaufmann Series in Computer Architecture and Design) 5th Edition
137 pages
Processor PDF
No ratings yet
Processor PDF
98 pages
Chapter 4 The Processor
No ratings yet
Chapter 4 The Processor
131 pages
CPU Design and Pipelining
100% (1)
CPU Design and Pipelining
131 pages
The Processor: The Hardware/Software Interface 5
No ratings yet
The Processor: The Hardware/Software Interface 5
149 pages
MIPS Processor Architecture Guide
No ratings yet
MIPS Processor Architecture Guide
51 pages
Parallelism Via Instructions: Instruction-Level Parallelism (ILP)
No ratings yet
Parallelism Via Instructions: Instruction-Level Parallelism (ILP)
21 pages
Ca Lecture 9
No ratings yet
Ca Lecture 9
26 pages
Chapter 4 The Processor
No ratings yet
Chapter 4 The Processor
72 pages
Chapter 04MHE Kabir
No ratings yet
Chapter 04MHE Kabir
171 pages
Chapter4 Part1
No ratings yet
Chapter4 Part1
51 pages
Chap 4 1
No ratings yet
Chap 4 1
57 pages
Pipeline Processor Design
No ratings yet
Pipeline Processor Design
89 pages
MIPS Pipeline Performance Guide
No ratings yet
MIPS Pipeline Performance Guide
20 pages
The Processor: Omputer Rganization and Esign
No ratings yet
The Processor: Omputer Rganization and Esign
135 pages
Pipelining
No ratings yet
Pipelining
32 pages
Chapter 4 (Part II) The Processor: Datapath and Control: (Enhancing Performance With Pipelining)
No ratings yet
Chapter 4 (Part II) The Processor: Datapath and Control: (Enhancing Performance With Pipelining)
21 pages
Lec 2
No ratings yet
Lec 2
28 pages
MIPS Pipeline Stages & Hazards
No ratings yet
MIPS Pipeline Stages & Hazards
84 pages
Arch3 Pipelining Afterlecture
No ratings yet
Arch3 Pipelining Afterlecture
180 pages
CPU Pipelining and Performance
No ratings yet
CPU Pipelining and Performance
130 pages
Comp206 Lecture8
No ratings yet
Comp206 Lecture8
32 pages
CPU Architecture Essentials
No ratings yet
CPU Architecture Essentials
39 pages
The Processor: CPU Performance Factors
No ratings yet
The Processor: CPU Performance Factors
66 pages
MIPS Processor Design Overview
No ratings yet
MIPS Processor Design Overview
153 pages
Comp206 Inclass8
No ratings yet
Comp206 Inclass8
20 pages
Unit Iv
No ratings yet
Unit Iv
17 pages
Pipelining in MIPs Architecture
100% (3)
Pipelining in MIPs Architecture
23 pages
Chapter 04
No ratings yet
Chapter 04
227 pages
Input Unit: Memory: in Processing Element (PE) or CPU: Output
No ratings yet
Input Unit: Memory: in Processing Element (PE) or CPU: Output
24 pages
Pipeline History
No ratings yet
Pipeline History
30 pages
Chapter 2 ILP
No ratings yet
Chapter 2 ILP
89 pages
Chapter 2 Lecture 4 and 5
No ratings yet
Chapter 2 Lecture 4 and 5
56 pages
Chapter 04
No ratings yet
Chapter 04
169 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
17 pages
Comp206 Lecture7
No ratings yet
Comp206 Lecture7
44 pages
Chapter 04 Processor 2
No ratings yet
Chapter 04 Processor 2
28 pages
Chapter 04
No ratings yet
Chapter 04
131 pages
Comp206 Lecture9
No ratings yet
Comp206 Lecture9
53 pages
Me FIRST
No ratings yet
Me FIRST
4 pages
06 - CS F342 Pipelining (ForMIDSEM - Upto35slides)
No ratings yet
06 - CS F342 Pipelining (ForMIDSEM - Upto35slides)
69 pages
L15 MipsPipeline
No ratings yet
L15 MipsPipeline
26 pages
Parallel Processing: 6.004x Computation Structures Part 3 - Computer Organization
No ratings yet
Parallel Processing: 6.004x Computation Structures Part 3 - Computer Organization
41 pages
Onur Digitaldesign - Comparch 2021 Lecture13 Pipelining Afterlecture
No ratings yet
Onur Digitaldesign - Comparch 2021 Lecture13 Pipelining Afterlecture
138 pages
The Processor: Computer Organization and Design
No ratings yet
The Processor: Computer Organization and Design
162 pages
4 MultiIssue 2024
No ratings yet
4 MultiIssue 2024
174 pages
Lec5 - ILP Issues in Pipeline Design
No ratings yet
Lec5 - ILP Issues in Pipeline Design
38 pages
Lecture 12
No ratings yet
Lecture 12
29 pages
CH16-WS ILP and Superscalar-V2
No ratings yet
CH16-WS ILP and Superscalar-V2
42 pages
Lecture - 17 - MIPS - Instruction Level Parallelism
No ratings yet
Lecture - 17 - MIPS - Instruction Level Parallelism
27 pages
3-Pipelining 241110 203716
No ratings yet
3-Pipelining 241110 203716
59 pages
4-Advanced Pipelining - 241114 - 060906
No ratings yet
4-Advanced Pipelining - 241114 - 060906
80 pages
Onur Digitaldesign - Comparch 2021 Lecture14 Pipelined Processor Design Afterlecture
No ratings yet
Onur Digitaldesign - Comparch 2021 Lecture14 Pipelined Processor Design Afterlecture
97 pages
A Tapestry of Values: An Introduction To Values in Science 1st Edition Elliott Download
100% (1)
A Tapestry of Values: An Introduction To Values in Science 1st Edition Elliott Download
59 pages
Personality Differences Between Males and Females Based On Big Five Factors - An Empirical Study
No ratings yet
Personality Differences Between Males and Females Based On Big Five Factors - An Empirical Study
11 pages
Elementary Proof of Basel Problem
No ratings yet
Elementary Proof of Basel Problem
4 pages
Documentationjdaily Activityjreflection
No ratings yet
Documentationjdaily Activityjreflection
25 pages
Fire Blocks
No ratings yet
Fire Blocks
6 pages
Digital Imaging
No ratings yet
Digital Imaging
13 pages
Energy Equation PDF
No ratings yet
Energy Equation PDF
35 pages
Lecture 2 John Austin
No ratings yet
Lecture 2 John Austin
13 pages
Roberto Cabrales: Math Educator & Researcher
No ratings yet
Roberto Cabrales: Math Educator & Researcher
3 pages
Ivan Sutherland - Characterization of Ten Hidden-Surface Algorithms (1974)
No ratings yet
Ivan Sutherland - Characterization of Ten Hidden-Surface Algorithms (1974)
55 pages
Cbse Ugc Net Paper 1 June 2010
No ratings yet
Cbse Ugc Net Paper 1 June 2010
20 pages
5-Phase Project Management Guide
No ratings yet
5-Phase Project Management Guide
20 pages
Lotus Alarm & Key Fob Guide
No ratings yet
Lotus Alarm & Key Fob Guide
2 pages
Transformational Leadership Behavioe Inventory (TLI) by Podsakoff Et Al. (PG 117)
No ratings yet
Transformational Leadership Behavioe Inventory (TLI) by Podsakoff Et Al. (PG 117)
131 pages
Internalisasi Core Value BerAKHLAK BPSDM Jatim - HO
100% (1)
Internalisasi Core Value BerAKHLAK BPSDM Jatim - HO
48 pages
Voice Modulation in Spoken Language: Voiceless
No ratings yet
Voice Modulation in Spoken Language: Voiceless
1 page
17aud04 Final Report Review of Metro Safety Culture and Rail Ops Safety 12.22.16
No ratings yet
17aud04 Final Report Review of Metro Safety Culture and Rail Ops Safety 12.22.16
337 pages
Module 1 PDF
No ratings yet
Module 1 PDF
35 pages
Aicte Mandatory Disclosure 2010-11
No ratings yet
Aicte Mandatory Disclosure 2010-11
59 pages
Ice Minus Bacteria
No ratings yet
Ice Minus Bacteria
6 pages
1 Sem A
No ratings yet
1 Sem A
1 page
Lofland TheoryBashingAnswerImprovingStudy 1993
No ratings yet
Lofland TheoryBashingAnswerImprovingStudy 1993
23 pages
Economic & Social Environment Q6
No ratings yet
Economic & Social Environment Q6
2 pages
7 Regression Analysis
No ratings yet
7 Regression Analysis
23 pages
Rise Goals and Objectives
No ratings yet
Rise Goals and Objectives
2 pages
Resume Saikrishna
No ratings yet
Resume Saikrishna
3 pages
Diagnostic Pathology: Molecular Oncology 2nd Edition Mohammad A Vasef MD Instant Download
100% (1)
Diagnostic Pathology: Molecular Oncology 2nd Edition Mohammad A Vasef MD Instant Download
72 pages
Oleh Kelompok 6
No ratings yet
Oleh Kelompok 6
129 pages
Ver5 - 2023-2024 Modified CRLA Pre-Test
No ratings yet
Ver5 - 2023-2024 Modified CRLA Pre-Test
25 pages
Diana
100% (1)
Diana
3 pages

Advanced Processor Techniques

Uploaded by

Advanced Processor Techniques

Uploaded by

Morgan Kaufmann Publishers 105年12月14日

Chapter 4 (Part IV)

§4.10 Parallelism and Advanced Instruction Level Parallelism

4.10 Instruction-Level Parallelism (ILP)

Chapter 4 — The Processor 1

http://imgs.ntdtv.com/pic/2015/6-24/p6481351a175614304.jpg Chapter 4 — The Processor — 3

Multiple issue (多重分發)

Chapter 4 — The Processor 2

MIPS with Static Dual Issue

p. 323(頁335) Fig. 4.68 Chapter 4 — The Processor — 6

Instruction-Level Parallelism (cont.)

 Launching multiple instructions per

Chapter 4 — The Processor 3

Multiple Issue Processor Styles

Chapter 4 — The Processor — 8

Multiple Issue Processor Styles

Chapter 4 — The Processor — 9

Chapter 4 — The Processor 4

Static Multiple Issue

Scheduling Static Multiple Issue

Chapter 4 — The Processor — 14

Chapter 4 — The Processor 5

MIPS with Static Dual Issue

p. 323(頁335) Fig. 4.68 Chapter 4 — The Processor — 15

p. 324(頁336) Fig. 4.69

Chapter 4 — The Processor — 16

Chapter 4 — The Processor 6

Hazards in the Dual-Issue MIPS

load $s2, 0($t0)

Chapter 4 — The Processor — 18

ALU/branch Load/store cycle

 IPC = 5/4 = 1.25 (c.f. peak IPC = 2)

Chapter 4 — The Processor 7

Chapter 4 — The Processor — 20

ALU/branch Load/store cycle

 IPC = 14/8 = 1.75

Chapter 4 — The Processor 8

Dynamic Multiple Issue

Chapter 4 — The Processor — 22

Dynamic Pipeline Scheduling

Chapter 4 — The Processor — 23

Chapter 4 — The Processor 9

Chapter 4 — The Processor — 26

Why Do Dynamic Scheduling?

Chapter 4 — The Processor — 27

Chapter 4 — The Processor 10

§4.14 Concluding Remarks

Chapter 4 — The Processor — 34

第四次作業：第四章後半部習題 (Due in 2 weeks)

4.9.1 (10%) 指出所有的相依關係以及其類別。

4.9.2 (10%) 假設在此管道化的處理器中無前饋能力。指出所有的危

4.9.3 (10%) 假設有充分的前饋能力。指出所有的危障，並加入nop指

4.9.5 (10%) 假設僅有ALU-ALU的前饋(但是沒有自MEM至EX階的前饋)

Chapter 4 — The Processor 11

200ps 120ps 150ps 190ps 100ps

4.10.1 (10%) 在本問題中，假設所有分支均可被準確預測(因而消除

4.13.2 (10%) 重複4.13.1，但是只有在無法改變或重新安排這些指令

4.13.3 (10%) 若處理器可做前饋但是我們忘記置入危障偵測單元，則

4.13.4 (10%) 若有前饋功能，則在執行該碼的前五個週期中，指出每

Chapter 4 — The Processor 12

4.16.2 (5%) 對於該樣式的前四個分支，假設2位元分支預測器的開

4.16.3 (10%) 假設該樣式不斷重複，則2位元分支預測器的準確度

Chapter 4 — The Processor 13

You might also like