0% found this document useful (0 votes)

49 views27 pages

CPUlogic Design 11 Cache

This document discusses CPU logic design and cache memory. It describes the memory hierarchy from fastest to slowest components and their typical sizes and access times. It shows a diagram of the processor, instruction cache, and data cache. The rest of the document discusses key cache parameters like size, block size, placement, mapping algorithms, replacement algorithms, and write strategies. It provides examples to explain direct mapping, set associative mapping and fully associative placement. It also describes how blocks are identified in the cache using address tags and valid bits and how CPU addresses are divided for caching.

Uploaded by

sharoofy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views27 pages

CPUlogic Design 11 Cache

Uploaded by

sharoofy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

CPU Logic Design

Li Yamin
Department of Computer Science
Faculty of Computer and Information Sciences
Hosei University, Tokyo 184-8584 Japan
http://cis.k.hosei.ac.jp/∼yamin/

LYM, Dept. of CS, Hosei University Cache – p.1/27

Performance

100,000

10,000

1,000

100

1
19
80
19
81
19
82
CPU Logic Design

19
83
19
84

LYM, Dept. of CS, Hosei University

19
85
19
86
19
87
19
88
19
89
19
90
19
91
19
92
19
CPU

93
19
94
19
95
19
Memory

96
19
97
19
98
19
99
20
00
20
Cache – p.2/27

02
CPU Logic Design

Memory hierarchy
Speed Size

Fastest Registers smallest

On-chip cache
Off-chip cache

Main memory
Disk
Slowest Tape Biggest

LYM, Dept. of CS, Hosei University Cache – p.3/27

CPU Logic Design

Capacity Access Time

Registers ∼ 1KB 0.5 ∼ 1ns
On-chip cache 16 ∼ 128KB 1 ∼ 10ns
Off-chip cache 128K ∼ 4MB 10 ∼ 20ns
Main memory 256MB ∼ 4GB 40 ∼ 80ns
Hard disk 20 ∼ 500GB 1 ∼ 10ms
Tape ∼ 10TB 10ms ∼ 10s

LYM, Dept. of CS, Hosei University Cache – p.4/27

CPU Logic Design

Processor
DO
IR
DI Instruction Memory
Cache
PC A

DI D

DO
DO
DI Data
Cache
DA A

LYM, Dept. of CS, Hosei University Cache – p.5/27

CPU Logic Design

 
  Cache Size
Size






  Block Size

 

  Physical Address Cache

Cache Placement






  Virtual Address Cache

 







 Direct Mapping

Mapping Algorithms Set Associate Mapping




 

Parameters Full Associate Mapping


 


  Write Back


 Memory Update Mechanism


  Write Through

 

  Write Allocate




 Cache Write Miss


  No-Write Allocate

 

  Random Replacement Algorithms


 Replacement Algorithms 



LRU Replacement Algorithms

LYM, Dept. of CS, Hosei University Cache – p.6/27

CPU Logic Design

Q1: Block placement – Where can a block be placed

in the upper level?

Q2: Block identification – How is a block found if it is in

the upper level?

Q3: Block replacement – Which block should be

replaced on a miss?

Q4: Write strategy – What happens on a write?

LYM, Dept. of CS, Hosei University Cache – p.7/27

CPU Logic Design

Q1: Where can a block be placed in the upper level?

If each block has only one place it can appear in the
cache, the cache is said to be direct mapped.
If a block can be placed anywhere in the cache, the
cache is said to be fully associative.
If a block can be placed in a restricted set of places in
the cache, the cache is said to be set associative. A
set is a group of blocks in the cache. A block is first
mapped onto a set, and then the block can be placed
anywhere within that set. If there are n blocks in a set,
the cache placement is called n-way set associative.

LYM, Dept. of CS, Hosei University Cache – p.8/27

CPU Logic Design

Fully associative Direct mapped Set associative

block 12 can go block 12 can go block 12 can go
anywhere only into block 4 anywhere in set 0
(12 mod 8) (12 mod 4)
Block Block Block
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
number number number

Cache

Set Set Set Set

0 1 2 3

Block 0 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
number

Memory

LYM, Dept. of CS, Hosei University Cache – p.9/27

CPU Logic Design

Q2: How is a block found if it is in the upper level?

Caches have an address tag on each block frame that
gives the block address.
The tag is checked to see if it matches the block
address from the CPU.
There must be a way to know that a cache block does
not have valid information. The most common
procedure is to add a valid bit to the tag to say
whether or not this entry contains a valid address.
The figure in the next page shows how a CPU
address is divided.

LYM, Dept. of CS, Hosei University Cache – p.10/27

CPU Logic Design

The three portions of an address in a set-associative

or direct-mapped cache:
Block address Block
Tag Index offset

The first division is between the block address and

the block offset. The block frame address can be
further divided into the tag field and the index field.
The block offset field selects the desired data from
the block, the index field selects the set, and the tag
field is compared against it for a hit.

LYM, Dept. of CS, Hosei University Cache – p.11/27

CPU Logic Design

Block
Block address
offset
<21> <8> <5> Address
Tag Index Data in
2
Data out
Valid Tag Data
<1> <21> <256>

256
blocks Write
buffer
mux
=? 64

Lower level memory

Hit

LYM, Dept. of CS, Hosei University Cache – p.12/27

CPU Logic Design

Block
Block address
offset
<22> <7> <5> Address
Tag Index Data in
2
Data out

Valid Tag Data

<1> <22> <256>

128
blocks Write
buffer
mux mux
=? =?

Lower
mux level
64 memory

LYM, Dept. of CS, Hosei University Cache – p.13/27

CPU Logic Design

Q3: Which block should be replaced on a miss?

When a miss occurs, the cache controller must select
a block to be replaced with the desired data.
A benefit of direct-mapped placement is that
hardware decisions are simplified — in fact, so simple
that there is no choice: Only one block is selected for
a hit, and one that block can be replaced.
With fully associative or set associative placement,
there are many blocks to choose from on a miss.
There are two primary strategies employed for
selecting which block to replace — random and
least-recently used (LRU).

LYM, Dept. of CS, Hosei University Cache – p.14/27

CPU Logic Design

Random Strategy:
To spread allocation uniformly, candidate blocks are
randomly selected.
Some system generate pseudo-random block
numbers to get reproducible behavior, which is
partially useful when debugging hardware.
A virtue of random replacement is that it is simple to
build in hardware.

LYM, Dept. of CS, Hosei University Cache – p.15/27

CPU Logic Design

Least-recently used (LRU) Strategy:

To reduce the chance of throwing out information that
will be needed soon, accesses to block are recorded.
The block replaced is the one that has been unused
for the longest time.
LRU makes use of a corollary of locality: If recently
used blocks are likely to be used again, then the best
candidate for disposal is the least-recently used block.

LYM, Dept. of CS, Hosei University Cache – p.16/27

CPU Logic Design

When a miss occurs in a cache, the incoming data

has to replace something already in the cache.
The least recently used , or LRU, replacement
algorithm tries to find which block was used least
recently. We must keep track of the most recent time
that a block is used.

LYM, Dept. of CS, Hosei University Cache – p.17/27

CPU Logic Design

Consider a set size of 8 (8-way set associate cache).

LRU could be realized by assigning a 3-bit
downcounter to each block. When the cache is
flushed, all counters are reset. The rules are:

Replace at count 0 (any 0 will do).

Upon a hit (including replacement), set the hit block
to 7 and decrement all other counters, but the
counter cannot go below zero.
On cache reset, reset all counters as well as valid
bits.

LYM, Dept. of CS, Hosei University Cache – p.18/27

CPU Logic Design

If the set size is 4 (4-way set associate cache), LRU

could be realized by assigning a 2-bit downcounter to
each block.
The counter value 3 means that set is used most
recently; 2 means that set is used more recently; ...;
and 0 means that set is used least recently.

LYM, Dept. of CS, Hosei University Cache – p.19/27

CPU Logic Design

If the set size is 2 (2-way set associate cache), the

one which was not most recently used must be the
least recently used. So it is not necessary to use two
1-bit counters in the two sets. A 1-bit counter is
enough. The counter value 1 means the set 1 is used
most recently, for instance, it can be understand that
the set 0 must be used least recently.
By assigning a 4-bit downcounter to each block can
realize LRU for a 16-way set associate cache. But the
16-way set associate cache was not found in real
design. Assigning a 4-bit downcounter to each block
will increase the hardware burden greatly.

LYM, Dept. of CS, Hosei University Cache – p.20/27

CPU Logic Design

(1) Replace S3: (2) Replace S1:

S3 S2 S1 S0 S3 S2 S1 S0
0 3 0 0 0 2 0 3 0
1 0 0 0 0 0 0 0 0
LRU counter ... ... ... ... ... ... ... ... ...
(2-bit)
7 0 0 0 0 0 0 0 0

(3) Replace S0: (4) Replace S2:

S3 S2 S1 S0 S3 S2 S1 S0

1 0 2 3 0 3 1 2
0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ...

0 0 0 0 0 0 0 0

LYM, Dept. of CS, Hosei University Cache – p.21/27

CPU Logic Design

2-way set-associate cache Tag 3 2

S1
2AE1 4 1 Word address S0

COMP COMP
Select Select
MUX MUX
OutputEnable OutputEnable

Data out Hit Data out

LYM, Dept. of CS, Hosei University Cache – p.22/27

CPU Logic Design

Q4: What happens on a write?

There are two basic options when writing to the
cache:
1. Write-through (or store through) — The information
is written to both the block in the cache and the
block in the lower-level memory.
2. Write back (or copy back or store in) — The
information is written only to the block in the cache.
The modified cache block is written to main
memory only when it is replaced.

LYM, Dept. of CS, Hosei University Cache – p.23/27

CPU Logic Design

There are two options on a write miss:

1. Write allocate — The block is loaded on a write
miss. This is similar to a read miss.
2. No-write allocate — The block is modified in the
lower level and not loaded into the cache.
Although either write-miss policy could be used with
write through or write back, write-back caches
generally use write allocate (hopping that subsequent
writes to that block will be captured by the cache) and
write-through caches often use no-write allocate
(since subsequent writes to that block will still have to
go to memory).

LYM, Dept. of CS, Hosei University Cache – p.24/27

CPU Logic Design

A valid bit is added to units smaller than the full block,

called sub-blocks.
Only a single sub-block need be read on a miss.

Tag V V V V
100 1 1 1 1
300 1 1 0 0
200 0 1 0 1
204 0 0 0 0

SUb-block

LYM, Dept. of CS, Hosei University Cache – p.25/27

CPU Logic Design

Tc : cache access time

Tm : main memory access time
h: hit ratio; 1 − h: miss ratio
Average memory access time
T = hTc + (1 − h)(Tc + Tm ) = Tc + (1 − h)Tm

Example:
Tc = 1ns, Tm = 50ns, h = 95%
T = Tc + mTm = 1 + 0.05 × 50 = 1 + 2.5 = 3.5ns
Speedup = 50/3.5 = 14.29 = 1429%

LYM, Dept. of CS, Hosei University Cache – p.26/27

CPU Logic Design

First-level caches in the Pentium Pro and PowerPC 604

Characteristic Intel Pentium Pro PowerPC 604
Cache organization Split instruction Split instruction
and data caches and data caches
Cache size 8KB each for 16KB each for
instructions/data instructions/data
Cache associativity Four-way set Four-way set
associative associative
Replacement Approximated LRU LRU
Block size 32 bytes 32 bytes
Write policy Write-back Write-back or Write-through

LYM, Dept. of CS, Hosei University Cache – p.27/27

Lecture 8
No ratings yet
Lecture 8
33 pages
15IF11 Multicore B
No ratings yet
15IF11 Multicore B
36 pages
Memory Hierarchy Design Guide
No ratings yet
Memory Hierarchy Design Guide
54 pages
UNIT2 Cahe-Opt
No ratings yet
UNIT2 Cahe-Opt
134 pages
Cache Presentation
No ratings yet
Cache Presentation
45 pages
CH04 COA10e
No ratings yet
CH04 COA10e
41 pages
AC14L08 Memory Hierarchy
No ratings yet
AC14L08 Memory Hierarchy
20 pages
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
No ratings yet
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
10 pages
Onur Comparch Fall2017 Lecture3 Afterlecture
No ratings yet
Onur Comparch Fall2017 Lecture3 Afterlecture
219 pages
6.module 2 - Part 2
No ratings yet
6.module 2 - Part 2
39 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
46 pages
Memory Hierarchy Design Guide
No ratings yet
Memory Hierarchy Design Guide
115 pages
Computer Architecture - Lecture 06
No ratings yet
Computer Architecture - Lecture 06
18 pages
CH04 COA9e Cache Memory Repaired
No ratings yet
CH04 COA9e Cache Memory Repaired
42 pages
Chapter 5
No ratings yet
Chapter 5
16 pages
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
No ratings yet
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
32 pages
Lec 27
No ratings yet
Lec 27
28 pages
CH04 COA10e
No ratings yet
CH04 COA10e
46 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
46 pages
05) Cache Memory Introduction
No ratings yet
05) Cache Memory Introduction
20 pages
Cache PPT
No ratings yet
Cache PPT
38 pages
Cache Memory
No ratings yet
Cache Memory
47 pages
Memory Hierarchy Design
No ratings yet
Memory Hierarchy Design
76 pages
Sampriya Chandra Cache Memory
No ratings yet
Sampriya Chandra Cache Memory
36 pages
Lecture 13 - Introduction To Cache
No ratings yet
Lecture 13 - Introduction To Cache
47 pages
Cache Read Write Policies
No ratings yet
Cache Read Write Policies
9 pages
Cache Memory-Associative Mapping
No ratings yet
Cache Memory-Associative Mapping
23 pages
Conspect of Lecture 7
No ratings yet
Conspect of Lecture 7
13 pages
Cache
No ratings yet
Cache
34 pages
Lec8 Memory
No ratings yet
Lec8 Memory
17 pages
Cache Memory: CS 322M Digital Logic & Computer Architecture
No ratings yet
Cache Memory: CS 322M Digital Logic & Computer Architecture
16 pages
UNIT-IV Memory and I/O
No ratings yet
UNIT-IV Memory and I/O
36 pages
Computer Architecture: Cache Design
No ratings yet
Computer Architecture: Cache Design
61 pages
Memory Cache
No ratings yet
Memory Cache
18 pages
Lec 5
No ratings yet
Lec 5
35 pages
Unit 6
No ratings yet
Unit 6
25 pages
Computer Memory Systems Overview
No ratings yet
Computer Memory Systems Overview
37 pages
N-Way Set Associative Cache Guide
No ratings yet
N-Way Set Associative Cache Guide
70 pages
Computer Architecture: Assoc. Prof. Nguyễn Trí Thành, Phd
No ratings yet
Computer Architecture: Assoc. Prof. Nguyễn Trí Thành, Phd
55 pages
Large and Fast: Exploiting Memory Hierarchy
No ratings yet
Large and Fast: Exploiting Memory Hierarchy
48 pages
CH04
No ratings yet
CH04
46 pages
Cache Basics and Operation
No ratings yet
Cache Basics and Operation
42 pages
CH04 Cache Memory
No ratings yet
CH04 Cache Memory
44 pages
Introduction To Cache Memory: Lecture 4A
No ratings yet
Introduction To Cache Memory: Lecture 4A
31 pages
Cache Memory: A Safe Place For Hiding or Storing Things
100% (1)
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
09 Caches Tlbs
No ratings yet
09 Caches Tlbs
33 pages
Cache Mapping
No ratings yet
Cache Mapping
23 pages
Memory Hierarchies (Part 2) Review: The Memory Hierarchy
No ratings yet
Memory Hierarchies (Part 2) Review: The Memory Hierarchy
7 pages
CPU Cache: Details of Operation
No ratings yet
CPU Cache: Details of Operation
18 pages
Lec 5
No ratings yet
Lec 5
29 pages
Course Code: CS 283 Course Title: Computer Architecture: Class Day: Friday Timing: 12:00 To 1:30
No ratings yet
Course Code: CS 283 Course Title: Computer Architecture: Class Day: Friday Timing: 12:00 To 1:30
23 pages
Cacne Memory - Mapping Techniques
No ratings yet
Cacne Memory - Mapping Techniques
7 pages
Computer Architecture: Cache Memory
No ratings yet
Computer Architecture: Cache Memory
28 pages
10 Caches
No ratings yet
10 Caches
34 pages
Cache Memory Mapping Techniques
No ratings yet
Cache Memory Mapping Techniques
7 pages
Chapter # 05
No ratings yet
Chapter # 05
42 pages
Com Arch Lec Slide 3
No ratings yet
Com Arch Lec Slide 3
30 pages
Lec 4
No ratings yet
Lec 4
31 pages
File Management
No ratings yet
File Management
26 pages
Run - Time Environment
No ratings yet
Run - Time Environment
27 pages
Ownership Based Cache Coherence
No ratings yet
Ownership Based Cache Coherence
10 pages
OS LAB Manual
No ratings yet
OS LAB Manual
101 pages
Systems I: Locality and Caching
No ratings yet
Systems I: Locality and Caching
18 pages
OS Basics for Students
No ratings yet
OS Basics for Students
51 pages
OS PYQ Questions and Answers
No ratings yet
OS PYQ Questions and Answers
5 pages
Week 2-3 2nd Day
No ratings yet
Week 2-3 2nd Day
46 pages
Chapter 3 - Memory Management (Virtual Memory Systems)
No ratings yet
Chapter 3 - Memory Management (Virtual Memory Systems)
53 pages
200 Advanced Oracle DBA Q&A - Real-World, Deep-Level
No ratings yet
200 Advanced Oracle DBA Q&A - Real-World, Deep-Level
62 pages
Dhamdhere OS2E Chapter 06 PowerPoint Slides
100% (1)
Dhamdhere OS2E Chapter 06 PowerPoint Slides
67 pages
5 Operating System - 8096097
No ratings yet
5 Operating System - 8096097
32 pages
Dynamic Memory Allocation Guide
No ratings yet
Dynamic Memory Allocation Guide
5 pages
Memory Management Essentials
No ratings yet
Memory Management Essentials
30 pages
Draw Syntax Tree For The Expression A
No ratings yet
Draw Syntax Tree For The Expression A
2 pages
A To Z Preparation Guide For Code With Cisco by Vikram
No ratings yet
A To Z Preparation Guide For Code With Cisco by Vikram
17 pages
Python Programming FAQ Guide
No ratings yet
Python Programming FAQ Guide
68 pages
Programming Language Design and Implementation Torben Gidius Mogensen Instant Download
100% (1)
Programming Language Design and Implementation Torben Gidius Mogensen Instant Download
76 pages
Object Lifetime and Storage Management
No ratings yet
Object Lifetime and Storage Management
16 pages
Operating Systems: Syed Mansoor Sarwar
No ratings yet
Operating Systems: Syed Mansoor Sarwar
26 pages
Z1) Privilege Escalation Via Heap Overflow Binary Exploitation Using Exploit Development On 64 Bit Kali Linux 2025.2
No ratings yet
Z1) Privilege Escalation Via Heap Overflow Binary Exploitation Using Exploit Development On 64 Bit Kali Linux 2025.2
27 pages
Linux and OS Question SEM FINAL QUESTION BANK
No ratings yet
Linux and OS Question SEM FINAL QUESTION BANK
65 pages
Dynamic Memory Allocation 19
No ratings yet
Dynamic Memory Allocation 19
14 pages
Java Memory Management
No ratings yet
Java Memory Management
38 pages
Topics in EOne Applications Performance
No ratings yet
Topics in EOne Applications Performance
34 pages
Lovely Professional University: Operating System
No ratings yet
Lovely Professional University: Operating System
3 pages
Principles of Programming Language Unit-2 1.: Explain Briefly About Scope and Its Lifetime. Scope and Scope Rules
No ratings yet
Principles of Programming Language Unit-2 1.: Explain Briefly About Scope and Its Lifetime. Scope and Scope Rules
38 pages
CD Model Set-5 Answer Key
No ratings yet
CD Model Set-5 Answer Key
32 pages
Unit-I-Operating System Overview and Structure
No ratings yet
Unit-I-Operating System Overview and Structure
57 pages
Chapter 5 (Memory Management) Notes
No ratings yet
Chapter 5 (Memory Management) Notes
25 pages

CPUlogic Design 11 Cache

Uploaded by

CPUlogic Design 11 Cache

Uploaded by

CPU Logic Design

LYM, Dept. of CS, Hosei University Cache – p.1/27

LYM, Dept. of CS, Hosei University

Fastest Registers smallest

LYM, Dept. of CS, Hosei University Cache – p.3/27

Capacity Access Time

LYM, Dept. of CS, Hosei University Cache – p.4/27

LYM, Dept. of CS, Hosei University Cache – p.5/27

LYM, Dept. of CS, Hosei University Cache – p.6/27

Q1: Block placement – Where can a block be placed

Q2: Block identification – How is a block found if it is in

Q3: Block replacement – Which block should be

Q4: Write strategy – What happens on a write?

LYM, Dept. of CS, Hosei University Cache – p.7/27

Q1: Where can a block be placed in the upper level?

LYM, Dept. of CS, Hosei University Cache – p.8/27

Fully associative Direct mapped Set associative

Set Set Set Set

LYM, Dept. of CS, Hosei University Cache – p.9/27

Q2: How is a block found if it is in the upper level?

LYM, Dept. of CS, Hosei University Cache – p.10/27

The three portions of an address in a set-associative

The first division is between the block address and

LYM, Dept. of CS, Hosei University Cache – p.11/27

Lower level memory

LYM, Dept. of CS, Hosei University Cache – p.12/27

Valid Tag Data

LYM, Dept. of CS, Hosei University Cache – p.13/27

Q3: Which block should be replaced on a miss?

LYM, Dept. of CS, Hosei University Cache – p.14/27

LYM, Dept. of CS, Hosei University Cache – p.15/27

Least-recently used (LRU) Strategy:

LYM, Dept. of CS, Hosei University Cache – p.16/27

When a miss occurs in a cache, the incoming data

LYM, Dept. of CS, Hosei University Cache – p.17/27

Consider a set size of 8 (8-way set associate cache).

Replace at count 0 (any 0 will do).

LYM, Dept. of CS, Hosei University Cache – p.18/27

If the set size is 4 (4-way set associate cache), LRU

LYM, Dept. of CS, Hosei University Cache – p.19/27

If the set size is 2 (2-way set associate cache), the

LYM, Dept. of CS, Hosei University Cache – p.20/27

(1) Replace S3: (2) Replace S1:

(3) Replace S0: (4) Replace S2:

LYM, Dept. of CS, Hosei University Cache – p.21/27

2-way set-associate cache Tag 3 2

d3 d2 d1 d0 V Tag LRU Tag V d3 d2 d1 d0

Data out Hit Data out

LYM, Dept. of CS, Hosei University Cache – p.22/27

Q4: What happens on a write?

LYM, Dept. of CS, Hosei University Cache – p.23/27

There are two options on a write miss:

LYM, Dept. of CS, Hosei University Cache – p.24/27

A valid bit is added to units smaller than the full block,

LYM, Dept. of CS, Hosei University Cache – p.25/27

Tc : cache access time

LYM, Dept. of CS, Hosei University Cache – p.26/27

First-level caches in the Pentium Pro and PowerPC 604

LYM, Dept. of CS, Hosei University Cache – p.27/27

You might also like