Cache Memory
3 Levels of Cache
Level 1 Cache
o Part of Processor
Level 2 Cache
o Part of Processor
Level 3 Cache
o Shared by all cores in the processor
Cache related Terms:
Cache Hit “Hit Latency”
Tag Directory
Cache Miss “Miss Latency”
Page Fault
Page Hit
Locality of Reference
Spatial Locality
o If a location of memory is accessed by processor, then the chance of the nearby
memory spaces being used is high.
Temporal Locality
o If location of memory is accessed by processor, then the chance of it being accessed
again is high.
Direct Memory Mapping
Word : Smallest Addressable Unit of Memory
Byte-Addressable Memory: 1 word = 1 Byte
Direct Memory Bits
Num of Blocks in MM = P.A Size / Block Size
Num of Lines in Cache = Cache Size / Block Size
Block Size = Cache Line Size
Tags Bits = P.A. bits – (Line no. bits + offset)
Tag Bits = log (MM Size / Cache Size )
Tag Directory Size = Num of Lines in Cache * Num of Tag Bits
Direct Memory Mapping: Hardware Implementation
Hit Latency = T(MUX) + T(n-bit comparator)
Associative Mapping
Types of Cache
Compulsory Miss (Cold Miss)
o When memory block is referenced for the first time
Conflict Miss
o Memory Block is referenced more than once
Capacity Miss
o When Working Set > Cache
Coherence Miss
Coverage Miss
System Related Miss
Block Number Bits = No. of Tag Bits
Hit Latency = (Comparator Delay X No. of Tag Bits) + (Delay of Multi-input OR gate)
Set Associative Mapping
Num of Lines = Cache Size / Block Size
Num of Sets = No. of Lines / Number of Lines per set
Important Note
Cache Design
Block Placement
o Where to place the Main Memory Block in the Cache
Block Identification
o How to find the Main Memory Block in the Cache
Block Replacement
o During a Cache Miss, how to choose which entry to replace form the Cache
Write Strategy
o How are the updations propagated
Block Placement
Direct Mapping: Block No. % #lines
Set Associative Mapping: Block No. % #sets
Associative Mapping: Anywhere
Block Identification
Direct Mapping
o Find potential match using Line no. and Offset Bits
o Compare the Tag bits.
Set Associative Mapping
o Find set using Set Index
o Compare the Tag bits parallel.
Associative Mapping
o Find the potential match comparing all the Tag bits associated to ever line,
simultaneously.
Block Replacement
Cache is Limited in Size
What should be done?
o When Cache is Full?
o When potential match can’t be found?
Write Strategy
When?
o Processor needs to modify data word.
Situations
o Write Hit: Data is present in the cache.
Write Through
Cache and Main Memory are updated simultaneously.
Used during lesser Write operations.
Cache Replacement Policies
Used for Associative and Set Associative Mapping
Random Replacement
o Evict any Block from Cache at Random
o Access Information is not needed.
o Not Implemented
FIFO
o Evicts Blocks from Cache in their order of arrival.
o Cache behaves as a First In First Out queue
LIFO (Last in First Out)
o Behaves like a Stack
Optimal Replacement
o Evicts the block that won’t be referred for the longest period of time in future.
o Use First in First Out as Tie Breaker
Most Recently Used
o Evicts Most Recently referred Block.
o Works well with cyclic patterns
Least Recently Used
o Exploits Temporal Locality
o Evicts least recently referred block.
LRU Implementation
Uses Age Bits
Huge Overhead for Caches with higher Associativity
Age Bits start as 0,1,2,3. When request happens, set that request to value 3 and decrement the rest
Pseudo-LRU Implementation
Generates approximate measures for replacements