Lecture of Module 3
Memory Organization
Overview
Introduction
Characteristics of Memory System
Memory Hierarchy
Memory Classification
Semiconductor Memories
Memory Cell Organization and Operation
Magnetic Disk
Magnetic Tape
Introduction
One of the major advantages of computer is its storage capacity where huge amount of
information can be stored.
But how this information is represented and stored?
Now we are going to learn about the various data storage devices.
A memory is just like a human brain.
It is used to store data and instructions.
Computer memory is the storage space in the computer, where the data and instructions are
stored.
Each location of the memory has a unique address, which varies from zero to memory size
minus one.
For example, if the computer has 64k words, then this memory unit has 64 * 1024 = 65536
memory locations. The address of these locations varies from 0 to 65535.
Processor Memory
k-bit
address bus
MAR
n-bit
data bus Up to 2 k addressable
MDR locations
Word length = n bits
Control lines
( R / W, MFC, etc.)
Characteristics of Memory System
Location
Refers to whether memory is internal and external to the computer
Internal memory is often equated with main memory
Processor requires its own local memory, in the form of registers
Cache is another form of internal memory
External memory (Secondary Memory) consists of peripheral storage devices that are
accessible to the processor via I/O controllers
Capacity
Memory is typically expressed in terms of bytes
Transfer unit
Byte Addressable, Word Addressable (Main Memory)
Block (Secondary Memory)
Access Methods
Sequential Direct Random Associative
access access access Access
Memory is Individual blocks or
Each addressable location in A word is retrieved based on a
organized into records have a unique
memory has a unique, portion of its contents rather
units of data address based on
called records physical location address than its address
Each location has its own
Access must be Access is accomplished The time to access a given
addressing mechanism and
made in a by direct access to location is independent of
specific linear
retrieval time is constant
reach a particular the sequence of prior
sequence independent of location or
place, then sequential accesses and is constant
prior access patterns
Any location can be
Cache memories may
Access time is Access time is selected at random and
employ associative
variable variable directly addressed and
access
accessed
Main memory and some
Magnetic Tape Magnetic Disk cache systems are
random access
Performance
Most important characteristics of memory
Three performance parameters are used:
Access time Memory cycle time
For random-access memory it is the Transfer rate
Access time plus any additional
time it takes to perform a read or time required before second The rate at which data can
write operation access can commence be transferred into or out of
For non-random-access memory it Time between initiation of two a memory unit
is the time it takes to position the successive memory operations For random-access
read-write mechanism at the Slightly more than the Access memory it is equal to
desired location and read/write time 1/(cycle time)
Time between initiation of a
memory operation and completion
Physical Characteristics
Volatile memory
Information decays naturally or is lost when electrical power is switched off
RAM
Nonvolatile memory
Once recorded, information remains without deterioration until deliberately changed
No electrical power is needed to retain information
Magnetic-surface memories, ROM
Semiconductor memory may be either volatile or nonvolatile
Erasable memory
Can be altered or erased by the programmer
Nonerasable memory
Cannot be altered, except by destroying the storage unit
Semiconductor memory of this type is known as read-only memory (ROM)
Memory Hierarchy
Design constraints on a computer s memory can be
summed up by three questions:
How much, how fast, how expensive
There is a trade-off among capacity, access time, and
cost
Faster access time, greater cost per bit
Greater capacity, smaller cost per bit
Greater capacity, slower access time
The way out of the memory dilemma is not to rely on a
single memory component or technology, but to employ
a memory hierarchy
The memory hierarchy separates computer storage into
a hierarchy based on access time, capacity and cost.
Memory Classification
Memory
Primary Memory Secondary Memory
Semiconductor Memory Magnetic Memory Direct Access Sequential
Memory Access Memory
Read/Write Memory Read Only Memory
Magnetic Drum Magnetic Disk Magnetic Tape
Bipolar RAM MOS RAM
Static RAM Static RAM Dynamic RAM Core Memory Bubble Memory Floppy Disk Hard Disk
Semiconductor Memory
Semiconductor memory is a semiconductor device used for digital data storage in the
computer.
It typically refers to Bipolar or MOS memory, where data is stored within memory cell on
a silicon integrated circuit (IC) memory chip.
There are different types of memory using different semiconductor technologies.
The two main types of semiconductor memories are Random Access Memory (RAM) and
Read Only Memory (ROM).
Most types of semiconductor memory have the property of random access, which means that it
takes the same amount of time to access any memory location.
So, data can be efficiently accessed in any random order.
Semiconductor memory also has much faster access time than other types of data storage.
It is used for main memory for the computer (primary storage), to hold data of the computer.
Random Access Memory (RAM)
RAM (Random Access Memory) is the internal memory of the CPU for storing
data, program, and results.
It is a read/write memory which stores data until the machine is working.
As soon as the machine is switched off, data is erased.
RAM is volatile, that is, data stored in it is lost when we switch off the computer or
if there is a power failure.
Hence, a backup Uninterruptible Power Supply (UPS) is often used with
computers.
RAM is small, both in terms of its physical size and in the amount of data it can
hold.
Access time in RAM is independent of the address, that is, each storage location
inside the memory is as easy to reach as other locations and takes the same amount
of time.
Structure of RAM
According to semiconductor technology, there are two types of RAM.
Bipolar RAM - always Static RAM
MOS RAM may be Static RAM or Dynamic RAM
The word Static indicates that the memory retains its contents as long as power is being
supplied. However, data is lost when the power gets down due to volatile nature.
SRAM chips use a matrix of transistors and no capacitors.
Transistors do not require power to prevent leakage, so SRAM need not be refreshed on a
regular basis.
There is extra space in the matrix, hence SRAM uses more chips than DRAM for the same
amount of storage space, making the manufacturing costs higher.
SRAM is thus used as cache memory and has very fast access.
Static RAM (Bipolar)
Bipolar RAM always Static RAM
Q1 is on Q2 is off logic 1 is stored
Q2 is on Q1 is off logic 0 is stored
D1 D2
Very high speed of operation
But High power dissipation
Q1 Q2
Low bit density
Complex manufacturing process
Used as CACHE memory cell
Static RAM (MOS)
This is a random access memory that uses an array of
six transistors and is made up of CMOS technology.
Its structure consisting of two cross-coupled inverters
which hold the data.
T1, T4 Off, T2, T3 On, So C1 is high and C2 is
low, stores logic 1
T1, T4 On, T2, T3 Off, So C1 is low and C2 is
high, stores logic 0
All states remain stable until the power voltage is
applied with the Direct Current (DC)
Higher bit density than Bipolar static RAM
Lower power dissipation than Bipolar static RAM
Slower speed of operation than Bipolar static RAM
Dynamic RAM (DRAM)
DRAM consists of one transistor and a capacitor.
The capacitor stores one bit of data in the form of charge.
Presence or absence of charge in a capacitor is interpreted as a
binary 1 or 0.
It requires refreshing of each memory cell to retain the data as
capacitor discharges through time. So, dynamic in nature.
For storing, Word line is selected, T is On by applying voltage to the
Bit line then the capacitor starts charging.
For reading, Word line is selected, the charged capacitor will
discharge through Bit line and the sense amplifier compares the
voltage in capacitor with threshold voltage.
If the voltage more than threshold voltage then logic 1 otherwise
logic 0.
Requires periodic charge refreshing to maintain data storage.
Low cost than SRAM
Higher bit density than SRAM
So, High storage capacity than SRAM
Simple manufacturing process
Consumes less power than SRAM
Slower speed of operation than SRAM
Needs to refresh periodically
Some DRAM chip incorporate refreshing facility and address control circuitry within the
memory chip to make it behave similar to Static RAM. So, the dynamic nature of the chip is
not visible to the user. This type of DRAM called as Pseudo-Static RAM (PSRAM).
Types of DRAM
ADRAM Asynchronous DRAM
EDRAM Enhanced DRAM
CDRAM Cache DRAM
RDRAM Rhombus DRAM
SDRAM Synchronous DRAM
DDR SDRAM Double Data Rate SDRAM
Comparison
Read Only Memory (ROM)
ROM stands for Read Only Memory.
The memory from which we can only read but cannot write on it.
This type of memory is non-volatile.
The information is stored permanently in such memories during manufacture.
Contains a permanent pattern of data that cannot be changed or added.
No power source is required to maintain the bit values in memory.
Data or program is permanently in main memory and never needs to be loaded
from a secondary storage device.
Data is actually wired into the chip as part of the fabrication process.
A ROM stores such instructions that are required to start a computer.
This operation is referred to as Bootstrap.
Types of ROM
Mask ROM
PROM
EPROM
EEPROM
Flash ROM
Mask ROM In this type of ROM, the specification of the ROM (its contents and their
location), is taken by the manufacturer from the customer in tabular form in a specified format
and then makes corresponding masks for the paths to produce the desired output . This is costly,
as the vendor charges special fee from the customer for making a particular ROM
(recommended, only if large quantity of the same ROM is required).
Uses They are used in network operating systems, server operating systems, storing of fonts
for laser printers, sound data in electronic musical instruments.
PROM It stands for Programmable Read-Only Memory . It is first prepared as blank memory, and then it is programmed
to store the information . The difference between PROM and Mask ROM is that PROM is manufactured as blank memory
and programmed after manufacturing, whereas a Mask ROM is programmed during the manufacturing process.
To program the PROM, a PROM programmer or PROM burner is used . The process of programming the PROM is called
as burning the PROM . Also, the data stored in it cannot be modified, so it is called as one time programmable device.
Uses They have several different applications, including cell phones, video game consoles, medical devices, and other
electronics.
EPROM It stands for Erasable Programmable Read-Only Memory . It overcomes the disadvantage of PROM that once
programmed, the fixed pattern is permanent and cannot be altered . If a bit pattern has been established, the PROM
becomes unusable, if the bit pattern has to be changed .This problem has been overcome by the EPROM, as when the
EPROM is placed under a special ultraviolet light for a length of time, the shortwave radiation makes the EPROM return
to its initial state, which then can be programmed accordingly . Again for erasing the content, PROM programmer or
PROM burner is used.
Uses Before the advent of EEPROMs, some micro-controllers, like some versions of Intel 8048, the Freescale 68HC11
used EPROM to store their program .
EEPROM It stands for Electrically Erasable Programmable Read-Only Memory . It is
similar to EPROM, except that in this, the EEPROM is returned to its initial state by
application of an electrical signal, in place of ultraviolet light . Thus, it provides the ease of
erasing, as this can be done, even if the memory is positioned in the computer. It erases or
writes one byte of data at a time .
Uses It is used for storing the computer system BIOS.
Flash ROM It is an enhanced version of EEPROM .The difference between EEPROM and
Flash ROM is that in EEPROM, only 1 byte of data can be deleted or written at a particular
time, whereas, in flash memory, blocks of data (usually 512 bytes) can be deleted or written at
a particular time . So, Flash ROM is much faster than EEPROM .
Uses Many modern PCs have their BIOS stored on a flash memory chip, called as flash
BIOS and they are also used in modems as well.
Advantages of ROM
It is non-volatile, meaning data which was set by the
manufacture will function as expected.
Due to being static, they don t need a refreshing time.
In comparison to RAM, the circuitry is simpler.
Data can be stored permanently.
Disadvantages of ROM:
ROM is a read only memory unit, so it can t be modified.
If any changes are required, it s not possible.
Comparison
RAM ROM
Random Access Memory Read Only Memory
Volatile memory Non volatile memory
If the system is turned off, the information it carries
If the system is turned off, the information will be
will still be on the memory, meaning that the system
deleted
can retrieve it again when the system is switched on
Requires power to store data Doesn t require power to store data
RAM is a temporary storage unit to store files ROM is used to store BIOS which don t change
Chips often range from 1 to 256 GB Chips often range from 4 to 8 MB
Temporary memory Permanent memory
Complex Circuit Simpler Circuit
Error Detection and Correction
Two types of error may be encountered in semiconductor memory.
Hard Failure
Permanent physical defect so that memory cell or cells affected cannot reliably
store data but become stuck at 0 or 1 or switch erratically between 0 and 1.
Can be caused by:
Harsh environmental effect
Manufacturing defects
Wear and tear etc.
Soft Error
It is a random, non-destructive event that alters the contents of one or more
memory cells without damaging the memory.
Can be caused by:
Power supply problems
Alpha particles results from radioactive decay
So, error detection and correction is required.
Magnetic Disk
Magnetic Disk is type of secondary memory which is a flat disc
covered with magnetic coating to hold information.
A disk is a circular plate constructed of nonmagnetic material, called
the substrate, coated with a magnetizable material.
Traditionally the substrate has been an aluminium or aluminium
alloy material.
Recently glass substrates have been introduced for improvement in
the uniformity of the magnetic film surface to increase disk
reliability.
It is used to store various programs and files.
The polarized information in one direction is represented by 1, and
other direction is indicated by 0.
There are two types of Magnetic Disk memory
Floppy Disk and Hard Disk
Floppy Disk
Hard Sectoring
The number of Sectors on each Track
is physically fixed while
manufacturing.
The beginning of each sector is
identified by a sector hole punched
on the disk.
Sector is fixed, no flexibility.
Soft Sectoring
Number Sectors per Track chosen by
software.
Information about sector is written on
each sector.
Flexible
Hard Disk
Magnetic disk are less expensive than RAM and can store large amounts of data.
Data access rate is slower than main memory.
Data can be modified or can be deleted easily in the magnetic disk memory.
Advantages:
These are economical memory.
The easy and direct access of data possible.
It can store large amounts of data.
It has better data transfer rate than magnetic tapes.
It has less prone to corruption of data as compared to tapes.
Disadvantages:
More expensive than magnetic tape memories.
It need clean and dust free environment to store.
These are not suitable for sequential access.
Disk Performance Parameters
When the disk drive is operating the disk rotate at constant speed.
To read or write the head must be positioned at the desired track and at the beginning
of the desired sector on the track.
Track selection involves moving the head in a movable-head system or
electronically selecting one head on a fixed-head system.
Once the track is selected, the disk controller waits until the appropriate sector
rotates to line up with the head.
Seek time
On a movable head system, the time it takes to position the head at the proper
track.
For fixed head system, it is zero.
Average Seek time is N/3, where N = Number of Track or Cylinder.
Rotational delay (latency)
After finding the Track, the time it takes to reach the head at the beginning of the required
sector.
Average Latency time is half of the rotation time.
Access time
The sum of the seek time and the rotational delay (latency).
The time it takes to get into position to read or write.
So, Access time = Seek time + Latency time
Transfer time
Once the head is in position, the read or write operation is then performed as the sector is
under the head.
Time required to move the data from secondary storage device to the processor memory.
Read/Write Mechanism
The magnetic surface is placed just below the yoke coil.
The magnetic surface is mounted on a rotatory device
and rotates at a uniform speed.
Digital Information can be stored on Magnetic film by
applying current pulses of suitable polarity to the
magnetizing coils.
This causes the magnetization of the film in the area
immediately underneath the head by inducing magnetic
field just below the yoke.
According to the generated flux polarization the bit
pattern is stored over the surface.
The same method is used for reading the information.
For reading, when head reaches at the required place over the magnetic surface
from where an information to be retrieved, it senses the polarity of the
magnetic field present at that particular spot and sends appropriate signal
through the coil.
This particular technique for reading/writing is known as Manchester
Encoding.
Problem:
A disk system has 16 data recording surfaces with 1024 tracks per surface.
There are 16 sectors per track, each containing 1024 bytes. The diameter of
inner cylinder is 6 inches and outer cylinder is 10 inches. Find out
Capacity of disk
Transfer rate if rotational speed is 3000 rpm
Latency time and average latency time
Track density
Maximum bit density (linear and areal)
Magnetic Tape
Tape systems use the same reading and recording techniques as
disk systems.
The tape is made of a thin magnetizable coating on a long
narrow strip of plastic film.
Coating may consist of particles of pure metal in special
binders or vapor-plated metal films.
Data on the tape are structured as a number of parallel tracks
running lengthwise.
Reading and writing is serial and data are laid out as a
sequence of bits along each track.
Data are read and written in contiguous blocks called physical
records.
Blocks on the tape are separated by gaps
referred to as inter-record gaps.
Magnetic tape is usually recorded on only one
side.
The opposite side is a substrate to give the
tape strength and flexibility.
Data read/write speed is slower because of
sequential access.
The width of the ribbon varies from 4mm to 1
Inch and it has storage capacity 100 MB to
200 GB.
Advantages:
1. These are inexpensive, that is, low cost memories.
2. It provides backup or archival storage.
3. It can be used for larger files.
4. It can be used for copying from disk files.
5. It is a reusable memory.
Disadvantages:
1. Sequential access is the disadvantage, means it does not allow access randomly or directly.
2. Data access rate is slower.
3. It requires caring to store, that is, vulnerable humidity, dust free, and suitable environment.
4. It stored data cannot be easily updated or modified, that is, difficult to make updates on data.
Overview
Introduction
CACHE memory
Memory Management
Virtual Memory
Associative Memory
Memory Interleaving
Introduction
All computer manufacturer want their systems to run as fast as possible.
Number of advanced techniques currently being investigated to improve system
performance.
To make computer faster, high performance memory techniques should be
implemented.
The performance of the memory generally improved due to following techniques.
Cache Memory
Virtual Memory
Associative Memory
Memory Interleaving
Cache Memory
Processor is much faster than the main memory.
As a result, the processor has to spend much of its time waiting while instructions and
data are being fetched from the main memory.
Major obstacle towards achieving good performance.
Speed of the main memory cannot be increased beyond a certain point.
Cache memory is an architectural arrangement which makes the main memory appear faster
to the processor than it really is.
A special very high speed memory called Cache which is used to increase the speed of
processing by making a current program and data available to the CPU at a rapid rate.
Generally it is employed in computer system to compensate the speed difference between
the main memory access time and high speed processor logic.
So, the Cache is located between Main memory and the Processor.
Processor Cache Main
memory
Cache memory access time is close to the processor logic cycle time.
It is also used to store segments of program currently being needed and temporary data frequently
needed in the present calculation.
Sometimes it is found in a computer program a particular part is executed repeatedly, while other
parts are less frequently.
These instructions may be the ones in a loop, nested loop or few procedures/functions calling each
other repeatedly.
This is called locality of reference .
There are two types of Locality of reference.
Temporal locality of reference:
A memory location that is referenced is likely to be accessed again in the near future.
Data/instruction brought into cache expecting it to be used again.
Spatial locality of reference:
Memory locations near the last access are likely to be accessed in the near future.
Instructions/data with addresses close to a recently instruction/data are likely to be executed
soon.
If the active segment of the program can be placed in a fast memory, then the total execution time
can be significantly reduced.
The fundamental idea of Cache organization is that by keeping the most frequently accessed
instructions and data in the fast Cache memory.
Levels of Cache
The basic technique that works very
effectively is to introduce separate
Cache for Instruction and Data.
This is called as Split Cache which is
Level 1 Cache.
Today many memory systems are more
complicated and additional Cache
called Level 2 Cache, may reside
between instruction and data cache and
main memory.
In fact there may one more level of
cache, that is, Level 3 cache in more
sophisticated systems.
Level 2 and Level 3 are unified Cache.
CPU Package
Microprocessor
Unified L2 Cache Unified L3 Cache Main
L1-I L1-D Memory
(DRAM)
Keyboard Graphics Disk
Controller Controller Controller
Board level Cache
(SRAM)
The processor within itself contains a small instruction and data cache,
typically 16KB to 64KB.
There is Level 2 cache, which is not within the processor, but may be included
in the CPU package.
This Cache is generally unified containing both instruction and data, typically
512KB to 1MB.
The Level 3 Cache is on the Processor Board and consists of few megabytes
of SRAM, which is much faster than Main DRAM Memory.
Caches are inclusive, with the full contents of the Level 1 Cache being in the
Level 2 and the full contents of the Level 2 Cache being in the Level 3 Cache.
Operations
When the CPU needs to access memory, the Cache is examined.
If the word is found in the Cache, it reads from the Cache memory.
If the word is not found in the Cache, the main memory is accessed to
read the required word.
A block of words containing the one just accessed is then transferred
from the main memory to Cache memory.
In this manner some data are transferred to the Cache from the main
memory so that future references to memory find the required words in
the fast Cache memory.
Cache Hit
Existence of a Cache is transparent to the processor.
The processor issues Read and Write requests in the same manner.
If the data is in the cache, it is called a Read or Write hit.
Read hit:
The data is obtained from the cache.
Write hit:
Cache has a replica of the contents of the main memory.
Contents of the cache and the main memory may be updated simultaneously. This is the write-
through protocol.
Update the contents of the cache, and mark it as updated by setting a bit known as the dirty bit
or modified bit. The contents of the main memory are updated when this block is replaced.
This is write-back or copy-back protocol.
Cache Miss
If the data is not present in the cache, then a Read miss or Write miss occurs.
If Read miss:
Block of words containing this requested word is transferred from the memory.
After the block is transferred, the desired word is forwarded to the processor.
The desired word may also be forwarded to the processor as soon as it is transferred without
waiting for the entire block to be transferred. This is called load-through or early-restart.
Write-miss:
Write-through protocol is used, then the contents of the main memory are updated directly.
If write-back protocol is used, the block containing the addressed word is first brought into the
cache. The desired word is overwritten with new information.
The performance of Cache is measured in terms of a quantity called Hit Ratio.
Hit Ratio = Hit / Hit + Miss (CPU references to memory)
Write Policy
An important aspect of cache organization is concerned
with memory write requests.
When the CPU find a word in Cache during a read
operation, the main memory is not involved in the transfer.
If the operation is a write, there are two ways the system
can proceed for writing.
Write through
Write back
Write through
Main Main
HIT Miss
Memory Memory
Cache Cache
CPU CPU
Update the main memory with every main memory write operation with the Cache
memory being updated in parallel if it contains the word at specified address.
The main memory always contains same data as Cache.
The data reside in main memory is valid at all time.
Write back
Problem of write through is repeated write HIT Main Memory
operation.
Write back minimizes memory write. Cache
Updates are made only in the Cache.
CPU
When an update occurs, an Update/Dirty bit
associated with the slot is set. Miss
When that block is replaced, it is written back to Main Memory
main memory if and only if the Update/Dirty bit
associated with the slot is set. Cache
The problem with write back is that the portions of
main memory data not valid data. CPU
Cache Mapping
Processor issues a Read request, a block of words is transferred from the main
memory to the cache, one word at a time.
Subsequent references to the data in this block of words are found in the cache.
At any given time, only some blocks in the main memory are held in the cache.
Which blocks in the main memory are in the cache is determined by a Mapping
Function.
Mapping functions determine how memory blocks are placed in the cache.
Three mapping functions:
Direct mapping
Associative mapping
Set-associative mapping.
Direct Mapping
A simple processor example: Main Block 0
memory
Main memory has 64K words. Cache Block 1
tag
Block 0
Main memory is addressable by a 16-bit address. tag
Block 1
Size of cache is 2048 (2K) words. Block 127
Cache is addressable by a 11-bit address. Block 128
tag
Now Index will be LS11-bits and Tag will be MS 5-bits. Block 127 Block 129
Number of bits in Index field is equal to the number of
bits required for accessing Cache.
Block 255
Cache consisting of blocks of 16 words each. Tag Block Word
5 7 4 Block 256
Total number of Blocks will be 128. Block 257
Main memory address
Now Index bit will be divided as Block bits and Word bits.
Block will be 7-bits and Word will be 4-bits.
Block 4095
Main memory has 4K blocks of 16 words each.
Each word in the Cache consists of data word and its Tag bits
stored alongside the data bits.
When the CPU generates a memory request, the Index field is used
for the address to access the Cache.
The Tag field generated during addressing is compared with the
Tag bits in word read from the Cache.
If two Tags match, then there is a Hit.
If no match, there is a Miss and the required word read from the
memory and then the word is stored in Cache with the new Tag.
•Block j of the main memory maps to j modulo 128 of the cache. Main Block 0
memory
0 maps to 0, 129 maps to 1 Cache Block 1
tag
•More than one memory block is mapped onto the same position in Block 0
the cache. tag
Block 1
•May lead to contention for cache blocks even if the cache is not full. Block 127
Block 128
•Resolve the contention by allowing new block to replace the old tag
Block 127 Block 129
block, leading to a trivial replacement algorithm.
•Memory address is divided into three fields:
- Low order 4 bits determine one of the 16 words in a block. Block 255
Tag Block Word
- When a new block is brought into the cache, the next 7 bits 5 7 4 Block 256
determine which cache block this new block is placed in. Main memory address Block 257
- High order 5 bits determine which of the possible 32 blocks is
currently present in the cache. These are tag bits.
Block 4095
•Simple to implement but not very flexible.
Associative Mapping
Main Block 0
This is fastest and flexible. memory
Cache Block 1
Uses Associative Memory. tag
Block 0
Main memory block can be placed into any cache position. tag
Block 1
Memory address is divided into two fields. Block 127
Lower 4-bits identify the word within a Block. Block 128
tag
Block 127 Block 129
Higher 12-bit is the Tag bit identify a memory block when
it is resident in the Cache.
So, in Associative Cache each word in the Cache consists
of data word and its Tag bits stored alongside the data bits. Block 255
Tag Word
12 4 Block 256
This is very expensive.
Main memory address Block 257
Flexible and uses Cache space efficiently.
Replacement algorithm is used to replace an existing block
in the Cache when the Cache is full. Block 4095
Set-Associative Mapping
The problem of Direct Mapping is that two words with
same Index but different Tag values can not reside in the
Cache memory at a time. Set 0
Set associative addresses the problem of possible
thrashing in the direct mapping method. Set 1
In Set-Associative two or more words of memory under
the same Index address can reside.
It is a combination of Direct and associative mapping. Set 63
Blocks of Cache are grouped into Set.
Hence, the contention problem of Direct mapping is
Set
somehow solved by having few choices for block
placement.
Number of blocks per set is a design parameter.
- One extreme is to have all the blocks in one set,
requiring no set bits (fully associative mapping).
- Other extreme is to have one block per set,
is the same as direct mapping.
Replacement Policy
When the cache is full, and a block of words needs to be
transferred from the main memory, some block of words in the
cache must be replaced.
This is determined by a Replacement Algorithm.
Cache performance is greatly affected by properly choosing data
that is unlikely to be referenced again.
The most common replacement algorithms are:
Random replacement
First-in-first-out (FIFO)
Least Recently Used (LRU)
Least Frequently Used (LFU)
With a Random replacement policy any block can be replaced
randomly.
In case of FIFO, replace the block in the set which has been in the
cache for longest time. It is easily implemented as a round-robin
or circular buffer technique.
In case of LRU, replace the block in the set which has been in the
Cache for longest period with no references.
In case of LFU, replace the block from the Cache which has
fewest references.
Memory Management
In a uni-processor system, main memory is divided in to two parts.
One part for OS (Resident Monitor)
Other part for the program currently being executed (User part)
In multiprogramming the user part of the memory is subdivided to
accommodate multiple processes.
The subdivision is carried out dynamically.
The memory needs to be allocated efficiently to pack as many
processes in to the memory as possible.
This is known as Memory Management.
Generally there are two types partitioning available for memory. OS
5M
One is Fixed Partitioning and another one is Variable 5M
Partitioning. 5M
5M
Fixed partitioning is of two types.
5M
One is Equal size partitioning and another one is Unequal size Equal size partitioning
partitioning.
OS
The process brought into the memory and placed into the 2M
smallest available partition. 5M
7M
In most of the cases process may not require exactly as much 6M
memory as provided by the partition. 9M
So, there will be wastage of memory space. Unequal size partitioning
A more efficient approach is Variable Partitioning.
When the process is brought into memory, it is allocated exactly as
much memory as it requires and no more.
This method starts out well, but eventually it leads to a situation in
which there are lot of small holes in memory.
At time goes the memory became more and more fragmented.
Unused space in a partition is called as Internal Fragmentation.
Unused of a partition is called External Fragmentation.
So, the memory utilization will be declined.
The technique to overcome this is Compaction.
From time to time the OS shifts the processes in the memory and places all the free
memory space together in one block.
This is called Compaction.
This is a time consuming procedure and wastage of processor time.
If compaction is adopted, the process may be shifted in main memory.
The instruction and branch address of the instruction also shifted.
This can be solved by Logical Addressing.
Logical address is expressed as a location relative to the beginning of the program.
Instructions in the program contains only logical address.
Physical address is the actual location in the memory.
When the processor executes a process, it automatically converts
from logical to physical address by adding the current starting
location of the process which is called its Base Address to each
logical address.
Both fixed and variable partitioning are not efficient.
So, Paging.
The memory is partitioned in to relatively small fixed size chunks,
known as Frame.
The processes are also divided to small fixed size chunks of same size
of frames.
The chunk of processes is called as Page.
The pages are swapped in and swapped out to or from the frames
of the memory.
Usually this enhances the performance of the system.
This leads to the concept of Virtual Memory.
Each page of the process brought in to the memory, when it is
needed, that is on demand, called Demand Paging.
During the processing if a page reference is required, but that is
not in main memory then Page fault occurs.
Then the OS brings the required page to main memory.
Paging is invisible to the programmer.
There is another way in which addressable memory can be
subdivided, known as Segmentation.
It is visible to the programmer and provides a convenience for
organizing the programs and data.
It allows the programmer to view memory as consisting of
multiple address space or segments.
Virtual Memory
Program and data are stored in auxiliary memory.
Portions of the program or data that are brought in to the main memory
as they are needed by the CPU.
Virtual memory is a technique that allows execution of processes that
may not be completely in the main memory.
Virtual memory concept is used to give programmer the illusion that
they have very large memory, even though the computer actually has a
relatively small main memory.
Each address is referenced by the CPU goes through an address
mapping from the virtual address to a physical address in main memory.
An address used by the programmer called Virtual address or
Logical address.
Set of virtual addresses called as Address space.
An address in main memory is called as location or Physical address.
Set of physical addresses called as Memory space.
The physical memory is divided in to equal size called as Frame or
Block.
Let a computer address space is 8K and memory space is 4K.
Let split the address space in to 8 pages and each page size is 1K.
The virtual address will be 13 bits.
These 13 bits are for specifying the pages and the lines within the page.
So, for 8 pages 3 bit are required and rest 10 bits LSB for the line number of each page.
The memory space is divided in to the size of pages as 1K so, 4 frames or blocks.
The actual address is 12 bits.
For 4 frames, 2 bits are required and rest 10 bits LSB for the line number of each frame.
So, the line address in address space and memory space is same.
Only the translation is required from page to frame, that is, from 3 bits to 2 bits.
Page Replacement Algorithms
If a page is required for processing and that is not in main memory and
the main memory is full, in that case page replacement is required.
Various page replacement algorithms are:
First-in-first-out (FIFO)
Least Recently Used (LRU)
Optimal Algorithm
In case of LRU, replace the page that has not been used for longest
period of time.
In case of optimal algorithm, replace the page that will not be used for
the longest period of time.
Associative Memory
The time required to find an item stored in memory can be
reduced considerably if stored data can be identified for access by
the content of data itself rather by an address.
The Associative Memory (AM) called as Content Addressable
Memory (CAM) which addresses as content not address.
When a word is written in an associative memory no address is
needed.
The memory is also capable of finding unused location to store
that word.
When a word to be read from an associative memory the
content of word or part of the word is specified.
The memory locates all words which match the specified
content and mark them for reading.
Then at a time all marked content can be read.
So, associative memory known as Content addressable
memory or Parallel search memory or Multiaccess
memory.
Hardware Organization
Argument Register (A)
Key Register (K)
Associative
Input Memory
m words (M)
Read
n bits per word
Write
Each word in the memory is compared in
parallel with the contents of the Argument
register.
The Key register or Masking register
provides a mask for choosing a particular
field or key in the argument word.
The entire argument is compared with each
memory word if the Key register contains
all 1 s.
Otherwise only those bits in the argument
that have 1 s in their corresponding position
of the Key register are compared.
Thus, masking register provides a mask
for identifying whole or piece of
information which specifies how the
reference to the memory is made.
The words that matches the bits of the
Argument register taking in to
consideration of 1 s in Key register set the
corresponding bit in the Match register.
After the matching process, those bits in
the Match register that have been set
indicates that their corresponding word
have been matched.
Reading is accomplished by accessing
to the memory for those words
corresponding bits in the Match
register have been set.
Match Example:
A = 101111100
K = 111000000
Word 1 = 100111100
Word 2 = 101000001
Which word will match?
Internal Organization of each Cell
Aj will be compared with Fij for matching when Kj = 1
Match Logic
Comparison logic for two numbers required to find matching.
Neglecting Key or Mask bits, word i is equal to the argument A if
Aj = Fij for j = 1, 2, 3, ,n
Two bits are equal if both are either 1 or 0.
The equality of two bits can be expressed logically by the Boolean function
xj = Aj Fij + A j F ij xj = 1 if both are equal else 0
For word i to be equal to the argument A must all xj variables equal to 1.
This is the condition for setting the corresponding match bit Mi to 1.
So, the Boolean function will be Mi = x1 x2 x3 . . . xn . This is a AND
operation.
Now the key bit Kj is included in the comparison logic.
When Kj = 1 Aj and Fij needs comparison
Kj = 0 Aj and Fij need not compared
j, Kj = 1
xj + K j =
1, if Kj = 0
Thus, now the xj term can be rewritten as (xj + K j ).
Now the match logic Mi for word i will be
Mi = (x1 + K 1 ) (x2 + K 2) . . . (xn + K n )
If Kj = 1 the terms will be compared and will be either 0 or 1.
When all terms will be 1 then only match will occur and Mi will be 1.
If Kj = 0 all terms in the expression will be 1 but this condition can be
avoided as all bits of Key word can not be 0.
So, now Mi = ς Aj Fij + A j F ij + K j
This is the expression for one word, that is word i.
m numbers of such function needed to compare m numbers of words
as i = 1, 2, 3, . . ., m.
Diagram of Match Logic
Memory Interleaving
There may be simultaneous access to memory from two or
more sources due to parallel processing.
The memory can be partitioned into a number of modules
connected to a common memory address and data bus.
Each module has its own address buffer register (ABR) and
data buffer register (DBR).
Due to its personal Address register and Data register for each
module the average rate of transmission of data is increased.
Consecutive words are k bits m bits
placed in a module.
Module Address in module MM address
High-order k bits of a
memory address determine
the module.
Low-order m bits of a
memory address determine
the word within a module.
When a block of words is ABR DBR ABR DBR ABR DBR
transferred from main
memory to cache, only one Module Module Module
module is busy at a time. 0 i n- 1
One module kept busy by
CPU at a time.
Arrange addressing so that successive m bits k bits
words in the address space are placed in
consecutive modules. Address in module Module MM address
When requests for memory access
involve consecutive addresses, the
access will be to consecutive modules.
While transferring a block of data,
several memory modules can be kept
busy at the same time.
ABR DBR ABR DBR ABR DBR
Since parallel access to these modules is
possible, the average rate of fetching Module Module Module
words from the Main Memory can be k
2 - 1
0 i
increased.
Higher utilization of memory system.