KEMBAR78
CSE2213 Lecture 8 Chapter5 the-Memory-System | PDF | Random Access Memory | Read Only Memory
0% found this document useful (0 votes)
64 views48 pages

CSE2213 Lecture 8 Chapter5 the-Memory-System

The document discusses the memory system in computers, focusing on concepts such as addressing schemes, memory types (RAM, ROM), and their characteristics. It highlights the importance of speed, cost, and size in memory design, as well as the role of cache memory and locality of reference in improving performance. Additionally, it covers the hierarchy of memory, detailing the differences between static and dynamic RAM, and various types of read-only memories.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views48 pages

CSE2213 Lecture 8 Chapter5 the-Memory-System

The document discusses the memory system in computers, focusing on concepts such as addressing schemes, memory types (RAM, ROM), and their characteristics. It highlights the importance of speed, cost, and size in memory design, as well as the role of cache memory and locality of reference in improving performance. Additionally, it covers the hierarchy of memory, detailing the differences between static and dynamic RAM, and various types of read-only memories.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

THE MEMORY

SYSTEM

(Chapter 5: Carl Hamacher, Zaky)


Basic concepts

❑ Maximum size of the memory depends on the


addressing scheme:
◆ 16-bit computer generates 16-bit addresses and
can address up to 216 memory locations.
◆ Number of locations represents the size of the
address space of a computer.
❑ Most modern computers are byte-addressable.
❑ Memory is designed to store and retrieve data in
word-length quantities.
◆ Word length of a computer is commonly defined
as the number of bits stored and retrieved in one
memory operation.
2
Basic concepts (contd..)

Processor Memory
k-bit
address bus
MAR
n-bit
data bus
Up to 2k addressable
MDR locations

Word length= n bits

Control lines
ഥ , MFC, etc.)
(R/𝑊

Recall that the data transfers between a processor and memory involves two
registers MAR and MDR.
If the address bus is k-bits, then the length of MAR is k bits.
If the word length is n-bits, then the length of MDR is n bits.
Control lines include R/𝑊 ഥ and MFC.
For Read operation R/𝑊 ഥ = 1 and for Write operation R/𝑊 ഥ = 0.

3
Basic concepts (contd..)
❑ Measures for the speed of a memory:
◆ Elapsed time between the initiation of an operation and the completion of
an operation is the memory access time (e.g. the time between the Read
& MFC signals).
◆ Minimum time between the initiation of two successive memory
operations is memory cycle time (e.g. the time between two successive
Read operations)
◆ Memory Cycle time is slightly longer than memory access time
❑ In general, the faster a memory system, the costlier it is and the
smaller it is.
❑ An important design issue is to provide a computer system with as
large and fast a memory as possible, within a given cost target.
❑ Several techniques to increase the effective size and speed of the
memory:
◆ Cache memory (to increase the effective speed).
◆ Virtual memory (to increase the effective size).
4
Semiconductor RAM memories

❑ Random Access Memory (RAM) memory unit is a unit where any


location can be addressed in a fixed amount of time, independent of
the location’s address.
❑ Internal organization of memory chips:
◆ Each memory cell can hold one bit of information.
◆ Memory cells are organized in the form of an array.
◆ One row is one memory word.
◆ All cells of a row are connected to a common line, known as the “word
line”.
◆ Word line is connected to the address decoder.
◆ Sense/write circuits are connected to the data input/output lines of the
memory chip.

5
Semiconductor RAM memories (contd..)
Internal organization of memory chips size=128 bits (16x8) = 16 words, 8 bits each
7 7 1 1 0 0

W0


• FF FF
A0 W1

A1 •
Address • Memory
decoder • • • • • • cells
A2 • • • • • •
• • • • • •
A3

W15


Sense / Write Sense / Write Sense / Write R /W


circuit circuit circuit
CS

Data input /output lines: b7 b1 b0

6
Semiconductor RAM memories (contd..)
❑ Static RAMs (SRAMs):
◆ Consist of circuits that are capable of retaining their state as long as the
power is applied.
◆ Volatile memories, because their contents are lost when power is interrupted.
◆ Access times of static RAMs are in the range of few nanoseconds.
◆ Capacity is low (each cell consists of 6 transistors)
◆ However, the cost is usually high.
❑ Dynamic RAMs (DRAMs):
◆ Do not retain their state indefinitely.
◆ Contents must be periodically refreshed to retain its state (or else they get
erased).
◆ Contents may be refreshed while accessing them for reading.
◆ Access time is longer than SRAM
◆ Capacity is higher than SRAM (each cell consists of 1 transistor)
◆ Cost is lower than SRAM (as it has fewer transistors)

7
Semiconductor RAM memories (contd..)

❑ Data is transferred between the processor and the memory in units


of single word or a block.
❑ Speed and efficiency of data transfer has a large impact on the
performance of a computer.
❑ Two metrics of performance: Latency and Bandwidth.
❑ Latency:
◆ Time taken to transfer a single word of data to or from
memory.
◆ Definition is clear if the memory operation involves transfer of a single
word of data.
◆ In case of a block transfer, latency is the time it takes to transfer first
word of data.
◆ Time required to transfer first word in a block is substantially larger than
the time required to transfer consecutive words in a block.

8
Semiconductor RAM memories (contd..)

Bandwidth
❑ How much time is needed to transfer a single block of data.
❑ Blocks can be variable in size, it is useful to define a performance
measure in terms of the number of bits or bytes transferred in
one second.
❑ This performance measure is referred to as memory bandwidth.
❑ Bandwidth of a memory unit depends on:
◆ Speed of access to the chip.
◆ How many bits can be accessed in parallel.
❑ Bandwidth of data transfer between processor and memory unit also
depends on:
◆ Transfer capability of the links that connect the processor and memory,
or the speed of the bus.

9
Memory systems

❑ Various factors such as cost, speed, power consumption and size of


the chip determine how a RAM is chosen for a given application.
❑ Static RAMs:
◆ Chosen when speed is the primary concern.
◆ Circuit implementing the basic cell is highly complex, so cost and size
are affected.
◆ Used mostly in cache memories.
❑ Dynamic RAMs:
◆ Predominantly used for implementing computer main memories.
◆ High densities available in these chips.
◆ Economically viable for implementing large memories.

10
Read-Only Memories (ROMs)
❑ SRAM and SDRAM chips are volatile:
◆ Lose the contents when the power is turned off.
❑ Many applications need memory devices to retain contents after the
power is turned off.
◆ For example, computer is turned on, the operating system must be
loaded from the disk into the memory.
◆ Store instructions which would load the OS from the disk.
◆ Need to store these instructions so that they will not be lost after the
power is turned off.
◆ We need to store the instructions into a non-volatile memory.
❑ Non-volatile memory is read in the same manner as volatile memory.
◆ Separate writing process is needed to place information in this memory.
◆ Normal operation involves only reading of data, this type of memory is
called Read-Only memory (ROM).

11
Read-Only Memories (contd..)

❑ Read-Only Memory:
◆ Data are written into a ROM when it is manufactured.
❑ Programmable Read-Only Memory (PROM):
◆ Allow the data to be loaded by a user.
◆ Process of inserting the data is irreversible.
◆ Storing information specific to a user in a ROM is expensive.
◆ Providing programming capability to a user may be better.
❑ Erasable Programmable Read-Only Memory (EPROM):
◆ Stored data to be erased and new data to be loaded.
◆ Flexibility, useful during the development phase of digital systems.
◆ Erasable, reprogrammable ROM.
◆ Erasure requires exposing the ROM to UV light.

12
Read-Only Memories (contd..)

❑ Electrically Erasable Programmable Read-Only Memory (EEPROM):


◆ To erase the contents of EPROMs, they have to be exposed to ultraviolet
light. Physically removed from the circuit.
◆ For EEPROMs the contents can be stored and erased electrically by
providing particular amount of voltage.

13
Flash memory

❑ Flash memory has similar approach to EEPROM.


❑ Read the contents of a single cell, but write the contents of an entire
block of cells.
❑ Flash devices have greater density.
◆ Higher capacity and low storage cost per bit.
❑ Power consumption of flash memory is very low, making it attractive
for use in equipment that is battery-driven.
❑ Single flash chips are not sufficiently large, so larger memory modules
are implemented using flash cards and flash drives.

14
Acronyms
RAM --Random Access Memory : time taken to access any arbitrary location in memory is
constant (c.f., disks)
ROM --Read Only Memory : ROMs are RAMs which can only be written to once; thereafter
they can only be read. Older uses included storage of bootstrap info

PROM --Programmable ROM : A ROM which can be bench programmed

EPROM --Erasable PROM: A PROM which can be erased for rewriting

EEPROM --Electrically EPROM: A PROM which can be erased electrically.

SRAM --Static RAM: RAM chip which loses contents upon power off

DRAM --Dynamic RAM: RAM chip whose contents need to be refreshed.

SDRAM --Synchronous DRAM: RAM whose memory operations are synchronized with a clock
signal.

15
Memory hierarchy

• A big challenge in the design of a computer system is to provide a


sufficiently large memory, with a reasonable speed at an affordable
cost.
• Static RAM:
– Very fast, but expensive, because a basic SRAM cell has a complex
circuit making it impossible to pack a large number of cells onto a
single chip.
• Dynamic RAM:
– Simpler basic cell circuit, hence are much less expensive, but
significantly slower than SRAMs.
• Magnetic disks:
– Storage provided by DRAMs is higher than SRAMs, but is still less
than what is necessary.
– Secondary storage such as magnetic disks provide a large amount
of storage, but is much slower than DRAMs.
16
Memory hierarchy (contd..)

• All these types of memory units are


employed effectively in a computer.
– SRAM -- Smaller units where speed is critical, namely, cache memories.
– DRAM -- Large, yet affordable memory, namely, main memory.

– Magnetic disks -- Huge amount of cost-effective storage.

• Computer memory can be viewed as a


hierarchy.

17
Memory hierarchy (contd..)
Pr ocessor •Fastest access is to the data held in
processor registers. Registers are at
Registers the top of the memory hierarchy.
Increasing Increasing Increasing •Relatively small amount of memory that
size speed cost per bit can be implemented on the processor
Primary L1
cache chip. This is processor cache.
•Two levels of cache:
Level 1 (L1) cache is on the processor chip.
Level 2 (L2) cache is in between main memory and
Secondary L2
cache
processor.
•Next level is main memory, implemented
as SIMMs. Much larger, but much slower
than cache memory.
Main •Next level is magnetic disks. Huge amount
memory
of inexpensive storage.
•Speed of memory access is critical, the
idea is to bring instructions and data
Magnetic disk that will be used in the near future as
secondary close to the processor as possible.
memory

18
Cache memories
• Processor is much faster than the main memory.
– As a result, the processor has to spend much of its
time waiting while instructions and data are being
fetched from the main memory.
– Major obstacle towards achieving good performance.
• Speed of the main memory cannot be increased beyond a
certain point.
• Cache memory is an architectural arrangement which
makes the main memory appear faster to the processor
than it really is.
• Cache memory is based on the property of computer
programs known as “locality of reference”.
19
Locality of reference
• Analysis of programs indicates that many instructions in localized
areas of a program are executed repeatedly during some period of
time, while the others are accessed relatively less frequently.
– These instructions may be the ones in a loop, nested loop or few
procedures calling each other repeatedly.
– This is called “locality of reference”.
• Temporal locality of reference:
– Recently executed instruction is likely to be executed again very
soon.
• Spatial locality of reference:
– Instructions with addresses close to a recently executed
instruction are likely to be executed soon.

20
Time Accessed Content in Actual Memory for(i = 1 ; i <=10 ; i++ )
(when it was brought Cache Location
to Cache)
{
(location where the
data is actually stored Instruction X
[hh:mm:ss] in Main Memory) Instruction Y
Instruction Z
Instruction C
02:00:00 Instruction X 1000 }
Temporal Locality of Reference
02:00:01 Instruction Y 2004

02:00:02 Instruction Z 3008 for(i = 1 ; i <=10 ; i++ )


{
Instruction X
02:05:00 Instruction A 1008 Instruction A
Instruction B
02:05:01 Instruction B 1012 Instruction C
}
02:00:05 Instruction C 1016 Spatial Locality of Reference

Suppose this is the cache memory


21
Locality of reference (contd..)

• Cache memory is based on the concept of locality


of reference.
– If active segments of a program are placed in a fast cache memory, then the
execution time can be reduced.

• Temporal locality of reference:


– Whenever an instruction or data is needed for the first time, it should be brought
into a cache. It will hopefully be used again repeatedly.

• Spatial locality of reference:


– Instead of fetching just one item from the main memory to the cache at a time,
several items that have addresses adjacent to the item being fetched may be useful.
– The term “block” refers to a set of contiguous addresses locations of some size.

22
Hit and Miss

• Focus on any two adjacent levels – called, upper (closer to CPU) and lower (farther
from CPU) – in the memory hierarchy, because each block copy is always between
two adjacent levels
• Terminology:
– block: minimum unit of data to move between levels (Usually 1 word)
– hit: data requested is in upper level (example:data found in cache!)
– miss: data requested is not in upper level (e.g., data not in cache, So, go to lower
level (main memory or hard disk) to find that data)
– hit rate: fraction of memory accesses that are hits (i.e., found at upper level, say
cache)
– miss rate: fraction of memory accesses that are not hits
• miss rate = 1 – hit rate
– hit time: time to determine if the access is really a hit + time to get and deliver the
data from the upper level (e.g., cache) to the CPU
– miss penalty: time to determine if the access is a miss + time to replace block at
upper level (say, cache) with corresponding block at lower level (e.g., RAM) + time
to deliver the block to the CPU

23
Cache memories

Main
Processor Cache
memory

•Processor issues a Read request, a block of words is transferred from the main memory to the
cache, one word at a time.
•Subsequent references to the data in this block of words are found in the cache.
•At any given time, only some blocks in the main memory are held in the cache.
•Which blocks in the main memory are in the cache is determined by a “mapping function”.
•When the cache is full, and a block of words needs to be transferred from the main memory,
some block of words in the cache must be replaced. This is determined by a “replacement
algorithm”.

24
Cache memories (contd..)
•Existence of a cache is transparent to the processor. The processor issues Read and
Write requests in the same manner.

•If the data is in the cache it is called a Read or Write hit.

•Read hit:
- The data is obtained from the cache.

•Write hit: Whenever a Processor wants to write a word, it checks to see if the
address it wants to write the data to, is present in the cache or not. If the address
is present in the cache it is called Write Hit. Two protocols are used:
- Contents of the cache and the main memory may be updated simultaneously. This
is called the write-through protocol.
- Update the contents of the cache, and mark it as updated by setting a bit known as
the dirty bit or modified bit. The contents of the main memory are updated when this
block is replaced (i.e. removed from cache and replaced with another block). This is
write-back or copy-back protocol.

25
Cache memories (contd..)
•If the data is not present in the cache, then a Read miss or Write miss occurs.

•Read miss:
- Block of words containing this requested word is transferred from the memory.
- After the block is transferred, the desired word is forwarded to the processor.
- The desired word may also be forwarded to the processor as soon as it is
transferred without waiting for the entire block to be transferred. This is called
load-through or early-restart.

•Write-miss:
- If Write-through protocol is used, then the contents of the main memory are
updated directly.
- If write-back protocol is used, the block containing the addressed word is first
brought into the cache.The desired word is overwritten with new information.

26
Mapping functions
• Mapping functions determine how memory
blocks are placed in the cache.
• Three mapping functions:
– Direct mapping
– Associative mapping
– Set-associative mapping.

27
Direct Mapped Cache
•Cache block address = memory block address mod
cache size

•Placement of a block in the cache is determined


from the memory address. The memory address can
be divided into three fields. The low-order 4 bits
select one of 16 words in a block. When a new block
enters the cache, the 7-bit cache block field
determines the cache position in which this block
must be stored. The high-order 5 bits of the
memory address of the block are stored in 5 tag bits
associated with its location in the cache. They
identify which of the 32 blocks that are mapped into
this cache position are currently resident in the
cache.

•The high-order 5 bits of the address are compared


with the tag bits associated with that cache location.
If they match, then the desired word is in that block
of the each~. If there is no match, then the block
containing the required word must first be read
from the main memory and loaded into the 28
cache
Associative mapping
Main
memory
Block 0 •Main memory block can be placed into any cache
Block 1
position.
Cache
•Memory address is divided into two fields:
tag
Block 0 (If Main memory is addressable by a 16-bit
tag
Block 1 address)
Block 127
• Total memory locations = 2^16 = 65536
- Low order 4 bits identify the word within a
Block 128 block. So number of memory locations per block =
tag
Block 127 Block 129 2^4 = 16
- High order 12 bits or tag bits identify a memory
block when it is resident in the cache. So number of
blocks = 2^12 = 4096
Block 255
Tag Word
•Flexible, and uses cache space efficiently.
Block 256 •Replacement algorithms can be used to replace an
12 4
Block 257 existing block in the cache when the cache is full.
Main memory address •Cost is higher than direct-mapped cache because
of the need to search all patterns to determine
whether a given block is in the cache.
Block 4095

29
Set-Associative mapping
Main
memory
Block 0 Set-associative mapping combination of direct and
Cache
associative mapping.
tag Block 1
Block 0 Blocks of cache are grouped into sets.
Set 0
tag Mapping function allows a block of the main
Block 1 memory to reside in any block of a specific set.
tag
Block 2
Assume that, No of Blocks of memory =4096.
Set 1 Divide the cache into 64 sets, with two blocks per set. (if cache size
tag Block 63
Block 3 128)
Block 64 Memory block 0, 64, 128 etc. map to set 0, and they
can occupy either of the two positions.
Block 65
16 bit Memory address is divided into three fields:
tag
- Low order 4 bits identify the word within a block.
Set 63
Block 126 - Middle 6 bit field determines the set number.
tag - High order 6 bit fields are compared to the tag
Block 127
fields of the two blocks in a set.
Block 127
Tag size = No of Blocks of memory / No of sets of cache =
Block 128 4096/64 = 64= 26

Tag Set Word Block 129


Number of blocks per set is a design parameter.
6 6 4
- One extreme is to have all the blocks in one
Main memory address set, requiring no set bits (fully associative
mapping).
Block 4095 - Other extreme is to have one block per set, is
the same as direct mapping.
30
Cache memories (contd..)
❑ A bit called as “valid bit” is provided for each block.
❑ If the block contains valid data, then the bit is set to 1, else it is 0.
❑ Valid bits are set to 0, when the power is just turned on.
❑ When a block is loaded into the cache for the first time, the valid bit is set to 1.
❑ Data transfers between main memory and disk (secondary memory) occur directly
bypassing the cache.
❑ When the data on a disk changes, the main memory block is also updated.
❑ However, if the data is also resident in the cache, then the valid bit is set to 0.

What happens if transfer is made from the main memory to


the disk, and the cache uses the write-back protocol?
❑ In this case, the data in the memory might not reflect the changes that may have been made
in the cached copy.
❑ The copies of the data in the cache, and the main memory are different. This is called the
cache coherence problem.
❑ One option is to force a write-back before the disk is updated from the main memory .

31
Replacement algorithms
When the cache is full, and a block of words needs to be transferred from the
main memory, some block of words in the cache must be replaced.This is
determined by a “replacement algorithm”.

•Direct-mapped cache, the position that each memory block occupies in


the cache is fixed.
As a result, the replacement strategy is trivial.

• Associative and set-associative mapping provide some flexibility in


deciding which memory block occupies which cache block.

When a new block is to be transferred to the cache, and all the positions it
may occupy are full, which block in the cache should be replaced?

32
Replacement algorithms
Locality of reference suggests that it may be okay to replace the block that
has gone the longest time without being referenced.

This block is called Least Recently Used (LRU) block, and the replacement
strategy is called LRU replacement algorithm.

LRU algorithm has been used extensively.


It provides poor performance in some cases.
Performance of the LRU algorithm may be improved by introducing a small
amount of randomness into which block might be replaced.

Other replacement algorithms include removing the “oldest” block.


However, they disregard the locality of reference principle and do not
perform as well.

33
MEASURING C ACHE
PERFORMANCE

• Ideally, the entire memory hierarchy would appear to the processor as a


single memory unit that has the access time of a cache on the processor chip
and the size of a magnetic disk.
• How close we get to this ideal depends largely on the hit rate at different
levels of the hierarchy.
• High hit rates, well over 0.9, are essential for high-performance computers.
• Performance is adversely affected by the actions that must be taken after a
miss.
MEASURING C ACHE
PERFORMANCE

• The extra time needed to bring the desired information into the cache is
called the miss penalty.
• This penalty is ultimately reflected in the time that the processor is stalled
because the required instructions or data are not available for execution.
• In general, the miss penalty is the time needed to bring a block of data
from a slower unit in the memory hierarchy to a faster unit.
• The miss penalty is reduced if efficient mechanisms for transferring data
between the various units of the hierarchy are implemented.
MEASURING C ACHE
PERFORMANCE

• Consider now the impact of the cache on the overall performance of the
computer.
• Let h be the hit rate, M the miss penalty, that is, the time to access
information in the main memory, and C the time to access information in the
cache. The average access time (tave) experienced by the processor is:
tave = h*C + (1-h)M
If the computer has no cache, then, using a fast processor and a typical DRAM
main memory, suppose it takes 10 clock cycles for each memory read access.
Suppose, 17 cycles are needed to load a block into the cache.
MEASURING C ACHE
PERFORMANCE

Assume that 30 percent of the instructions in a typical program perform a read


or a write operation, which means that there are 130 memory accesses for
every 100 instructions executed. Assume that the hit rates in the cache are 0.95
for instructions and 0.9 for data. Let us further assume that the miss penalty is
the same for both read and write accesses. Then, a rough estimate of the
improvement in performance that results from using the cache can be obtained
as follows:

This result suggests that the computer with the cache performs five times
better.
MEASURING C ACHE
PERFORMANCE

It is also interesting to consider how effective this cache is compared to an


ideal cache that has a hit rate of 100 percent (in which case, all memory
references take one cycle). Our rough estimate of relative performance for
these caches is

This means that the actual cache provides an environment in which the
processor effectively works with a large DRAM-based main memory that
appears to be only two times slower than the circuits in the cache.
IMPROVING C ACHE
PERFORMANCE

• From the speed point of view, the optimal place for a cache is on the
processor chip.
• Unfortunately, space on the processor chip is needed for many other
functions; this limits the size of the cache that can be accommodated.
• All high-performance processor chips include some form of a cache.
• Some manufacturers have chosen to implement two separate caches, one for
instructions and another for data, as in the 68040, Pentium III, and Pentium 4
processors.
• Others have implemented a single cache for both instructions and data, as in
the ARM710T processor.
IMPROVING C ACHE
PERFORMANCE

• A combined cache for instructions and data is likely to have a somewhat


better hit rate because it offers greater flexibility in mapping new information
into the cache.
• However, if separate caches are used, it is possible to access both caches at
the same time, which leads to increased parallelism and, hence, better
performance.
• The disadvantage of separate caches is that the increased parallelism comes
at the expense of more complex circuitry.
IMPROVING C ACHE
PERFORMANCE

• In high-performance processors two levels of caches are normally used.


• The L1 cache(s) is on the processor chip.
• The L2 cache, which is much larger, may be implemented externally using
SRAM chips.
• If both L1 and L2 caches are used, the L1 cache should be designed to allow
very fast access by the processor because its access time will have a large
effect on the clock rate of the processor.
• A cache cannot be accessed at the same speed as a register file because the
cache is much bigger and, hence, more complex. A practical way to speedup
access to the cache is to access more than one word simultaneously and then
let the processor use them one at a time. This technique is used in many
commercial processors.
IMPROVING C ACHE
PERFORMANCE

• The L2 cache can be slower, but it should be much larger to ensure a high hit
rate.
• Its speed is less critical because it only affects the miss penalty of the L1
cache.
• A workstation computer may include an L1 cache with the capacity of tens of
kilobytes and an L2 cache of several megabytes. Including an L2 cache further
reduces the impact of the main memory speed on the performance of a
computer.
IMPROVING C ACHE
PERFORMANCE

The average access time experienced by the processor in a system


with two levels of caches is:
VIRTUAL MEMORY

• In most modern computer systems, the physical main memory is not as large
as the address space spanned by an address issued by the processor.
• For example, a processor that issues 32-bit addresses has an addressable
space of 4G bytes.
• The size of the main memory in a typical computer ranges from a few
hundred megabytes to 1G bytes.
• When a program does not completely fit into the main memory, the parts of
it not currently being executed are stored on secondary storage devices,
such as magnetic disks.
• Of course, all parts of a program that are eventually executed are first
brought into the main memory.
VIRTUAL MEMORY

• When a new segment of a program is to be moved into a full memory, it


must replace another segment already in the memory.
• In modern computers, the operating system moves programs and data
automatically between the main memory and secondary storage. Thus, the
application programmer does not need to be aware of limitations imposed
by the available main memory.
• Techniques that automatically move program and data blocks into the
physical main memory when they are required for execution are called
virtual-memory techniques.
VIRTUAL MEMORY

• Virtual memory is used to increase the apparent size of the physical


memory.
• Data are addressed in a virtual address space that can be as large as the
addressing capability of the processor.
• But at any given time, only the active portion of this space is mapped onto
locations in the physical memory. The remaining virtual addresses are
mapped onto the bulk storage devices used, which are usually magnetic disks.
VIRTUAL MEMORY

• Programs, and hence the processor, reference an instruction and data space
that is independent of the available physical main memory space.
• The binary addresses that the processor issues for either instructions or
data are called virtual or logical addresses. These addresses are translated
into physical addresses by a combination of hardware and software
components.
• If a virtual address refers to a part of the program or data space that is
currently in the physical memory, then the contents of the appropriate location
in the main memory are accessed immediately.
• On the other hand, if the referenced address is not in the main memory, its
contents must be brought into a suitable location in the memory before they
can be used.
VIRTUAL MEMORY

Figure 5.26 shows a typical


organization that implements
virtual memory. A special hardware
unit, called the Memory
Management Unit (MMU),
translates virtual addresses into
physical addresses. When the
desired data (or instructions) are in
the main memory, these data are
fetched as described in our
presentation of the cache
mechanism. If the data are not in
the main memory, the MMU causes
the operating system to bring the
data into the memory from the
disk.

You might also like