Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
Memory Hierarchy
The memory hierarchy design in a computer system mainly
includes different storage devices. Most of the computers were
inbuilt with extra storage to run more powerfully beyond the
main memory capacity. The following memory hierarchy
diagram is a hierarchical pyramid for computer memory. The
designing of the memory hierarchy is divided into two types such
as primary (Internal) memory and secondary (External)
memory.
Primary Memory
The primary memory is also known as internal memory, and this
is accessible by the processor straightly. This memory includes
main, cache, as well as CPU registers.
Secondary Memory
The secondary memory is also known as external memory, and
this is accessible by the processor through an input/output
module. This memory includes an optical disk, magnetic disk,
and magnetic tape.
1|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
Characteristics of Memory Hierarchy
The memory hierarchy characteristics mainly include the
following.
Performance
Previously, the designing of a computer system was done
without memory hierarchy, and the speed gap among the main
memory as well as the CPU registers enhances because of the
huge disparity in access time, which will cause the lower
performance of the system. So, the enhancement was
mandatory. The enhancement of this was designed in the
memory hierarchy model due to the system’s performance
increase.
Ability
The ability of the memory hierarchy is the total amount of data
the memory can store. Because whenever we shift from top to
bottom inside the memory hierarchy, then the capacity will
increase.
Access Time
The access time in the memory hierarchy is the interval of the
time among the data availability as well as request to read or
write. Because whenever we shift from top to bottom inside the
memory hierarchy, then the access time will increase
Cost per bit
When we shift from bottom to top inside the memory hierarchy,
then the cost for each bit will increase which means an internal
Memory is expensive compared with external memory.
2|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
Memory Hierarchy Properties
Information stored in a memory hierarchy satisfies all the
following properties:-
1) Inclusion
2) Coherence
3) Locality of reference
Inclusion Property
it states as: (M1CM2C...)
The set inclusive implies that all the information's are originally
stored in the outermost level Mi and the information word
found in Mn then copies of the same word can also be found in
all the upper levels Mi+1, Mi+2.
1. Access by word(4 bytes) from a cache block of 32 bytes,
such as block A
2. Access by block (32 bytes) from a memory page of 32
blocks or 1 KB such as block B from page B.
3. Access by page 1 KB from a file consisting of many pages
like A & B in segment F
4. Segment Transfer with different No. of pages
Coherence Property
The coherence property requires that copies of the same
information item at successive memory level should be
consistent. If a word is modified in the cache copies of that
word must be updated immediately at the higher levels.
There are two strategies for maintaining coherence in a
memory hierarchy. The first method is,
1. Write through which demand immediate update through
broadcasting Mi+1 level of memory if a word is modified
in Mi.
3|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
2. Write back – the second method is written which delays
the update in Mi.
Locality of references
• Temporal Locality - Recently referenced items like
instructions or data are likely to be referred again shortly
such as iterative loops process stacks, temporary
variables or subroutines.
• Spatial Locality - This refers to the tendency for a
process to access items whose addresses are nearer to
another for example operations on tables or arrays involve
access of a certain clustered area in the address space.
• Sequential Locality - In typical programs the execution
of instructions follows a sequential order unless branch
instructions create out of order execution. The ratio of in
order execution to out of order execution is roughly 5:1 in
ordinary programs.
Cache Memory
Cache Memory is a special very high-speed memory. It is
used to speed up and synchronizing with high-speed CPU.
Cache memory is costlier than main memory or disk memory
but economical than CPU registers.
Cache memory is an extremely fast memory type that acts as
a buffer between RAM and the CPU. It holds frequently
requested data and instructions so that they are immediately
available to the CPU when needed.
Cache memory is used to reduce the average time to access
data from the Main memory. The cache is a smaller and faster
memory which stores copies of the data from frequently used
main memory locations. There are various different
independent caches in a CPU, which store instructions and
data.
4|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
Levels of memory:
• Level 1 or Register –
It is a type of memory in which data is stored and accepted
that are immediately stored in CPU. Most commonly used
register is accumulator, Program counter, address register etc.
• Level 2 or Cache memory –
It is the fastest memory which has faster access time where
data is temporarily stored for faster access.
• Level 3 or Main Memory –
It is memory on which computer works currently. It is small in
size and once power is off data no longer stays in this
memory.
• Level 4 or Secondary Memory –
It is external memory which is not as fast as main memory
but data stays permanently in this memory.
Cache Performance:
When the processor needs to read or write a location in main
memory, it first checks for a corresponding entry in the cache.
• If the processor finds that the memory location is in the
cache, a cache hit has occurred and data is read from cache
• If the processor does not find the memory location in the
cache, a cache miss has occurred. For a cache miss, the cache
allocates a new entry and copies in data from main memory,
then the request is fulfilled from the contents of the cache.
The performance of cache memory is frequently measured in
terms of a quantity called Hit ratio.
Hit ratio = hit / (hit + miss) = no. of hits/total accesses
5|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
We can improve Cache performance using higher cache block
size, higher associativity, reduce miss rate, reduce miss
penalty, and reduce the time to hit in the cache.
Minimizing Cache Misses.
Most CPU's have first-level instruction and data caches on chip
and many have second-level cache(s) that are bigger but
somewhat slower. Memory accesses are much faster if the data
is already loaded into the first-level cache. When your program
accesses data that isn't in one of the caches, it gets a cache
miss. This causes a block of consecutively addressed words,
including the data that you just accessed, to be loaded into the
cache. Since cache misses are costly, you should try to
minimize them, using these tips:
• Keep frequently accessed data together. Store and access
frequently used data in flat, sequential data structures
and avoid pointer indirection. This way, the most
frequently accessed data remains in the first-level cache
as much as possible.
• Access data sequentially. Each cache miss brings in a
block of consecutively addressed words of needed data. If
you are accessing data sequentially then each cache miss
will bring in n words (where n is system dependent); if
you are accessing only every nth word, then you will be
constantly reading in unneeded data, degrading
performance.
• Avoid simultaneously traversing several large buffers of
data, such as an array of vertex coordinates and an array
of colors within a loop since there can be cache conflicts
between the buffers. Instead, pack the contents into one
buffer whenever possible. If you are using vertex arrays,
try to use interleaved arrays. (For more information on
vertex arrays see ``Rendering Geometry Efficiently''.)
However, if packing your data forces a big increase in the
size of the data, it may not be the right optimization for
your program.
6|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
Virtual Memory
The physical memory which we are using in our computer
system is in megabytes not in gigabytes. If we try to run a
program which is not completely fit into main memory, then, in
this case, we try the parts of currently being executed are
stored in main memory and remaining portion is stored in a
secondary device such as hard disk.
If this case occurs, so it’s necessary that all parts of a program
which are needed for execution are first brought into the main
memory. When a new segment of a program is brought and
memory is full, it must replace another segment already in the
memory. So the techniques which are used to move program
and data block into the main memory when they required for
execution are called virtual memory techniques.
Virtual Memory Organization
• Programmers discern a larger memory which is allocated
on the disk this memory is referred to as virtual
memory. A programmer enjoys a huge virtual space to
develop his or her program or software.
• Program or a processor references a data space which is
independent of available physical memory space. The
address issued by a processor is called a virtual address.
7|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
• The virtual address can be translated into the physical
address by a combination of hardware and software
components.
• If a virtual address is a part of the program in the physical
memory then it can be accessed immediately and if the
address is not in the main memory then its content will be
brought into the main memory before it can be used.
• To improve the performance hardware are added to the
system, these hardware units are known as Memory
Management Unit (MMU).
• In the paging system, page table helps us to find the
physical address form virtual address since the virtual
address space is used to develop a process. So the work
of the MMU is to translate the virtual address to physical
address.
Paging and Virtual Memory Address Translation
The mechanism for reading a data from memory involves
translation of virtual address to physical address. In this
basically, the concept of paging is used for doing the
translation. Following steps are there for address translations
which are given below:
1. Virtual address space of the program is divided into fixed
length units called, pages.
2. Each page will contain a block of words that will occupy
the locations in the main memory. After this, each page
will be mapped into the fixed location in the main memory
called page frame.
3. The address generated by the processors to fetch the
word memory can be divided into the two parts:
In this case, high order bits are interpreted as virtual page
number and these bits are used to fetch corresponding
page frame number from the page table.
8|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
4. Low order bits of the address gives the offset and it
specifies as the particular byte within in a page.
5. By adding this virtual page number to the contents of the
content page table, the address of the corresponding
entry in the page table is obtained.
6. Each entry in a page includes a control bit that describes a
validity of a page, the status of the page, and whether the
page has been modified.
Advantages of virtual memory
1. A process which is larger than the main memory, it will be
executed because of the demand paging.
2. By the concept of a virtual memory, many processes can
be maintained in the main memory.
3. It will allow greater programming level by using only less
space for a particular process.
Cache Mapping and Management Techniques:
There are three different types of mapping used for the
purpose of cache memory which are as follows: Direct
mapping, Associative mapping, and Set-Associative mapping.
These are explained below.
1. Direct Mapping –
The simplest technique, known as direct mapping, maps each
block of main memory into only one possible cache line. or
In Direct mapping, assign each memory block to a specific line
in the cache. If a line is previously taken up by a memory
block when a new block needs to be loaded, the old block is
trashed. An address space is split into two parts index field
and a tag field. The cache is used to store the tag field
whereas the rest is stored in the main memory.
Direct mapping`s performance is directly proportional to the
Hit ratio.
9|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
i = j modulo m
where
i=cache line number
j= main memory block number
m=number of lines in the cache
For purposes of cache access, each main memory address can
be viewed as consisting of three fields. The least significant w
bits identify a unique word or byte within a block of main
memory. In most contemporary machines, the address is at
the byte level. The remaining s bits specify one of the
2s blocks of main memory. The cache logic interprets these s
bits as a tag of s-r bits (most significant portion) and a line
field of r bits. This latter field identifies one of the m=2r lines
of the cache.
10 | P a g e
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
2. Associative Mapping –
In this type of mapping, the associative memory is used to
store content and addresses of the memory word. Any block
can go into any line of the cache. This means that the word id
bits are used to identify which word in the block is needed, but
the tag becomes all of the remaining bits. This enables the
placement of any word at any place in the cache memory. It is
considered to be the fastest and the most flexible mapping
form.
3. Set-associative Mapping –
This form of mapping is an enhanced form of direct mapping
where the drawbacks of direct mapping are removed. Set
associative addresses the problem of possible thrashing in the
direct mapping method. It does this by saying that instead of
having exactly one line that a block can map to in the cache,
we will group a few lines together creating a set. Then a block
in memory can map to any one of the lines of a specific
set..Set-associative mapping allows that each word that is
present in the cache can have two or more words in the main
memory for the same index address. Set associative cache
mapping combines the best of direct and associative cache
mapping techniques.
In this case, the cache consists of a number of sets, each of
which consists of a number of lines. The relationships are
11 | P a g e
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
m=v*k
i= j mod v
where
i=cache set number
j=main memory block number
v=number of sets
m=number of lines in the cache number of sets
k=number of lines in each set
12 | P a g e
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
Application of Cache Memory –
1. Usually, the cache memory can store a reasonable number of
blocks at any given time, but this number is small compared
to the total number of blocks in the main memory.
2. The correspondence between the main memory blocks and
those in the cache is specified by a mapping function.
Types of Cache –
Primary Cache –
A primary cache is always located on the processor chip. This
cache is small and its access time is comparable to that of
processor registers.
Secondary Cache –
Secondary cache is placed between the primary cache and the
rest of the memory. It is referred to as the level 2 (L2) cache.
Often, the Level 2 cache is also housed on the processor chip.
Locality of reference –
Since size of cache memory is less as compared to main
memory. So to check which part of main memory should be
given priority and loaded in cache is decided based on locality
of reference.
Types of Locality of reference
Spatial Locality of reference
This says that there is a chance that element will be
present in the close proximity to the reference point and
next time if again searched then more close proximity to
the point of reference.
Temporal Locality of reference
In this Least recently used algorithm will be used.
Whenever there is page fault occurs within a word will not
only load word in main memory but complete page fault
will be loaded because spatial locality of reference rule
says that if you are referring any word next word will be
referred in its register that’s why we load complete page
table so the complete block will be loaded.
13 | P a g e
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
Memory Replacement policies
In an operating system that uses paging for memory management, a page
replacement algorithm is needed to decide which page needs to be replaced
when new page comes in.
Page Fault – A page fault happens when a running program accesses a
memory page that is mapped into the virtual address space, but not loaded
in physical memory.
Since actual physical memory is much smaller than virtual memory, page
faults happen. In case of page fault, Operating System might have to replace
one of the existing pages with the newly needed page. Different page
replacement algorithms suggest different ways to decide which page to
replace. The target for all algorithms is to reduce the number of page faults.
Page Replacement Algorithms :
• First In First Out (FIFO) –
This is the simplest page replacement algorithm. In this algorithm,
the operating system keeps track of all pages in the memory in a
queue, the oldest page is in the front of the queue. When a page
needs to be replaced page in the front of the queue is selected for
removal.
• Example-1Consider page reference string 1, 3, 0, 3, 5, 6 with 3 page
frames. Find number of page faults.
14 | P a g e
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
Initially all slots are empty, so when 1, 3, 0 came they are allocated to
the empty slots —> 3 Page Faults.
when 3 comes, it is already in memory so —> 0 Page Faults.
Then 5 comes, it is not available in memory so it replaces the oldest
page slot i.e 1. —>1 Page Fault.
6 comes, it is also not available in memory so it replaces the oldest
page slot i.e 3 —>1 Page Fault.
Finally when 3 come it is not avilable so it replaces 0 1 page fault
Belady’s anomaly – Belady’s anomaly proves that it is possible to have
more page faults when increasing the number of page frames while using the
First in First Out (FIFO) page replacement algorithm. For example, if we
consider reference string 3, 2, 1, 0, 3, 2, 4, 3, 2, 1, 0, 4 and 3 slots, we get 9
total page faults, but if we increase slots to 4, we get 10 page faults.
• Optimal Page replacement –
In this algorithm, pages are replaced which would not be used for the
longest duration of time in the future.
Example-2:Consider the page references 7, 0, 1, 2, 0, 3, 0, 4, 2, 3,
0, 3, 2, with 4 page frame. Find number of page fault.
Initially all slots are empty, so when 7 0 1 2 are allocated to the
empty slots —> 4 Page faults
0 is already there so —> 0 Page fault.
when 3 came it will take the place of 7 because it is not used for the
longest duration of time in the future.—>1 Page fault.
0 is already there so —> 0 Page fault..
4 will takes place of 1 —> 1 Page Fault.
15 | P a g e
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR College: ABA, BLS
Now for the further page reference string —> 0 Page fault because
they are already available in the memory.
Optimal page replacement is perfect, but not possible in practice as
the operating system cannot know future requests. The use of
Optimal Page replacement is to set up a benchmark so that other
replacement algorithms can be analyzed against it.
• Least Recently Used –
In this algorithm page will be replaced which is least recently used.
Example-3Consider the page reference string 7, 0, 1, 2, 0, 3, 0, 4,
2, 3, 0, 3, 2 with 4 page frames. Find number of page faults.
Initially all slots are empty, so when 7 0 1 2 are allocated to the
empty slots —> 4 Page faults
0 is already their so —> 0 Page fault.
when 3 came it will take the place of 7 because it is least recently
used —>1 Page fault
0 is already in memory so —> 0 Page fault.
4 will takes place of 1 —> 1 Page Fault
Now for the further page reference string —> 0 Page fault because
they are already available in the memory.
16 | P a g e
Contact: 7008443534, 9090042626