Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR            College: ABA, BLS
Memory Hierarchy
The memory hierarchy design in a computer system mainly
includes different storage devices. Most of the computers were
inbuilt with extra storage to run more powerfully beyond the
main memory capacity. The following memory hierarchy
diagram is a hierarchical pyramid for computer memory. The
designing of the memory hierarchy is divided into two types such
as primary (Internal) memory and secondary (External)
memory.
Primary Memory
The primary memory is also known as internal memory, and this
is accessible by the processor straightly. This memory includes
main, cache, as well as CPU registers.
Secondary Memory
The secondary memory is also known as external memory, and
this is accessible by the processor through an input/output
module. This memory includes an optical disk, magnetic disk,
and magnetic tape.
                                                          1|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR            College: ABA, BLS
Characteristics of Memory Hierarchy
The memory hierarchy characteristics mainly include the
following.
Performance
Previously, the designing of a computer system was done
without memory hierarchy, and the speed gap among the main
memory as well as the CPU registers enhances because of the
huge disparity in access time, which will cause the lower
performance of the system. So, the enhancement was
mandatory. The enhancement of this was designed in the
memory hierarchy model due to the system’s performance
increase.
Ability
The ability of the memory hierarchy is the total amount of data
the memory can store. Because whenever we shift from top to
bottom inside the memory hierarchy, then the capacity will
increase.
Access Time
The access time in the memory hierarchy is the interval of the
time among the data availability as well as request to read or
write. Because whenever we shift from top to bottom inside the
memory hierarchy, then the access time will increase
Cost per bit
When we shift from bottom to top inside the memory hierarchy,
then the cost for each bit will increase which means an internal
Memory is expensive compared with external memory.
                                                          2|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR               College: ABA, BLS
Memory Hierarchy Properties
Information stored in a memory hierarchy satisfies all the
following properties:-
   1) Inclusion
   2) Coherence
   3) Locality of reference
Inclusion Property
it states as: (M1CM2C...)
The set inclusive implies that all the information's are originally
stored in the outermost level Mi and the information word
found in Mn then copies of the same word can also be found in
all the upper levels Mi+1, Mi+2.
   1. Access by word(4 bytes) from a cache block of 32 bytes,
      such as block A
   2. Access by block (32 bytes) from a memory page of 32
      blocks or 1 KB such as block B from page B.
   3. Access by page 1 KB from a file consisting of many pages
      like A & B in segment F
   4. Segment Transfer with different No. of pages
Coherence Property
The coherence property requires that copies of the same
information item at successive memory level should be
consistent. If a word is modified in the cache copies of that
word must be updated immediately at the higher levels.
There are two strategies for maintaining coherence in a
memory hierarchy. The first method is,
   1. Write through which demand immediate update through
      broadcasting Mi+1 level of memory if a word is modified
      in Mi.
                                                             3|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR               College: ABA, BLS
   2. Write back – the second method is written which delays
      the update in Mi.
Locality of references
   •   Temporal Locality - Recently referenced items like
       instructions or data are likely to be referred again shortly
       such as iterative loops process stacks, temporary
       variables or subroutines.
   •   Spatial Locality - This refers to the tendency for a
       process to access items whose addresses are nearer to
       another for example operations on tables or arrays involve
       access of a certain clustered area in the address space.
   •   Sequential Locality - In typical programs the execution
       of instructions follows a sequential order unless branch
       instructions create out of order execution. The ratio of in
       order execution to out of order execution is roughly 5:1 in
       ordinary programs.
Cache Memory
Cache Memory is a special very high-speed memory. It is
used to speed up and synchronizing with high-speed CPU.
Cache memory is costlier than main memory or disk memory
but economical than CPU registers.
Cache memory is an extremely fast memory type that acts as
a buffer between RAM and the CPU. It holds frequently
requested data and instructions so that they are immediately
available to the CPU when needed.
Cache memory is used to reduce the average time to access
data from the Main memory. The cache is a smaller and faster
memory which stores copies of the data from frequently used
main memory locations. There are various different
independent caches in a CPU, which store instructions and
data.
                                                             4|Page
Contact: 7008443534, 9090042626
    Subject: Computer System Architecture
    Created By: Asst. Prof. SK ABDUL ISRAR            College: ABA, BLS
    Levels of memory:
•   Level 1 or Register –
    It is a type of memory in which data is stored and accepted
    that are immediately stored in CPU. Most commonly used
    register is accumulator, Program counter, address register etc.
•   Level 2 or Cache memory –
    It is the fastest memory which has faster access time where
    data is temporarily stored for faster access.
•   Level 3 or Main Memory –
    It is memory on which computer works currently. It is small in
    size and once power is off data no longer stays in this
    memory.
•   Level 4 or Secondary Memory –
    It is external memory which is not as fast as main memory
    but data stays permanently in this memory.
    Cache Performance:
    When the processor needs to read or write a location in main
    memory, it first checks for a corresponding entry in the cache.
•   If the processor finds that the memory location is in the
    cache, a cache hit has occurred and data is read from cache
•   If the processor does not find the memory location in the
    cache, a cache miss has occurred. For a cache miss, the cache
    allocates a new entry and copies in data from main memory,
    then the request is fulfilled from the contents of the cache.
    The performance of cache memory is frequently measured in
    terms of a quantity called Hit ratio.
    Hit ratio = hit / (hit + miss) = no. of hits/total accesses
                                                              5|Page
    Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR              College: ABA, BLS
We can improve Cache performance using higher cache block
size, higher associativity, reduce miss rate, reduce miss
penalty, and reduce the time to hit in the cache.
Minimizing Cache Misses.
Most CPU's have first-level instruction and data caches on chip
and many have second-level cache(s) that are bigger but
somewhat slower. Memory accesses are much faster if the data
is already loaded into the first-level cache. When your program
accesses data that isn't in one of the caches, it gets a cache
miss. This causes a block of consecutively addressed words,
including the data that you just accessed, to be loaded into the
cache. Since cache misses are costly, you should try to
minimize them, using these tips:
   •   Keep frequently accessed data together. Store and access
       frequently used data in flat, sequential data structures
       and avoid pointer indirection. This way, the most
       frequently accessed data remains in the first-level cache
       as much as possible.
   •   Access data sequentially. Each cache miss brings in a
       block of consecutively addressed words of needed data. If
       you are accessing data sequentially then each cache miss
       will bring in n words (where n is system dependent); if
       you are accessing only every nth word, then you will be
       constantly reading in unneeded data, degrading
       performance.
   •   Avoid simultaneously traversing several large buffers of
       data, such as an array of vertex coordinates and an array
       of colors within a loop since there can be cache conflicts
       between the buffers. Instead, pack the contents into one
       buffer whenever possible. If you are using vertex arrays,
       try to use interleaved arrays. (For more information on
       vertex arrays see ``Rendering Geometry Efficiently''.)
       However, if packing your data forces a big increase in the
       size of the data, it may not be the right optimization for
       your program.
                                                            6|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR               College: ABA, BLS
Virtual Memory
The physical memory which we are using in our computer
system is in megabytes not in gigabytes. If we try to run a
program which is not completely fit into main memory, then, in
this case, we try the parts of currently being executed are
stored in main memory and remaining portion is stored in a
secondary device such as hard disk.
If this case occurs, so it’s necessary that all parts of a program
which are needed for execution are first brought into the main
memory. When a new segment of a program is brought and
memory is full, it must replace another segment already in the
memory. So the techniques which are used to move program
and data block into the main memory when they required for
execution are called virtual memory techniques.
Virtual Memory Organization
   •   Programmers discern a larger memory which is allocated
       on the disk this memory is referred to as virtual
       memory. A programmer enjoys a huge virtual space to
       develop his or her program or software.
   •   Program or a processor references a data space which is
       independent of available physical memory space. The
       address issued by a processor is called a virtual address.
                                                             7|Page
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR                College: ABA, BLS
   •   The virtual address can be translated into the physical
       address by a combination of hardware and software
       components.
   •   If a virtual address is a part of the program in the physical
       memory then it can be accessed immediately and if the
       address is not in the main memory then its content will be
       brought into the main memory before it can be used.
   •   To improve the performance hardware are added to the
       system, these hardware units are known as Memory
       Management Unit (MMU).
   •   In the paging system, page table helps us to find the
       physical address form virtual address since the virtual
       address space is used to develop a process. So the work
       of the MMU is to translate the virtual address to physical
       address.
Paging and Virtual Memory Address Translation
The mechanism for reading a data from memory involves
translation of virtual address to physical address. In this
basically, the concept of paging is used for doing the
translation. Following steps are there for address translations
which are given below:
   1. Virtual address space of the program is divided into fixed
      length units called, pages.
   2. Each page will contain a block of words that will occupy
      the locations in the main memory. After this, each page
      will be mapped into the fixed location in the main memory
      called page frame.
   3. The address generated by the processors to fetch the
      word memory can be divided into the two parts:
       In this case, high order bits are interpreted as virtual page
       number and these bits are used to fetch corresponding
       page frame number from the page table.
                                                              8|Page
Contact: 7008443534, 9090042626
  Subject: Computer System Architecture
  Created By: Asst. Prof. SK ABDUL ISRAR              College: ABA, BLS
     4. Low order bits of the address gives the offset and it
        specifies as the particular byte within in a page.
     5. By adding this virtual page number to the contents of the
        content page table, the address of the corresponding
        entry in the page table is obtained.
     6. Each entry in a page includes a control bit that describes a
        validity of a page, the status of the page, and whether the
        page has been modified.
  Advantages of virtual memory
     1. A process which is larger than the main memory, it will be
        executed because of the demand paging.
     2. By the concept of a virtual memory, many processes can
        be maintained in the main memory.
     3. It will allow greater programming level by using only less
        space for a particular process.
  Cache Mapping and Management Techniques:
  There are three different types of mapping used for the
  purpose of cache memory which are as follows: Direct
  mapping, Associative mapping, and Set-Associative mapping.
  These are explained below.
1. Direct Mapping –
   The simplest technique, known as direct mapping, maps each
   block of main memory into only one possible cache line. or
   In Direct mapping, assign each memory block to a specific line
   in the cache. If a line is previously taken up by a memory
   block when a new block needs to be loaded, the old block is
   trashed. An address space is split into two parts index field
   and a tag field. The cache is used to store the tag field
   whereas the rest is stored in the main memory.
  Direct mapping`s performance is directly proportional to the
  Hit ratio.
                                                              9|Page
  Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR               College: ABA, BLS
      i = j modulo m
      where
      i=cache line number
      j= main memory block number
     m=number of lines in the cache
For purposes of cache access, each main memory address can
be viewed as consisting of three fields. The least significant w
bits identify a unique word or byte within a block of main
memory. In most contemporary machines, the address is at
the byte level. The remaining s bits specify one of the
2s blocks of main memory. The cache logic interprets these s
bits as a tag of s-r bits (most significant portion) and a line
field of r bits. This latter field identifies one of the m=2r lines
of the cache.
                                                           10 | P a g e
Contact: 7008443534, 9090042626
     Subject: Computer System Architecture
     Created By: Asst. Prof. SK ABDUL ISRAR            College: ABA, BLS
2.   Associative Mapping –
     In this type of mapping, the associative memory is used to
     store content and addresses of the memory word. Any block
     can go into any line of the cache. This means that the word id
     bits are used to identify which word in the block is needed, but
     the tag becomes all of the remaining bits. This enables the
     placement of any word at any place in the cache memory. It is
     considered to be the fastest and the most flexible mapping
     form.
3. Set-associative Mapping –
   This form of mapping is an enhanced form of direct mapping
   where the drawbacks of direct mapping are removed. Set
   associative addresses the problem of possible thrashing in the
   direct mapping method. It does this by saying that instead of
   having exactly one line that a block can map to in the cache,
   we will group a few lines together creating a set. Then a block
   in memory can map to any one of the lines of a specific
   set..Set-associative mapping allows that each word that is
   present in the cache can have two or more words in the main
   memory for the same index address. Set associative cache
   mapping combines the best of direct and associative cache
   mapping techniques.
   In this case, the cache consists of a number of sets, each of
   which consists of a number of lines. The relationships are
                                                             11 | P a g e
     Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR          College: ABA, BLS
      m=v*k
      i= j mod v
      where
      i=cache set number
      j=main memory block number
      v=number of sets
      m=number of lines in the cache number of sets
     k=number of lines in each set
                                                      12 | P a g e
Contact: 7008443534, 9090042626
  Subject: Computer System Architecture
  Created By: Asst. Prof. SK ABDUL ISRAR             College: ABA, BLS
  Application of Cache Memory –
1. Usually, the cache memory can store a reasonable number of
   blocks at any given time, but this number is small compared
   to the total number of blocks in the main memory.
2. The correspondence between the main memory blocks and
   those in the cache is specified by a mapping function.
  Types of Cache –
  Primary Cache –
  A primary cache is always located on the processor chip. This
  cache is small and its access time is comparable to that of
  processor registers.
  Secondary Cache –
  Secondary cache is placed between the primary cache and the
  rest of the memory. It is referred to as the level 2 (L2) cache.
  Often, the Level 2 cache is also housed on the processor chip.
  Locality of reference –
  Since size of cache memory is less as compared to main
  memory. So to check which part of main memory should be
  given priority and loaded in cache is decided based on locality
  of reference.
  Types of Locality of reference
       Spatial Locality of reference
       This says that there is a chance that element will be
       present in the close proximity to the reference point and
       next time if again searched then more close proximity to
       the point of reference.
       Temporal Locality of reference
       In this Least recently used algorithm will be used.
       Whenever there is page fault occurs within a word will not
       only load word in main memory but complete page fault
       will be loaded because spatial locality of reference rule
       says that if you are referring any word next word will be
       referred in its register that’s why we load complete page
       table so the complete block will be loaded.
                                                           13 | P a g e
  Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR                        College: ABA, BLS
Memory Replacement policies
In an operating system that uses paging for memory management, a page
replacement algorithm is needed to decide which page needs to be replaced
when new page comes in.
Page Fault – A page fault happens when a running program accesses a
memory page that is mapped into the virtual address space, but not loaded
in physical memory.
Since actual physical memory is much smaller than virtual memory, page
faults happen. In case of page fault, Operating System might have to replace
one of the existing pages with the newly needed page. Different page
replacement algorithms suggest different ways to decide which page to
replace. The target for all algorithms is to reduce the number of page faults.
Page Replacement Algorithms :
     • First In First Out (FIFO) –
        This is the simplest page replacement algorithm. In this algorithm,
        the operating system keeps track of all pages in the memory in a
        queue, the oldest page is in the front of the queue. When a page
        needs to be replaced page in the front of the queue is selected for
        removal.
  • Example-1Consider page reference string 1, 3, 0, 3, 5, 6 with 3 page
     frames. Find number of page faults.
                                                                    14 | P a g e
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR                           College: ABA, BLS
       Initially all slots are empty, so when 1, 3, 0 came they are allocated to
       the empty slots —> 3 Page Faults.
       when 3 comes, it is already in memory so —> 0 Page Faults.
       Then 5 comes, it is not available in memory so it replaces the oldest
       page slot i.e 1. —>1 Page Fault.
       6 comes, it is also not available in memory so it replaces the oldest
       page slot i.e 3 —>1 Page Fault.
       Finally when 3 come it is not avilable so it replaces 0 1 page fault
Belady’s anomaly – Belady’s anomaly proves that it is possible to have
more page faults when increasing the number of page frames while using the
First in First Out (FIFO) page replacement algorithm. For example, if we
consider reference string 3, 2, 1, 0, 3, 2, 4, 3, 2, 1, 0, 4 and 3 slots, we get 9
total page faults, but if we increase slots to 4, we get 10 page faults.
   •   Optimal Page replacement –
       In this algorithm, pages are replaced which would not be used for the
       longest duration of time in the future.
          Example-2:Consider the page references 7, 0, 1, 2, 0, 3, 0, 4, 2, 3,
          0, 3, 2, with 4 page frame. Find number of page fault.
          Initially all slots are empty, so when 7 0 1 2 are allocated to the
          empty slots —> 4 Page faults
          0 is already there so —> 0 Page fault.
          when 3 came it will take the place of 7 because it is not used for the
          longest duration of time in the future.—>1 Page fault.
          0 is already there so —> 0 Page fault..
          4 will takes place of 1 —> 1 Page Fault.
                                                                        15 | P a g e
Contact: 7008443534, 9090042626
Subject: Computer System Architecture
Created By: Asst. Prof. SK ABDUL ISRAR                         College: ABA, BLS
          Now for the further page reference string —> 0 Page fault because
          they are already available in the memory.
          Optimal page replacement is perfect, but not possible in practice as
          the operating system cannot know future requests. The use of
          Optimal Page replacement is to set up a benchmark so that other
          replacement algorithms can be analyzed against it.
   •   Least Recently Used –
       In this algorithm page will be replaced which is least recently used.
           Example-3Consider the page reference string 7, 0, 1, 2, 0, 3, 0, 4,
           2, 3, 0, 3, 2 with 4 page frames. Find number of page faults.
          Initially all slots are empty, so when 7 0 1 2 are allocated to the
          empty slots —> 4 Page faults
          0 is already their so —> 0 Page fault.
          when 3 came it will take the place of 7 because it is least recently
          used —>1 Page fault
          0 is already in memory so —> 0 Page fault.
          4 will takes place of 1 —> 1 Page Fault
          Now for the further page reference string —> 0 Page fault because
          they are already available in the memory.
                                                                      16 | P a g e
Contact: 7008443534, 9090042626