UNIT 4
MEMORY ORGANIZATION
LAKSHMI R SURESH
ASSISTANT PROFESSOR
SOIT RGPV BHOPAL
What is memory and its hierarchy?
The computer memory refers to a storage space in which the computer stores the
instructions that are needed for processing, and it can be divided into 5 major
hierarchies that are based on use as well as speed.
A processor can easily move from any one level to some other on the basis of its
requirements.
These five hierarchies in a system’s memory are register, cache memory, main
memory, magnetic disc, and magnetic tape.
What is Memory Hierarchy?
The memory unit refers to an essential type of component in a digital
computer. And it is required for storing various programs and data. The
memory can be typically classified into two separate categories:
Primary or internal memory
It consists of CPU registers, Cache Memory, Main Memory, and these are
directly accessible by the processor.
Secondary or external memory
It consists of a Magnetic Disk, Optical Disk, Magnetic Tape, which are
accessible by processor via I/O Module.
Memory Hierarchy-
• Memory hierarchy is the hierarchy of memory and storage devices found in a computer system.
• It ranges from the slowest but high capacity auxiliary memory to the fastest but low capacity cache
memory.
Level-0:
• At level-0, registers are present which are contained inside the CPU.
• Since they are present inside the CPU, they have least access time.
• They are most expensive and therefore smallest in size (in KB).
• Registers are implemented using Flip-Flops.
Level-1:
• At level-1, Cache Memory is present.
• It stores the segments of program that are frequently accessed by the processor.
• It is expensive and therefore smaller in size (in MB).
• Cache memory is implemented using static RAM.
Level-2:
• At level-2, main memory is present.
• It can communicate directly with the CPU and with auxiliary memory devices through an I/O
processor.
• It is less expensive than cache memory and therefore larger in size (in few GB).
• Main memory is implemented using dynamic RAM.
Level-3:
• At level-3, secondary storage devices like Magnetic Disk are present.
• They are used as back up storage.
• They are cheaper than main memory and therefore much larger in size (in few TB).
Level-4:
• At level-4, tertiary storage devices like magnetic tape are present.
• They are used to store removable files.
• They are cheapest and largest in size (1-20 TB).
Goals of Memory Hierarchy-
The goals of memory hierarchy are-
• To obtain the highest possible average access speed
• To minimize the total cost of the entire memory system
Characteristics of Memory Hierarchy
The memory characteristics mainly include:
• Access Time: The interval between data availability and a read or
write request.
• Capacity: The amount of information that can be stored increasing
as we move from top to bottom in the hierarchy.
• Performance: Historically, designing a computer system without a
memory hierarchy resulted in a significant speed gap between main
memory and CPU registers, leading to lower system performance.
The memory hierarchy model was introduced to address this issue
and improve system performance.
• Cost Per Bit: Moving from bottom to top in the system's hierarchy,
the cost per bit increases, meaning that internal memory is costlier
than external memory.
MAIN MEMORY
Main memory is the central storage unit in a computer
system.
It is a relatively large and fast memory used to store
programs and data during the computer operation.
The principal technology used for the main memory is
based on semi conductor integrated circuits.
Integrated circuits RAM chips are available in two
possible operating modes, static and dynamic.
Static RAM – Consists of internal flip flops that store the binary
information.
Dynamic RAM – Stores the binary information in the form of
electric charges that are applied to capacitors.
MAIN MEMORY
Most of the main memory in a general purpose computer is made up of RAM
integrated circuit chips, but a portion of the memory may be constructed
with ROM chips.
Read Only Memory
Store programs that are permanently resident in the computer and for tables of
constants that do not change in value once the production of the computer is
completed.
The ROM portion of main memory is needed for storing an initial program
called a Bootstrap loader.
Boot strap loader –function is start the computer software operating when
power is turned on.
Boot strap program loads a portion of operating system from disc to main
memory and control is then transferred to operating system.
RAM integrated circuit chips
The RAM integrated circuit chips are further classified into two possible operating
modes, static and dynamic.
The primary compositions of a static RAM are flip-flops that store the binary information.
The nature of the stored information is volatile, i.e. it remains valid as long as power is applied to
the system.
The static RAM is easy to use and takes less time performing read and write operations as
compared to dynamic RAM.
The dynamic RAM exhibits the binary information in the form of electric charges that are applied
to capacitors.
The capacitors are integrated inside the chip by MOS transistors.
The dynamic RAM consumes less power and provides large storage capacity in a single memory
chip.
RAM chips are available in a variety of sizes and are used as per the system requirement.
The following block diagram demonstrates the chip interconnection in a 128 * 8 RAM chip.
• A 128 * 8 RAM chip has a memory capacity of 128 words of eight bits (one byte) per word. This
requires a 7-bit address and an 8-bit bidirectional data bus.
• The 8-bit bidirectional data bus allows the transfer of data either from memory to CPU during
a read operation or from CPU to memory during a write operation.
• The read and write inputs specify the memory operation, and the two chip select (CS) control
inputs are for enabling the chip only when the microprocessor selects it.
• The bidirectional data bus is constructed using three-state buffers.
• The output generated by three-state buffers can be placed in one of the three possible states
which include a signal equivalent to logic 1, a signal equal to logic 0, or a high-impedance state.
From the functional table, we can conclude that the unit is in operation only
when CS1 = 1 and CS2 = 0.
The bar on top of the second select variable indicates that this input is
enabled when it is equal to 0.
ROM integrated circuit
The primary component of the main memory is RAM integrated circuit chips,
but a portion of memory may be constructed with ROM chips.
A ROM memory is used for keeping programs and data that are permanently
resident in the computer.
Apart from the permanent storage of data, the ROM portion of main memory is
needed for storing an initial program called a bootstrap loader.
The primary function of the bootstrap loader program is to start the computer
software operating when power is turned on.
ROM chips are also available in a variety of sizes and are also used as per the
system requirement.
The following block diagram demonstrates the chip interconnection in a 512 * 8
ROM chip.
• A ROM chip has a similar organization as a RAM chip. However, a ROM can
only perform read operation; the data bus can only operate in an output
mode.
• The 9-bit address lines in the ROM chip specify any one of the 512 bytes
stored in it.
• The value for chip select 1 and chip select 2 must be 1 and 0 for the unit to
operate. Otherwise, the data bus is said to be in a high-impedance
Auxiliary memory
An Auxiliary memory is referred to as the lowest-cost, highest-
space, and slowest-approach storage in a computer system.
It is where programs and information are preserved for long-
term storage or when not in direct use. The most typical
auxiliary memory devices used in computer systems
are magnetic disks and tapes.
Magnetic Disks
A magnetic disk is a round plate generated of metal or plastic
coated with magnetized material. There are both sides of the
disk are used and multiple disks can be stacked on one spindle
with read/write heads accessible on each surface.
All disks revolve together at high speed and are not stopped or
initiated for access purposes.
Bits are saved in the
magnetized surface in marks
along concentric circles known as
tracks.
The tracks are frequently divided
into areas known as sectors.
In this system, the lowest quantity
of data that can be sent is a sector.
Magnetic Tape
Magnetic tape transport includes the robotic, mechanical, and electronic
components to support the methods and control structure for a magnetic tape
unit.
The tape is a layer of plastic coated with a magnetic documentation medium.
Bits are listed as a magnetic stain on the tape along various tracks. There are
seven or nine bits are recorded together to form a character together with a
parity bit. Read/write heads are mounted one in each track therefore that
information can be recorded and read as a series of characters.
Magnetic tape units can be stopped, initiated to move forward, or in the
opposite, or it can be reversed.
However, they cannot be initiated or stopped fast enough between single
characters. For this reason, data is recorded in blocks defined as records. Gaps
of unrecorded tape are added between records where the tape can be stopped.
The tape begins affecting while in a gap and achieves its permanent speed by
the time it arrives at the next record.
Each record on tape has a recognition bit design at the starting and end. By
reading the bit design at the starting, the tape control recognizes the data
number.
Associative Memory
An associative memory can be considered as a memory unit whose stored
data can be identified for access by the content of the data itself rather than
by an address or memory location.
Associative memory is often referred to as Content Addressable Memory
(CAM).
When a write operation is performed on associative memory, no address or
memory location is given to the word.
The memory itself is capable of finding an empty unused location to store
the word.
On the other hand, when the word is to be read from an associative
memory, the content of the word, or part of the word, is specified.
The words which match the specified content are located by the memory
and are marked for reading
an associative memory consists of a
memory array and logic for 'm' words with
'n' bits per word.
The functional registers like the argument
register A and key register K each
have n bits, one for each bit of a word. The
match register M consists of m bits, one
for each memory word.
The words which are kept in the memory
are compared in parallel with the content
of the argument register.
The key register (K) provides a mask for
choosing a particular field or key in the
argument word.
If the key register contains a binary
value of all 1's, then the entire
argument is compared with each
memory word.
Otherwise, only those bits in the
argument that have 1's in their
corresponding position of the key
register are compared.
Thus, the key provides a mask for
identifying a piece of information which
specifies how the reference to memory
is made.
UNIT 4
MODES OF TRANSFER
Interrupt-Initiated I/O
• An alternative to the CPU constantly monitoring the flag is to let the
interface inform the computer when it is ready to transfer data.
• This mode of transfer uses the interrupt facility. While the CPU is
running a program, it does not check the flag.
• However, when the flag is set, the computer is momentarily
interrupted from proceeding with the current program and is
informed of the fact that the flag has been set.
• The CPU deviates from what it is doing to take care of the input or
output transfer.
• After the transfer is completed, the computer returns to the
previous program to continue what it was doing before the interrupt.
• The CPU responds to the interrupt signal by storing the return
address from the program counter into a memory stack and then
control branches to a service routine that processes the required I/O
transfer.
Interrupt-Initiated I/O
• The way that the processor chooses the branch address of the
service routine varies from one unit to another.
• In principle, there are two methods for accomplishing this.
• One is called vectored interrupt and the other, nonvectored
interrupt. In a non vectored interrupt, the branch address is
assigned to a fixed location in memory.
• In a vectored interrupt, the source that interrupts supplies the
branch information to the computer. This information is called the
interrupt vector.
• In some computers the interrupt vector is the first address of the
I/O service routine.
• In other computers the interrupt vector is an address that points
to a location in memory where the beginning address of the I/O
service routine is stored.
Priority Interrupt
• Data transfer between the CPU and an I/O device is initiated by the CPU.
However, the CPU cannot start the transfer unless the device is ready to
communicate with the CPU.
• The readiness of the device can be determined from an interrupt signal. The
CPU responds to the interrupt request by storing the return address from PC
into a memory stack and then the program branches to a service routine that
processes the required transfer.
• Some processors also push the current PSW (program status word) onto the
stack and load a new PSW for the service routine.
• In a typical application a number of I/O devices are attached to the
computer, with each device being able to originate an interrupt request. The
first task of the interrupt system is to identify the source of the interrupt.
• There is also the possibility that several sources will request service
simultaneously. In this case the system must also decide which device to
service first.
Priority Interrupt
• A priority interrupt is a system that establishes a priority over the various sources to
determine which condition is to be serviced first when two or more requests arrive
simultaneously.
• The system may also determine which conditions are permitted to interrupt the
computer while another interrupt is being serviced.
• Higher-priority interrupt levels are assigned to requests which, if delayed or
interrupted, could have serious consequences.
• Devices with high speed transfers such as magnetic disks are given high priority, and
slow devices such as keyboards receive low priority.
• When two devices interrupt the computer at the same time, the computer services the
device, with the higher priority first.
• Establishing the priority of simultaneous interrupts can be done by software or
hardware.
• A polling procedure is used to identify the highest-priority source by software means
• In this method there is one common branch address for all interrupts.
• The program that takes care of interrupts begins at the branch address and polls the
interrupt sources in sequence.
Priority Interrupt
• The highest-priority source is tested first, and if its interrupt
signal is on, control branches to a service routine for this source.
• Otherwise, the next-lower-priority source is tested, and so on.
Thus the initial service routine for all interrupts consists of a
program that tests the interrupt sources in sequence and
branches to one of many possible service routines.
• The particular service routine reached belongs to the highest-
priority device among all devices that interrupted the computer.
• The disadvantage of the software method is that if there are
many interrupts, the time required to poll them can exceed the
time available to service the I/O device. In this situation a
hardware priority-interrupt unit can be used to speed up the
operation.
• A hardware priority-interrupt unit functions as an overall
manager in an interrupt system environment.
Priority Interrupt
• It accepts interrupt requests from many sources, determines
which of the incoming requests has the highest priority, and
issues an interrupt request to the computer based on this
determination.
• To speed up the operation, each interrupt source has its own
interrupt vector to access its own service routine directly. Thus no
polling is required because all the decisions are established by
the hardware priority-interrupt unit.
• The hardware priority function can be established by either a
serial or a parallel connection of interrupt lines. The serial
connection is also known as the daisychaining method.
Daisy-Chaining Priority
• The daisy-chaining method of establishing priority consists of a serial connection of all
devices that request an interrupt.
• The device with the highest priority is placed in the first position, followed by lower-
priority devices up to the device with the lowest priority, which is placed last in the chain.
• The interrupt request line is common to all devices and forms a wired logic connection.
• If any device has its interrupt signal in the low-level state, the interrupt line goes to the
low-level state and enables the interrupt input in the CPU.
• When no interrupts are pending, the interrupt line stays in the high-level state and no
interrupts are recognized by the CPU.
• This is equivalent to a negativelogic OR operation.
• The CPU responds to an interrupt request by enabling the interrupt acknowledge line.
This signal is received by device 1 at its PI (priority in) input.
•
• The acknowledge signal passes on to the next device through the PO (priority out)
output only if device 1 is not requesting an interrupt.
• If device 1 has a pending interrupt, it blocks the acknowledge signal from the next
device by placing a 0 in the PO output.
• It then proceeds to insert its own interrupt vector address (VAD) into the data bus for the
CPU to use during the interrupt cycle.
• A device with a 0 in its PI input generates a 0 in its PO output to inform the next-
lower-priority device that the acknowledge signal has been blocked.
• A device that is requesting an interrupt and has a 1 in its PI input will intercept the
acknowledge signal by placing a 0 in its PO output.
• If the device does not have pending interrupts, it transmits the acknowledge signal to
the next device by placing a 1 in its PO output.
• Thus the device with PI = 1 and PO = 0 is the one with the highest priority that is
requesting an interrupt, and this device places its VAD on the data bus.
• The daisy chain arrangement gives the highest priority to the device that receives
the interrupt acknowledge signal from the CPU.
.
It consists of an interrupt register whose individual bits are
set by external conditions and cleared by program
instructions.
• The magnetic disk, being a high-speed device, is given
the highest priority. The printer has the next priority,
followed by a character reader and a keyboard.
• The mask register has the same number of bits as the
interrupt register. By means of program instructions, it is
possible to set or reset any bit in the mask register.
• Each interrupt bit and its corresponding mask bit are
applied to an AND gate to produce the four inputs to a
priority encoder.
• In this way an interrupt is recognized only if its
corresponding mask bit is set to 1 by the program. The
priority encoder generates two bits of the vector address,
which is transferred to the CPU.
• Another output from the encoder sets an interrupt status
flip-flop lST when an interrupt that is not masked occurs.
The interrupt enable flip-flop lEN can be set or cleared by
the program to provide an overall control over the interrupt
system.
• The outputs of IST ANDed with IEN provide a common
interrupt signal for the CPU.
• The interrupt acknowledge INTACK signal from the CPU
enables the bus buffers in the output register and a vector
address VAD is placed into the data bus.
UNIT 5
RISC and CISC Architecture
Reduced Instruction Set Computer or RISC Architecture
The fundamental goal of RISC is to make hardware simpler by
employing an instruction set that consists of only a few basic
steps used for evaluating, loading, and storing operations.
A load command loads data but a store command stores data.
Characteristics of RISC:
1. It has simpler instructions and thus simple instruction decoding.
2. More general-purpose registers.
3. The instruction takes one clock cycle in order to get executed.
4. The instruction comes under the size of a single word.
5. Pipeline can be easily achieved.
6. Few data types.
7. Simpler addressing modes.
Complex Instruction Set Computer or CISC Architecture
The fundamental goal of CISC is that a single instruction will
handle all evaluating, loading, and storing operations, similar to
how a multiplication command will handle evaluating, loading,
and storing data, which is why it’s complicated.
Characteristics of CISC:
1. Instructions are complex, and thus it has complex instruction
decoding.
2. The instructions may take more than one clock cycle in order to
get executed.
3. The instruction is larger than one-word size.
4. Lesser general-purpose registers since the operations get
performed only in the memory.
5. More data types.
6. Complex addressing modes.
Pipelining in Computer Architecture
• A program consists of several number of instructions.
• These instructions may be executed in the following two ways-
1. Non-Pipelined Execution-
In non-pipelined architecture,
• All the instructions of a program are executed sequentially one after the other.
• A new instruction executes only after the previous instruction has executed
completely.
• This style of executing the instructions is highly inefficient.
Example-
Consider a program consisting of three instructions.
In a non-pipelined architecture, these instructions
execute one after the other as-
If time taken for executing one instruction = t, then-
Time taken for executing ‘n’ instructions = n x t
2. Pipelined Execution-
In pipelined architecture,
• Multiple instructions are executed parallely.
• This style of executing the instructions is highly efficient.
Pipelined Architecture-
In pipelined architecture,
• The hardware of the CPU is split up into several functional units.
• Each functional unit performs a dedicated task.
• The number of functional units may vary from processor to processor.
• These functional units are called as stages of the pipeline.
• Control unit manages all the stages using control signals.
• There is a register associated with each stage that holds the data.
• There is a global clock that synchronizes the working of all the stages.
• At the beginning of each clock cycle, each stage takes the input from its
register.
• Each stage then processes the data and feed its output to the register of the
next stage.
Four-Stage Pipeline-
In four stage pipelined architecture, the execution of each
instruction is completed in following 4 stages-
1. Instruction fetch (IF)
2. Instruction decode (ID)
3. Instruction Execute (IE)
4. Write back (WB)
To implement four stage pipeline,
• The hardware of the CPU is divided into four functional units.
• Each functional unit performs a dedicated task.
Stage-01:
First functional unit performs instruction fetch.
• It fetches the instruction to be executed.
Stage-02:
• Second functional unit performs instruction decode.
• It decodes the instruction to be executed.
Stage-03:
• Third functional unit performs instruction execution.
• It executes the instruction.
Stage-04:
• Fourth functional unit performs write back.
• It writes back the result so obtained after executing the instruction.
Execution-
In pipelined architecture,
• Instructions of the program execute parallely.
• When one instruction goes from nth stage to (n+1)th stage, another instruction goes from (n-1)th stage
to nth stage.
Phase-Time Diagram-
• Phase-time diagram shows the execution of instructions in the pipelined architecture.
• The following diagram shows the execution of three instructions in four stage pipeline architecture.
• Time taken to execute three instructions in four stagepipelined architecture = 6 clock cycles.
NOTE-
In non-pipelined architecture,
Time taken to execute three instructions would be
= 3 x Time taken to execute one instruction
= 3 x 4 clock cycles
= 12 clock cycles
Clearly, pipelined execution of instructions is far more
efficient than non-pipelined execution.