Processor Organization
Topics to be covered
• Instruction Formats
• Instruction Sets
• Addressing Modes
• Assembly Language
• Processor Organization
• Register Organization
• Instruction Cycle – Pipelining
• RISC vs CISC
• Instruction Level Parallelism
Basic Computer Model
ENIAC
• On the ENIAC, all programming was done at the digital logic
level
• Programming the computer involved moving plugs and wires
• Configuring the ENIAC to solve a “simple” problem required
many days labor by skilled technicians
• A different hardware configuration was needed to solve every
unique problem type
The von Neumann Model
• The invention of stored program computers has been
ascribed to a mathematician, John von Neumann
• Stored-program computers have become known as von
Neumann Architecture systems
Stored-program computers have the following characteristics:
• Three hardware systems:
• A central processing unit (CPU)
• A main memory system
• An I/O system
• Provides the capacity to carry out sequential instruction
processing
The von Neumann Model
Computers employ a fetch-decode-execute cycle to run
programs
Computer Organization
• The organization of the computer is defined
by its internal registers, the timing and
control structure and the set of instructions
that it uses.
• The internal organization of a digital system
is defined by the sequence of
microoperations it performs on data stored in
its registers.
Stored Program Organization
Memory
4096 x16
15 12 11 0
Opcode Address
Instructions
(program)
Instruction Format
15 0 Operand
Binary Operand (data)
Processor Register
(accumulator or AC)
Instruction Codes
Program:
– A sequence of (machine) instructions
(Machine) Instruction:
– A group of bits that tell the computer to perform a specific operation
(a sequence of micro-operation)
• The instructions of a program, along with any needed data are
stored in memory
• The CPU reads each instruction from memory
• It is placed in an Instruction Register (IR)
• Control circuitry in control unit then translates the instruction
into the sequence of micro operations necessary to implement
it.
Instruction Code
• Instruction Code
• An instruction code is a group of bits that instruct the computer to
perform a specific operation.
Unique Binary
• Example code is assigned
ADD1547 to every OpCode
• Operation Code (Opcode)
• The operation code of an instruction is a group of bits that define
such operations as add, subtract, multiply, shift, and complement.
• The number of bits required for the operation code of an
instruction depends on the total number of operations available in
the computer.
• The operation code must consist of at least n bits for a given 2n (or
less) distinct operations.
Stored Program Organization
• The simplest way to organize a computer is to have one
processor register(AC) and an instruction code format with
two parts.
• The first part specifies the operation (opcode) to be
performed and the second specifies an address (operand).
• The memory address tells the control where to find an
operand in memory.
• This operand is read from memory and used as the data to be
operated on together with the data stored in the processor
register.
Stored Program Organization
• Instructions are stored in one section of memory
and data in another.
• For a memory unit with 4096 words, we need 12 bits
to specify an address since 212 = 4096.
• If we store each instruction code in one 16-bit
memory word, we have available four bits for
operation code (opcode) to specify one out of 16
possible operations, and 12 bits to specify the
address of an operand.
Stored Program Organization
• The control reads a 16-bit instruction from the
program portion of memory.
• It uses the 12-bit address part of the instruction to
read a 16-bit operand from the data portion of
memory.
• It then executes the operation specified by the
operation code.
Instruction format of basic
computer
Instruction Format
15 14 12 11 0
I Opcode Address
0 0 0 1 0 1 0 0 0 1 0 1 0 1 1 1
Add Instruction – ADD457
Direct & Indirect Addressing of Memory
• If the second part of an instruction format
specifies the address of an operand, the
instruction is said to have a direct address.
• In Indirect address, the bits in the second
part of the instruction designate an address
of a memory word in which the address of
the operand is found.
Direct &Indirect Addressing of Memory
Memory Memory
22 0 ADD 457 35 1 ADD 300
300 1350
457 Operand
1350 Operand
+ +
AC AC
Direct &Indirect Addressing of Memory
• One bit of the instruction code can be used to
distinguish between a direct and an indirect
address.
• It consists of a 3-bit operation code, a 12-bit address,
and an indirect address mode bit designated by I.
• The mode bit is 0 for a direct address and 1 for
an indirect address.
Direct &Indirect Addressing of Memory
• A direct address instruction is placed at address 22 in
memory.
• The I bit is 0, so the instruction is recognized as a
direct address instruction.
• The opcode specifies an ADD instruction, and the
address part is the binary equivalent of 457.
• The control finds the operand in memory at address
457 and adds it to the content of AC.
Direct &Indirect Addressing of Memory
• The instruction in address 35 has a mode bit I = 1,
recognized as an indirect address instruction.
• The address part is the binary equivalent of 300.
• The control goes to address 300 to find the address of
the operand.
• The address of the operand in this case is 1350.
• The operand found in address 1350 is then added to the
content of AC.
Direct &Indirect Addressing of Memory
• The indirect address instruction needs two
references to memory to fetch an operand.
• The first reference is needed to read the address of
the operand.
• Second reference is for the operand itself.
• The memory word that holds the address of the
operand in an indirect address instruction is used as
a pointer to an array of data.
Computer Registers
11 0
Program Counter(12)
PC Holds address of instruction
11 0
Address Register(12) Holds
AR address formemory
15 0
Instruction Register(16)
IR Holds instruction code
15 0
Temporary Register(16)
TR Holds temporary data
15 0
Data Register(16)
DR Holds memory operand
Computer Registers
15 0
Accumulator(16)
AC Processor Register
7 0
Output Register(8) Holds
OUTR output character
7 0
Input Register(8) Holds
INPR input character
Memory
4096 words
16 bits per word
Computer Instructions
Unit – 2: Basic Computer Organization Darshan
DarshanInstitute
InstituteofofEngineering
Engineering&&
Technology Technology
Instruction Set Completeness
• Instruction set is said to be complete if it includes
sufficient number of instructions in each of the following
categories:
1. Arithmetic, logical and shift instructions
2. Instructions for moving information to and from memory and
processor registers
3. Program control instructions together with instructions that
check status conditions
4. Input and output instructions
Unit – 2: Basic Computer 2
Organization 2
Types of Computer Instructions
1. Memory Reference Instruction
15 14 12 0
11
I Opcode Address
0 0 0 0 Address
0xxx 8xxx AND AND the content of memory to
AC
1xxx 9xxx ADD Add the content of memory to
AC
2xxx Axxx LDA Load memory word to AC
3xxx Bxxx STA Store content of AC in memory
4xxx Cxxx BUN Branch unconditionally
Unit5xxx Dxxx
– 2: Basic Computer BSA Branch
224 and save return address
Organization 2
6xxx Exxx ISZ Increment and skip if zero
Types of Computer Instructions
2. Register Reference Instruction
15 14 13 12 11 0
0 1 1 1 Register Operation
0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0
7800 CLA Clear AC
7400 CLE Clear E
7200 CMA Complement AC
7100 CME Complement E
7080 CIR Circulate right AC and E
7040 CIL Circulate left AC and E
7020 INC Increment AC
Types of Computer Instructions
2. Register Reference Instruction
15 14 13 12 11 0
0 1 1 1 Register Operation
0 1 1 1 0 0 0 0 0 0 0 01 0 0 0 0
7010 SPA Skip next instruction if AC is
positive
7008 SNA Skip next instruction if AC is
negative
7004 SZA Skip next instruction if AC is zero
7002 SZE Skip next instruction if E is zero
7001 HLT Halt computer
Unit – 2: Basic Computer 2
Organization 2
Types of Computer Instructions
3. Input – Output Instruction
15 14 13 12 11 0
1 1 1 1 I/O Operation
1 1 1 1 01 00 0 00 0 0 0 0 0 0 0 0
F800 INP Input character to AC
F400 OUT Output character from
AC
F200 SKI Skip on input flag
F100 SKO Skip on output flag
F080 ION Interrupt on
F040 IOF Interrupt off
Unit – 2: Basic Computer 2
Organization 2
Exercise 1
A computer uses a memory unit with 256K words
of 32 bits each. The instruction has four parts: an
indirect bit, an operation code and a register
code part to specify one of 64 registers and an
address part.
a. How many bits are there in the operation code,
the register code part and the address part?
b. Draw the instruction word format and indicate
the number of bits in each part.
c. How many bits are there in data and address
inputs of the memory?
Exercise 2
• What is difference between a direct and an
indirect address instruction?
• How many references to memory are needed
for each type of instruction to bring an
operand into a processor register?
Common bus system of basic computer
High-level programming languages
• These are special languages developed to reflect the procedures used in the
solution of a problem rather than be concerned with the computer
hardware behavior. E.g.Fortran, C++,Java,etc.
• The program is written in a sequence of statements in a form that people
prefer to think in when solving aproblem.
• However, each statement must be translated into a sequence of binary
instructions before the program canbe executed in acomputer.
• The program that translates a high level language program to binary is
called acompiler.
INTEGER A, B, C
DATA A, 83 B,-23
C= A +B
END
Assembly Language
• The user employs symbols (letters, numerals, or special characters) for the
operation part, the address part, and other parts of the instruction code.
• Each symbolic instruction can be translated into one binary coded instruction
by a special program called an assembler and language is referred to as an
assemblylanguage program.
Location Instruction Comment
000 LDA 004 Load first operand into AC
001 ADD 005 Add second operand to AC
002 STA 006 Store sum in location 006
003 HLT Halt computer
004 0053 First operand
005 FFE9 Second operand (negative)
006 0000 Store sum here
Assembly language program subtract Two number
ORG 100 ORIGIN LOCATION IS 100
LDA SUB LOAD TO AC
CMA COMPLEMENT AC
INC INCREMENT AC
ADD MIN ADD TO AC
STA DIF STORE DIFFERENCE
HLT HALT COMPUTER
MIN, DEC 83 MINUEND
SUB, DEC –23 SUBTRAHEND
DIF, HEX 0 DIFFERENCE STORE HERE
END END PROGRAM
REGISTERS
• In Basic Computer, there is only one general purpose register,
the Accumulator (AC)
• In modern CPUs, there are many general purpose registers
• It is advantageous to have many registers
– Transfer between registers within the processor are relatively fast
– Going “off the processor” to access memory is much slower
GENERAL REGISTER CPU ORGANIZATION
Clock Input
R1
R2
R3
R4
R5
R6
R7
Load
(7 lines)
SELA { MUX MUX } SELB
3x8 A bus B bus
decoder
SELD
OPR ALU
Output
Control
OPERATION OF CONTROL UNIT
The control unit
Directs the information flow through ALU by
- Selecting various Components in the system
- Selecting the Function of ALU
Example: R1 R2 + R3
[1] MUX A selector (SELA): BUS A R2
[2] MUX B selector (SELB): BUS B R3
[3] ALU operation selector (OPR): ALU to ADD
[4] Decoder destination selector (SELD): R1 Out Bus
3 3 3 5
Control Word SELA SELB SELD OPR
Encoding of register selection fields
Binary
Code SELA SELB SELD
000 Input Input None
001 R1 R1 R1
010 R2 R2 R2
011 R3 R3 R3
100 R4 R4 R4
101 R5 R5 R5
110 R6 R6 R6
111 R7 R7 R7
Control
ALU CONTROL
Encoding of ALU operations OPR
Select Operation Symbol
00000 Transfer A TSFA
00001 Increment A INCA
00010 ADD A + B ADD
00101 Subtract A - B SUB
00110 Decrement A DECA
01000 AND A and B AND
01010 OR A and B OR
01100 XOR A and B XOR
01110 Complement A COMA
10000 Shift right A SHRA
11000 Shift left A SHLA
Symbolic Designation
Microoperation SELA SELB SELD OPR Control Word
Examples of ALU
R1 R2 R3 R2 R3 R1 SUB 010 011 001 00101
Microoperations
R4 R4 R5 R4 R5 R4 OR 100 101 100 01010
R6 R6 + 1 R6 - R6 INCA 110 000 110 00001
R7 R1 R1 - R7 TSFA 001 000 111 00000
Output R2 R2 - None TSFA 010 000 000 00000
Output Input Input - None TSFA 000 000 000 00000
R4 shl R4 R4 - R4 SHLA 100 000 100 11000
R5 0 R5 R5 R5 XOR 101 101 101 01100
Exercise 1
PIPELINING
A technique of decomposing a sequential process into sub
operations, with each sub process being executed in a partial
dedicated segment that operates concurrently
with all other segments.
A pipeline can be visualized as a collection of
processing segments through which binary
information flows.
The name “pipeline” implies a flow of information analogous
to an industrial assembly line.
What Is Pipelining
6 PM 7 8 9 10 11 Midnight
Time
30 40 20 30 40 20 30 40 20 30 40 20
T
a A
s
k
B
O
r C
d
e
r D
Sequential laundry takes 6 hours for 4 loads
If they learned pipelining, how long would laundry take?
What Is Pipelining ?
Start work ASAP
6 PM 7 8 9 10 11 Midnight
Time
30 40 40 40 40 20
T
a A
s • Pipelined laundry takes 3.5
k hours for 4 loads
B
O
r
C
d
e
r D
Instruction Pipeline
• Pipelining in Instruction Stream
• Reads consecutive instructions from memory
while previous instructions are being
executed in other segments.
• Instruction fetch and execute phases to
overlap and perform simultaneous
operations.
Instruction Pipeline
The design of an instruction pipeline will be most efficient if
the instruction cycle is divided into segments of equal
duration:
==> 4-Stage Pipeline
[1] FI: Fetch an instruction from memory
[2] DA: Decode the instruction and calculate
the effective address of the operand
[3] FO: Fetch the operand
[4] EX: Execute the operation
INSTRUCTION PIPELINE
Execution of Three Instructions in a 4-Stage Pipeline
Conventional
i FI DA FO EX
i+1 FI DA FO EX
i+2 FI DA FO EX
Pipelined
i FI DA FO EX
i+1 FI DA FO EX
i+2 FI DA FO EX
RISC(Reduced Instruction Set
Computer)
• A computer with a large number of instructions
is classified as a complex instruction set
computer(CISC).
• In the early 1980s, a number of computer
designers recommended that computer use
fewer instruction with simple constructs so they
can be executed much faster within the CPU
without having to use memory as often.
• This type of computer is classified as a reduced
instruction set computer(RISC).
Characteristics of CISC
1. A large number of instructions – typically
100 to 250 instructions.
2. Some instructions that perform specialized
tasks and are used infrequently.
3. A large variety of addressing modes –
typically from 5 to 20 different modes.
4. Variable-length instruction formats
5. Instructions that manipulates operands in
memory.
Characteristics of RISC
1. Relatively few instructions.
2. Relatively few addressing modes.
3. Memory access limited to load and store
instructions.
4. All operations done within the registers of the
CPU.
5. Fixed-length, easily decoded instruction format.
6. Single-cycle instruction execution by using
Pipelining.
7. Hardwired rather than micro programmed control.
Exercise 3
• What are the two instructions needed in the
basic computer in order to set the E flip flop to
1?
Exercise 4
• Consider the instruction format of the basic computer
and the list of instructions. For each of the following
16-bit instruction, give the equivalent four-digit
hexadecimal code and explain in your own words what
it is that the instruction is going to perform:
a. 0001 0000 0010 0100 = 1024 H = Direct AND
Ans: AND operation between AC and operand at memory
location 024H
a. 1011 0001 0010 0100 = B124 H = Indirect STA
Ans: Store the AC value at the location specified by 124H
a. 0111 0000 0010 0000 = 7020H
Ans: Increment AC
Memory Organization
Contents
• Memory Hierarchy
• Main Memory
• Auxiliary Memory
• Associative Memory
• Cache Memory
• Virtual Memory
• Memory Management Hardware
Introduction
Computer Memory Refers to electronics holding place for
instruction and data where the processor can reach
quickly.
Memory can be classified into two broad categories:
(1) Primary Memory (to handle the data).
(2) Secondary Memory (to store the output).
Motherboard
ROM, RAM
Memory Representation
• The basic unit of memory is bit. Memory
can be represented in the bit and bytes..
• 1 Bit = Binary Digit
• 8 Bits = 1 Byte
• 2^10 = 1024 Bytes = 1 KB (Kilo Byte)
• 2^20 = 1024 KB = 1 MB (Mega Byte)
• 2^30 = 1024 MB = 1 GB (Giga Byte)
• 2^40 = 1024 GB = 1 TB(Terra Byte)
Memory Hierarchy
• The memory is characterized on the basis of two key factors—
capacity and access time.
• Three fundamental types of memory:
Internal processor Memory
• This memory is placed in CPU and it includes cache
memory and special registers, both of which can be directly
accessed by processor.
Primary Memory
• RAM and ROM fall in the category of primary memory, also
known as main memory.
Secondary Memory
• Also known as auxiliary memory, secondary memory
provides backup storage for instructions and data.
• Most commonly used secondary memory devices are hard
disk, magnetic disk, and magnetic tapes.
Memory Hierarchy
Storage Evaluation Criteria
• Storage Capacity
It refers to size of memory.
• Cost
Estimated by the cost per bit of storage.
• Access Time
Time required between the request made for read/write
operation and time it takes for completion of the request.
• Physical Characteristics
Four parts namely, electronic, magnetic, mechanical and
optical
• Permanence of Storage
Volatile or Non-volatile.
• Access Mode
Sequential
Random
Direct
READ ONLY MEMORY (ROM)
• ROM stands for Read Only Memory.
we can only read but cannot write on it.
It is non-volatile. The information is stored permanently in such memories
during manufacture.
• A ROM, stores such instructions that are required to start a computer. This
operation is referred to as bootstrap.
ROM chips are in the computer, other electronic items like washing machine
and microwave oven.
• BIOS(Basic Input Output System) is the responsible for the startup of
computer so it can be considered as a Read only memory.
• ROM for each and Every computer may be different that’s what the
compatibility issue of platform generating while we use a different platform.
• Also can said OTP(One time programmed) means if it is programmed once it
cannot be reprogrammed.
Random Access Memory (RAM)
• RAM(Random Access Memory) is the primary memory of
the CPU for storing data, program and program result.
It is read/write memory which stores data until the
machine is working. As soon as the machine is switched
off, data is erased.
• RAM is volatile, i.e. Data stored in it is lost when we
switch off the computer or if there is a power failure.
• RAM is small, both in terms of its physical size and in the
amount of data it can hold.
• Data in the RAM can be accessed randomly but it is very
expensive.
Secondary Memory
• Magnetic tape is a type of storage medium that uses
a long, narrow strip of plastic film coated with a
magnetic material to store digital data. It has been
used historically for data storage and backup
purposes, especially in large-scale computing
environments.
Magnetic Disk
• Magnetic disks, also known as hard disk drives (HDDs), are
a type of non-volatile storage device that uses magnetic
storage to store and retrieve digital information.
• Hard disk drives have been a fundamental component of
computer systems for many years, providing a reliable and
high-capacity storage solution
Optical disk
• Optical disks are a type of storage medium that
use laser light to read and write data.
• They are commonly used for distributing
software, music, movies, and other types of
digital content. Optical disks are characterized
by their flat, circular shape and are read by
optical disk drives. There are different types of
optical disks, each with its own characteristics
• CD/DVD
• BD
Associative Memory
• Definition: Associative memory, also known as
content-addressable memory (CAM) or associative
storage, is a type of memory where data is accessed
based on content rather than a specific address.
• Characteristics:
• Allows for the retrieval of data by matching content rather
than using a specific address.
• Parallel search capability, enabling multiple comparisons
simultaneously.
• Often used in applications like search engines, network
routing tables, and pattern recognition.
• Provides fast access to information based on content
matching.
Hardware Organization
Compare each word in CAM in parallel with the
content of A(Argument Register)
- If CAM Word[i] = A, M(i) = 1
- Read sequentially accessing CAM for CAM Word(i) for M(i) = 1
- K(Key Register) provides a mask for choosing a particular field or key
in the argument in A(only those bits in the argument that have 1’s in
their corresponding position of K are compared)
Cache Memory
• Definition: Cache memory is a small-sized type of volatile
computer memory that provides high-speed data access to
a processor and stores frequently used computer programs,
applications, and data.
• Characteristics:
• Located between the main memory (RAM) and the central
processing unit (CPU).
• Designed to store frequently accessed or recently used data for
quick retrieval.
• Helps improve overall system performance by reducing the time
taken to access data.
• Levels of cache include L1, L2, and sometimes L3, with L1 being
the closest to the CPU and fastest but also the smallest
PERFORMANCE OF CACHE
• Memory Access:
– All the memory accesses are directed first to Cache
– If the word is in Cache; Access cache to provide it to CPU
– If the word is not in Cache; Bring a block (or a line)
including that word to replace a block now in Cache.
How can we know if the word that is required is
there ?
If a new block is to replace one of the old blocks,
which one should we choose ?
Performance of Cache Memory
System
• Hit ratio.
• When the CPU refers to memory and finds
the word in cache, it is said to produce a hit.
• If the word is not found in cache, it is in main
memory and it counts as a miss.
• The ratio of hits divided by the total CPU
references to memory is hit ratio.
Virtual Memory
• Definition: Virtual memory is a memory management
capability of an operating system (OS) that uses hardware
and software to allow a computer to compensate for
physical memory shortages by temporarily transferring data
from random access memory (RAM) to disk storage.
• Characteristics:
• Extends the available memory beyond the physical RAM by
using disk space.
• Helps run larger applications or multiple applications
simultaneously.
• Involves the use of a page file or swap space on disk for storing
data temporarily.
• Allows the operating system to manage memory more
efficiently, though it may incur some performance overhead.
Address Mapping
Organization of memory Mapping Table in a paged system
Summary
• Auxiliary Memory (Secondary Memory): Provides
long-term storage for data and programs.
• Associative Memory (Content-Addressable Memory):
Allows for content-based retrieval, often used in
applications requiring quick pattern matching or search
capabilities.
• Cache Memory: Sits between the CPU and main
memory, storing frequently used data to enhance
overall system speed.
• Virtual Memory: Extends the available RAM by using
disk space, enabling more efficient memory
management and the ability to run larger applications.
SSD
• A Solid-State Drive (SSD) is a type of non-
volatile storage device that uses NAND-based
flash memory to store and retrieve data.
Unlike traditional hard disk drives (HDDs),
SSDs have no moving parts and use electronic
circuits for data storage.
INPUT-OUTPUT ORGANIZATION
• Peripheral Devices
• Input-Output Interface
• Asynchronous Data Transfer
• Modes of Transfer
• Priority Interrupt
• Direct Memory Access
• Input-Output Processor
• Serial Communication
Peripheral Devices
PERIPHERAL DEVICES
Input Devices Output Devices
• Keyboard • Card Puncher, Paper Tape Puncher
• Optical input devices • CRT
- Card Reader • Printer (Impact, Ink Jet,
- Paper Tape Reader Laser, Dot Matrix)
- Bar code reader • Plotter
- Digitizer • Analog
- Optical Mark Reader • Voice
• Magnetic Input Devices
- Magnetic Stripe Reader
• Screen Input Devices
- Touch Screen
- Light Pen
- Mouse
• Analog Input Devices
I/O Instructions
• The CPU communicates with I/O devices using specific instructions.
• These instructions are often part of the computer architecture's instruction set
and are used to transfer data between memory and I/O devices.
I/O Ports and Addresses
• Each I/O device is assigned a specific port or
address through which the CPU can send or
receive data.
• These addresses are typically mapped in the
memory address space, creating a memory-
mapped I/O interface.
ISOLATED I/O
MEMORY MAPPED I/O
Input/Output Interfaces
INPUT/OUTPUT INTERFACE
• Provides a method for transferring information between internal storage
(such as memory and CPU registers) and external I/O devices
• Resolves the differences between the computer and peripheral devices
– Peripherals - Electromechanical Devices
– CPU or Memory - Electronic Device
– Data Transfer Rate
• Peripherals - Usually slower
• CPU or Memory - Usually faster than peripherals
– Some kinds of Synchronization mechanism may be needed
– Unit of Information
• Peripherals – Byte, Block, …
• CPU or Memory – Word
– Data representations may differ
Device Drivers
• Device drivers are software components that
act as intermediaries between the operating
system and the I/O devices.
• They provide an abstraction layer, translating
high-level I/O requests from the operating
system into low-level commands that the
hardware understands
Control Registers
• I/O devices often have control registers that
store configuration information and control the
device's behavior.
• The CPU writes specific values to these registers
to configure the device or Initiate operations.
• Simple controller will have at least 3 addresses
(ports) on the bus, each corresponding to a
register in the controller
• a data register (either readable or writable, depending on whether it is
an input or output device)
• a control register (writable, for controlling device operation)
• a status register (readable, for determining device status -- in particular,
whether it is ready to receive or provide data)
Interrupts:
• I/O devices can use interrupts to signal the
CPU when they need attention.
• When an I/O operation is complete or
requires action, the device generates an
interrupt, causing the CPU to temporarily
suspend its current tasks and handle the I/O
Polling:
• Polling is a method where the CPU regularly
checks the status of I/O devices to see if they
need attention.
• This involves repeatedly reading a status
register associated with each device.
Direct Memory Access (DMA)
• DMA is a technique that allows I/O devices to
transfer data directly to and from memory
without involving the CPU. This reduces CPU
involvement in data transfer, improving
overall system performance.
I/O Bus
• The I/O bus is a communication pathway that
connects the CPU to various I/O devices. It
provides a way for the CPU to send and
receive data to and from peripherals
I/O Instructions in High-Level
Programming
• High-level programming languages provide libraries
and functions to interact with I/O devices.
• Programmers can use these abstractions without
dealing with low-level details, thanks to the
underlying operating system and device drivers.
• When a program needs to interact with an I/O device,
the CPU executes specific I/O instructions or system
calls, triggering the necessary actions to read from or
write to the device. The exact mechanisms may vary
based on the computer architecture, operating
system, and programming language being used.
Fundamentals of Advanced
Computer Architecture
By:
Mitul Patel
Types of Processing
Sequential Processing Parallel Processing
Sequential and Parallel Processing
Parallel Processing
• Execution of Concurrent Events in the computing
process to achieve faster Computational Speed
• The purpose of parallel processing is to speed up the
computer processing capability and increase its
throughput, that is, the amount of processing that
can be accomplished during a given interval of time.
• The amount of hardware increases with parallel
processing, and with it, the cost of the system
increases.
• However, technological developments have reduced
hardware costs to the point where parallel processing
techniques are economically feasible.
PARALLEL COMPUTERS
Architectural Classification
* Flynn's classification
- Based on the multiplicity of Instruction Streams and Data
Streams
- Instruction Stream
Sequence of Instructions read from memory
- Data Stream
Operations performed on the data in the processor
Number of Data Streams
Single Multiple
Number of Single SISD SIMD
Instruction
Streams Multiple MISD MIMD
SISD COMPUTER SYSTEMS
Control Processor Data stream
Memory
Unit Unit
Instruction stream
Characteristics
- Standard von Neumann machine
- Instructions and data are stored in memory
- One operation at a time
Example: One ADD instruction(Single instruction stream) is executed
on some data(Single data stream) in one processor at a time.
Limitations Von Neumann bottleneck
• Only one instruction is executed at a time.
• Memory Bandwidth(Bits/Sec) is less.
SIMD COMPUTER SYSTEMS
Memory
Data bus
Control Unit
Instruction stream
P P ••• P Processor units
Data stream
Alignment network
M M ••• M Memory modules
EXAMPLE: Single ADD instruction(Single Instruction Stream) is
executed at different processors with Different operands(Multiple
Data Stream)
Characteristics:
- Only one copy of the program exists
- A single controller executes one instruction at a time
MISD COMPUTER SYSTEMS
M CU P
M CU P Memory
• •
• •
• •
M CU P Data stream
Instruction stream
Characteristics
- There is no computer at present that can be
classified as MISD
MIMD COMPUTER SYSTEMS
P M P M ••• P M
Interconnection Network
Shared Memory
EXAMPLE:
Multiple Processors are processing Multiple types of Data.
Processor 1 is performing addition on operand A and B
Processor 2 is performing subtraction on operand C and D
Processor 3 is performing shifting of E
And
Like that
Characteristics
- Multiple processing units
- Execution of multiple instructions on multiple data
Array Processor
• An array processor is a type of parallel
computing architecture specifically designed
to perform operations on arrays or matrices
of data.
• It is optimized for tasks that involve large-
scale numerical computations, such as those
found in scientific simulations, signal
processing, and certain types of data analysis
Clusters, and NUMA Computers
• Clusters and NUMA (Non-Uniform Memory
Access) computers are both types of parallel
computing architectures designed to improve
performance and scalability in large-scale
computational tasks
Multiprocessor Systems
• Multiprocessor systems, also known as parallel
or multiprocessing systems, involve multiple
processors or central processing units (CPUs)
working together to execute tasks concurrently.
The structure and interconnection networks play
a crucial role in determining how these
processors communicate and collaborate.
• Shared Memory Multiprocessor
• Distributed Memory Multiprocessor:
Multi-core Computers
• Multi-core computers are systems that contain
multiple processor cores on a single chip. Each
core is a separate central processing unit (CPU),
capable of executing its own set of instructions
independently.
• The advent of multi-core processors has become
a standard in modern computing systems,
offering improved performance and efficiency
compared to single-core counterparts
Pentium 4
• The Pentium 4 processor was a product line
developed by Intel and was part of the larger
family of x86 processors. Introduced in the
year 2000, the Pentium 4 represented a
significant shift in microprocessor
architecture compared to its predecessor, the
Pentium III.
Exercise
• Case Study of Pentium 4
• Key Features
• Instruction Set of Pentium 4
• Challenges of Pentium 4