0% found this document useful (0 votes)

14 views117 pages

Unit 1 Introduction (New)

The document provides an overview of computer architecture and organization, detailing the differences between them, including definitions, purposes, and examples. It discusses the evolution of computer architecture from CISC to RISC, highlighting key features and characteristics of each. Additionally, it explains the structure and function of computers, including the roles of various components like the CPU, memory, and I/O mechanisms.

Uploaded by

nepaledtech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views117 pages

Unit 1 Introduction (New)

Uploaded by

nepaledtech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 117

Unit 1

Introduction(6 hours)
Contents
1.1 Organization and architecture
1.2 Structure and function
1.3 The evolution of computer architecture (RISC, CISC, BERKELEY RISC I,
overlapped register window)
1.4 Performance assessment
1.4.1 Clock speed and instruction per second
1.4.2 Instruction execution rate: CPI, MIPS Rate, MFLOPS rate, arithmetic mean,
harmonic mean, speed metric, geometric mean, rate metric, Amdahl’s law, speed up
1.5 Computer function
1.5.1 Instruction fetch and execute
1.5.2 Instruction cycle state diagram
1.6 Interconnection structure: bus interconnection, multilevel bus hierarchy, PCI
5/23/2025 Compiled by Yatru Harsha Hiski 2
Overview

FIGURE: A Typical Computer Advertisement

5/23/2025 Compiled by Yatru Harsha Hiski 3
Explanation of Ad Short Form /
Full Form / Meaning
Term
Short Form / Level 2 Cache – Slightly slower but larger
Full Form / Meaning L2 cache
Term than L1 (2MB here).
Intel Core i7 – A high-performance CPU 1 Terabyte Serial ATA – Hard disk with
Intel i7 1TB SATA
family from Intel. 1000 GB storage using SATA interface.
CPU with 4 processing cores for parallel 7200 Revolutions Per Minute – Speed of
Quad Core 7200 RPM
processing. the hard disk spinning.
3.9 Gigahertz – Processor speed; 1 GHz = 1 USB Universal Serial Bus – Standard port for
3.9GHz connecting peripherals.
billion cycles/sec.
DDR3 Double Data Rate 3 Synchronous Dynamic PCI / PCIe Peripheral Component Interconnect /
SDRAM RAM – Older type of system memory. Express – Expansion slots for cards.
PCI Express lanes – x16 is for high
Memory speed – 1600 million cycles per
1600MHz x16 / x1 bandwidth (graphics), x1 is for smaller
second.
devices.
32 Gigabytes – Amount of RAM (system High-Definition Multimedia Interface –
32GB HDMI
memory). For digital video/audio output.
Level 1 Cache – Fastest memory located Liquid Crystal Display – Type of monitor
L1 cache LCD
inside CPU, very small (128KB here). screen technology.
5/23/2025 Compiled by Yatru Harsha Hiski 4
Explanation of Ad
Short Form
Full Form / Meaning Short Form /
/ Term Full Form / Meaning
Term
24” Monitor screen size – 24 inches diagonally. Can read/write both CD and
Aspect ratio – Ratio of width to height of the CD/DVD ± RW DVD disks in + or - formats.
16:10
screen.
Dedicated graphics card with 1
1920×1200 Widescreen Ultra eXtended Graphics Array – 1GB PCIe video GB memory, connected via PCI
WUXGA High resolution display. card
Express.
300 cd/m² Brightness – 300 candelas per square meter.
Separate sound card connected
Active Type of LCD tech with better image quality PCIe sound card
through PCI Express.
matrix and response time.
Wired network interface with
1000:1 Contrast ratio – Difference between brightest Ethernet
contrast
10 Mbps, 100 Mbps, or 1 Gbps
white and darkest black. (10/100/1000)
speeds.
8 milliseconds – Response time of the
8ms
monitor.
24-bit color 16.7 million colors – Display color depth.
Video Graphics Array / Digital Visual Interface
VGA / DVI
– Monitor connectors.
5/23/2025 Compiled by Yatru Harsha Hiski 5
Computer Architecture
 Definition:
Computer architecture refers to the attributes of a system visible to the programmer and those that affect
the logical execution of programs.
 Includes:
 Instruction Set Architecture (ISA)
 Instruction formats and opcodes
 Registers and memory addressing
 Effects of instruction execution
 Input/Output mechanisms
 Data types and representation (e.g., number of bits for integers, characters)
 Purpose:
Defines what a computer is supposed to do. It's a high-level design concern focused on functionality and
program behavior.
 Example:
Deciding whether the system should support a "multiply" instruction.
 Stability:
Architectures often remain consistent across many generations of computers to ensure software
compatibility (e.g., IBM System/370).

5/23/2025 Compiled by Yatru Harsha Hiski 6

Computer Organization
 Definition:
Computer organization refers to the physical and operational structure of a computer—how the architectural
specifications are implemented.
 Includes:
 Control signals and data paths
 Hardware components (ALU, memory units, buses)
 Interfaces with peripherals
 Memory technology
 Purpose:
Defines how a computer performs operations—the internal working and construction of the system.
 Example:
Choosing whether to implement the "multiply" instruction using a dedicated multiply unit or repeated addition
logic.
 Flexibility:
Organizational changes often occur to improve performance or reduce cost without altering the architecture. For
example, various IBM System/370 models differ in organization but share the same architecture.

5/23/2025 Compiled by Yatru Harsha Hiski 7

Computer Organization and Architecture
1. Computer Organization
Focuses on the operational structure of the computer.
Deals with how components work and interact at the hardware
level.
Topics include:
Control Unit
ALU (Arithmetic Logic Unit)
Registers
Memory Hierarchy
I/O Mechanisms
Micro-operations

5/23/2025 Compiled by Yatru Harsha Hiski 8

Computer Organization and Architecture
2. Computer Architecture
Focuses on the design principles and programmer’s perspective.
Deals with what a computer system does, not how.
Topics include:
Instruction Set Architecture (ISA)
Aspect Architecture Organization
Addressing Modes Design & Physical
What is it?
Data Types Specification Implementation
Instruction sets, Data paths, control
Memory Formats Concerned With
addressing modes signals, memory

Performance Metrics (CPI, MIPS, etc.) Computer

Hardware
Engineers,
Who is interested? Architects, System
Microprocessor
Designers
Designers

5/23/2025 Compiled by Yatru Harsha Hiski 9

Computer Organization vs Computer
Architecture
Computer Architecture Computer Organization
It is the description of what the computer does. It is the description of how the computer does things.
It refers to the operational units and their
It refers to those attributes of a system that have a direct
interconnections that realize the architectural
impact on the logical execution of a program.
specifications.
A programmer can view architecture in terms of
Organization expresses the realization of architecture.
instructions, addressing modes and registers.
While designing a computer system architecture is
An organization is done on the basis of architecture.
considered first.
Computer Architecture deals with high-level design Computer Organization deals with low-level design
issues. issues.
Architecture involves Logic (Instruction sets, Organization involves Physical Components (Circuit
Addressing modes, Data types, Cache optimization) design, Adders, Signals, Peripherals)
For example, it is architectural design issues on what For example, it is an organizational issue whether to
type of instructions are to be included, whether to use implement a special purpose unit or make use the pre-
direct or indirect addressing for accessing memory and existing unit, for instance, to implement multiply
so on. instruction, a special multiply unit can be used or an
5/23/2025 Compiled by Yatruadd unit
Harsha can be used repeatedly.
Hiski 10
Two Basic Computer Architectures
1. Von Neumann 2. Harvard

5/23/2025 Compiled by Yatru Harsha Hiski 11

Von Neumann Architecture
Based on Stored program concept
The concept holds that:
Data and instructions should be stored
together in same memory area of computer
Execution occurs in sequential fashion (unless
explicitly modified) from one instruction to
next
Same signal pathways and memory for data
and instructions (CPU do one thing at a time,
either read/write data or read instruction)
E.g.: desktop personal computer
5/23/2025 Compiled by Yatru Harsha Hiski 12
Harvard Architecture
Physically separate memory and
pathways for instructions and data
CPU can read both instructions and data
from memory at the same time
Have double memory bandwidth
E.g.: Digital signal processor (DSP) based
computer system

5/23/2025 Compiled by Yatru Harsha Hiski 13

Structure and Function
A computer is a complex machine made of many small
electronic components.
To understand or design it better, we look at it in a
hierarchical way, meaning we break it down into layers or
levels. At each level, we focus on two things:
1. Structure: How the parts are connected.
2. Function: What each part does.

5/23/2025 Compiled by Yatru Harsha Hiski 14

Function
Refers to operations each component performs.
Computers perform only four basic functions:
Data Processing: Manipulation of data through operations (e.g.,
addition, comparison).
Data Storage: Holding data temporarily (RAM, cache) or
permanently (SSD, HDD).
Data Movement: Input/output operations like keyboard entry or
display output.
Control: Coordination of all operations via the control unit.

5/23/2025 Compiled by Yatru Harsha Hiski 15

Structure
Describes the relationship among different components: CPU, memory,
I/O, buses.
Example: A CPU is composed of the control unit, ALU, and registers.

Figure: The Computer

5/23/2025 Compiled by Yatru Harsha Hiski 16
Simple Single-processor Computer
The hierarchical view of the internal structure of a traditional single-
processor computer is given in the figure. There are four main
structural components:
1. Central processing unit (CPU): Controls the operation of the
computer and performs its data processing functions; often simply
referred to as processor.
2. Main memory: Stores data.
3. I/O: Moves data between the computer and its external environment.
4. System interconnection: Some mechanism that provides for
communication among CPU, main memory, and I/O. A common
example of system interconnection is by means of a system bus,
consisting of a number of conducting wires to which all the other
components attach.

5/23/2025 Compiled by Yatru Harsha Hiski 17

Simple Single-processor
Computer
The major structural components of a
simple single-processor computer are as
follows:
Control unit: Controls the operation of
the CPU and hence the computer.
Arithmetic and logic unit (ALU):
Performs the computer’s data processing
functions.
Registers: Provides storage internal to
the CPU.
CPU interconnection: Some
mechanism that provides for
communication among the control unit,
ALU, and registers.

5/23/2025 Compiled by Yatru Harsha Hiski 18

Figure The Computer: Top-Level Structure
Multicore Computer Structure
Multicore computer structure, contemporary computers generally have
multiple processors. When these processors all reside on a single chip, the term
multicore computer is used, and each processing unit (consisting of a control
unit, ALU, registers, and perhaps cache) is called a core.

Central processing unit (CPU): That portion of a computer that fetches and executes
instructions. It consists of an ALU, a control unit, and registers. In a system with a single
processing unit, it is often simply referred to as a processor.

Core: An individual processing unit on a processor chip. A core may be equivalent in

functionality to a CPU on a single-CPU system. Other specialized processing units, such as
one optimized for vector and matrix operations, are also referred to as cores.

Processor: A physical piece of silicon containing one or more cores. The processor is the
computer component that interprets and executes instructions. If a processor contains
multiple cores, it is referred to as a multicore processor.

5/23/2025 Compiled by Yatru Harsha Hiski 19

Multicore Computer
Structure
 In general terms, the functional
elements of a core are:
 Instruction logic: This includes the
tasks involved in fetching instructions,
and decoding each instruction to
determine the instruction operation and
the memory locations of any operands.
 Arithmetic and logic unit (ALU):
Performs the operation specified by an
instruction.
 Load/store logic: Manages the transfer
of data to and from main memory via
cache.

5/23/2025 Compiled by Yatru Harsha Hiski 21

The Evolution of Computer Architecture
The evolution of computer architecture reflects the progression from
complex to simpler and more efficient instruction designs.
This evolution is centered around the ideas of CISC (Complex
Instruction Set Computing) and RISC (Reduced Instruction Set
Computing), with key innovations such as the Berkeley RISC I and
Overlapped Register Windows playing vital roles.

5/23/2025 Compiled by Yatru Harsha Hiski 22

CISC (Complex Instruction Set
Computing)
Era: 1960s–1970s
Philosophy: Provide rich, complex instructions to reduce the number of instructions per
program.
Features:
 Large instruction set (hundreds of instructions)
 Variable instruction lengths
 Instructions that combine multiple low-level operations (e.g., memory access and arithmetic)
 Microcoded control unit
Examples: Intel x86, Digital Equipment Corporation VAX computer and the IBM 370
computer.
Problems with CISC:
 Complex hardware to decode and execute instructions
 Slower clock speeds due to complexity
 Harder to pipeline due to variable instruction lengths

5/23/2025 Compiled by Yatru Harsha Hiski 23

CISC Characteristics
The major characteristics of CISC architecture are:
1. A large number of instructions-typically from 100 to 250 instructions
2. Some instructions that perform specialized tasks and are used
infrequently
3. A large variety of addressing modes-typically from 5 to 20 different
modes
4. Variable-length instruction formats
5. Instructions that manipulate operands in memory

5/23/2025 Compiled by Yatru Harsha Hiski 24

RISC (Reduced Instruction Set
Computing)
Era: 1980s onwards
Philosophy: RISC architecture is designed to reduce execution time
by simplifying the instruction set and using efficient hardware
techniques like pipelining.
Features:
Small, optimized instruction set
Fixed instruction length (commonly 32-bit)
Load/store architecture (only load/store can access memory)
Simple addressing modes
Emphasis on register usage
Efficient pipelining and compiler optimization

5/23/2025 Compiled by Yatru Harsha Hiski 25

RISC Characteristics
The concept of RISC architecture involves an attempt to reduce execution time by
simplifying the instruction set of the computer. The major characteristics of a
RISC processor are:
1. Relatively few instructions
2. Relatively few addressing modes
3. Memory access limited to load and store instructions
4. All operations done within the registers of the CPU
5. Fixed-length, easily decoded instruction format
6. Single-cycle instruction execution
7. Hardwired rather than microprogrammed control

5/23/2025 Compiled by Yatru Harsha Hiski 26

Overlapped Register Windows
Introduced in: Berkeley RISC and SPARC architectures
Problem Addressed: Function calls cause overhead due to saving and
restoring registers
Solution:
Divide registers into windows
On function call, shift to a new register window
Each window overlaps with the previous one, sharing parameters
Reduces memory traffic for saving/restoring registers
Benefits:
Faster function calls and returns
Less memory usage during nested calls

5/23/2025 Compiled by Yatru Harsha Hiski 27

Overlapped Register Windows
Purpose and Motivation
Procedure calls and returns are frequent in high-level languages and involve
saving/restoring registers and passing parameters/results.
Traditional methods like memory stacks are time-consuming due to memory access
delays.
Overlapped register windows aim to reduce overhead and improve efficiency in
procedure calls.
Concept of Overlapped Register Windows
Each procedure call activates a new register window using a pointer.
Windows overlap with adjacent procedures to share registers for parameters/results
without copying.
Only one window is active at a time.

5/23/2025 Compiled by Yatru Harsha Hiski 28

Overlapped Register Windows
Advantages
No need to save/restore register values during procedure calls.
Parameters passed automatically via overlapping registers.
Faster execution due to reduced memory access.

General Formulas
Organization of register windows will have the following relationships:
The number of registers available for each window is calculated as follows:
Window size = L + 2C + G
The total number of registers needed in the processor is.
Total registers in file = (L + C) × W + G
•Where:
•G = number of global registers
•L = number of local registers in each window
•C = number of registers common to two windows
•W = number of windows
5/23/2025 Compiled by Yatru Harsha Hiski 29
Overlapped Register
Windows
 In the example of Figure we have G = 10,
L = 10, C = 6, and W = 4.
 The window size is 10 + 12 + 10 = 32
registers, and the register file consists of
(10 + 6) x 4 + 10 = 74 registers.

Figure: Overlapped register windows.

5/23/2025 Compiled by Yatru Harsha Hiski 30

Overlapped Register
Windows
For an example, suppose that procedure A calls procedure B. Registers
R26 through R31 are common to both procedures, and therefore
procedure A stores the parameters for procedure B in these registers.
Procedure B uses local registers R32 through R41 for local variable
storage.
If procedure B calls procedure C, it will pass the parameters through
registers R42 through R47. When procedure B is ready to return at the end
of its computation, the program stores results of the computation in
registers R26 through R31 and transfers back to the register window of
procedure A.
Note that registers R10 through R15 are common to procedures A and D
because the four windows have a circular organization with A being
adjacent to D.

5/23/2025 Compiled by Yatru Harsha Hiski 31

Overlapped Register
Windows
As mentioned previously, the 10 global registers R0 through R9 are
available to all procedures. Each procedure has available a total of 32
registers while it is active.
This includes 10 global registers, 10 local registers, six low overlapping
registers, and six high overlapping registers.
Other fixed size register window schemes are possible, and each may differ
in the size of the register window and the size of the total register file.

5/23/2025 Compiled by Yatru Harsha Hiski 32

Berkeley RISC I
General Overview
Developed at University of California, Berkeley.
Among the first RISC architectures to demonstrate the benefits of the RISC
concept.
Implemented as a 32-bit integrated circuit CPU.
Architecture Features
32-bit:
 Address bus
 Data (supports 8-, 16-, or 32-bit data)
 Instruction format
Instruction set: Only 31 instructions (simple, fast operations).
Addressing modes:
 Register addressing
 Immediate operand
 Relative to PC (for branch instructions)
5/23/2025 Compiled by Yatru Harsha Hiski 33
Berkeley RISC I
Register File and Windows
138 registers total:
 10 global registers
 8 register windows with 32 registers each
Each window includes local and overlapping registers (like the overlapped register
window model).
Only one 32-register window active at a time.
A 5-bit field is enough to select any register (2⁵ = 32).

5/23/2025 Compiled by Yatru Harsha Hiski 34

Berkeley RISC I
Instruction Formats
All instructions are 32 bits wide.
Three types of instruction formats:
 Register-to-register
 Memory access
 Branch and jump (19-bit relative address)
Opcode:
 7 bits for the operation.
 1 bit to indicate status flag update after ALU operations.
Operand fields:
 Rd: Destination register (5 bits)
 Rs: First source register
 S2: Second source register or a 13-bit immediate constant (based on bit 13)
Memory access:
 Rs contains base address
 S2 is the offset
5/23/2025 Compiled by Yatru Harsha Hiski 35
Berkeley RISC I
Special Features
Register R0 with all 0's (used to specify zero in any field).
COND field: Used in jump instructions to specify 1 of 16 branch conditions.
All instructions use a three-operand format.

Instruction Set
31 total instructions, categorized into:
1. Data manipulation (arithmetic, logic, shift)
2. Register transfer
3. Control flow
Second operand (S2) can be register or immediate (denoted by # in assembly).

5/23/2025 Compiled by Yatru Harsha Hiski 36

Instruction Set of
Berkeley RISC I

Figure: Berkeley RISC I instruction formats.

5/23/2025 Compiled by Yatru Harsha Hiski 37
Berkeley RISC I
Consider, for example, the ADD instruction and how it can be used to
perform a variety of operations.
ADD R22, R21, R23 R23 ← R22 + R21
ADD R22, #150, R23 R23 ← R22 + 150
ADD R0, R21, R22 R22 ← R21 (Move)
ADD R0, #150, R22 R22 ← 150 (Load Immediate)
ADD R22, #1, R22 R22 ← R22 + 1 (Increment)

5/23/2025 Compiled by Yatru Harsha Hiski 38

Berkeley RISC I
The following are examples of load long instructions with different
addressing modes.
LDL (R22)#150, R5 R5 ← M[R22] + 150
LDL (R22)#0, R5 R5 ← M[R22]
LDL (R0)#500, R5 R5 ← M[500]

5/23/2025 Compiled by Yatru Harsha Hiski 39

Performance Assessment
Performance Assessment refers to the systematic evaluation of a
computer system’s efficiency, speed, and capability to execute
programs and tasks.
It involves analyzing various quantitative metrics that indicate how
well a computer system performs under specific conditions or workloads.

5/23/2025 Compiled by Yatru Harsha Hiski 40

Clock Speed and Instructions Per
Second
Clock Speed:
Measured in Hertz (Hz), typically GHz today.
Indicates how many clock cycles occur per second.
Each instruction may take multiple clock cycles to execute.
Instructions Per Second (IPS):
Indicates how many instructions the CPU can execute per second.
Depends on both clock speed and CPI (Cycles Per Instruction).
Not always a reliable performance metric because different ISAs have
instructions of varying complexity.

5/23/2025 Compiled by Yatru Harsha Hiski 41

Instruction Execution Rate
a. CPI (Cycles Per Instruction):
Average number of clock cycles each instruction takes.
Formula:
Total Clock Cycles
CPI=
Total Instructions
Lower CPI generally implies better performance.
b. MIPS (Million Instructions Per Second):
Formula:
Clock Speed (in MHz)
MIPS =
CPI
Does not account for instruction complexity, so it's not always a good comparison
metric across architectures.

5/23/2025 Compiled by Yatru Harsha Hiski 42

Instruction Execution Rate
c. MFLOPS (Million Floating Point Operations Per Second):
Measures performance in scientific applications where floating point operations
dominate.
Formula:
Number of FP operations
MFLOPS=
Execution Time

5/23/2025 Compiled by Yatru Harsha Hiski 43

Benchmarks
Benchmarks are standardized tests or programs used to measure
and evaluate the performance of computer systems, components (like
CPUs, memory, or storage), or software.
They provide a quantitative basis for comparing the speed,
efficiency, and capabilities of different systems under consistent and
repeatable workloads.
Purpose of Benchmarks
To evaluate performance of hardware/software components.
To compare systems (e.g., two CPUs or GPUs).
To identify bottlenecks in processing, memory, or I/O.
To assist in optimization, design, and purchasing decisions

5/23/2025 Compiled by Yatru Harsha Hiski 44

Types of Benchmarks

Type Description Example

Designed specifically to test LINPACK (floating-point),
Synthetic Benchmarks
certain features or workloads Dhrystone (integer)
Real-world applications used for Microsoft Word load time,
Application Benchmarks
testing Photoshop rendering
Test specific portions of
Kernel Benchmarks Matrix multiplication kernel
programs (e.g., loops, I/O)
Focus on specific hardware PassMark, 3DMark,
Component Benchmarks
(CPU, GPU, Disk, etc.) CrystalDiskMark

5/23/2025 Compiled by Yatru Harsha Hiski 45

Performance Averages and Metrics
When comparing systems using multiple benchmarks, we need to use
mathematical averages. Choosing the right average is critical depending on
what you're measuring.
1. Arithmetic Mean (AM)
Used to average performance across several benchmarks.
Formula:

Example:
 Execution times of a program on 3 systems: 2s, 3s, 5s.

 AM =

Not suitable for averaging rates like "speed" or "performance per unit
time".
5/23/2025 Compiled by Yatru Harsha Hiski 46
Performance Averages and Metrics
2. Harmonic Mean (HM)
More appropriate for averaging rates (like CPI or execution time).
Formula:

Example:
MIPS values of a CPU on 3 tasks: 20, 30, and 40.

HM =

Why HM? It gives a more conservative average for rates, especially when one of
the rates is significantly lower.

5/23/2025 Compiled by Yatru Harsha Hiski 47

Performance Averages and Metrics
3. Geometric Mean (GM)
Best for comparing relative performance across multiple benchmarks.
Formula:

Example:
 Performance ratios of CPU A vs CPU B on 3 benchmarks: 2x, 4x, 0.5x.
 GM =
 So, CPU A is ~1.59 times faster than CPU B overall.
Why GM? It neutralizes the effect of outliers and is commonly used in SPEC
benchmarks.

5/23/2025 Compiled by Yatru Harsha Hiski 48

Performance Metric Types
These metrics categorize how performance is measured. Each is used in different
scenarios.
1. Speed Metric
Measures how fast a single task completes.
Focus: Time per task (lower is better).
Examples:
Execution Time: Program A takes 2 seconds.
CPI (Cycles Per Instruction): Lower CPI = better performance.
Old Time
Speedup =
New Time
Use case: Measuring latency – e.g., "How fast can this CPU render a
frame?"
Example:
System A takes 5 seconds to sort a list; System B takes 2.5 seconds.
5
Speedup of B over A =
2.5 =2x
5/23/2025 Compiled by Yatru Harsha Hiski 49
Performance Metric Types
2. Rate Metric
Measures how many tasks are completed per unit time.
Focus: Throughput (higher is better).
Examples:
 MIPS (Million Instructions Per Second)
 MFLOPS (Million Floating Point Ops/Sec)
 Requests/sec in a web server
Use case: Measuring throughput – e.g., "How many images can be processed per
second?"
Example:
 CPU A executes 50 million instructions in 1 second → MIPS = 50.
 CPU B executes 100 million in 1 second → MIPS = 100.
 CPU B has higher throughput.

5/23/2025 Compiled by Yatru Harsha Hiski 50

Amdahl’s Law
Amdahl’s Law gives the theoretical maximum speedup you can
achieve by improving or parallelizing a portion of a system, while the rest
remains unchanged (or serial).
Formula:
1
Speedup(N) =
(1 − f) + f
N
Where:
f: Fraction of the program that can be parallelized.
(1 − 𝑓𝑓): Fraction that is inherently serial (cannot be parallelized).
N: Number of processors (or speedup factor applied to the parallel part).

5/23/2025 Compiled by Yatru Harsha Hiski 51

Amdahl’s Law

Figure: Illustration of Amdahl’s Law

5/23/2025 Compiled by Yatru Harsha Hiski 52
Amdahl’s Law
Even with infinite processors, the serial part limits the overall
speedup.
1
Max Speedup = as N→∞
(1 − f)
Example 1: Simple Calculation
Suppose:
80% of your program can be parallelized (f = 0.8)
20% is serial (1 − f = 0.2)
You use 4 processors ( N = 4)
Then:
Speedup(4) =

So, the program runs 2.5 times faster with 4 processors.

5/23/2025 Compiled by Yatru Harsha Hiski 53

Amdahl’s Law
Example 2: Infinite Processors
Using the same f = 0.8, if we assume infinite processors:

Speedup(∞) =

 No matter how many processors you use, speedup will never exceed 5x.

5/23/2025 Compiled by Yatru Harsha Hiski 54

Speedup
Speedup measures how much faster a system performs after an
improvement or optimization (hardware, software, or algorithm).
Formula:
Execution Time (Before)
Speedup =
Execution Time (After)
Example:
A program originally takes 10 seconds.
After optimizing the code or upgrading hardware, it now takes 4 seconds.
10
Speedup = = 2.5
4
So, your optimization gives you a 2.5× speedup.

5/23/2025 Compiled by Yatru Harsha Hiski 55

Speedup
Speedup from Multiple Enhancements (Extended Amdahl’s
Law)
If multiple parts are improved separately, Amdahl’s Law can be applied
repeatedly or in parts, such as:
1
Speeduptotal =
P1 + P2 + ⋯+ Pn
S1 S2 Sn
Where Pi are fractions of time and Si are speedup factors of each part.

5/23/2025 Compiled by Yatru Harsha Hiski 56

Computer Components
Virtually all contemporary computer designs are based on
concepts developed by John von Neumann at the Institute
for Advanced Studies, Princeton. Such a design is referred to as
the von Neumann architecture and is based on three key
concepts:

1. Data and instructions are stored in a single read–write memory.

2. The contents of this memory are addressable by location, without
regard to the type of data contained there.
3. Execution occurs in a sequential fashion (unless explicitly modified)
from one instruction to the next.

5/23/2025 Compiled by Yatru Harsha Hiski 57

Computer Components
1.Processor (CPU)
Performs computations and controls other parts of the computer.
Contains control unit, ALU (arithmetic logic unit), and registers.
2.Main Memory
Temporarily stores data and instructions during processing.
Volatile (data is lost when power is off).
3.I/O Modules
Allow the CPU to communicate with external devices like keyboards, mice,
printers, and disks.
4.System Interconnection
Mechanism for components to communicate.
Usually implemented as a bus or interconnect fabric.

5/23/2025 Compiled by Yatru Harsha Hiski 58

Computer
Components
 PC: Contains the address of the next
instruction pair to be fetched from memory
 IR: Contains the 8-bit opcode instruction
being executed.
 MAR: Specifies the address in memory for
next read or write operation
 MBR: contains data to be written into
memory or receives the data from memory
 I/O AR: Specifies particular I/O devices
 I/O BR: Used for the exchange of data
between an I/O module and the CPU

5/23/2025 Compiled by Yatru Harsha Hiski 59

Figure: Computer Components: Top-Level View
Computer Function
Instruction Fetch and Execute
Interrupt Handling
I/O function

5/23/2025 Compiled by Yatru Harsha Hiski 60

Computer Function
The basic function performed by a computer is execution of a program,
which consists of a set of instructions stored in memory. The processor
does the actual work by executing instructions specified in the program.
In its simplest form, instruction processing consists of two steps:
Fetch: reads the instruction from memory
Execute: execute the fetched instruction

Figure: Basic Instruction Cycle

5/23/2025 Compiled by Yatru Harsha Hiski 61
Computer Function
Fetch Cycle:
At the beginning of each instruction cycle, the processor fetches an
instruction from Memory pointed by a register.
In a typical microprocessor, the register is called Program Counter (PC)
which holds the address of the instruction to be fetched next.
The PC is incremented each time an instruction is fetched (unless told
otherwise)
The fetched instruction is loaded into the Instruction Register.

5/23/2025 Compiled by Yatru Harsha Hiski 62

Computer Function
Execute Cycle
The instruction present in the IR register contains bits that specify the
action to be taken by the processor.
The processor interprets the instruction and performs required actions
such as:
 Processor-Memory: Data transfer between processor and memory module.
 Processor-I/O: Data may be transferred to or from peripheral devices.
 Data processing: The processor may perform some arithmetic or logical
operations on data.
 Control: An instruction may specify the sequence of execution be altered.
Example: Jump, Call, etc.

5/23/2025 Compiled by Yatru Harsha Hiski 63

Computer Function
 The diagram illustrates the Basic Instruction Cycle, consisting of two primary cycles: the Fetch
Cycle and the Execute Cycle. Here’s a stepwise explanation:
1. Start:
 The CPU is initialized, and the Program Counter (PC) is set to the address of the first instruction to be
executed.
2. Fetch Cycle:
 The CPU reads the next instruction from memory using the address in the Program Counter.
 The fetched instruction is then stored in the Instruction Register (IR).
 The Program Counter is incremented to point to the address of the next instruction to be fetched.
3. Execute Cycle:
 The instruction in the Instruction Register is decoded and executed.
 Depending on the type of instruction, the CPU performs data transfer, arithmetic/logic operations, or
control operations.
 If the instruction is a branch or jump, the Program Counter may be modified to point to a different address.
4. Check for Halt Condition:
 After executing an instruction, the CPU checks if the halt (HALT) instruction has been reached.
 If not, it loops back to the Fetch Cycle to fetch the next instruction.
 If the halt instruction is encountered, the cycle terminates, and the CPU stops executing further
instructions.
5/23/2025 Compiled by Yatru Harsha Hiski 64
Instruction Fetch and Execute
Example of Program Execution
Consider an example of a hypothetical machine with a single data register
“Accumulator” (AC). Both instructions and data are 16 bits long. The first 4
bits in the instruction represents the opcode which specifies the operation
to be performed. There can be as many as 24 = 16 different opcodes, and up
to 212 = 4096 4 K) words of memory can be directly addressed.
Here:
Registers:  Partial list of Opcodes:
PC: Program Counter  0001 Load AC from memory
AC: Accumulator  0010 Store AC to memory
IR: Instruction Register  0101 Add to AC from memory

5/23/2025 Compiled by Yatru Harsha Hiski 65

Instruction Fetch and Execute

Figure: Characteristics of a Hypothetical Machine

5/23/2025 Compiled by Yatru Harsha Hiski 66
Instruction Fetch and Execute
For example, consider a computer in which each instruction occupies one
16-bit word of memory.
Assume that the program counter is set to location 300. The processor will
next fetch the instruction at location 300. On succeeding instruction cycles,
it will fetch instructions from locations 301, 302, 303, and so on.
This sequence may be altered, as explained presently.

5/23/2025 Compiled by Yatru Harsha Hiski 67

Instruction Fetch
and Execute

Figure: Example of Program

Execution (contents of memory
and registers in hexadecimal)

5/23/2025 Compiled by Yatru Harsha Hiski 68

Instruction Fetch and Execute
Three instructions, which can be described as three fetch and three
execute cycles, are required:
1. The PC contains 300, the address of the first instruction. This instruction (the value
1940 in hexadecimal) is loaded into the instruction register IR and the PC is
incremented. Note that this process involves the use of a memory address register
(MAR) and a memory buffer register (MBR). For simplicity, these intermediate
registers are ignored.
2. The first 4 bits (first hexadecimal digit) in the IR indicate that the AC is to be
loaded. The remaining 12 bits (three hexadecimal digits) specify the address (940)
from which data are to be loaded.
3. The next instruction (5941) is fetched from location 301 and the PC is incremented.
4. The old contents of the AC and the contents of location 941 are added and the result
is stored in the AC.
5. The next instruction (2941) is fetched from location 302 and the PC is incremented.
6. The contents of the AC are stored in location 941.
5/23/2025 Compiled by Yatru Harsha Hiski 69
Instruction Fetch and Execute
In this example, three instruction cycles, each consisting of a fetch cycle and an execute
cycle, are needed to add the contents of location 940 to the contents of 941. With a more
complex set of instructions, fewer cycles would be needed. Some older processors, for
example, included instructions that contain more than one memory address. Thus the
execution cycle for a particular instruction on such processors could involve more than
one reference to memory. Also, instead of memory references, an instruction may specify
an I/O operation.
For example, the PDP-11 processor includes an instruction, expressed symbolically as
ADD B,A, that stores the sum of the contents of memory locations B and A into memory
location A. A single instruction cycle with the following steps occurs:
Fetch the ADD instruction.
Read the contents of memory location A into the processor.
Read the contents of memory location B into the processor. In order that the contents of A
are not lost; the processor must have at least two registers for storing memory values,
rather than a single accumulator.
Add the two values.
5/23/2025 Compiled by Yatru Harsha Hiski 70
Write the result from the processor to memory location A.
Instruction Cycle State Diagram

Figure: Instruction Cycle State Diagram

5/23/2025 Compiled by Yatru Harsha Hiski 71
Instruction Cycle State Diagram
Instruction address calculation (iac): Determine the address of the next instruction
to be executed. Usually, this involves adding a fixed number to the address of the previous
instruction. For example, if each instruction is 16 bits long and memory is organized into
16-bit words, then add 1 to the previous address. If, instead, memory is organized as
individually addressable 8-bit bytes, then add 2 to the previous address.
Instruction fetch (if): Read instruction from its memory location into the processor.
Instruction operation decoding (iod): Analyze instruction to determine type of
operation to be performed and operand(s) to be used.
Operand address calculation (oac): If the operation involves reference to an operand
in memory or available via I/O, then determine the address of the operand.
Operand fetch (of): Fetch the operand from memory or read it in from I/O.
Data operation (do): Perform the operation indicated in the instruction.
Operand store (os): Write the result into memory or out to I/O.

5/23/2025 Compiled by Yatru Harsha Hiski 72

Instruction Cycle State Diagram
The diagram illustrates the Instruction Cycle State Diagram, representing the
various stages through which a CPU processes an instruction. Here's a stepwise
explanation:
1. Instruction Fetch:
 The CPU fetches the instruction from memory using the Program Counter (PC) address.
 The instruction is then placed in the Instruction Register (IR).
2. Instruction Address Calculation:
 The address of the next instruction is calculated and updated in the Program Counter (PC).
 This step ensures that the next instruction can be fetched while the current instruction is being
decoded and executed.
3. Instruction Operation Decoding:
 The fetched instruction is decoded to determine the operation to be performed.
 This includes identifying the opcode and the operands involved.
4. Operand Fetch:
 The CPU retrieves the necessary operands from memory or registers as specified by the
decoded instruction.
 If multiple operands are required, this step may be repeated.
5/23/2025 Compiled by Yatru Harsha Hiski 73
Instruction Cycle State Diagram
5. Operand Address Calculation:
The addresses of the operands are calculated, especially if they are located in memory.
This step is crucial when dealing with indirect addressing modes.
6. Data Operation:
The actual operation is performed using the fetched operands.
This may involve arithmetic, logical, data transfer, or control operations.
7. Operand Store:
The result of the operation is stored back in memory or registers.
If multiple results are produced, they are stored sequentially.
8. Return to Fetch Cycle:
After storing the result, the cycle returns to the Instruction Fetch state to process the
next instruction.

5/23/2025 Compiled by Yatru Harsha Hiski 74

Interrupts
Virtually all computers provide a mechanism by which other modules
(I/O, memory) may interrupt the normal processing of the processor.
Table below lists the most common classes of interrupts.

5/23/2025 Compiled by Yatru Harsha Hiski 75

Interrupts
Mechanism by which other modules (e.g. I/O) may interrupt normal
sequence of processing

Classes of Interrupt: Program, Timer, I/O, Hardware failure

A program can interrupt in case of : Overflow, division by zero; as a result of an
instruction execution

A timer can interrupt for multitasking or from internal processor itself

An I/O controller can interrupt for I/O operation

Hardware failure can generate interrupt. Eg, Memory parity error

5/23/2025 Compiled by Yatru Harsha Hiski 76

Interrupts
The user program performs a series of WRITE calls interleaved with processing.
Code segments 1, 2, and 3 refer to sequences of instructions that do not involve
I/O.
The WRITE calls are to an I/O program that is a system utility and that will
perform the actual I/O operation. The I/O program consists of three sections:
A sequence of instructions, labeled 4 in the figure, to prepare for the actual I/O
operation. This may include copying the data to be output into a special buffer and
preparing the parameters for a device command.
The actual I/O command. Without the use of interrupts, once this command is
issued, the program must wait for the I/O device to perform the requested function
(or periodically poll the device).The program might wait by simply repeatedly
performing a test operation to determine if the I/O operation is done.
A sequence of instructions, labeled 5 in the figure, to complete the operation. This
may include setting a flag indicating the success or failure of the operation.

5/23/2025 Compiled by Yatru Harsha Hiski 77

Interrupt Handler
Interrupt signal is detected
Normal sequence of execution
suspended,
Interrupt generating device is serviced
by processor by branching off to a
program called interrupt handler
Original code execution sequence is
resumed from suspended point

Figure: Transfer of Control via Interrupts

5/23/2025 Compiled by Yatru Harsha Hiski 78

5/23/2025
Interrupts

Compiled by Yatru Harsha Hiski

Figure: Program Flow of Control without and with

Interrupts
79
Interrupts and the Instruction Cycle
With interrupts, the processor can be engaged in executing other instructions
while an I/O operation is in progress.
Upon a WRITE system call, control transfers to the I/O program which executes
the I/O command and returns control to the user program, allowing the I/O
operation to proceed concurrently with user program execution.
When an external device is ready, it's I/O module sends an interrupt request to
the processor, which invokes the corresponding interrupt handler by suspending
the current program, services the device, and then resumes normal execution.
From the user program’s perspective, an interrupt is a transparent pause in
execution handled entirely by the processor and OS, with execution resuming
automatically at the same point after interrupt processing.

5/23/2025 Compiled by Yatru Harsha Hiski 80

Interrupts and the Instruction Cycle
To accommodate interrupts, an interrupt cycle is added to the instruction cycle as
shown in figure.

Figure: Instruction Cycle with Interrupts

5/23/2025 Compiled by Yatru Harsha Hiski 81
Interrupts and the Instruction Cycle
In the interrupt cycle, the processor checks to see if any interrupts have
occurred, indicated by the presence of an interrupt signal.
If no interrupts are pending, the processor proceeds to the fetch cycle and
fetches the next instruction of the current program.
If an interrupt is pending then the CPU:
Suspends execution of the current program being executed
Saves context
Sets the program counter to the starting address of an interrupt handler routine.
Process the interrupt
Restore context and continue interrupted program.

5/23/2025 Compiled by Yatru Harsha Hiski 82

Interrupts and the Instruction Cycle
1.Start: Initialization point where the processor prepares to begin fetching
instructions.
2.Fetch cycle: The processor reads the next instruction from memory
based on the program counter.
3.Execute cycle: The processor executes the fetched instruction while
interrupts are temporarily disabled to avoid disruption.
4.Interrupt cycle: Interrupts are enabled; the processor checks for any
pending interrupt requests and, if found, saves context and transfers
control to the interrupt handler.
5.HALT: Execution is stopped when a HALT instruction is encountered,
suspending all processing until reset or external intervention.

5/23/2025 Compiled by Yatru Harsha Hiski 83

Program Timing:
Short I/O Wait

Figure: Program Timing: Short I/O Wait

5/23/2025 Compiled by Yatru Harsha Hiski 84

Program Timing:
Long I/O Wait

Figure: Program Timing: Long I/O Wait

5/23/2025 Compiled by Yatru Harsha Hiski 85

Instruction Cycle State Diagram, with
Interrupts

Figure: Instruction Cycle State Diagram, with Interrupts

5/23/2025 Compiled by Yatru Harsha Hiski 86
Instruction Cycle State Diagram, with
Interrupts
This state diagram illustrates the instruction cycle of a typical
processor. The instruction cycle is the sequence of steps that the CPU
performs to fetch, decode, and execute an instruction.
Let’s walk through the diagram step-by-step:
1. Instruction Address Calculation
 The CPU begins by calculating the address of the next instruction, typically held in the Program
Counter (PC).
 This address is used to locate the next instruction to be executed.
2. Instruction Fetch
 The instruction located at the address calculated is fetched from memory and loaded into the
Instruction Register (IR).

5/23/2025 Compiled by Yatru Harsha Hiski 87

Instruction Cycle State Diagram, with
Interrupts
3. Instruction Operation Decoding
 The fetched instruction is then decoded to understand what operation it specifies (e.g., addition,
subtraction, move).
 This step identifies:
 The operation to perform
 The source and destination operands
 Addressing modes
4. Operand Address Calculation
 If the instruction requires data (operands), the address of the operand(s) is calculated.
 This applies when operands are in memory (not in registers).
5. Operand Fetch
 The operand(s) are fetched from their memory locations or registers.
 This step may involve multiple operands, depending on the instruction.

5/23/2025 Compiled by Yatru Harsha Hiski 88

Instruction Cycle State Diagram, with
Interrupts
6. Data Operation
 The actual computation or data manipulation is performed.
 Example operations: add, subtract, AND, OR, shift, etc.
7. Operand Address Calculation (Result Storage)
 If the result needs to be stored, the destination address is calculated.
 This is often a separate calculation if indirect or indexed addressing is used.
8. Operand Store
 The results of the computation are stored at the destination address, either in memory or in a
register.
 This step may involve multiple results, especially in vector or string operations.
9. Interrupt Check
 The system checks for any pending interrupts (e.g., I/O completion, timer expiration).
 If an interrupt is detected, control is transferred to the interrupt handler.

5/23/2025 Compiled by Yatru Harsha Hiski 89

Instruction Cycle State Diagram, with
Interrupts
10. Interrupt (if any)
 If an interrupt is present, the system services it.
 Once completed, control typically returns to the instruction cycle.
11. Return Paths
 If there are no interrupts, the cycle returns to Instruction Address Calculation for the next
instruction.
 For string or vector data, the processor may return to Data Operation to process the next
element in the set.
 If instruction completes, it fetches the next instruction.

5/23/2025 Compiled by Yatru Harsha Hiski 90

Multiple Interrupts
Two approaches can be taken for dealing with multiple interrupts.
1. Disable interrupt
Ignore further interrupts while processing one interrupt
Interrupts remain pending and are checked after first interrupt has been processed
Interrupts handled in sequence as they occur
2. Define priorities
Low priority interrupts can be interrupted by higher priority interrupts
When higher priority interrupt has been processed, processor return to previous
interrupt

5/23/2025 Compiled by Yatru Harsha Hiski 91

Transfer of Control with Multiple
Interrupts

5/23/2025 Compiled by Yatru Harsha Hiski 92

Transfer of Control with Multiple
Interrupts

5/23/2025 Compiled by Yatru Harsha Hiski 93

Transfer of Control with Multiple
Interrupts

5/23/2025 Figure:
Compiled byExample
Yatru HarshaTime
Hiski Sequence of Multiple Interrupts 94
Time Sequence of Multiple Interrupts
Handling
1. t = 0: 5. t = 25 – Completion of
• User program starts execution. Communications ISR:
• Communications ISR finishes.
2. t = 10 – Printer Interrupt:
• Processor restores the state of the Printer
• A printer interrupt occurs. ISR.
• User program state is saved on the stack. • Before execution can resume, the disk
• Execution switches to the Printer Interrupt interrupt is recognized.
Service Routine (ISR).
6. t = 25 – Disk Interrupt Handling:
3. t = 15 – Communications Interrupt: • Disk ISR is executed as it has higher
• Communications interrupt occurs with higher priority than the Printer ISR.
priority than the printer.
• Printer ISR state is pushed onto the stack.
7. t = 35 – Completion of Disk ISR:
• Disk ISR finishes.
• Execution switches to the Communications ISR.
• Control returns to the Printer ISR.
4. t = 20 – Disk Interrupt:
• Disk interrupt occurs, but it has lower priority
8. t = 40 – Completion of Printer ISR:
than the Communications ISR. • Printer ISR finishes.
• Disk interrupt is held until the Communications • Control returns to the user program.
ISR completes.
5/23/2025 Compiled by Yatru Harsha Hiski 95
Computer Function: I/O Function
The I/O function in a computer system allows data exchange between the
processor and external devices through I/O modules (e.g., disk
controllers). The processor can initiate read or write operations with I/O
modules similarly to memory operations, by specifying the device address.
There are two primary modes of data exchange:
1. Processor-Controlled I/O: The processor directly reads from or writes to the
I/O module, identifying specific devices for data transfer. This is similar to
memory-referencing instructions.
2. Direct Memory Access (DMA): The processor grants control to an I/O module
to perform data transfers directly between memory and I/O devices without
processor intervention. This allows data exchange to occur concurrently with other
CPU tasks, optimizing processing efficiency.
DMA significantly reduces the processor’s involvement in data transfers,
enhancing overall system performance.
5/23/2025 Compiled by Yatru Harsha Hiski 96
Interconnection Structures
A computer consists of a set of components or modules of three basic types
(Processor, memory, I/O) that communicate with each other. There must
be a path for connecting these modules.
The collection of paths connecting the various modules is called the
interconnection structure. The design of this structure will depend on the
exchanges that must be made among modules.
Different types of connections are required for different types of modules
CPU Module
Memory Module
I/O Module

5/23/2025 Compiled by Yatru Harsha Hiski 97

Interconnection Structures
A computer system comprises three core components: Processor,
Memory, Input/Output (I/O) Modules, these components must
communicate with one another, and the infrastructure that enables this
communication is known as the interconnection structure.
It acts as the network of pathways through which data and control
signals are exchanged between components.

5/23/2025 Compiled by Yatru Harsha Hiski 98

Interconnection Structures

Figure: Computer Modules

5/23/2025 Compiled by Yatru Harsha Hiski 99
Interconnection Structures
Components and Their Roles
1. Memory Module
 Stores data in N words of equal length, each with a unique address.
 Supports read and write operations.
 Uses address and control signals to specify the operation and its target.
2. I/O Module
 Manages interaction with external devices.
 Like memory, it supports read and write operations.
 Each external device is accessed via a port with a unique address.
 I/O modules provide external data paths for device communication and may issue
interrupts to the processor.
3. Processor
 Executes instructions, reads/writes data, and controls system operations using
control signals.
 Receives interrupts to handle external or internal events.
5/23/2025 Compiled by Yatru Harsha Hiski 100
Interconnection Structures
The interconnection structure must support the following types
of transfers:
1. Memory to Processor: Fetching instructions or data.
2. Processor to Memory: Writing processed data back to memory.
3. I/O to Processor: Processor reads data from an external device.
4. Processor to I/O: Processor sends data to an external device.
5. I/O to/from Memory: Data exchange directly between memory and I/O devices
using Direct Memory Access (DMA), bypassing the processor.

5/23/2025 Compiled by Yatru Harsha Hiski 101

Bus Interconnection

Figure: Bus Interconnection Scheme

5/23/2025 Compiled by Yatru Harsha Hiski 102

Bus Interconnection
1. Definition and Role
A bus is a shared communication pathway connecting multiple devices
in a computer system.
It allows data transmission between components like the CPU,
memory, and I/O devices.
Only one device can transmit at a time to avoid data collisions.
Traditionally dominant in system design, but now more common in
embedded systems (e.g., microcontrollers) than in high-
performance computers, which use point-to-point
interconnections.

5/23/2025 Compiled by Yatru Harsha Hiski 103

Bus Interconnection
2. Structure of a Bus
A bus consists of multiple lines, each capable of transmitting binary data (1s
and 0s).
Buses can transfer data serially (one bit at a time) or in parallel
(multiple bits simultaneously).
An 8-bit bus has 8 lines and can send 8 bits in one operation.
3. Types of System Buses
A System Bus connects major components (CPU, memory, I/O devices).
It typically includes 50–100+ separate lines.
Lines are grouped into three functional categories:
1) Data Lines (Data Bus)
2) Address Lines (Address Bus)
3) Control Lines (Control Bus)

5/23/2025 Compiled by Yatru Harsha Hiski 104

Bus Interconnection
4. Functional Groups
Data Bus
Carries actual data between components.
Width (e.g., 32, 64, 128 bits) determines how much data can be transferred at once.
Wider buses improve performance (e.g., transferring a 64-bit instruction in one vs.
two cycles).
Address Bus
Specifies the source or destination address of data on the data bus.
Width determines maximum memory capacity (e.g., 32-bit address bus supports
2³² = 4 GB memory).
Also used to address I/O ports.
Control Bus
Manages access and use of data and address lines.
Transmits command and timing signals to coordinate operations.

5/23/2025 Compiled by Yatru Harsha Hiski 105

Bus Interconnection
5. Common Control Signals
Signal Function
Memory Read Reads data from a specified memory location.
Memory Write Writes data to a specified memory location.
I/O Read Reads data from an I/O port.
I/O Write Writes data to an I/O port.
Transfer ACK Confirms successful data transfer.
Bus Request Requests control of the bus.
Bus Grant Grants bus access to the requester.
Interrupt Request Signals an interrupt has occurred.
Interrupt ACK Acknowledges the interrupt request.
Clock Synchronizes all bus operations.
Reset Initializes the system.

5/23/2025 Compiled by Yatru Harsha Hiski 106

Bus Interconnection
6. Bus Operation Procedure
To send data:
1. A module requests control of the bus.
2. Once granted, it sends data over the data lines.
To request data:
1. A module requests bus control.
2. Sends a read request over control/address lines.
3. Waits for the target module to place the requested data on the data bus.

5/23/2025 Compiled by Yatru Harsha Hiski 107

A communication pathway Signals transmitted by any
connecting two or more
devices
one device are available for
reception by all other I
• Key characteristic is that it is a devices attached to the bus
n
n
shared transmission medium • If two devices transmit during the
same time period their signals will
overlap and become garbled

e
t
Typically consists of Computer systems contain B c
e
multiple communication a number of different buses
lines that provide pathways

u t
• Each line is capable of transmitting between components at
signals representing binary 1 and
various levels of the

r
binary 0
computer system hierarchy

s i
c
o
System bus
• A bus that connects major The most common

o
computer components (processor, computer interconnection
memory, I/O) structures are based on the

n
use of one or more system
buses

5/23/2025 n 108
Compiled by Yatru Harsha Hiski
The interconnection structure must support the
following types of transfers:

Memory Processor I/O to or

I/O to Processor
to to from
processor to I/O
processor memory memory

An I/O
module is
allowed to
exchange
data
Processor Processor
directly
reads an Processor reads data Processor
with
instruction writes a from an I/O sends data
memory
or a unit of unit of data device via to the I/O
without
data from to memory an I/O device
going
memory module
through the
processor
using direct
memory
access

5/23/2025 109
Compiled by Yatru Harsha Hiski
PCI (Peripheral Component
Interconnect)
1. General Overview
 PCI is a high-bandwidth, processor-independent bus used for connecting I/O subsystems
like:
 Graphic display adapters
 Network interface controllers
 Disk controllers
 Functions as both a mezzanine and peripheral bus.
 Delivers better system performance than earlier bus standards.
2. Speed and Data Capacity
 Supports up to 64 data lines at 66 MHz.
 Maximum raw data transfer rate:
 528 MB/s or 4.224 Gbps.
 Speed is not the only benefit; PCI is also cost-effective and requires fewer chips for
implementation.
5/23/2025 Compiled by Yatru Harsha Hiski 110
PCI (Peripheral Component
Interconnect)
3. Development and Compatibility
Developed by Intel in 1990 for Pentium systems.
Intel released the patents to the public domain to encourage
adoption.
Formation of the PCI Special Interest Group (PCI SIG) to:
Develop the standard further
Maintain compatibility
Widely used in PCs, workstations, and servers.
Open specification allows products from different vendors to be
interoperable.

5/23/2025 Compiled by Yatru Harsha Hiski 111

PCI (Peripheral Component
Interconnect)

Figure:
Example PCI
Configurations
- Typical
desktop system

5/23/2025 Compiled by Yatru Harsha Hiski 112

PCI (Peripheral Component
Interconnect)

Figure:
Example PCI
Configurations
- Typical server
system

5/23/2025 Compiled by Yatru Harsha Hiski 113

PCI (Peripheral Component
Interconnect)
4. Architectural Flexibility
 Supports single and multiple-processor systems.
 Uses synchronous timing and centralized arbitration to manage access and coordination.

5. System Integration
 In a single-processor system:
 A combined DRAM controller and PCI bridge connects the processor to the PCI bus.
 The bridge acts as a data buffer, decoupling PCI speed from processor I/O speed.
 In a multiprocessor system:
 Multiple PCI buses can be linked via bridges.
 The system bus connects processors, main memory, and PCI bridges only.
 Bridges allow high-speed data transfer while keeping PCI independent of processor speed.

5/23/2025 Compiled by Yatru Harsha Hiski 114

PCI (Peripheral Component
Interconnect)
6. Bus Structure and Signal Lines
PCI supports 32- or 64-bit configurations.
Contains 49 mandatory signal lines, grouped into:
1. System Pins
 Handle clock and reset functions.
2. Address and Data Pins
 32 lines used for time-multiplexed addresses and data.
 Additional lines validate and interpret these signals.
3. Interface Control Pins
 Manage timing and coordination between devices (initiators and targets).
4. Arbitration Pins
 Each PCI master has its own pair of arbitration lines.
 Connects directly to the PCI arbiter, unlike shared signal lines.
5. Error Reporting Pins
 Report parity errors and other faults.
5/23/2025 Compiled by Yatru Harsha Hiski 115
PCI (Peripheral Component
Interconnect)
Bus Structure
PCI Commands
Data Transfers
PCI Arbitration
PCI Bus Arbitration is the process by which the PCI bus decides which device
gets control of the bus when multiple devices request it simultaneously.

5/23/2025 Compiled by Yatru Harsha Hiski 116

Assignment#1
Q1) Explain Brief History of Computer Generations.
Q2) Explain the Evolution of the Intel x86 and ARM Architecture.
Q3) Explain PCI Arbitration
Q3) Compare in between PCI and PCIe.

5/23/2025 Compiled by Yatru Harsha Hiski 117

Thank You

Lecture1 All
No ratings yet
Lecture1 All
37 pages
Chapter#1 - Notes
No ratings yet
Chapter#1 - Notes
5 pages
Computer Architecture Course Overview
No ratings yet
Computer Architecture Course Overview
34 pages
U 1 Newcoa
No ratings yet
U 1 Newcoa
23 pages
Introduction To Computer
No ratings yet
Introduction To Computer
50 pages
Chargeman Syllabus
No ratings yet
Chargeman Syllabus
106 pages
Microprocessors & Interfacing Guide
No ratings yet
Microprocessors & Interfacing Guide
54 pages
Chapter 01 Edited
No ratings yet
Chapter 01 Edited
9 pages
KTMT Và H P NG Full
No ratings yet
KTMT Và H P NG Full
500 pages
Unit - 1 Computer Archi
No ratings yet
Unit - 1 Computer Archi
162 pages
CSC 311: Computer Org & Arch Intro
No ratings yet
CSC 311: Computer Org & Arch Intro
25 pages
BCS1373 - Chap 1 Intro
No ratings yet
BCS1373 - Chap 1 Intro
42 pages
Chapter 1 - Introduction To Computer Architecture and Organization
No ratings yet
Chapter 1 - Introduction To Computer Architecture and Organization
21 pages
Unit 1-Ec355tbf - 1511
No ratings yet
Unit 1-Ec355tbf - 1511
154 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
31 pages
1 - Module 1
No ratings yet
1 - Module 1
36 pages
Unit 1
No ratings yet
Unit 1
162 pages
Module 1
No ratings yet
Module 1
67 pages
Chapter 01 Introduction To Computer Organization and Architecture
No ratings yet
Chapter 01 Introduction To Computer Organization and Architecture
47 pages
Module Part1
No ratings yet
Module Part1
21 pages
Computer Architecture Basics
No ratings yet
Computer Architecture Basics
8 pages
Unit 1 BCA 303 CA
No ratings yet
Unit 1 BCA 303 CA
77 pages
Unit 1 Bca 303 Group C & Group H
No ratings yet
Unit 1 Bca 303 Group C & Group H
50 pages
Computer Organization and Architecture
67% (3)
Computer Organization and Architecture
111 pages
Computer Architecture and Organization: Dr. Mohd Hanafi Ahmad Hijazi
No ratings yet
Computer Architecture and Organization: Dr. Mohd Hanafi Ahmad Hijazi
34 pages
01 Introduction Nya
No ratings yet
01 Introduction Nya
27 pages
CompArch CH 1 1
No ratings yet
CompArch CH 1 1
23 pages
Computer Architecture Overview
No ratings yet
Computer Architecture Overview
61 pages
Lec 1 Introduction
No ratings yet
Lec 1 Introduction
24 pages
Computer Organization - MIPS Assembly Part 1
No ratings yet
Computer Organization - MIPS Assembly Part 1
6 pages
Computer Architecture CSC 303
100% (1)
Computer Architecture CSC 303
36 pages
Lect1-2 Introduction To Computer Hardware and Architecture
No ratings yet
Lect1-2 Introduction To Computer Hardware and Architecture
19 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
3 pages
COA-note 1680051568084
No ratings yet
COA-note 1680051568084
136 pages
Computer Organization Course Guide
No ratings yet
Computer Organization Course Guide
30 pages
Chapter I CAO
No ratings yet
Chapter I CAO
44 pages
Study Notes COAL Mids
No ratings yet
Study Notes COAL Mids
14 pages
Computer Organization & Architecture Basics
No ratings yet
Computer Organization & Architecture Basics
9 pages
Chapter I CAO
No ratings yet
Chapter I CAO
36 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
12 pages
Ics 2101 Bit 2102 First Batch
No ratings yet
Ics 2101 Bit 2102 First Batch
23 pages
Comp Dep Arch 1 1
No ratings yet
Comp Dep Arch 1 1
23 pages
Computer Architecture Basics
No ratings yet
Computer Architecture Basics
11 pages
Introduction To Computer Organization and Architecture: Key Concepts
No ratings yet
Introduction To Computer Organization and Architecture: Key Concepts
8 pages
COA1
No ratings yet
COA1
21 pages
Advanced Computer Architecture: CSE-401 E
No ratings yet
Advanced Computer Architecture: CSE-401 E
71 pages
CH01 COA10e
No ratings yet
CH01 COA10e
47 pages
Lecture 2 - ١٢٢١١٣
No ratings yet
Lecture 2 - ١٢٢١١٣
42 pages
Student Notes 1
No ratings yet
Student Notes 1
65 pages
1 Introduction21
No ratings yet
1 Introduction21
34 pages
Computer Organization and Architecture: Lecture-1,2: Introduction
No ratings yet
Computer Organization and Architecture: Lecture-1,2: Introduction
39 pages
COA - Till Last Class
No ratings yet
COA - Till Last Class
50 pages
COA For Midterm
No ratings yet
COA For Midterm
64 pages
Ii-B.E. Iii-Semester: Computer Architecture (CA)
No ratings yet
Ii-B.E. Iii-Semester: Computer Architecture (CA)
37 pages
Cit Lecture Note Part B by P - K - Joseph
No ratings yet
Cit Lecture Note Part B by P - K - Joseph
16 pages
Basic Concepts and Computer Evolution
No ratings yet
Basic Concepts and Computer Evolution
85 pages
CHAPTER - ONE - General Introduction
No ratings yet
CHAPTER - ONE - General Introduction
49 pages
Chapter 1
No ratings yet
Chapter 1
59 pages
Obedience Cost Us Something.
No ratings yet
Obedience Cost Us Something.
15 pages
Am/is/are + Past Participle Was/were + Past Participle Will + Be+ Past Participle
No ratings yet
Am/is/are + Past Participle Was/were + Past Participle Will + Be+ Past Participle
2 pages
Descriptive Writing
No ratings yet
Descriptive Writing
4 pages
W9 Review and Periodical Test
100% (2)
W9 Review and Periodical Test
2 pages
Travel Management System
No ratings yet
Travel Management System
118 pages
Ulysses - TEST
0% (1)
Ulysses - TEST
2 pages
Lesson 5 Advanced Grammar Inversion
No ratings yet
Lesson 5 Advanced Grammar Inversion
4 pages
Islamic New Year
No ratings yet
Islamic New Year
9 pages
Winxp PDF
No ratings yet
Winxp PDF
18 pages
Legal Implications of All-Caps Names
No ratings yet
Legal Implications of All-Caps Names
23 pages
CMS Substructure in ANSYS Workbench
No ratings yet
CMS Substructure in ANSYS Workbench
8 pages
Anuj KumarKushwaha - InternshalaResume-3
No ratings yet
Anuj KumarKushwaha - InternshalaResume-3
2 pages
English Grammar Icse 25
No ratings yet
English Grammar Icse 25
12 pages
Macmillan Next Move Level 1 Pupils Book Sample
No ratings yet
Macmillan Next Move Level 1 Pupils Book Sample
10 pages
Engineering Heat Transfer Guide
No ratings yet
Engineering Heat Transfer Guide
45 pages
Tài liệu bồi dưỡng học sinh giỏi tiếng Anh lớp 7
100% (1)
Tài liệu bồi dưỡng học sinh giỏi tiếng Anh lớp 7
26 pages
Class Time Table 2025 - 26
No ratings yet
Class Time Table 2025 - 26
10 pages
Basic Algebra
100% (4)
Basic Algebra
13 pages
Eternal Recurrence Personal Infinity 2019
No ratings yet
Eternal Recurrence Personal Infinity 2019
18 pages
Distributed Object Systems Guide
No ratings yet
Distributed Object Systems Guide
31 pages
Effectiveness of One-On-One Tutoring
No ratings yet
Effectiveness of One-On-One Tutoring
7 pages
Ghaus-ul-Azam, The Greatest Saint of All Time
No ratings yet
Ghaus-ul-Azam, The Greatest Saint of All Time
7 pages
The Noun - Eng
No ratings yet
The Noun - Eng
2 pages
SystemVerilog CDC Design & Verification
No ratings yet
SystemVerilog CDC Design & Verification
56 pages
Advanced Modal Usage with 'Can'
No ratings yet
Advanced Modal Usage with 'Can'
29 pages
Uneven Iambs
No ratings yet
Uneven Iambs
4 pages
CS513 MJP CloudComputing Slips
No ratings yet
CS513 MJP CloudComputing Slips
27 pages
Qemetica Brand Guidelines Eng 1
No ratings yet
Qemetica Brand Guidelines Eng 1
59 pages
Role of The Communicators
No ratings yet
Role of The Communicators
11 pages
MariaDB Galera Cluster Encryption Guide
No ratings yet
MariaDB Galera Cluster Encryption Guide
9 pages