0% found this document useful (0 votes)

25 views33 pages

Computer Organization and Architecture

The document outlines the computer system hierarchy, detailing components such as the CPU, caches, main memory, secondary memory, and I/O devices. It explains the Instruction Set Architecture (ISA), comparing RISC and CISC, and discusses memory hierarchy, performance, and I/O communication methods. Additionally, it covers concepts of pipelining, parallelism, cache memory organization, virtual memory, and memory management techniques.

Uploaded by

shrinjoyee30

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views33 pages

Computer Organization and Architecture

Uploaded by

shrinjoyee30

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

1.

Computer System Hierarchy

A computer system is organized into multiple layers of memory and processing

units:

1. CPU (Central Processing Unit)

○ Executes instructions.
○ Contains:
■ ALU (Arithmetic Logic Unit): Performs calculations.
■ Control Unit (CU): Manages instruction execution.
■ Registers: Fast storage (e.g., Program Counter, Accumulator).
2. Caches (L1, L2, L3)
○ Small, fast memory to reduce CPU-memory latency.
○ L1 Cache (fastest, inside CPU).
○ L2 Cache (slower, may be inside or outside CPU).
○ L3 Cache (shared among cores).
3. Main Memory (RAM - Random Access Memory)
○ Volatile storage for running programs.
○ Slower than cache but faster than secondary storage.
4. Secondary Memory (HDD/SSD, Optical Drives)
○ Non-volatile storage (persistent).
○ Examples: Hard Disk Drive (HDD), Solid-State Drive (SSD).
5. I/O (Input/Output) Devices
○ Peripherals like keyboard, mouse, monitor, network cards.
○ Managed by device controllers and drivers.

2. Instruction Set Architecture (ISA)

The ISA defines the interface between hardware and software, including:

● Supported instructions (e.g., ADD, LOAD, STORE).

● Data types (integers, floats).
● Memory addressing modes.
● Registers and their usage.

Types of ISA

1. RISC (Reduced Instruction Set Computer)

Fewer, simpler instructions.
○
Fixed instruction length.
○
Examples: ARM, MIPS, RISC-V.
○
Advantages:
○
■ Faster execution (pipelining).
■ Lower power consumption (used in mobile devices).
2. CISC (Complex Instruction Set Computer)
○ Many complex instructions.
○ Variable instruction length.
○ Examples: x86 (Intel, AMD).
○ Advantages:
■ Fewer instructions per program (higher code density).
■ Hardware handles complex operations.

Feature RISC CISC

Instruction Set Small & Simple Large & Complex

Execution Faster (pipelined) Slower (multi-cycle ops)

Speed

Power Better (mobile devices) Worse

Efficiency (desktops/servers)

Code Size Larger (more Smaller (complex ops)

instructions)

3. Memory Hierarchy & Performance

3.1 Memory Speed & Size Trade-off

Memory Type Speed Size Cost per Bit

Registers Fastest Smallest Highest

Cache (L1, L2, L3) Very Fast Small High

Main Memory (RAM) Fast Medium Moderate

Secondary Storage Slow Large Low

(SSD/HDD)

3.2 Locality Principles

1. Temporal Locality: Recently accessed data is likely to be reused.

2. Spatial Locality: Nearby memory locations are likely to be accessed.

Caching works best when programs exhibit good locality.

4. I/O System & Buses

4.1 I/O Communication Methods

1. Programmed I/O

○ CPU polls devices (inefficient).
2. Interrupt-Driven I/O
○ Device interrupts CPU when ready.
3. DMA (Direct Memory Access)
○ I/O device writes directly to memory without CPU.

4.2 System Buses

1. Data Bus: Carries data between CPU and memory.

2. Address Bus: Specifies memory location.
3. Control Bus: Manages operations (read/write signals).

5. Pipelining & Parallelism

5.1 Instruction Pipelining

● Breaks instruction execution into stages (Fetch, Decode, Execute, Writeback).
● Improves throughput (more instructions per cycle).
● Hazards:
○ Structural: Resource conflicts.
○ Data: Dependency between instructions.
○ Control: Branch instructions disrupt flow.

5.2 Parallel Processing

1. Multicore CPUs: Multiple cores execute tasks simultaneously.

2. SIMD (Single Instruction, Multiple Data):
○ Same operation on multiple data points (used in GPUs).
3. MIMD (Multiple Instruction, Multiple Data):
○ Different cores execute different instructions (e.g., multi-threading).

📘 Introduction to Parallelism
🔹 Goals of Parallelism
Parallelism is a method used to make computing faster by doing multiple operations
simultaneously.

✅ Key Goals:
● Increase performance (speedup)

● Enhance throughput (tasks per unit time)

● Reduce execution time for large-scale problems

● Improve resource utilization

● Enable concurrent execution of independent tasks

● Solve complex problems that are infeasible on a single processor

🔹 Instruction-Level Parallelism (ILP)

ILP exploits the ability of a CPU to execute multiple instructions simultaneously by
identifying independent instructions.

✅ ILP Techniques:
● Pipelining

● Out-of-order execution

● Branch prediction

● Register renaming

● Speculative execution

Goal: Maximize CPU pipeline usage and reduce idle stages.

🔹 Pipelining
Pipelining breaks instruction execution into multiple stages (fetch, decode, execute,
etc.), allowing overlapping execution.

✅ Key Points:
● Like an assembly line: each stage processes one instruction at a time.

● Increases throughput, not individual instruction speed.

● Best suited when instructions are independent.

🚧 Challenges:
● Hazards (Data, Control, Structural)

● Stalling and bubbles in pipeline

🔁 Example:
sql
Copy code

Cycle 1 2 3 4 5

Instr 1 IF ID EX MEM WB

Instr 2 IF ID EX MEM

Instr 3 IF ID EX

🔹 Superscalar Architecture
A superscalar processor can issue and execute more than one instruction per clock
cycle.

✅ Features:
● Multiple instruction pipelines

● Requires complex scheduling, dependency checks

● Hardware must support parallel decoding, execution, and writeback

⚙️ Example:
Intel Core i7, AMD Ryzen CPUs are superscalar.

🔹 Processor-Level Parallelism (PLP)

Involves using multiple processors or cores to execute different threads or programs
concurrently.

✅ Types:
● Symmetric Multiprocessing (SMP) – all processors share memory and OS (e.g.,
multi-core CPUs)
● Massively Parallel Processing (MPP) – each processor has its own memory,
works independently

🧠 Examples:
● Servers, clusters, modern smartphones, cloud computing

🔹 Multiprocessor System Overview

A multiprocessor system is a system with two or more processors that share tasks
to improve performance and reliability.

✅ Classification:
1. Shared Memory Systems (Tightly Coupled):

○ All CPUs share global memory.

○ Easier to program.

○ Needs cache coherence (e.g., MESI protocol).

2. Distributed Memory Systems (Loosely Coupled):

○ Each CPU has private memory.

○ Communicate via messages (e.g., MPI, Hadoop).

📌 Characteristics:
● High performance

● Scalability

● Fault tolerance

● Expensive hardware and complexity

🧠 Summary Table
Type Definition Example

ILP Parallel execution of instructions in Instruction pipelining

a pipeline

Pipelining Overlapping stages of instruction 5-stage RISC pipeline

execution

Superscalar Multiple instructions per cycle Intel i7, ARM

Cortex-A76

PLP Multiple processors running tasks Dual-core, quad-core

CPUs

Multiprocessor System with >1 CPU for better Cloud servers, SMP
System performance systems

🔶 Cache Memory Organization

🔹 Concepts:
● Cache Memory: Small, fast memory between CPU and main memory.

● Locality Principle:

○ Temporal Locality: Recently used data is likely to be used again.

○ Spatial Locality: Nearby data will likely be accessed soon.

🔹 Mapping Techniques:
Technique Description

Direct Mapping One memory block → One specific cache line (fast, simple,
high conflict)

Associative Mapping Any memory block → Any cache line (flexible, but slower
and expensive)

Set-Associative Compromise: Block maps to a set, then associatively to

Mapping lines

🔶 Techniques to Reduce Cache Misses

1. Block Size Optimization

2. Higher Associativity

3. Multilevel Caches (L1, L2, L3)

4. Victim Caches: Small buffer to hold recently evicted blocks

5. Prefetching: Load data before it’s needed

6. Write Policies:

○ Write-Through: Updates main memory immediately

○ Write-Back: Updates only in cache, written to memory later

🔶 Virtual Memory Organization
🔹 Concepts:
● Allows execution of processes not fully in memory

● Virtual Address → Physical Address via Page Table

🔹 Mapping Techniques:
Mapping Explanation

Paging Divides memory into fixed-size pages (no

fragmentation)

Segmentation Divides memory by logical divisions

(code/data/stack)

Paged Combines paging & segmentation

Segmentation

🔶 Memory Management Techniques

1. Page Tables: Store mapping of virtual to physical addresses

○ Inverted Page Table, Multilevel Page Table

2. TLB (Translation Lookaside Buffer): Fast lookup cache for page table entries

3. Protection & Sharing: Done via segment/page permissions

🔶 Memory Replacement Policies
Used when a page must be swapped out from memory:

Policy Strategy

FIFO Oldest page is replaced

LRU (Least Recently Page not used for the longest time
Used)

Optimal Replace the page that will not be used for the longest time
(ideal but not practical)

Clock Algorithm Approximate LRU using a circular queue

✅ Recommended YouTube Videos:

● 📺 Gate Smashers – Cache Memory & Virtual Memory
Cache Memory | Virtual Memory

I/O Organization & Memory Hierarchy: Theory

1. I/O Organization

1.1 Programmed I/O

● The CPU directly controls I/O operations by polling device status registers.
● Disadvantages:
○ CPU remains busy in a loop, wasting cycles.
○ Inefficient for slow devices.
1.2 Interrupt-Driven I/O

● Devices interrupt the CPU when they are ready for data transfer.
● Advantages:
○ CPU can perform other tasks while waiting.
○ More efficient than polling.
● Disadvantages:
○ Overhead of handling frequent interrupts.

1.3 Direct Memory Access (DMA)

● DMA Controller transfers data between I/O devices and memory without CPU
intervention.
● Steps:
1. CPU initializes DMA (source, destination, size).
2. DMA takes control of the bus.
3. Data is transferred directly.
4. DMA interrupts CPU upon completion.
● Used for: High-speed transfers (e.g., disk I/O, network packets).

1.4 Synchronous vs. Asynchronous Data Transfer

Feature Synchronous Asynchronous

Clock Uses a common clock No common clock

Speed Fixed rate Variable rate

Handshaking Not required Required (STROBE,

ACK)
Example CPU-RAM Keyboard input

communication

2. Memory Hierarchy

2.1 Principle of Locality

● Temporal Locality: Recently accessed data is likely to be accessed again.

● Spatial Locality: Nearby memory locations are likely to be accessed soon.
● Exploited by: Caching, prefetching.

2.2 Memory Hierarchy Levels

Level Type Speed Cost Size

Registers Fastest 1 ns Highest Few KB

Cache (L1, L2, L3) SRAM 1-10 ns High KB-MB

Main Memory Volatile 50-100 ns Moderate GB

(DRAM)

Secondary Non-volatile 100,000 ns Low TB

(SSD/HDD)
Tertiary Archival Very slow Very low PB

(Tape/Cloud)

2.3 Cache Memory

● Purpose: Reduce average memory access time (AMAT).

● Mapping Techniques:
○ Direct Mapped: Each block maps to one cache line.
○ Fully Associative: Block can go anywhere in cache.
○ Set-Associative: Block maps to a set of lines (compromise between
direct & fully associative).
● Replacement Policies:
○ LRU (Least Recently Used)
○ FIFO (First-In-First-Out)
○ Random Replacement
● Write Policies:
○ Write-Through: Data written to cache and main memory.
○ Write-Back: Data written only to cache; main memory updated on
eviction.

2.4 Virtual Memory

● Uses disk space to extend RAM.

● Paging: Divides memory into fixed-size blocks (pages).
● TLB (Translation Lookaside Buffer): Caches page table entries for faster
address translation.
● Thrashing: Excessive page faults due to insufficient RAM.

3. Key Formulas

1. Average Memory Access Time (AMAT):

2. AMAT=Hit Time+Miss Rate×Miss Penalty
3. AMAT=Hit Time+Miss Rate×Miss Penalty
4. Miss Rate:
5. Miss Rate=Number of MissesTotal Accesses
6. Miss Rate=
7. Total Accesses
8. Number of Misses
9.

10.Effective Access Time (Virtual Memory):

11.EAT=(1−p)×Memory Access Time+p×Page Fault Time
12.EAT=(1−p)×Memory Access Time+p×Page Fault Time
○ p
○ p = Page fault rate.

4. Summary Table

Concept Key Points

Programmed I/O CPU polls devices (inefficient).

Interrupt-Driven I/O Devices interrupt CPU (better

efficiency).

DMA Direct transfers without CPU (fastest).

Synchronous Transfer Uses clock (e.g., CPU-RAM).

Asynchronous Uses handshaking (e.g., keyboard).

Transfer
Temporal Locality Reuse of recent data.

Spatial Locality Access nearby data.

Cache Mapping Direct, Associative, Set-Associative.

Write Policies Write-Through vs. Write-Back.

Virtual Memory Extends RAM using disk.

1. General System Architecture

1. Which component executes arithmetic and logic operations?

a) Cache
b) ALU (Arithmetic Logic Unit)
c) Control Unit
d) I/O Controller
Answer: b) ALU
2. RISC architectures are characterized by:
a) Complex instructions
b) Fewer instructions and fixed-length formats
c) Variable-length instructions
d) Microprogrammed control
Answer: b) Fewer instructions and fixed-length formats
3. CISC vs. RISC: Which typically uses more registers?
a) CISC
b) RISC
c) Both use the same number
d) Depends on the OS
Answer: b) RISC
4. The Control Unit (CU) is responsible for:
a) Performing arithmetic operations
b) Decoding instructions and generating control signals
c) Storing data temporarily
d) Managing I/O devices
Answer: b) Decoding instructions
5. Secondary memory (e.g., HDD) is:
a) Faster than cache
b) Non-volatile
c) Part of the CPU
d) Used only for ALU operations
Answer: b) Non-volatile

2. I/O Organization & Memory Hierarchy

6. Programmed I/O is inefficient because:

a) It uses DMA
b) The CPU polls devices in a loop
c) It relies on interrupts
d) It bypasses the CPU
Answer: b) CPU polls devices
7. DMA improves performance by:
a) Increasing cache size
b) Allowing direct transfers between I/O and memory
c) Using more interrupts
d) Slowing down the CPU
Answer: b) Direct transfers
8. Synchronous data transfer requires:
a) Handshaking signals
b) A common clock
c) No clock
d) Only interrupts
Answer: b) Common clock
9. Temporal locality refers to:
a) Accessing nearby memory locations
b) Reusing recently accessed data
c) Random memory access
d) Disk I/O patterns
Answer: b) Reusing recent data
10.The fastest memory in the hierarchy is:
a) HDD
b) DRAM
c) L1 Cache
d) SSD
Answer: c) L1 Cache

3. Cache Memory & Virtual Memory

11.A direct-mapped cache places a block at:

a) Any empty location
b) A unique location based on its address
c) Only the first location
d) Randomly
Answer: b) Unique location
12.Fully associative cache allows:
a) A block to be placed anywhere
b) Fixed placement
c) Only sequential access
d) No replacement
Answer: a) Anywhere
13.LRU (Least Recently Used) is a:
a) Cache replacement policy
b) Interrupt handler
c) DMA technique
d) CPU scheduling algorithm
Answer: a) Replacement policy
14.Write-through cache ensures:
a) Data is written only to cache
b) Data is written to cache and memory simultaneously
c) Data is written only on eviction
d) Data is never written to memory
Answer: b) Cache and memory
15.Virtual memory uses:
a) Cache to extend RAM
b) Disk to extend RAM
c) Registers to extend RAM
d) DMA to extend RAM
Answer: b) Disk to extend RAM
4. Parallelism

16.Pipelining improves performance by:

a) Executing multiple instructions simultaneously in stages
b) Increasing clock speed
c) Reducing cache size
d) Eliminating interrupts
Answer: a) Simultaneous stages
17.Superscalar architectures can:
a) Execute multiple instructions per cycle
b) Only execute one instruction per cycle
c) Replace cache memory
d) Slow down the CPU
Answer: a) Multiple instructions/cycle
18.Multiprocessor systems share:
a) Only I/O devices
b) Memory and I/O devices
c) Only registers
d) Nothing
Answer: b) Memory and I/O
19.Instruction-level parallelism (ILP) is achieved via:
a) Pipelining and superscalar execution
b) Increasing disk size
c) Reducing cache misses
d) Using more interrupts
Answer: a) Pipelining/superscalar
20.A SIMD (Single Instruction Multiple Data) architecture is used in:
a) GPUs
b) Hard disks
c) Keyboards
d) Printers
Answer: a) GPUs

Advanced Questions (21–40)

21.Which cache mapping technique has no conflict misses?

a) Direct-mapped
b) Fully associative
c) Set-associative
d) None
Answer: b) Fully associative
22.The TLB (Translation Lookaside Buffer) speeds up:
a) Virtual-to-physical address translation
b) DMA transfers
c) Interrupt handling
d) Disk I/O
Answer: a) Address translation
23.Thrashing occurs when:
a) Cache hits increase
b) Excessive page faults degrade performance
c) DMA is overused
d) The CPU is idle
Answer: b) Excessive page faults
24.Non-uniform memory access (NUMA) means:
a) All memory accesses take the same time
b) Memory access times vary by location
c) Cache is unused
d) DMA is disabled
Answer: b) Variable access times
25.A write-back cache updates main memory:
a) Immediately
b) Only when the block is evicted
c) Never
d) Randomly
Answer: b) On eviction
26.The goal of parallelism is to:
a) Increase throughput
b) Reduce memory size
c) Slow down execution
d) Eliminate caches
Answer: a) Increase throughput
27.In a 4-way set-associative cache, each set has:
a) 1 block
b) 4 blocks
c) 8 blocks
d) Unlimited blocks
Answer: b) 4 blocks
28.Which is not a memory hierarchy level?
a) Registers
b) L1 Cache
c) SSD
d) Power supply
Answer: d) Power supply
29.The AMAT formula includes:
a) Hit time + Miss rate × Miss penalty
b) Hit rate × Miss time
c) Cache size × Access time
d) Miss rate × Hit penalty
Answer: a) Hit time + Miss rate × Miss penalty
30.SIMD is useful for:
a) Image processing
b) Keyboard input
c) Printer output
d) Interrupt handling
Answer: a) Image processing
31.A multicore processor is an example of:
a) Thread-level parallelism
b) Instruction-level parallelism
c) Data-level parallelism
d) Disk-level parallelism
Answer: a) Thread-level parallelism
32.Which is not a replacement policy?
a) LRU
b) FIFO
c) DMA
d) Random
Answer: c) DMA
33.The tag field in a cache address is used to:
a) Identify the memory block
b) Replace the block
c) Disable the cache
d) Speed up DMA
Answer: a) Identify the block
34.An interrupt vector table contains:
a) Addresses of interrupt handlers
b) Cache blocks
c) DMA requests
d) Virtual memory pages
Answer: a) Interrupt handler addresses
35.Which is not a goal of pipelining?
a) Increase instruction throughput
b) Reduce CPI (Cycles Per Instruction)
c) Eliminate memory hierarchy
d) Overlap instruction execution
Answer: c) Eliminate memory hierarchy
36.A hazard in pipelining occurs due to:
a) Resource conflicts
b) Perfect branch prediction
c) Zero cache misses
d) No interrupts
Answer: a) Resource conflicts
37.VLIW (Very Long Instruction Word) architectures rely on:
a) The compiler to schedule parallel instructions
b) Dynamic hardware scheduling
c) Reducing cache size
d) Increasing interrupts
Answer: a) Compiler scheduling
38.In a multiprocessor system, cache coherence ensures:
a) All CPUs see the same memory value
b) Each CPU has its own memory
c) DMA is disabled
d) No interrupts occur
Answer: a) Consistent memory views
39.Which is not a parallel processing technique?
a) Pipelining
b) Superscalar execution
c) Multithreading
d) Polling
Answer: d) Polling
40.The principle of locality justifies the use of:
a) Caches
b) Interrupts
c) DMA
d) Power supplies
Answer: a) Caches

5. Advanced Cache & Memory Management

41.Which cache mapping technique requires a replacement policy?

a) Direct-mapped
b) Fully associative
c) Set-associative
d) Both b and c
Answer: d) Both b and c
Explanation: Direct-mapped has no choice (1 block per set), while associative
mappings need replacement policies.
42.A cache with 64 blocks and 4 blocks per set is:
a) Direct-mapped
b) 4-way set-associative
c) Fully associative
d) 16-way set-associative
Answer: b) 4-way set-associative
Explanation: Number of sets = 64/4 = 16.
43.The "dirty bit" in a cache block indicates:
a) The block is corrupted
b) The block has been modified but not written to memory
c) The block is unused
d) The block is locked
Answer: b) Modified but not written to memory (used in write-back caches).
44.Which reduces conflict misses?
a) Larger cache size
b) Higher associativity
c) Smaller block size
d) Disabling cache
Answer: b) Higher associativity
Explanation: More blocks per set reduce collisions.
45.Virtual memory is implemented using:
a) Paging or segmentation
b) Only paging
c) Only segmentation
d) DMA
Answer: a) Paging or segmentation

6. Parallelism & Multiprocessing

46.In a pipelined processor, a "bubble" occurs due to:

a) A cache hit
b) A branch misprediction
c) A DMA transfer
d) An interrupt
Answer: b) Branch misprediction
Explanation: Pipeline stalls until the correct instruction is fetched.
47.SIMD (Single Instruction Multiple Data) is used in:
a) Vector processors
b) GPUs
c) Both a and b
d) Hard disks
Answer: c) Both a and b
48.Multithreading improves CPU utilization by:
a) Overlapping I/O waits with execution
b) Increasing clock speed
c) Reducing cache size
d) Disabling interrupts
Answer: a) Overlapping I/O waits
49.In a multicore system, shared L3 cache is an example of:
a) Uniform Memory Access (UMA)
b) Non-Uniform Memory Access (NUMA)
c) Message-passing architecture
d) SIMD architecture
Answer: a) UMA
Explanation: All cores access L3 cache with equal latency.
50.Which is not a parallel processing architecture?
a) MIMD (Multiple Instruction Multiple Data)
b) SISD (Single Instruction Single Data)
c) RAID (Redundant Array of Disks)
d) MISD (Multiple Instruction Single Data)
Answer: c) RAID

7. I/O Systems & Interrupts

51.An ISR (Interrupt Service Routine) is:

a) A program that handles interrupts
b) A cache replacement policy
c) A DMA controller
d) A type of memory
Answer: a) Interrupt handler
52.Which I/O method is most suitable for high-speed devices?
a) Programmed I/O
b) Interrupt-driven I/O
c) DMA
d) Polling
Answer: c) DMA
53.In vectored interrupts:
a) The device supplies the interrupt handler address
b) The CPU polls devices
c) No ISR is used
d) DMA is required
Answer: a) Device supplies the address
54.A "maskable" interrupt can be:
a) Ignored or delayed by the CPU
b) Only handled by DMA
c) Generated only by the clock
d) Never serviced
Answer: a) Ignored/delayed
55.Which is not a disk scheduling algorithm?
a) FCFS (First-Come First-Served)
b) SCAN (Elevator)
c) LRU (Least Recently Used)
d) SSTF (Shortest Seek Time First)
Answer: c) LRU

8. Advanced Memory Concepts

56.The "working set" of a process is:

a) The set of pages actively used in a time interval
b) The total memory allocated
c) The cache size
d) The disk buffer
Answer: a) Actively used pages
57.Belady’s anomaly occurs in:
a) FIFO page replacement
b) LRU page replacement
c) Optimal page replacement
d) Random replacement
Answer: a) FIFO
Explanation: Increasing frames can increase page faults.
58.A "TLB miss" results in:
a) A page table walk
b) A cache eviction
c) A DMA transfer
d) An interrupt
Answer: a) Page table walk
59.Inverted page tables are used to:
a) Reduce memory overhead for large address spaces
b) Increase cache size
c) Speed up DMA
d) Disable interrupts
Answer: a) Reduce memory overhead
60.The "compulsory miss" rate can be reduced by:
a) Larger block size
b) Prefetching
c) Higher associativity
d) Smaller cache
Answer: b) Prefetching

9. Processor Architecture

61.The "data hazard" in pipelining occurs when:

a) An instruction depends on the result of a previous instruction
b) The cache is full
c) A branch is mispredicted
d) An interrupt occurs
Answer: a) Instruction dependency
62.Out-of-order execution is used to:
a) Hide pipeline stalls
b) Reduce cache size
c) Disable interrupts
d) Slow down the CPU
Answer: a) Hide stalls
63.A "superscalar" processor can:
a) Issue multiple instructions per cycle
b) Only execute scalar operations
c) Replace cache with disk
d) Eliminate pipelines
Answer: a) Multiple instructions/cycle
64.The "branch target buffer (BTB)" helps in:
a) Predicting branch addresses
b) Reducing cache misses
c) Managing DMA
d) Handling interrupts
Answer: a) Branch prediction
65.Which is not a type of pipeline hazard?
a) Structural
b) Data
c) Control
d) Virtual
Answer: d) Virtual

10. System Performance & Optimization

66.The "CPI" (Cycles Per Instruction) is improved by:

a) Pipelining
b) Increasing cache misses
c) Slowing the clock
d) Disabling interrupts
Answer: a) Pipelining
67."False sharing" in multicore systems occurs when:
a) Cores modify different variables in the same cache line
b) The cache is disabled
c) DMA is overused
d) No interrupts are used
Answer: a) Cores modify same cache line
68.The "Amdahl’s Law" states that speedup is limited by:
a) The sequential portion of a program
b) The cache size
c) The number of interrupts
d) The disk speed
Answer: a) Sequential portion
69."Speculative execution" involves:
a) Executing instructions before knowing if they are needed
b) Disabling pipelines
c) Reducing cache size
d) Ignoring branches
Answer: a) Early execution
70.Which is not a memory coherence protocol?
a) MESI (Modified, Exclusive, Shared, Invalid)
b) MOESI (Modified, Owned, Exclusive, Shared, Invalid)
c) LRU (Least Recently Used)
d) Directory-based
Answer: c) LRU

11. Mixed Advanced Topics

71.The "NUCA" (Non-Uniform Cache Access) architecture is used in:
a) Large multicore caches
b) Hard disks
c) Interrupt controllers
d) Power supplies
Answer: a) Multicore caches
72."Way prediction" in caches aims to:
a) Reduce access time by guessing the correct cache way
b) Increase miss rate
c) Disable associativity
d) Slow down hits
Answer: a) Guess cache way
73.The "store buffer" in a CPU is used to:
a) Hold store instructions until retirement
b) Replace the cache
c) Manage interrupts
d) Disable pipelines
Answer: a) Hold stores
74."Memory-mapped I/O" means:
a) I/O devices are accessed like memory locations
b) I/O uses separate instructions
c) Cache is unused
d) DMA is disabled
Answer: a) I/O as memory
75.The "interrupt latency" is the time between:
a) Interrupt occurrence and servicing
b) Two DMA transfers
c) Cache misses
d) Pipeline stalls
Answer: a) Interrupt and servicing

12. Final Questions (76–80)

76."Cache coloring" is used to:

a) Reduce conflict misses by partitioning cache
b) Increase power consumption
c) Disable TLB
d) Slow down memory
Answer: a) Reduce conflicts
77.The "ROB" (ReOrder Buffer) helps in:
a) Out-of-order execution and retirement
b) Increasing cache misses
c) Disabling interrupts
d) Reducing disk speed
Answer: a) Out-of-order execution
78."EPIC" (Explicitly Parallel Instruction Computing) is used in:
a) Itanium processors
b) GPUs
c) Hard disks
d) Keyboards
Answer: a) Itanium
79.The "memory barrier" instruction ensures:
a) Ordering of memory operations
b) Cache eviction
c) DMA completion
d) Interrupt masking
Answer: a) Memory ordering
80."Transactional Memory" simplifies parallel programming by:
a) Grouping instructions into atomic transactions
b) Disabling caches
c) Increasing interrupts
d) Slowing pipelines
Answer: a) Atomic transactions
81.Which cache write policy guarantees memory consistency but has higher
latency?
a) Write-through
b) Write-back
c) Write-around
d) No-write allocate
Answer: a) Write-through
Explanation: Write-through updates both cache and main memory
immediately, ensuring consistency at the cost of slower writes.
82.A "compulsory miss" in a cache occurs because:
a) The block is accessed for the first time
b) The cache is too small
c) The replacement policy failed
d) The TLB is full
Answer: a) First-time access
Explanation: Also called a "cold miss," it happens when data is loaded into
cache for the first time.
83.Which technique reduces conflict misses in a direct-mapped cache?
a) Increasing block size
b) Using a victim cache
c) Disabling prefetching
d) Reducing associativity
Answer: b) Victim cache
Explanation: A small fully associative cache that holds recently evicted blocks
to mitigate conflicts.
84.In virtual memory, a "page fault" occurs when:
a) A page is not in physical memory (RAM)
b) The TLB is full
c) The cache is flushed
d) DMA is initiated
Answer: a) Page not in RAM
Explanation: Requires loading the page from disk into RAM.
85.The "working set model" is used to:
a) Determine how much memory a process needs to avoid thrashing
b) Calculate cache hit rates
c) Schedule DMA transfers
d) Design CPU pipelines
Answer: a) Prevent thrashing
Explanation: Tracks the set of pages a process actively uses to optimize
memory allocation.

14. Parallel Architectures

86.In a SIMD architecture, a single instruction operates on:

a) Multiple data elements simultaneously
b) A single data element
c) Only scalar values
d) Cache lines
Answer: a) Multiple data elements
Explanation: Used in vector processors (e.g., GPU shader cores).
87.Which is a characteristic of MIMD (Multiple Instruction Multiple Data)
systems?
a) All cores execute the same instruction
b) Cores execute different instructions on different data
c) Only one core is active at a time
d) No shared memory
Answer: b) Different instructions/data
Explanation: Examples: Multicore CPUs, distributed systems.
88."Cache coherence" in multicore systems ensures:
a) All cores see the same value for a memory location
b) Caches are disabled
c) Each core has a private memory
d) DMA is used for all accesses
Answer: a) Consistent memory views
Explanation: Protocols like MESI enforce coherence.
89.The "fork-join" parallelism model is commonly used in:
a) Multithreaded programming (e.g., OpenMP)
b) GPU computing
c) Interrupt handling
d) Disk scheduling
Answer: a) Multithreading
Explanation: Tasks are forked into parallel threads and joined afterward.
90.Which is not a parallel programming challenge?
a) Race conditions
b) Deadlocks
c) False sharing
d) Sequential execution
Answer: d) Sequential execution

15. Advanced Pipelining & Hazards

91.A "control hazard" in pipelining occurs due to:

a) Branches or jumps
b) Data dependencies
c) Cache misses
d) DMA transfers
Answer: a) Branches
Explanation: The pipeline may fetch wrong instructions until the branch is
resolved.
92."Tomasulo’s algorithm" handles:
a) Out-of-order execution with register renaming
b) Cache replacement
c) Virtual memory paging
d) Interrupt prioritization
Answer: a) Out-of-order execution
Explanation: Dynamically schedules instructions to avoid stalls.
93.The "reorder buffer" (ROB) is used to:
a) Commit instructions in program order
b) Replace cache blocks
c) Manage interrupts
d) Schedule DMA
Answer: a) In-order commitment
Explanation: Ensures speculative execution results are finalized correctly.
94."Speculative execution" can lead to:
a) Performance gains
b) Security vulnerabilities (e.g., Spectre)
c) Both a and b
d) Neither
Answer: c) Both
Explanation: Improves performance but may leak data via side channels.
95.Which is not a branch prediction technique?
a) Static prediction (always-taken)
b) Dynamic prediction (branch history table)
c) Random replacement
d) Tournament predictors
Answer: c) Random replacement

16. I/O & System Integration

96."Memory-mapped I/O" differs from "port-mapped I/O" in that:

a) I/O devices appear as memory addresses
b) Special instructions (e.g., IN/OUT) are used
c) DMA is required
d) Only interrupts are used
Answer: a) I/O as memory addresses
Explanation: Simplifies programming by using load/store instructions.
97.A "vectored interrupt" system:
a) Directly provides the ISR address
b) Requires polling
c) Uses only DMA
d) Ignores interrupts
Answer: a) Provides ISR address
Explanation: Faster than non-vectored interrupts (no lookup needed).
98."Cycle stealing" in DMA refers to:
a) Using CPU cycles for transfers when the bus is idle
b) Halting the CPU indefinitely
c) Disabling caches
d) Skipping interrupts
Answer: a) Opportunistic bus access
Explanation: DMA "steals" cycles without fully stalling the CPU.
99.Which is not a disk scheduling algorithm?
a) SCAN (elevator)
b) C-SCAN (circular SCAN)
c) FIFO
d) LRU
Answer: d) LRU
Explanation: LRU is a cache/page replacement policy.
100. The "RAID 5" configuration provides:
a) Striping with distributed parity
b) Mirroring without parity
c) No redundancy
d) Double striping
Answer: a) Striping + distributed parity
Explanation: Balances performance and fault tolerance.

Computer Architecture Module 1 2 Notes
No ratings yet
Computer Architecture Module 1 2 Notes
3 pages
ACA CIE-1 Notes
No ratings yet
ACA CIE-1 Notes
4 pages
Basic of Computer Notes
No ratings yet
Basic of Computer Notes
36 pages
Cloud Computing
No ratings yet
Cloud Computing
1 page
ICS 2101: Computer Organization - Complete Notes
No ratings yet
ICS 2101: Computer Organization - Complete Notes
9 pages
Ade 2
No ratings yet
Ade 2
64 pages
CA Final PDF
No ratings yet
CA Final PDF
13 pages
Computerarchitecture and Organization Summary
No ratings yet
Computerarchitecture and Organization Summary
6 pages
Pipelining: Load Store
No ratings yet
Pipelining: Load Store
16 pages
Computer Architecture
No ratings yet
Computer Architecture
7 pages
Computer Architecture Study Guide Summary
No ratings yet
Computer Architecture Study Guide Summary
4 pages
Study Notes COAL Mids
No ratings yet
Study Notes COAL Mids
14 pages
Computer Architecture
No ratings yet
Computer Architecture
6 pages
Computer Archi
No ratings yet
Computer Archi
58 pages
FIT9134 Week11
No ratings yet
FIT9134 Week11
21 pages
Kome Text
No ratings yet
Kome Text
10 pages
Computer Architecture Notes
No ratings yet
Computer Architecture Notes
4 pages
Computer Architecture & Organization Notes
No ratings yet
Computer Architecture & Organization Notes
5 pages
Review of LSS CSC
No ratings yet
Review of LSS CSC
21 pages
Core Set of Topics From Computer Architecture and Microprocessors
No ratings yet
Core Set of Topics From Computer Architecture and Microprocessors
3 pages
Computer System: Operating Systems: Internals and Design Principles
No ratings yet
Computer System: Operating Systems: Internals and Design Principles
62 pages
RG1 Intro ParallelArch HPCAI Jan2020
No ratings yet
RG1 Intro ParallelArch HPCAI Jan2020
47 pages
Aca
No ratings yet
Aca
13 pages
Computer Architecture Basics
No ratings yet
Computer Architecture Basics
24 pages
Assembly Assign
No ratings yet
Assembly Assign
5 pages
AS CS Notes
No ratings yet
AS CS Notes
28 pages
Computer Architecture
No ratings yet
Computer Architecture
23 pages
Final CAAL Notes
No ratings yet
Final CAAL Notes
9 pages
Archi Reviewer
No ratings yet
Archi Reviewer
21 pages
Osca All in One Notes
No ratings yet
Osca All in One Notes
18 pages
Computer Architecture Basics
No ratings yet
Computer Architecture Basics
74 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
51 pages
CH17 COA9e
No ratings yet
CH17 COA9e
51 pages
1.1 Summary Notes Computer Science A Level OCR
0% (1)
1.1 Summary Notes Computer Science A Level OCR
5 pages
Chapter 8 - Parallel Processing
No ratings yet
Chapter 8 - Parallel Processing
50 pages
Computer System Overview
No ratings yet
Computer System Overview
52 pages
Coa Prep
No ratings yet
Coa Prep
14 pages
Operation System Design
No ratings yet
Operation System Design
7 pages
Chapter 2 - Computer Organization
No ratings yet
Chapter 2 - Computer Organization
30 pages
Com Org Unit Iv
No ratings yet
Com Org Unit Iv
17 pages
Organization CH 2
No ratings yet
Organization CH 2
102 pages
Chapter 1 Edit
No ratings yet
Chapter 1 Edit
463 pages
Chapter 1 Edit PDF
No ratings yet
Chapter 1 Edit PDF
40 pages
HPC Unit 1
No ratings yet
HPC Unit 1
65 pages
1.1 The Characteristics of Contemporary Processors.280155520
No ratings yet
1.1 The Characteristics of Contemporary Processors.280155520
1 page
System-On-Chip (Soc) Architecture Soc Example
No ratings yet
System-On-Chip (Soc) Architecture Soc Example
71 pages
Poa Viva
No ratings yet
Poa Viva
6 pages
A4 版本1 （未使用）
No ratings yet
A4 版本1 （未使用）
2 pages
WiSH 2025 Workbook Week1 Day5
No ratings yet
WiSH 2025 Workbook Week1 Day5
16 pages
Chapter01 OSedition7Final
No ratings yet
Chapter01 OSedition7Final
62 pages
CompArch Most Important Questions
No ratings yet
CompArch Most Important Questions
12 pages
Moring Exam
No ratings yet
Moring Exam
31 pages
Unit 2
No ratings yet
Unit 2
48 pages
HPC Lecture
No ratings yet
HPC Lecture
3 pages
HPC Unit 2
No ratings yet
HPC Unit 2
72 pages
CH17 COA9e Parallel Processing
No ratings yet
CH17 COA9e Parallel Processing
52 pages
2 - Cpe410l2
No ratings yet
2 - Cpe410l2
10 pages
Real Time System Lect10 A
No ratings yet
Real Time System Lect10 A
25 pages
Digital Electronics and Logic Design
No ratings yet
Digital Electronics and Logic Design
21 pages
10 Week Plan
No ratings yet
10 Week Plan
6 pages
DBMS
No ratings yet
DBMS
23 pages
Theory of Computation
No ratings yet
Theory of Computation
17 pages
Embedded Systems (15ec503) - Unit 3: BY J P Shri Tharanyaa M.E, (PH.D) AP (SR.G) /ECE
No ratings yet
Embedded Systems (15ec503) - Unit 3: BY J P Shri Tharanyaa M.E, (PH.D) AP (SR.G) /ECE
8 pages
CSC 101
No ratings yet
CSC 101
46 pages
For NARM CH 1&Ch 2
No ratings yet
For NARM CH 1&Ch 2
41 pages
Computer Science Notes Edexcel (Unit 1 To 5)
No ratings yet
Computer Science Notes Edexcel (Unit 1 To 5)
30 pages
Cache Memory for CS Students
No ratings yet
Cache Memory for CS Students
45 pages
Pgdca 111 It Notes
No ratings yet
Pgdca 111 It Notes
18 pages
Microp211 Aula05
No ratings yet
Microp211 Aula05
9 pages
Lecture 3
No ratings yet
Lecture 3
39 pages
CR800-R CR800-Q Series Controller Iq Platform Supporting Extended Function Instruction Manual Bfp-A3528b
No ratings yet
CR800-R CR800-Q Series Controller Iq Platform Supporting Extended Function Instruction Manual Bfp-A3528b
150 pages
ECT206 Mod 5 - Ktunotes - in
No ratings yet
ECT206 Mod 5 - Ktunotes - in
24 pages
Semester-7 MCA Integrated IIPS DAVV Syllabus
No ratings yet
Semester-7 MCA Integrated IIPS DAVV Syllabus
9 pages
Cos 101
No ratings yet
Cos 101
23 pages
UC1 - Computer Parts (Part of Oral Questioning
No ratings yet
UC1 - Computer Parts (Part of Oral Questioning
14 pages
Machine Tool Controller Design and Development Trend: Joe, Chien-Yi Lee
No ratings yet
Machine Tool Controller Design and Development Trend: Joe, Chien-Yi Lee
83 pages
1.evolution of Microprocessor
No ratings yet
1.evolution of Microprocessor
14 pages
5-13!1!101a Loss of Communication Updates Due To
No ratings yet
5-13!1!101a Loss of Communication Updates Due To
12 pages
Power Transformer Protection
100% (1)
Power Transformer Protection
70 pages
MASM1
No ratings yet
MASM1
29 pages
Addressing Modes 8051
No ratings yet
Addressing Modes 8051
5 pages
Renesas Flasher
No ratings yet
Renesas Flasher
18 pages
Project Plan For Degree Projects
No ratings yet
Project Plan For Degree Projects
4 pages
Computer Architecture: Processor Structure
No ratings yet
Computer Architecture: Processor Structure
6 pages
Internship Report - Robotics
75% (4)
Internship Report - Robotics
45 pages
OpenCL Programming
100% (1)
OpenCL Programming
246 pages
Static Mixer Tutorial - SimFlow CFD Software
No ratings yet
Static Mixer Tutorial - SimFlow CFD Software
34 pages
Microprocessors and Microcontrollers: Ș.L. Barbelian Mihai
No ratings yet
Microprocessors and Microcontrollers: Ș.L. Barbelian Mihai
19 pages
OS Chap 1
No ratings yet
OS Chap 1
4 pages
CC Module 1 Sjbit
No ratings yet
CC Module 1 Sjbit
35 pages
870use10110 - Manual PLC Schneider
No ratings yet
870use10110 - Manual PLC Schneider
376 pages

Computer Organization and Architecture

Uploaded by

Computer Organization and Architecture

Uploaded by

1.

Computer System Hierarchy

A computer system is organized into multiple layers of memory and processing

1.​ CPU (Central Processing Unit)

2. Instruction Set Architecture (ISA)

●​ Supported instructions (e.g., ADD, LOAD, STORE).

1.​ RISC (Reduced Instruction Set Computer)

Feature RISC CISC

Instruction Set Small & Simple Large & Complex

Execution Faster (pipelined) Slower (multi-cycle ops)

Power Better (mobile devices) Worse

Code Size Larger (more Smaller (complex ops)

3. Memory Hierarchy & Performance

3.1 Memory Speed & Size Trade-off

Memory Type Speed Size Cost per Bit

Registers Fastest Smallest Highest

Main Memory (RAM) Fast Medium Moderate

Secondary Storage Slow Large Low

3.2 Locality Principles

1.​ Temporal Locality: Recently accessed data is likely to be reused.

Caching works best when programs exhibit good locality.

4. I/O System & Buses

4.1 I/O Communication Methods

1.​ Programmed I/O

4.2 System Buses

1.​ Data Bus: Carries data between CPU and memory.

5. Pipelining & Parallelism

5.1 Instruction Pipelining

5.2 Parallel Processing

1.​ Multicore CPUs: Multiple cores execute tasks simultaneously.

●​ Enhance throughput (tasks per unit time)​

●​ Reduce execution time for large-scale problems​

●​ Improve resource utilization​

●​ Enable concurrent execution of independent tasks​

●​ Solve complex problems that are infeasible on a single processor​

🔹 Instruction-Level Parallelism (ILP)

Goal: Maximize CPU pipeline usage and reduce idle stages.

●​ Increases throughput, not individual instruction speed.​

●​ Best suited when instructions are independent.​

●​ Stalling and bubbles in pipeline​

●​ Requires complex scheduling, dependency checks​

●​ Hardware must support parallel decoding, execution, and writeback​

🔹 Processor-Level Parallelism (PLP)

🔹 Multiprocessor System Overview

○​ All CPUs share global memory.​

○​ Needs cache coherence (e.g., MESI protocol).​

2.​ Distributed Memory Systems (Loosely Coupled):​

○​ Each CPU has private memory.​

○​ Communicate via messages (e.g., MPI, Hadoop).​

●​ Expensive hardware and complexity​

ILP Parallel execution of instructions in Instruction pipelining

Pipelining Overlapping stages of instruction 5-stage RISC pipeline

Superscalar Multiple instructions per cycle Intel i7, ARM

PLP Multiple processors running tasks Dual-core, quad-core

🔶 Cache Memory Organization

○​ Temporal Locality: Recently used data is likely to be used again.​

○​ Spatial Locality: Nearby data will likely be accessed soon.​

Set-Associative Compromise: Block maps to a set, then associatively to

🔶 Techniques to Reduce Cache Misses

2.​ Higher Associativity​

3.​ Multilevel Caches (L1, L2, L3)​

4.​ Victim Caches: Small buffer to hold recently evicted blocks​

5.​ Prefetching: Load data before it’s needed​

6.​ Write Policies:​

○​ Write-Through: Updates main memory immediately​

○​ Write-Back: Updates only in cache, written to memory later​

●​ Virtual Address → Physical Address via Page Table​

Paging Divides memory into fixed-size pages (no

Segmentation Divides memory by logical divisions

Paged Combines paging & segmentation

🔶 Memory Management Techniques

○​ Inverted Page Table, Multilevel Page Table​

3.​ Protection & Sharing: Done via segment/page permissions​

FIFO Oldest page is replaced

Clock Algorithm Approximate LRU using a circular queue

✅ Recommended YouTube Videos:

1. CPU (Central Processing Unit)

● Supported instructions (e.g., ADD, LOAD, STORE).

1. RISC (Reduced Instruction Set Computer)

1. Temporal Locality: Recently accessed data is likely to be reused.

1. Programmed I/O

1. Data Bus: Carries data between CPU and memory.

1. Multicore CPUs: Multiple cores execute tasks simultaneously.

● Enhance throughput (tasks per unit time)

● Reduce execution time for large-scale problems

● Improve resource utilization

● Enable concurrent execution of independent tasks

● Solve complex problems that are infeasible on a single processor

● Increases throughput, not individual instruction speed.

● Best suited when instructions are independent.

● Stalling and bubbles in pipeline

● Requires complex scheduling, dependency checks

● Hardware must support parallel decoding, execution, and writeback

○ All CPUs share global memory.

○ Needs cache coherence (e.g., MESI protocol).

2. Distributed Memory Systems (Loosely Coupled):

○ Each CPU has private memory.

○ Communicate via messages (e.g., MPI, Hadoop).

● Expensive hardware and complexity

○ Temporal Locality: Recently used data is likely to be used again.

○ Spatial Locality: Nearby data will likely be accessed soon.

2. Higher Associativity

3. Multilevel Caches (L1, L2, L3)

4. Victim Caches: Small buffer to hold recently evicted blocks

5. Prefetching: Load data before it’s needed

6. Write Policies:

○ Write-Through: Updates main memory immediately

○ Write-Back: Updates only in cache, written to memory later

● Virtual Address → Physical Address via Page Table

○ Inverted Page Table, Multilevel Page Table

3. Protection & Sharing: Done via segment/page permissions

● Temporal Locality: Recently accessed data is likely to be accessed again.

● Purpose: Reduce average memory access time (AMAT).

● Uses disk space to extend RAM.

1. Average Memory Access Time (AMAT):

10.Effective Access Time (Virtual Memory):

1. Which component executes arithmetic and logic operations?

6. Programmed I/O is inefficient because:

11.A direct-mapped cache places a block at:

16.Pipelining improves performance by:

21.Which cache mapping technique has no conflict misses?

41.Which cache mapping technique requires a replacement policy?

46.In a pipelined processor, a "bubble" occurs due to:

51.An ISR (Interrupt Service Routine) is:

56.The "working set" of a process is:

61.The "data hazard" in pipelining occurs when:

66.The "CPI" (Cycles Per Instruction) is improved by:

76."Cache coloring" is used to:

86.In a SIMD architecture, a single instruction operates on:

91.A "control hazard" in pipelining occurs due to:

96."Memory-mapped I/O" differs from "port-mapped I/O" in that: