1.
Computer System Hierarchy
A computer system is organized into multiple layers of memory and processing
units:
1. CPU (Central Processing Unit)
○ Executes instructions.
○ Contains:
■ ALU (Arithmetic Logic Unit): Performs calculations.
■ Control Unit (CU): Manages instruction execution.
■ Registers: Fast storage (e.g., Program Counter, Accumulator).
2. Caches (L1, L2, L3)
○ Small, fast memory to reduce CPU-memory latency.
○ L1 Cache (fastest, inside CPU).
○ L2 Cache (slower, may be inside or outside CPU).
○ L3 Cache (shared among cores).
3. Main Memory (RAM - Random Access Memory)
○ Volatile storage for running programs.
○ Slower than cache but faster than secondary storage.
4. Secondary Memory (HDD/SSD, Optical Drives)
○ Non-volatile storage (persistent).
○ Examples: Hard Disk Drive (HDD), Solid-State Drive (SSD).
5. I/O (Input/Output) Devices
○ Peripherals like keyboard, mouse, monitor, network cards.
○ Managed by device controllers and drivers.
2. Instruction Set Architecture (ISA)
The ISA defines the interface between hardware and software, including:
● Supported instructions (e.g., ADD, LOAD, STORE).
● Data types (integers, floats).
● Memory addressing modes.
● Registers and their usage.
Types of ISA
1. RISC (Reduced Instruction Set Computer)
Fewer, simpler instructions.
○
Fixed instruction length.
○
Examples: ARM, MIPS, RISC-V.
○
Advantages:
○
■ Faster execution (pipelining).
■ Lower power consumption (used in mobile devices).
2. CISC (Complex Instruction Set Computer)
○ Many complex instructions.
○ Variable instruction length.
○ Examples: x86 (Intel, AMD).
○ Advantages:
■ Fewer instructions per program (higher code density).
■ Hardware handles complex operations.
Feature RISC CISC
Instruction Set Small & Simple Large & Complex
Execution Faster (pipelined) Slower (multi-cycle ops)
Speed
Power Better (mobile devices) Worse
Efficiency (desktops/servers)
Code Size Larger (more Smaller (complex ops)
instructions)
3. Memory Hierarchy & Performance
3.1 Memory Speed & Size Trade-off
Memory Type Speed Size Cost per Bit
Registers Fastest Smallest Highest
Cache (L1, L2, L3) Very Fast Small High
Main Memory (RAM) Fast Medium Moderate
Secondary Storage Slow Large Low
(SSD/HDD)
3.2 Locality Principles
1. Temporal Locality: Recently accessed data is likely to be reused.
2. Spatial Locality: Nearby memory locations are likely to be accessed.
Caching works best when programs exhibit good locality.
4. I/O System & Buses
4.1 I/O Communication Methods
1. Programmed I/O
○ CPU polls devices (inefficient).
2. Interrupt-Driven I/O
○ Device interrupts CPU when ready.
3. DMA (Direct Memory Access)
○ I/O device writes directly to memory without CPU.
4.2 System Buses
1. Data Bus: Carries data between CPU and memory.
2. Address Bus: Specifies memory location.
3. Control Bus: Manages operations (read/write signals).
5. Pipelining & Parallelism
5.1 Instruction Pipelining
● Breaks instruction execution into stages (Fetch, Decode, Execute, Writeback).
● Improves throughput (more instructions per cycle).
● Hazards:
○ Structural: Resource conflicts.
○ Data: Dependency between instructions.
○ Control: Branch instructions disrupt flow.
5.2 Parallel Processing
1. Multicore CPUs: Multiple cores execute tasks simultaneously.
2. SIMD (Single Instruction, Multiple Data):
○ Same operation on multiple data points (used in GPUs).
3. MIMD (Multiple Instruction, Multiple Data):
○ Different cores execute different instructions (e.g., multi-threading).
📘 Introduction to Parallelism
🔹 Goals of Parallelism
Parallelism is a method used to make computing faster by doing multiple operations
simultaneously.
✅ Key Goals:
● Increase performance (speedup)
● Enhance throughput (tasks per unit time)
● Reduce execution time for large-scale problems
● Improve resource utilization
● Enable concurrent execution of independent tasks
● Solve complex problems that are infeasible on a single processor
🔹 Instruction-Level Parallelism (ILP)
ILP exploits the ability of a CPU to execute multiple instructions simultaneously by
identifying independent instructions.
✅ ILP Techniques:
● Pipelining
● Out-of-order execution
● Branch prediction
● Register renaming
● Speculative execution
Goal: Maximize CPU pipeline usage and reduce idle stages.
🔹 Pipelining
Pipelining breaks instruction execution into multiple stages (fetch, decode, execute,
etc.), allowing overlapping execution.
✅ Key Points:
● Like an assembly line: each stage processes one instruction at a time.
● Increases throughput, not individual instruction speed.
● Best suited when instructions are independent.
🚧 Challenges:
● Hazards (Data, Control, Structural)
● Stalling and bubbles in pipeline
🔁 Example:
sql
Copy code
Cycle 1 2 3 4 5
Instr 1 IF ID EX MEM WB
Instr 2 IF ID EX MEM
Instr 3 IF ID EX
🔹 Superscalar Architecture
A superscalar processor can issue and execute more than one instruction per clock
cycle.
✅ Features:
● Multiple instruction pipelines
● Requires complex scheduling, dependency checks
● Hardware must support parallel decoding, execution, and writeback
⚙️ Example:
Intel Core i7, AMD Ryzen CPUs are superscalar.
🔹 Processor-Level Parallelism (PLP)
Involves using multiple processors or cores to execute different threads or programs
concurrently.
✅ Types:
● Symmetric Multiprocessing (SMP) – all processors share memory and OS (e.g.,
multi-core CPUs)
● Massively Parallel Processing (MPP) – each processor has its own memory,
works independently
🧠 Examples:
● Servers, clusters, modern smartphones, cloud computing
🔹 Multiprocessor System Overview
A multiprocessor system is a system with two or more processors that share tasks
to improve performance and reliability.
✅ Classification:
1. Shared Memory Systems (Tightly Coupled):
○ All CPUs share global memory.
○ Easier to program.
○ Needs cache coherence (e.g., MESI protocol).
2. Distributed Memory Systems (Loosely Coupled):
○ Each CPU has private memory.
○ Communicate via messages (e.g., MPI, Hadoop).
📌 Characteristics:
● High performance
● Scalability
● Fault tolerance
● Expensive hardware and complexity
🧠 Summary Table
Type Definition Example
ILP Parallel execution of instructions in Instruction pipelining
a pipeline
Pipelining Overlapping stages of instruction 5-stage RISC pipeline
execution
Superscalar Multiple instructions per cycle Intel i7, ARM
Cortex-A76
PLP Multiple processors running tasks Dual-core, quad-core
CPUs
Multiprocessor System with >1 CPU for better Cloud servers, SMP
System performance systems
🔶 Cache Memory Organization
🔹 Concepts:
● Cache Memory: Small, fast memory between CPU and main memory.
● Locality Principle:
○ Temporal Locality: Recently used data is likely to be used again.
○ Spatial Locality: Nearby data will likely be accessed soon.
🔹 Mapping Techniques:
Technique Description
Direct Mapping One memory block → One specific cache line (fast, simple,
high conflict)
Associative Mapping Any memory block → Any cache line (flexible, but slower
and expensive)
Set-Associative Compromise: Block maps to a set, then associatively to
Mapping lines
🔶 Techniques to Reduce Cache Misses
1. Block Size Optimization
2. Higher Associativity
3. Multilevel Caches (L1, L2, L3)
4. Victim Caches: Small buffer to hold recently evicted blocks
5. Prefetching: Load data before it’s needed
6. Write Policies:
○ Write-Through: Updates main memory immediately
○ Write-Back: Updates only in cache, written to memory later
🔶 Virtual Memory Organization
🔹 Concepts:
● Allows execution of processes not fully in memory
● Virtual Address → Physical Address via Page Table
🔹 Mapping Techniques:
Mapping Explanation
Paging Divides memory into fixed-size pages (no
fragmentation)
Segmentation Divides memory by logical divisions
(code/data/stack)
Paged Combines paging & segmentation
Segmentation
🔶 Memory Management Techniques
1. Page Tables: Store mapping of virtual to physical addresses
○ Inverted Page Table, Multilevel Page Table
2. TLB (Translation Lookaside Buffer): Fast lookup cache for page table entries
3. Protection & Sharing: Done via segment/page permissions
🔶 Memory Replacement Policies
Used when a page must be swapped out from memory:
Policy Strategy
FIFO Oldest page is replaced
LRU (Least Recently Page not used for the longest time
Used)
Optimal Replace the page that will not be used for the longest time
(ideal but not practical)
Clock Algorithm Approximate LRU using a circular queue
✅ Recommended YouTube Videos:
● 📺 Gate Smashers – Cache Memory & Virtual Memory
Cache Memory | Virtual Memory
I/O Organization & Memory Hierarchy: Theory
1. I/O Organization
1.1 Programmed I/O
● The CPU directly controls I/O operations by polling device status registers.
● Disadvantages:
○ CPU remains busy in a loop, wasting cycles.
○ Inefficient for slow devices.
1.2 Interrupt-Driven I/O
● Devices interrupt the CPU when they are ready for data transfer.
● Advantages:
○ CPU can perform other tasks while waiting.
○ More efficient than polling.
● Disadvantages:
○ Overhead of handling frequent interrupts.
1.3 Direct Memory Access (DMA)
● DMA Controller transfers data between I/O devices and memory without CPU
intervention.
● Steps:
1. CPU initializes DMA (source, destination, size).
2. DMA takes control of the bus.
3. Data is transferred directly.
4. DMA interrupts CPU upon completion.
● Used for: High-speed transfers (e.g., disk I/O, network packets).
1.4 Synchronous vs. Asynchronous Data Transfer
Feature Synchronous Asynchronous
Clock Uses a common clock No common clock
Speed Fixed rate Variable rate
Handshaking Not required Required (STROBE,
ACK)
Example CPU-RAM Keyboard input
communication
2. Memory Hierarchy
2.1 Principle of Locality
● Temporal Locality: Recently accessed data is likely to be accessed again.
● Spatial Locality: Nearby memory locations are likely to be accessed soon.
● Exploited by: Caching, prefetching.
2.2 Memory Hierarchy Levels
Level Type Speed Cost Size
Registers Fastest 1 ns Highest Few KB
Cache (L1, L2, L3) SRAM 1-10 ns High KB-MB
Main Memory Volatile 50-100 ns Moderate GB
(DRAM)
Secondary Non-volatile 100,000 ns Low TB
(SSD/HDD)
Tertiary Archival Very slow Very low PB
(Tape/Cloud)
2.3 Cache Memory
● Purpose: Reduce average memory access time (AMAT).
● Mapping Techniques:
○ Direct Mapped: Each block maps to one cache line.
○ Fully Associative: Block can go anywhere in cache.
○ Set-Associative: Block maps to a set of lines (compromise between
direct & fully associative).
● Replacement Policies:
○ LRU (Least Recently Used)
○ FIFO (First-In-First-Out)
○ Random Replacement
● Write Policies:
○ Write-Through: Data written to cache and main memory.
○ Write-Back: Data written only to cache; main memory updated on
eviction.
2.4 Virtual Memory
● Uses disk space to extend RAM.
● Paging: Divides memory into fixed-size blocks (pages).
● TLB (Translation Lookaside Buffer): Caches page table entries for faster
address translation.
● Thrashing: Excessive page faults due to insufficient RAM.
3. Key Formulas
1. Average Memory Access Time (AMAT):
2. AMAT=Hit Time+Miss Rate×Miss Penalty
3. AMAT=Hit Time+Miss Rate×Miss Penalty
4. Miss Rate:
5. Miss Rate=Number of MissesTotal Accesses
6. Miss Rate=
7. Total Accesses
8. Number of Misses
9.
10.Effective Access Time (Virtual Memory):
11.EAT=(1−p)×Memory Access Time+p×Page Fault Time
12.EAT=(1−p)×Memory Access Time+p×Page Fault Time
○ p
○ p = Page fault rate.
4. Summary Table
Concept Key Points
Programmed I/O CPU polls devices (inefficient).
Interrupt-Driven I/O Devices interrupt CPU (better
efficiency).
DMA Direct transfers without CPU (fastest).
Synchronous Transfer Uses clock (e.g., CPU-RAM).
Asynchronous Uses handshaking (e.g., keyboard).
Transfer
Temporal Locality Reuse of recent data.
Spatial Locality Access nearby data.
Cache Mapping Direct, Associative, Set-Associative.
Write Policies Write-Through vs. Write-Back.
Virtual Memory Extends RAM using disk.
1. General System Architecture
1. Which component executes arithmetic and logic operations?
a) Cache
b) ALU (Arithmetic Logic Unit)
c) Control Unit
d) I/O Controller
Answer: b) ALU
2. RISC architectures are characterized by:
a) Complex instructions
b) Fewer instructions and fixed-length formats
c) Variable-length instructions
d) Microprogrammed control
Answer: b) Fewer instructions and fixed-length formats
3. CISC vs. RISC: Which typically uses more registers?
a) CISC
b) RISC
c) Both use the same number
d) Depends on the OS
Answer: b) RISC
4. The Control Unit (CU) is responsible for:
a) Performing arithmetic operations
b) Decoding instructions and generating control signals
c) Storing data temporarily
d) Managing I/O devices
Answer: b) Decoding instructions
5. Secondary memory (e.g., HDD) is:
a) Faster than cache
b) Non-volatile
c) Part of the CPU
d) Used only for ALU operations
Answer: b) Non-volatile
2. I/O Organization & Memory Hierarchy
6. Programmed I/O is inefficient because:
a) It uses DMA
b) The CPU polls devices in a loop
c) It relies on interrupts
d) It bypasses the CPU
Answer: b) CPU polls devices
7. DMA improves performance by:
a) Increasing cache size
b) Allowing direct transfers between I/O and memory
c) Using more interrupts
d) Slowing down the CPU
Answer: b) Direct transfers
8. Synchronous data transfer requires:
a) Handshaking signals
b) A common clock
c) No clock
d) Only interrupts
Answer: b) Common clock
9. Temporal locality refers to:
a) Accessing nearby memory locations
b) Reusing recently accessed data
c) Random memory access
d) Disk I/O patterns
Answer: b) Reusing recent data
10.The fastest memory in the hierarchy is:
a) HDD
b) DRAM
c) L1 Cache
d) SSD
Answer: c) L1 Cache
3. Cache Memory & Virtual Memory
11.A direct-mapped cache places a block at:
a) Any empty location
b) A unique location based on its address
c) Only the first location
d) Randomly
Answer: b) Unique location
12.Fully associative cache allows:
a) A block to be placed anywhere
b) Fixed placement
c) Only sequential access
d) No replacement
Answer: a) Anywhere
13.LRU (Least Recently Used) is a:
a) Cache replacement policy
b) Interrupt handler
c) DMA technique
d) CPU scheduling algorithm
Answer: a) Replacement policy
14.Write-through cache ensures:
a) Data is written only to cache
b) Data is written to cache and memory simultaneously
c) Data is written only on eviction
d) Data is never written to memory
Answer: b) Cache and memory
15.Virtual memory uses:
a) Cache to extend RAM
b) Disk to extend RAM
c) Registers to extend RAM
d) DMA to extend RAM
Answer: b) Disk to extend RAM
4. Parallelism
16.Pipelining improves performance by:
a) Executing multiple instructions simultaneously in stages
b) Increasing clock speed
c) Reducing cache size
d) Eliminating interrupts
Answer: a) Simultaneous stages
17.Superscalar architectures can:
a) Execute multiple instructions per cycle
b) Only execute one instruction per cycle
c) Replace cache memory
d) Slow down the CPU
Answer: a) Multiple instructions/cycle
18.Multiprocessor systems share:
a) Only I/O devices
b) Memory and I/O devices
c) Only registers
d) Nothing
Answer: b) Memory and I/O
19.Instruction-level parallelism (ILP) is achieved via:
a) Pipelining and superscalar execution
b) Increasing disk size
c) Reducing cache misses
d) Using more interrupts
Answer: a) Pipelining/superscalar
20.A SIMD (Single Instruction Multiple Data) architecture is used in:
a) GPUs
b) Hard disks
c) Keyboards
d) Printers
Answer: a) GPUs
Advanced Questions (21–40)
21.Which cache mapping technique has no conflict misses?
a) Direct-mapped
b) Fully associative
c) Set-associative
d) None
Answer: b) Fully associative
22.The TLB (Translation Lookaside Buffer) speeds up:
a) Virtual-to-physical address translation
b) DMA transfers
c) Interrupt handling
d) Disk I/O
Answer: a) Address translation
23.Thrashing occurs when:
a) Cache hits increase
b) Excessive page faults degrade performance
c) DMA is overused
d) The CPU is idle
Answer: b) Excessive page faults
24.Non-uniform memory access (NUMA) means:
a) All memory accesses take the same time
b) Memory access times vary by location
c) Cache is unused
d) DMA is disabled
Answer: b) Variable access times
25.A write-back cache updates main memory:
a) Immediately
b) Only when the block is evicted
c) Never
d) Randomly
Answer: b) On eviction
26.The goal of parallelism is to:
a) Increase throughput
b) Reduce memory size
c) Slow down execution
d) Eliminate caches
Answer: a) Increase throughput
27.In a 4-way set-associative cache, each set has:
a) 1 block
b) 4 blocks
c) 8 blocks
d) Unlimited blocks
Answer: b) 4 blocks
28.Which is not a memory hierarchy level?
a) Registers
b) L1 Cache
c) SSD
d) Power supply
Answer: d) Power supply
29.The AMAT formula includes:
a) Hit time + Miss rate × Miss penalty
b) Hit rate × Miss time
c) Cache size × Access time
d) Miss rate × Hit penalty
Answer: a) Hit time + Miss rate × Miss penalty
30.SIMD is useful for:
a) Image processing
b) Keyboard input
c) Printer output
d) Interrupt handling
Answer: a) Image processing
31.A multicore processor is an example of:
a) Thread-level parallelism
b) Instruction-level parallelism
c) Data-level parallelism
d) Disk-level parallelism
Answer: a) Thread-level parallelism
32.Which is not a replacement policy?
a) LRU
b) FIFO
c) DMA
d) Random
Answer: c) DMA
33.The tag field in a cache address is used to:
a) Identify the memory block
b) Replace the block
c) Disable the cache
d) Speed up DMA
Answer: a) Identify the block
34.An interrupt vector table contains:
a) Addresses of interrupt handlers
b) Cache blocks
c) DMA requests
d) Virtual memory pages
Answer: a) Interrupt handler addresses
35.Which is not a goal of pipelining?
a) Increase instruction throughput
b) Reduce CPI (Cycles Per Instruction)
c) Eliminate memory hierarchy
d) Overlap instruction execution
Answer: c) Eliminate memory hierarchy
36.A hazard in pipelining occurs due to:
a) Resource conflicts
b) Perfect branch prediction
c) Zero cache misses
d) No interrupts
Answer: a) Resource conflicts
37.VLIW (Very Long Instruction Word) architectures rely on:
a) The compiler to schedule parallel instructions
b) Dynamic hardware scheduling
c) Reducing cache size
d) Increasing interrupts
Answer: a) Compiler scheduling
38.In a multiprocessor system, cache coherence ensures:
a) All CPUs see the same memory value
b) Each CPU has its own memory
c) DMA is disabled
d) No interrupts occur
Answer: a) Consistent memory views
39.Which is not a parallel processing technique?
a) Pipelining
b) Superscalar execution
c) Multithreading
d) Polling
Answer: d) Polling
40.The principle of locality justifies the use of:
a) Caches
b) Interrupts
c) DMA
d) Power supplies
Answer: a) Caches
5. Advanced Cache & Memory Management
41.Which cache mapping technique requires a replacement policy?
a) Direct-mapped
b) Fully associative
c) Set-associative
d) Both b and c
Answer: d) Both b and c
Explanation: Direct-mapped has no choice (1 block per set), while associative
mappings need replacement policies.
42.A cache with 64 blocks and 4 blocks per set is:
a) Direct-mapped
b) 4-way set-associative
c) Fully associative
d) 16-way set-associative
Answer: b) 4-way set-associative
Explanation: Number of sets = 64/4 = 16.
43.The "dirty bit" in a cache block indicates:
a) The block is corrupted
b) The block has been modified but not written to memory
c) The block is unused
d) The block is locked
Answer: b) Modified but not written to memory (used in write-back caches).
44.Which reduces conflict misses?
a) Larger cache size
b) Higher associativity
c) Smaller block size
d) Disabling cache
Answer: b) Higher associativity
Explanation: More blocks per set reduce collisions.
45.Virtual memory is implemented using:
a) Paging or segmentation
b) Only paging
c) Only segmentation
d) DMA
Answer: a) Paging or segmentation
6. Parallelism & Multiprocessing
46.In a pipelined processor, a "bubble" occurs due to:
a) A cache hit
b) A branch misprediction
c) A DMA transfer
d) An interrupt
Answer: b) Branch misprediction
Explanation: Pipeline stalls until the correct instruction is fetched.
47.SIMD (Single Instruction Multiple Data) is used in:
a) Vector processors
b) GPUs
c) Both a and b
d) Hard disks
Answer: c) Both a and b
48.Multithreading improves CPU utilization by:
a) Overlapping I/O waits with execution
b) Increasing clock speed
c) Reducing cache size
d) Disabling interrupts
Answer: a) Overlapping I/O waits
49.In a multicore system, shared L3 cache is an example of:
a) Uniform Memory Access (UMA)
b) Non-Uniform Memory Access (NUMA)
c) Message-passing architecture
d) SIMD architecture
Answer: a) UMA
Explanation: All cores access L3 cache with equal latency.
50.Which is not a parallel processing architecture?
a) MIMD (Multiple Instruction Multiple Data)
b) SISD (Single Instruction Single Data)
c) RAID (Redundant Array of Disks)
d) MISD (Multiple Instruction Single Data)
Answer: c) RAID
7. I/O Systems & Interrupts
51.An ISR (Interrupt Service Routine) is:
a) A program that handles interrupts
b) A cache replacement policy
c) A DMA controller
d) A type of memory
Answer: a) Interrupt handler
52.Which I/O method is most suitable for high-speed devices?
a) Programmed I/O
b) Interrupt-driven I/O
c) DMA
d) Polling
Answer: c) DMA
53.In vectored interrupts:
a) The device supplies the interrupt handler address
b) The CPU polls devices
c) No ISR is used
d) DMA is required
Answer: a) Device supplies the address
54.A "maskable" interrupt can be:
a) Ignored or delayed by the CPU
b) Only handled by DMA
c) Generated only by the clock
d) Never serviced
Answer: a) Ignored/delayed
55.Which is not a disk scheduling algorithm?
a) FCFS (First-Come First-Served)
b) SCAN (Elevator)
c) LRU (Least Recently Used)
d) SSTF (Shortest Seek Time First)
Answer: c) LRU
8. Advanced Memory Concepts
56.The "working set" of a process is:
a) The set of pages actively used in a time interval
b) The total memory allocated
c) The cache size
d) The disk buffer
Answer: a) Actively used pages
57.Belady’s anomaly occurs in:
a) FIFO page replacement
b) LRU page replacement
c) Optimal page replacement
d) Random replacement
Answer: a) FIFO
Explanation: Increasing frames can increase page faults.
58.A "TLB miss" results in:
a) A page table walk
b) A cache eviction
c) A DMA transfer
d) An interrupt
Answer: a) Page table walk
59.Inverted page tables are used to:
a) Reduce memory overhead for large address spaces
b) Increase cache size
c) Speed up DMA
d) Disable interrupts
Answer: a) Reduce memory overhead
60.The "compulsory miss" rate can be reduced by:
a) Larger block size
b) Prefetching
c) Higher associativity
d) Smaller cache
Answer: b) Prefetching
9. Processor Architecture
61.The "data hazard" in pipelining occurs when:
a) An instruction depends on the result of a previous instruction
b) The cache is full
c) A branch is mispredicted
d) An interrupt occurs
Answer: a) Instruction dependency
62.Out-of-order execution is used to:
a) Hide pipeline stalls
b) Reduce cache size
c) Disable interrupts
d) Slow down the CPU
Answer: a) Hide stalls
63.A "superscalar" processor can:
a) Issue multiple instructions per cycle
b) Only execute scalar operations
c) Replace cache with disk
d) Eliminate pipelines
Answer: a) Multiple instructions/cycle
64.The "branch target buffer (BTB)" helps in:
a) Predicting branch addresses
b) Reducing cache misses
c) Managing DMA
d) Handling interrupts
Answer: a) Branch prediction
65.Which is not a type of pipeline hazard?
a) Structural
b) Data
c) Control
d) Virtual
Answer: d) Virtual
10. System Performance & Optimization
66.The "CPI" (Cycles Per Instruction) is improved by:
a) Pipelining
b) Increasing cache misses
c) Slowing the clock
d) Disabling interrupts
Answer: a) Pipelining
67."False sharing" in multicore systems occurs when:
a) Cores modify different variables in the same cache line
b) The cache is disabled
c) DMA is overused
d) No interrupts are used
Answer: a) Cores modify same cache line
68.The "Amdahl’s Law" states that speedup is limited by:
a) The sequential portion of a program
b) The cache size
c) The number of interrupts
d) The disk speed
Answer: a) Sequential portion
69."Speculative execution" involves:
a) Executing instructions before knowing if they are needed
b) Disabling pipelines
c) Reducing cache size
d) Ignoring branches
Answer: a) Early execution
70.Which is not a memory coherence protocol?
a) MESI (Modified, Exclusive, Shared, Invalid)
b) MOESI (Modified, Owned, Exclusive, Shared, Invalid)
c) LRU (Least Recently Used)
d) Directory-based
Answer: c) LRU
11. Mixed Advanced Topics
71.The "NUCA" (Non-Uniform Cache Access) architecture is used in:
a) Large multicore caches
b) Hard disks
c) Interrupt controllers
d) Power supplies
Answer: a) Multicore caches
72."Way prediction" in caches aims to:
a) Reduce access time by guessing the correct cache way
b) Increase miss rate
c) Disable associativity
d) Slow down hits
Answer: a) Guess cache way
73.The "store buffer" in a CPU is used to:
a) Hold store instructions until retirement
b) Replace the cache
c) Manage interrupts
d) Disable pipelines
Answer: a) Hold stores
74."Memory-mapped I/O" means:
a) I/O devices are accessed like memory locations
b) I/O uses separate instructions
c) Cache is unused
d) DMA is disabled
Answer: a) I/O as memory
75.The "interrupt latency" is the time between:
a) Interrupt occurrence and servicing
b) Two DMA transfers
c) Cache misses
d) Pipeline stalls
Answer: a) Interrupt and servicing
12. Final Questions (76–80)
76."Cache coloring" is used to:
a) Reduce conflict misses by partitioning cache
b) Increase power consumption
c) Disable TLB
d) Slow down memory
Answer: a) Reduce conflicts
77.The "ROB" (ReOrder Buffer) helps in:
a) Out-of-order execution and retirement
b) Increasing cache misses
c) Disabling interrupts
d) Reducing disk speed
Answer: a) Out-of-order execution
78."EPIC" (Explicitly Parallel Instruction Computing) is used in:
a) Itanium processors
b) GPUs
c) Hard disks
d) Keyboards
Answer: a) Itanium
79.The "memory barrier" instruction ensures:
a) Ordering of memory operations
b) Cache eviction
c) DMA completion
d) Interrupt masking
Answer: a) Memory ordering
80."Transactional Memory" simplifies parallel programming by:
a) Grouping instructions into atomic transactions
b) Disabling caches
c) Increasing interrupts
d) Slowing pipelines
Answer: a) Atomic transactions
81.Which cache write policy guarantees memory consistency but has higher
latency?
a) Write-through
b) Write-back
c) Write-around
d) No-write allocate
Answer: a) Write-through
Explanation: Write-through updates both cache and main memory
immediately, ensuring consistency at the cost of slower writes.
82.A "compulsory miss" in a cache occurs because:
a) The block is accessed for the first time
b) The cache is too small
c) The replacement policy failed
d) The TLB is full
Answer: a) First-time access
Explanation: Also called a "cold miss," it happens when data is loaded into
cache for the first time.
83.Which technique reduces conflict misses in a direct-mapped cache?
a) Increasing block size
b) Using a victim cache
c) Disabling prefetching
d) Reducing associativity
Answer: b) Victim cache
Explanation: A small fully associative cache that holds recently evicted blocks
to mitigate conflicts.
84.In virtual memory, a "page fault" occurs when:
a) A page is not in physical memory (RAM)
b) The TLB is full
c) The cache is flushed
d) DMA is initiated
Answer: a) Page not in RAM
Explanation: Requires loading the page from disk into RAM.
85.The "working set model" is used to:
a) Determine how much memory a process needs to avoid thrashing
b) Calculate cache hit rates
c) Schedule DMA transfers
d) Design CPU pipelines
Answer: a) Prevent thrashing
Explanation: Tracks the set of pages a process actively uses to optimize
memory allocation.
14. Parallel Architectures
86.In a SIMD architecture, a single instruction operates on:
a) Multiple data elements simultaneously
b) A single data element
c) Only scalar values
d) Cache lines
Answer: a) Multiple data elements
Explanation: Used in vector processors (e.g., GPU shader cores).
87.Which is a characteristic of MIMD (Multiple Instruction Multiple Data)
systems?
a) All cores execute the same instruction
b) Cores execute different instructions on different data
c) Only one core is active at a time
d) No shared memory
Answer: b) Different instructions/data
Explanation: Examples: Multicore CPUs, distributed systems.
88."Cache coherence" in multicore systems ensures:
a) All cores see the same value for a memory location
b) Caches are disabled
c) Each core has a private memory
d) DMA is used for all accesses
Answer: a) Consistent memory views
Explanation: Protocols like MESI enforce coherence.
89.The "fork-join" parallelism model is commonly used in:
a) Multithreaded programming (e.g., OpenMP)
b) GPU computing
c) Interrupt handling
d) Disk scheduling
Answer: a) Multithreading
Explanation: Tasks are forked into parallel threads and joined afterward.
90.Which is not a parallel programming challenge?
a) Race conditions
b) Deadlocks
c) False sharing
d) Sequential execution
Answer: d) Sequential execution
15. Advanced Pipelining & Hazards
91.A "control hazard" in pipelining occurs due to:
a) Branches or jumps
b) Data dependencies
c) Cache misses
d) DMA transfers
Answer: a) Branches
Explanation: The pipeline may fetch wrong instructions until the branch is
resolved.
92."Tomasulo’s algorithm" handles:
a) Out-of-order execution with register renaming
b) Cache replacement
c) Virtual memory paging
d) Interrupt prioritization
Answer: a) Out-of-order execution
Explanation: Dynamically schedules instructions to avoid stalls.
93.The "reorder buffer" (ROB) is used to:
a) Commit instructions in program order
b) Replace cache blocks
c) Manage interrupts
d) Schedule DMA
Answer: a) In-order commitment
Explanation: Ensures speculative execution results are finalized correctly.
94."Speculative execution" can lead to:
a) Performance gains
b) Security vulnerabilities (e.g., Spectre)
c) Both a and b
d) Neither
Answer: c) Both
Explanation: Improves performance but may leak data via side channels.
95.Which is not a branch prediction technique?
a) Static prediction (always-taken)
b) Dynamic prediction (branch history table)
c) Random replacement
d) Tournament predictors
Answer: c) Random replacement
16. I/O & System Integration
96."Memory-mapped I/O" differs from "port-mapped I/O" in that:
a) I/O devices appear as memory addresses
b) Special instructions (e.g., IN/OUT) are used
c) DMA is required
d) Only interrupts are used
Answer: a) I/O as memory addresses
Explanation: Simplifies programming by using load/store instructions.
97.A "vectored interrupt" system:
a) Directly provides the ISR address
b) Requires polling
c) Uses only DMA
d) Ignores interrupts
Answer: a) Provides ISR address
Explanation: Faster than non-vectored interrupts (no lookup needed).
98."Cycle stealing" in DMA refers to:
a) Using CPU cycles for transfers when the bus is idle
b) Halting the CPU indefinitely
c) Disabling caches
d) Skipping interrupts
Answer: a) Opportunistic bus access
Explanation: DMA "steals" cycles without fully stalling the CPU.
99.Which is not a disk scheduling algorithm?
a) SCAN (elevator)
b) C-SCAN (circular SCAN)
c) FIFO
d) LRU
Answer: d) LRU
Explanation: LRU is a cache/page replacement policy.
100. The "RAID 5" configuration provides:
a) Striping with distributed parity
b) Mirroring without parity
c) No redundancy
d) Double striping
Answer: a) Striping + distributed parity
Explanation: Balances performance and fault tolerance.