Database System Concepts
Chapter: Storage
Prof. Dr. Rafiqul Islam
26 July, 2025
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 1 / 26
Overview of Physical Storage Media
Storage media differ by:
Access speed
Cost per byte
Reliability
Used at different levels of computer architecture
Crucial for database systems to balance performance and durability
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 2 / 26
Types of Physical Storage Media
Cache — Small, fast, expensive; hardware-managed
Main Memory (RAM) — Fast but volatile; used during query
execution
Flash Memory — Nonvolatile, portable; NAND preferred for storage
devices
Magnetic Disk — Primary medium for persistent data
Optical Disks — CD/DVD/Blu-ray; used for distribution and
backups
Tape Storage — Cheap and high-capacity; ideal for archives
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 3 / 26
Storage Media Characteristics
Medium Speed Cost/Byte Volatile?
Cache Very High Very High Yes
RAM High High Yes
Flash (NAND) Medium Medium No
Magnetic Disk Low Low No
Optical Disk Very Low Low No
Tape Very Low Very Low No
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 4 / 26
Storage Hierarchy
Primary Storage: Fastest access, volatile
Cache
Main Memory
Secondary Storage: Slower, nonvolatile
Magnetic Disks
Solid-State Drives
Tertiary Storage: Slowest, nonvolatile, offline
Magnetic Tape
Optical Disk Jukeboxes
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 5 / 26
Storage Hierarchy
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 6 / 26
Volatility of Storage Media
Volatile: Loses data when power is off
Cache
Main Memory
Nonvolatile: Retains data without power
Flash Memory
Magnetic Disks
Optical Disks
Magnetic Tapes
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 7 / 26
Performance Measures of Disks
Capacity — Total data the disk can hold
Access Time — Time from request to start of data transfer
Data-Transfer Rate — Speed of reading/writing data
Reliability — Resilience against crashes or failures
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 8 / 26
Seek Time and Rotational Latency
Seek Time — Time to move disk arm to correct track
Typical: 2–30 ms; Average: 4–10 ms
Average half of maximum seek time
Rotational Latency — Time for desired sector to rotate under head
Rotation speed: 5400–15000 RPM
Latency half of full rotation time
Total Access Time = Seek Time + Rotational Latency
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 9 / 26
Data-Transfer Rate
Once access time ends, data transfer begins
Depends on:
Disk platter speed
Density of data on tracks
Controller and bus bandwidth
Ranges vary widely by disk model and interface (SATA, SSD, NVMe)
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 10 / 26
Access Patterns in Disk I/O
Disk I/O requests specify block numbers (logical units of sectors)
Sequential Access:
Successive blocks on the same or adjacent track
Requires minimal disk-arm movement
High data-transfer efficiency
Random Access:
Blocks spread across tracks
Frequent seeks reduce throughput
Typical rate: 100–200 accesses/second
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 11 / 26
Disk Block Access Optimization Techniques
Buffering: Recently accessed blocks held in memory
Read-ahead: Prefetching consecutive blocks to reduce latency
Scheduling: Optimize order of block transfers for minimal arm
movement
Elevator Algorithm:
Arm sweeps in one direction, serving requests in order
Reverses when no more requests remain in that direction
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 12 / 26
Impact on Performance
Sequential Access: High transfer rate, low latency
Random Access: Increased seek time, lower throughput
Optimizations enhance throughput:
Buffer hits reduce disk I/O
Read-ahead minimizes seek/latency
Scheduling maximizes head efficiency
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 13 / 26
RAID: Redundant Arrays of Independent Disks
RAID uses multiple disks to improve:
Reliability via redundancy
Performance via parallel access
Formerly “Inexpensive” disks; now “Independent”
Justified by growth in Web, multimedia, and DB data
RAID enables fault tolerance and fast throughput
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 14 / 26
RAID Reliability via Redundancy
More disks → higher chance of individual disk failure
Mirroring stores duplicate copies
Mean Time to Data Loss with mirroring:
Exponential increase with redundancy
Example: Up to 57,000 years under ideal assumptions!
Power failures can cause inconsistency unless writes are staged
Real-world MTDL: 55–110 years with mirroring
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 15 / 26
RAID Performance via Parallelism
Mirrored Reads: doubled throughput
Data Striping:
Bit-Level Striping: Bits of bytes spread across disks
Block-Level Striping: Logical blocks assigned using modulo indexing
Example: Block i stored on disk (i mod n) + 1
Effective transfer rate increases by factor of number of disks
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 16 / 26
RAID Level
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 17 / 26
RAID Levels Overview
RAID = Redundant Arrays of Independent Disks
Combines performance (striping) + reliability (redundancy)
Trade-off: cost vs fault tolerance vs throughput
Uses mirroring, parity, and error-correcting codes
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 18 / 26
RAID Level Summary
Level Technique Redundancy Perf
RAID 0 Block striping None High (n
RAID 1 Mirroring + striping Mirror copies Moderate
RAID 2 ECC bits Extra disks High (b
RAID 3 Bit striping + parity 1 disk High tr
RAID 4 Block striping + central parity 1 disk High for
RAID 5 Block striping + distributed parity 1 disk spread Balanced (
RAID 6 Like RAID 5 + error-correction 2 blocks (P+Q) Slightly r
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 19 / 26
RAID Design Considerations
RAID 0 — Fastest, but unsafe
RAID 1 — Most reliable, high cost
RAID 5 — Balanced cost/performance
RAID 6 — Enterprise-grade fault tolerance
RAID 10 = Striping over mirrored pairs
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 20 / 26
Factors in RAID Level Selection
Cost: Extra disk overhead (mirroring vs parity)
I/O Performance: Number of operations supported
Failure Handling: Performance when a disk fails
Rebuild Time: Duration of recovery and its impact on MTDL
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 21 / 26
RAID Level Trade-offs
RAID 0: High performance; no redundancy
RAID 1: Mirroring; best write speed, easy rebuild
RAID 5: Block striping + parity; read-optimal, write overhead
RAID 6: Dual parity; high fault tolerance, slower writes
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 22 / 26
RAID Level Trade-offs
RAID 0: High performance; no redundancy
RAID 1: Mirroring; best write speed, easy rebuild
RAID 5: Block striping + parity; read-optimal, write overhead
RAID 6: Dual parity; high fault tolerance, slower writes
Note: RAID 2, 3, and 4 are largely obsolete in practice
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 22 / 26
Application Scenarios
RAID 0: Suitable for performance-critical, non-critical data (e.g.,
cache, temp files)
RAID 1: Log files, high I/O demand, moderate storage
RAID 5: Web/database servers—frequent reads, rare writes
RAID 6: Mission-critical systems (e.g., archival, enterprise storage)
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 23 / 26
RAID Configuration Decisions
Number of disks — affects throughput and cost
Parity span — fewer parity bits → lower overhead, higher risk
Rebuild strategy — impacts downtime and MTTR
Workload characteristics — read/write balance and data criticality
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 24 / 26
Fixed-Length Records
Record size is fixed — each record occupies the same number of bytes
Example: type instructor
ID varchar(5) → 5 bytes
name varchar(20) → 20 bytes
dept name varchar(20) → 20 bytes
salary numeric(8,2) → 8 bytes
Total record size: 53 bytes
Records placed consecutively: offset = 53× record number
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 25 / 26
Drawbacks of Fixed-Length Record Layout
Block Misalignment:
Block size may not align with record size
Leads to records crossing block boundaries
Results in multiple I/O accesses
Deletion Complexity:
Hard to reclaim space
Requires record reshuffling or marking as deleted
Prof. Dr. Rafiqul Islam Database System Concepts 26 July, 2025 26 / 26