Chapter 16
Disk Storage,
Basic File Structures, and
Hashing
Part2
1
2
File Organization
The database is stored as a collection of files.
Each file is a sequence of records.
A record is a sequence of fields.
Records are stored on disk blocks.
3
File, Fixed Length Records, and
Variable Length Records
A file can have fixed-length records or variable-
length records.
1- Fixed length records:
The starting position and size of each field are
known and fixed.
Each record is of fixed length. Pad with spaces.
4
File, Fixed Length Records, and
Variable Length Records
2- Variable length records
different records in the file have different sizes.
saves space.
the starting position and size of each field is
5
Blocking Factor
Blocking Factor (bfr) - the number of records that can
fit into a single block.
Bfr ( no of records in the block) = ⌊B/R⌋
B : Block size in bytes
R: Record size in bytes
Example:
Record size R = 225 bytes,
Block Size B = 2,000 bytes,
File size= 100000 record,
How many records fit into the block?
bfr (no of records in the block) = 2000/255 = 7.8 = 7Record
6
Example.cont
How many blocks are required to store the entire
file?
The number of blocks b needed to store a file of r
records:
B (no of blocks required to store a file) = r/bfr blocks
= 100000/7= 14285.7 = 14286= blocks
7
Placing File Records on Disk
Unspanned Records
Spanned Records
8
Types of records organization
Spanned & Unspanned Records
Unspanned records:
A record is found in one and only one block.
records do not span(not divided) across block
boundaries.
Used with fixed-length records having B R
Spanned records:
Records are been allowed to span across block
boundaries.
Used with variable-length records having R B
A pointer at the end of the first block points to
the block containing the remainder of the
record, not the next consecutive block on disk.
9
Types of records organization
Spanned & Unspanned Records
In variable-length records, either spanned or unspanned
organization can be used.
10
Allocating File Blocks on Disk
The physical disk blocks that are allocated to hold the
records of a file can be contiguous, linked, or indexed.
In contiguous allocation, the file blocks are allocated
to consecutive disk blocks.
In linked allocation, each file block contains a pointer
to the next file block.
In indexed allocation, one or more index blocks
contain pointers to the actual file blocks.
A file header or file descriptor contains information
about a file (e.g., the disk address, record format
descriptions, etc.)
11
Operation on Files
Operation on files are :
Retrieval operations: do not change any data in the file.
Update operations : change the file by insertion or
deletion or modification of field values
In either case, it needs to find a record(s) based on a selection
certain criteria such as :
Select Name where Salary > 3000
12
Operation on Files.cont
Typical file operations ( access methods) include:
OPEN: Prepares the file for reading or writing , sets the file pointer to
the beginning of the file.
FIND: Searches for the first file record that satisfies a certain
condition, and makes it the current file record.
READ: Reads the current file record into a program variable.
FINDNEXT: Searches for the next file record (from the current
record) that satisfies a certain condition, and makes it the current file
record.
INSERT: Inserts a new record into the file & makes it the current file
record.
13
Operation on Files.cont
DELETE: Removes the current file record from the file
MODIFY: Changes the values of some fields of the current file
record.
CLOSE: Terminates access to the file (release the buffers)
14
Organization of Records in Files
The general techniques for organizing records of
a file on disk :
1. Heap File Organization.
2. Sorted/Ordered File Organization.
3. Hashing File Organization.
15
Files of unordered records (Heap File)
Records are placed in the file in the order in which
they are inserted. Such an organization is called a
heap or pile file.
It is a simplest and most basic types of organization
Insertion : inserting a new record at the end of
file.
(very efficient)
Searching
requires a linear search (expensive)
Deleting
requires a linear search, then delete.
Leaves unused space in the disk block.
16
Heap File Organization
Either spanned or unspanned organization with
fixed length or variable length records can be used
for an unordered file (Heap file).
For a file of unordered fixed-length records using
unspanned blocks and contiguous allocation, it is
straightforward to access any record by its position
in the file.
17
Hashing Techniques
18