KEMBAR78
31 File Structures | PDF | Data Buffer | Computer Data Storage
0% found this document useful (0 votes)
84 views20 pages

31 File Structures

Database systems store records in disk blocks to minimize data transfers between disk and memory. Records are logically stored in files and physically stored in disk blocks. Buffer managers in memory cache frequently accessed blocks to reduce disk I/O. Variable length records require additional metadata like separators or lengths, and can be organized using spanned or unspanned blocking with pointers. Slotted page structures organize multiple records efficiently in blocks through headers.

Uploaded by

Obaid ur Rehman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views20 pages

31 File Structures

Database systems store records in disk blocks to minimize data transfers between disk and memory. Records are logically stored in files and physically stored in disk blocks. Buffer managers in memory cache frequently accessed blocks to reduce disk I/O. Variable length records require additional metadata like separators or lengths, and can be organized using spanned or unspanned blocking with pointers. Slotted page structures organize multiple records efficiently in blocks through headers.

Uploaded by

Obaid ur Rehman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Database Systems

Storage and File Structure


Database Storage
n Database is stored on disk as a collection of
files.
n Example: MySQL stores each relation in a
separate file
n Each file is logically a sequence of records.
n A record is a sequence of fields or values.

n Records are physically stored in disk blocks.


n Blocks are units of both storage and data
transfer.
n Blocking factor: number of records per block
Storage Access
n Database systems seek to minimize the
number of block transfers between the disk
and memory.
n Example: Keep as many blocks as possible

in main memory.
n Buffer portion of main memory available to
store copies of disk blocks.
n Buffer Manager: refers to the subsystem
responsible for managing (allocating) buffer
space
Buffer/Block Access
1. If the block is already in the buffer, buffer
manager returns the address of the block in
main memory
2. If the block is not in the buffer, the buffer
manager
1. Allocates space in the buffer for the block

1. Replacing (throwing out) some other block, if


required, to make space for the new block.
2. Replaced block written back to disk only if it was
modified.
2. Reads the block from the disk to the buffer,
and returns the address of the block in main
memory to requester.
Buffer-Replacement Policies
n LRU replace the block least recently used.
n Can be a bad strategy for certain access patterns
involving repeated scans of large data (nested look
join MRU strategy
n Pinned block memory block that is not allowed
to be written back to disk.
n Forced output: Write back the block to disk and
free the occupied space even though buffer space
is not needed.
n Toss-immediate strategy frees the space
occupied by a block as soon as the final tuple of
that block has been processed.
Which methods are used for disk
block allocation for storing a file?
Disk Block Allocation
n The physical disk blocks that are allocated to hold
the records of a file can be contiguous, linked, or
indexed.

File: Accounts File: Accounts File: Accounts


Start: 4, Length: 3 Start: 1, End: 10 Blocks: 1,4,10
File Organization
n Header (descriptor) of a database file
includes many information items to
describe the corresponding relation,
such as:
n Field names and their data types

n Addresses of the file blocks on disk


Unordered (Heap) Files
n New records are inserted at the end of the file.
n Record insertion is quite efficient.

n To search for a record, a linear search through


the file records is necessary.
n Quite expensive: requires reading and

searching half the file blocks on the average.


n Reading the records in order of a particular field
requires sorting the file records.
Fixed-Length Records
n Assumes record size is fixed
n Each file has records of one particular
type only
n Different files are used for different
relations
n Which method among spanned or un-
spanned is easiest to implement?

n Un-spanned blocking
Fixed-Length Records (Example)
n Account = {acc_num char(10), branch char(22), balance
double (8)}
n n as the size of each record in bytes (e.g. 40 for Account)
n Storage approach:
n Store record i starting from byte n * (i 1)
n Record access is simple but records may cross blocks
n Deletion of record i:
alternatives:
n move records i + 1, . . ., n
to i, . . . , n 1
n move record n to i
n do not move records, but
link all free records on a
free list.
Free Lists
n Store the address of the
first empty record in the
file header.
n Use this first record to
store the address of the
second empty record,
and so on
n Can think of these More space efficient
stored addresses as representation
pointers since they reuse actual space of the
point to the location of free records to store
a record. pointers (no pointers in
the in-use records)
Sequential File Organization
n File records are kept sorted by the values of an
ordering field (improved search, costly writes)
n Use the attribute used more often in queries (aka
search-key)

Why keep
pointers?
Sequential File Operations
n Delete use pointer chains
n Insert locate the position
where the record is to be
inserted
n if there is free space insert
there
n if no free space, insert the
record in an overflow
block
Need to reorganize the
n In either case, pointer
file from time to time to
chain must be updated
restore sequential order
Variable-Length Records (1)
n Variable-length records arise in database systems
in several ways:
n Record types that allow variable lengths for one or
more fields (very common).
n Storage of multiple record types in a file.

n Usually spanned blocking is used with such files.

n Files of variable-length records require additional


information to be stored in each record, such as
separator characters or data-value length.
Variable-Length Records (2)
n Approach 1: Field separators

n Approach 2: Data-value length


n Approach 3: Offset and data-value length, all fields
start at predefined location, but extra indirection
required for variable length fields

A-102 10 400 G10 Markaz


acc_num balance
branch
Un-spanned vs. Spanned Organization
n File records can be un-spanned (no record can
span two blocks) or spanned (a record can be
stored in more than one block).
How to manage multiple records
in one block for efficient storage
and retrieval of records?
Slotted Page Structure

n Slotted page header contains:


n number of record entries
n end of free space in the block
n location and size of each record
n Records are moved within a page to keep them
contiguous with no empty space between them.
n Header entries must be updated.
n External pointers should point to header entry instead of
pointing directly to the record.
Conclusion
n Several types of data storage exist with varying
reliability levels.
n We can reduce the likelihood of physical failure by
retaining multiple copies of data.
n Sequence of records can logically be organized in
a file and mapped onto disk blocks.

n Reading Resources
n Fundamentals of Database Systems (Chapter 13)
n Next
n Indexing Techniques

You might also like