DISTRIBUTED MULTIMEDIA SYSTEMS
Lecture 8: Multimedia OS- file system
8.1Introduction
I welcome to lecture eight on multimedia operating system file system.The file system is
said to be the most visible part of an operating system. Most programs write or read files.
Their program code, as well as user data, is stored in files. The organization of the file
system is an important factor for the usability and convenience of the operating system.
A file sequence is a sequence of information held as a unit for storage and use in a
computer system.
8.2 Lecture objectives
At the end of this lecture you will:
i) Be able to identify different file systems available for Multimedia OS.
ii) Learn how the files are managed in the Operating System.
iii) Understand the algorithms used for disk scheduling.
8.3 Lecture Outline
File Systems
Files are stored in secondary storage, so they can be used by different applications. The
life-span of files is usually longer than the execution of a program. In traditional file
systems, the information types stored in files are sources, objects, libraries and
executables of programs, numeric data, text payroll records, etc. In multimedia systems,
the stored information also covers digitized video and audio with their related real-time
“read” and “write” demands. Therefore, additional requirements in the design and
implementation of file systems must be considered. The file system provides access and
control functions for the storage and retrieval of files. From the user’s viewpoint, it is
important how the file system allows file organization and structure. The internals, which
are more important in our context, i.e., the organization of the file system, deal with the
representation of information in files, their structure and organization in secondary
storage.
Traditional File Systems
The two main goals of traditional files systems are: (1) to provide a comfortable
76
DISTRIBUTED MULTIMEDIA SYSTEMS
interface for file access to the user and (2) to make efficient use of storage media.
File Structure
We commonly distinguish between two methods of file organization. In sequential
storage, each file is organized as a simple sequence of bytes or records. Files are
stored consecutively on the secondary storage media as shown in the following figure.
They are separated from each other by a well defined “end of file” bit pattern, character
or character sequence. A file descriptor is usually placed at the beginning of the file and
is, in some systems, repeated at the end of the file. Sequential storage is the only possible
way to organize the storage on tape, but it can also be used on disks. The main advantage
is its efficiency for sequential access, as well as for direct access. Disk access time for
reading and writing is minimized. In non-sequential storage, the data items are stored in a
non-contiguous order. There exist mainly two approaches:
One way is to use linked blocks, where physical blocks containing consecutive logical
locations are linked using pointers. The file descriptor must contain the number of blocks
77
DISTRIBUTED MULTIMEDIA SYSTEMS
occupied by the file, the pointer to the first block and it may also have the pointer to the
last block. A serious disadvantage of this method is the cost of the implementation for
random access because all prior data must be read. In MS-DOS, a similar method is
applied. A File Allocation Table (FAT) is associated with each disk. One entry in the
table represents one disk block. The directory entry of each file holds the block number
of the first block. The number in the slot of an entry refers to the next block of a file. The
slot of the last block of a file contains an end-of -file mark.
Another approach is to store block information in mapping tables. Each file is
associated with a table where, apart from the block numbers information like owner, file
size, creation time, last access time, etc., is stored. Those tables usually have a fixed size,
which means that number of block references is bounded. Files with more blocks are
referenced indirectly by additional tables assigned to the files. In UNIX, a small table
(on disk) called i-node is associated with each file (See the following figure). The
indexed sequential approach is an example for multi-level mapping; here, logical and
physical organizations are not clearly separated.
Directory Structure
Files are usually organized in directories. Most of today’s operating systems provide tree-
structured directories where the user can organize the files according to his/her personal
needs. In multimedia systems, it is important to organize the files in a way that allows
easy, fast, and contiguous data access.
Disk Management
Disk access is a slow and costly transaction. In traditional systems, a common
technique to reduce disk access is block caches. Using block cache, blocks are kept in
memory because it is expected that future read or write operations access these data
again. Thus, performance is enhanced due to shorter access time.
Disk Scheduling
Sequential storage devices (e.g., tapes) do not have a scheduling problem, for
78
DISTRIBUTED MULTIMEDIA SYSTEMS
random access storage devices, every file operation may require movements of the
read/write head. This operation, known as “to seek,” is very time consuming, i.e., seek
time in order of 250ms for CDs is still state-of-art. The actual time to read or write a disk
block is determined by:
The seek time (the time required for the movement of the read/write head).
The latency time or rotational delay (the time during which the transfer cannot
proceed until the right block or sector rotates under the read/write head).
The actual data transfer time needed for the data to copy from disk into main
memory. Usually the seek time is the largest factor of the actual transfer time. Most
systems try to keep the cost of seeking low by applying special algorithms to the
scheduling of disk read/write operations. The access of the storage device is a problem
greatly influenced by the file allocation method. Most systems apply one of the
following scheduling algorithms:
First-Come-First-Served (FCFS)
With this algorithm, the disk driver accepts requests one-at-a-time and
serves them in incoming order. This is easy to program and an intrinsically fair
algorithm. However, it is not optimal with respect to head movement because it
does not consider the location of the other queued requests. These results in a
high average seek time.
Shortest-Seek-Time First (SSTF)
At every point in time, when a data transfer is requested, SSTF selects among all
requests the one with the minimum seek time from the current head position. Therefore,
the head is moved to the closest track in the request queue. This algorithm was developed
to minimize seek time and it is in this sense optimal. SSTF is a modification of Shortest
Job First (SJF), and like SJF, it may cause starvation of some requests. Request targets in
the middle of the disk will get immediate service at the expense of requests in the
innermost and outermost disk areas.
79
DISTRIBUTED MULTIMEDIA SYSTEMS
SCAN
Like SSTF, SCAN orders requests to minimize seek time. In contrast to SSTF, it takes
the direction of the current disk movement into account. It first serves all requests in one
direction until it does not have any requests in this direction anymore. The head
movement is then reversed and service is continued. SCAN provides a very good seek
time because the edge tracks get better service times. Note that middle tracks still get a
better service then edge tracks. When the head movement is reversed, it first serves
tracks that have recently been serviced, where the heaviest density of requests, assuming
a uniform distribution, is at the other end of the disk.
C-SCAN
C-SCAN also moves the head in one direction, but it offers fairer service with more
uniform waiting times. It does not alter the direction, as in SCAN. Instead, it scans in
cycles, always increasing or decreasing, with one idle head, movement from one edge to
the other between two consecutive scans. The performance of C-SCAN is somewhat less
than SCAN.
Multimedia File systems
Compared to the increased performance of processors and networks, storage devices
have become only marginally faster. The effect of this increasing speed mismatch is the
search for new storage structures, and storage and retrieval mechanisms with respect to
the file system. Continuous media data are different from discrete data in:
Real Time Characteristics
As mentioned previously, the retrieval, computation and presentation of continuous
media is time-dependent. The data must be presented (read) before a well-defined
deadline with small jitter only.
File Size
Compared to text and graphics, video and audio have very large storage space
requirements. Since the file system has to store information ranging from small,
80
DISTRIBUTED MULTIMEDIA SYSTEMS
unstructured units like text files to large, highly structured data units like video
and associated audio, it must organize the data on disk in a way that efficiently
uses the limited storage.
Multiple Data Streams
A multimedia system must support different media at one time. It does not only
have to ensure that all of them get a sufficient share of the resources, it also must
consider tight relations between streams arriving from different sources.
There are different ways to support continuous media in file systems. Basically there are
two approaches. With the first approach, the organization of files on disk remains as is.
The necessary real-time support is provided through special disk scheduling algorithms
and sufficient buffer to avoid jitter. In the second approach, the organization of audio
and video files on disk is optimized for their use in multimedia systems. Scheduling of
multiple data streams still remains an issue of research.
Disk Scheduling Algorithms in Multimedia File System
The main goals of traditional disk scheduling algorithms are to reduce the cost of seek
operations, to achieve a high throughput and to provide fair disk access for every process.
The additional real-time requirements introduced by multimedia systems make traditional
disk scheduling algorithms, such as described previously, inconvenient for multimedia
systems. Systems without any optimized disk layout for the storage of continuous media
depend far more on reliable and efficient disk scheduling algorithms than others. In the
case of contiguous storage, scheduling is only needed to serve requests from multiple
streams concurrently. A round-robin scheduler is employed that is able to serve hard real-
time tasks. Here, additional optimization is provided through the close physical
placement of streams that are likely to be accessed together.
Earliest Deadline First:
EDF scheduling strategy as described for CPU scheduling, is also used for the file system
issue as well. Here the block of the stream with the nearest deadline would be read first.
81
DISTRIBUTED MULTIMEDIA SYSTEMS
The employment of EDF, in the strict sense results in poor throughput and excessive
seek time. Further, as EDF is most often applied as a preemptive scheduling scheme, the
costs for preemption of a task and scheduling of another task are considerably high. The
overhead caused by this is in the same order of magnitude as at least one disk seek.
Hence, EDF must be adapted or combined with file system strategies.
SCAN-Earliest Deadline First
The SCAN-EDF strategy is a combination of the SCAN and EDF mechanisms.
The seek optimization of SCAN and the real-time guarantees of EDF are combined in the
following way: like in EDF, the request with the earliest deadline is always served first;
among requests with the same deadline, the specific one that is first according to the scan
direction is served first; among the remaining requests, this principle is repeated until no
request with this deadline is left.
Group Sweeping Scheduling:
With Group Sweeping Scheduling (GSS), requests are served in cycles, in round robin
manner. To reduce disk arm movements, the set of n streams is divided into groups.
Groups are served in fixed order. Individual streams within a group are served according
to SCAN; therefore, it is not fixed at which time or order individual streams within a
group are served. In one cycle, a specific stream may be the first to be served; in another
cycle, it may be the last in the same group. A smoothing buffer which is sized according
to the cycle time and data rate of the stream assures continuity.
Mixed Strategy:
The mixed strategy was introduced based on the shortest seek (also called greedy
strategy) and the balanced strategy. As shown in following figure, every time data are
retrieved from disk they are transferred into buffer memory allocated for the respective
data stream. From there, the application process removes them one at a time. The goal
of the scheduling algorithm is:
To maximize transfer efficiency by minimizing seek time and latency.
82
DISTRIBUTED MULTIMEDIA SYSTEMS
To serve process requirements with a limited buffer space.
With shortest seek, the first goal is served, i.e., the process of which data block is closest
is served first. The balanced strategy chooses the process which has the least amount of
buffered data for service because this process is likely to run out of data. The crucial part
of this algorithm is the decision of which of the two strategies must be applied (shortest
seek or balanced strategy). For the employment of shortest, seek two criteria must be
fulfilled: the number of buffers for all processes should be balanced (i.e., all processes
should nearly have the same number of buffered data) and the overall required
bandwidth should be sufficient for the number of active processes, so that none of them
will try to immediately read data out of an empty buffer. The urgency is introduced as an
attempt to measure both. The urgency is the sum of the reciprocals of the current
“fullness” (amount of buffered data). This number measures both the relative balance of
all read processes and the number of read processes. If the urgency is large, the balance
strategy will be used; it is small, it is safe to apply the shortest seek algorithm.
8.4 End of lecture activities (self –tests)