OS Unit 3 Part 2
OS Unit 3 Part 2
Files
Concepts:
A file is a named collection of related information that is recorded on secondary
storage such as magnetic disks, magnetic tapes and optical disks.
In general, a file is a sequence of bits, bytes, lines or records whose meaning is
defined by the files creator and user.
Attributes of a File
Following are some of the attributes of a file:
Name: It is the only information which is in human-readable form.
Identifier: The file is identified by a unique tag(number) within file system.
Type: It is needed for systems that support different types of files.
Location: Pointer to file location on device.
Size: The current size of the file.
Protection: This controls and assigns the power of reading, writing, executing.
Time, date, and user identification: This is the data for protection, security,
and usage monitoring.
File Operations
The operating system must do to perform basic file operations given below.
Creating a file: First, space in the file system must be found for the file. Second,
an entry for the new file must be made in the directory.
Writing a file: To write a file, we make a system call specifying both the name
of the file and the information to be written to the file. Given the name of the
file, the system searches the directory to find the file's location. The system must
keep a write pointer to the location in the file where the next write is to take
place. The write pointer must be updated whenever a write occurs.
Reading a file: To read from a file, we use a system call that specifies the name
of the file and where (in memory) the next block of the file should be put. Again,
the directory is searched for the associated entry, and the system needs to keep a
read pointer to the location in the file where the next read is to take place. Once
the read has taken place, the read pointer is updated.
Repositioning within a file: The directory is searched for the appropriate entry,
and the current-file-position pointer is repositioned to a given value.
Repositioning within a file need not involve any actual I/O. This file operation
is also known as a file seek.
Deleting a file. To delete a file, we search the directory for the named file.
Having found the associated directory entry, we release all file space, so that it
can be reused by other files, and erase the directory entry.
Page 1 of 20
Protection: Access-control information determines who can do reading, writing,
executing, and so on.
Truncating a file: The user may want to erase the contents of a file but keep its
attributes. Rather than forcing the user to delete the file and then recreate it, this
function allows all attributes to remain unchanged—except for file length—but
lets the tile be reset to length zero and its file space released.
In brief
File Types
Page 2 of 20
File System Structure
A File Structure should be according to a required format that the operating system can
understand.
A file has a certain defined structure according to its type.
A text file is a sequence of characters organized into lines.
A source file is a sequence of procedures and functions.
An object file is a sequence of bytes organized into blocks that are
understandable by the machine.
When operating system defines different file structures, it also contains the code
to support these file structure. Unix, MS-DOS support minimum number of file
structure.
Files can be structured in several ways in which three common structures are given in
this tutorial with their short description one by one.
File Structure 1
Here, as you can see from the figure 1, the file is an unstructured sequence of
bytes.
Therefore, the OS doesn't care about what is in the file, as all it sees are bytes.
File Structure 2
Now, as you can see from the figure 2 that shows the second structure of a file,
where a file is a sequence of fixed-length records where each with some internal
structure.
Central to the idea about a file being a sequence of records is the idea that read
operation returns a record and write operation just appends a record.
File Structure 3
Now in the last structure of a file that you can see in the figure 3, a file basically
consists of a tree of records, not necessarily all the same length, each containing
a key field in a fixed position in the record. The tree is stored on the field, just to
allow the rapid searching for a specific key.
Page 3 of 20
6.2 File Access method
File access mechanism refers to the manner in which the records of a file may be
accessed. There are several ways to access files −
Sequential access
Direct/Random access
Indexed sequential access
1. Sequential Access
A sequential access is that in which the records are accessed in some sequence,
i.e., the information in the file is processed in order, one record after the other.
This access method is the most primitive one.
The idea of Sequential access is based on the tape model which is a sequential
access device.
The Sequential access method is best because most of the records in a file are to
be processed. For example, transaction files.
Example: Compilers usually access files in this fashion.
In Brief:
Data is accessed one record right after another is an order.
Read command cause a pointer to be moved ahead by one.
Write command allocate space for the record and move the pointer to the new
End of File.
Such a method is reasonable for tape.
Advantages of sequential access
It is simple to program and easy to design.
Sequential file is best use if storage space.
Disadvantages of sequential access
Sequential file is time consuming process.
It has high data redundancy.
Random searching is not possible.
2. Direct Access
Sometimes it is not necessary to process every record in a file.
It is not necessary to process all the records in the order in which they are present in the
memory. In all such cases, direct access is used.
The disk is a direct access device which gives us the reliability to random access of any
file block.
In the file, there is a collection of physical blocks and the records of that blocks.
Example: Databases are often of this type since they allow query processing that
involves immediate access to large amounts of information. All reservation systems fall
into this category.
In brief:
This method is useful for disks.
The file is viewed as a numbered sequence of blocks or records.
Page 4 of 20
There are no restrictions on which blocks are read/written, it can be dobe in any
order.
User now says "read n" rather than "read next".
"n" is a number relative to the beginning of file, not relative to an absolute
physical disk location.
Advantages:
Direct access file helps in online transaction processing system (OLTP) like
online railway reservation system.
In direct access file, sorting of the records are not required.
It accesses the desired records immediately.
It updates several files quickly.
It has better control over record allocation.
Disadvantages:
Direct access file does not provide backup facility.
It is expensive.
It has less storage space as compared to sequential file.
Page 5 of 20
It requires more storage space.
It is expensive because it requires special software.
It is less efficient in the use of storage space as compared to other file
organizations.
Swapping:
Swapping is a mechanism in which a process can be swapped temporarily out of
main memory (or move) to secondary storage (disk) and make that memory
available to other processes.
At some later time, the system swaps back the process from the secondary
storage to main memory.
Though performance is usually affected by swapping process but it helps in
running multiple and big processes in parallel and that's the reason
Swapping is also known as a technique for memory compaction.
Swap space is a space on hard disk which is a substitute of physical memory.
It is used as virtual memory which contains process memory image.
Whenever our computer run short of physical memory it uses its virtual memory
and stores information in memory on disk.
Page 6 of 20
This means that given the starting block address and the length of the file (in
terms of blocks required), we can determine the blocks occupied by the file.
The directory entry for a file with contiguous allocation contains
1. Address of starting block
2. Length of the allocated portion.
The file ‘mail’ in the following figure starts from the block 19 with length = 6
blocks. Therefore, it occupies 19, 20, 21, 22, 23, 24 blocks.
Page 7 of 20
Each file carries a list of links to disk blocks.
Directory contains link / pointer to first block of a file.
No external fragmentation
Effectively used in sequential access file.
Inefficient in case of direct access file.
Advantages:
1. File size does not have to be specified.
2. No external fragmentation.
Disadvantages:
1. It does sequential access efficiently and is not for direct access
2. Each block contains a pointer, wasting space
3. Blocks scatter everywhere and a large number of disk seeks may be necessary
4. Reliability: what if a pointer is lost or damaged?
3. Indexed Allocation
In this scheme, a special block known as the Index block contains the pointers
to all the blocks occupied by a file. Each file has its own index block.
The ith entry in the index block contains the disk address of the ith file block.
The directory entry contains the address of the index block as shown in the
image:
Page 8 of 20
Each file has its own index block which stores the addresses of disk space
occupied by the file.
Directory contains the addresses of index blocks of files.
Advantages:
This supports direct access to the blocks occupied by the file and therefore
provides fast access to the file blocks.
It overcomes the problem of external fragmentation.
Disadvantages:
The pointer overhead for indexed allocation is greater than linked allocation.
For very small files, say files that expand only 2-3 blocks, the indexed allocation
would keep one entire block (index block) for the pointers which is inefficient in
terms of memory utilization. However, in linked allocation we lose the space of
only 1 pointer per block.
1. Single-level directory –
Single level directory is simplest directory structure.
In it all files are contained in same directory which make it easy to support and
understand.
A single level directory has a significant limitation, however, when the number
of files increases or when the system has more than one user.
Since all the files are in the same directory, they must have the unique name. if
two users call their dataset test, then the unique name rule violated.
Advantages:
Since it is a single directory, so its implementation is very easy.
If files are smaller in size, searching will faster.
Page 9 of 20
The operations like file creation, searching, deletion, updating are very easy in
such a directory structure.
Disadvantages:
There may chance of name collision because two files cannot have the same
name.
Searching will become time taking if directory will large.
In this cannot group the same type of files together.
2. Two-level directory –
As, a single level directory often leads to confusion of files names among
different users hence the solution to this problem is to create a separate directory
for each user.
In the two-level directory structure, each user has their own user files directory
(UFD).
The UFDs has similar structures, but each lists only the files of a single user.
system’s master file directory (MFD) is searches whenever a new user id=s
logged in.
The MFD is indexed by username or account number, and each entry points to
the UFD for that user.
Advantages:
We can give full path like /User-name/directory-name/.
Different users can have same directory as well as file name.
Searching of files become more easy due to path name and user-grouping.
Disadvantages:
A user is not allowed to share files with other users.
Still it not very scalable, two files of the same type cannot be grouped together
in the same user.
3. Tree-structured directory –
Once we have seen a two-level directory as a tree of height 2, the natural
generalization is to extend the directory structure to a tree of arbitrary height.
This generalization allows the user to create their own subdirectories and to
organize on their files accordingly.
A tree structure is the most common directory structure. The tree has a root
directory, and every file in the system have a unique path.
Page 10 of 20
Advantages:
Very generalize, since full path name can be given.
Very scalable, the probability of name collision is less.
Searching becomes very easy, we can use both absolute path as well as relative.
Disadvantages:
Every file does not fit into the hierarchical model; files may be saved into
multiple directories.
We cannot share files.
It is inefficient, because accessing a file may go under multiple directories.
Device Management
Device management in operating system known as the management of the
I/O devices such as a keyboard, magnetic tape, disk, printer, microphone,
USB ports, scanner, etc.as well as the supporting units like control
channels.
The operating system handles communication with a device via their drivers.
The OS components give a uniform interface for accessing devices with various
physical features. There are various functions of device management in the
operating system. Some of them are as follows
It Keeps track of data status, location uses, etc. The file system is term Used
to define a group of facilities.
It enforces the pre-determined policies and decides which process receives
the device when and for how long
It improves the performance of specific devices.
It monitors the status of every device, including printers, storage drivers and
other devices.
Page 11 of 20
Device Controllers
In computer systems, I/O devices do not usually communicate with the
operating system. The operating system manages their task with the help of
one intermediate electronic device called a device controller.
The device controller knows how to communicate with the operating system
as well as how to communicate with I/O devices. So device controller is an
interface between the computer system (operating system) and I/O devices.
The device controller communicates with the system using the system bus. So
how the device controller, I/O devices, and the system bus is connected is
shown below in the diagram.
In the above diagram, some IO devices have DMA (Direct Memory access) via
device controllers and some of them do not have DMA. The devices which have
a DMA path to communicate with the system to access memory are much faster
than devices that have a non-DMA path to access the memory. The devices
have a non-DMA path via the device controller to access the memory, they have
to go from the processor which means it will be scheduled by the scheduler and
then when it gets loaded into RAM then it will get the CPU to execute its
instruction to access memory so it is slow from devices which has a DMA.
Page 12 of 20
A device controller generally can control more than one IO device but it is most
common to control only a single device. Device controllers are stored in the
chip and that chip is attached to the system bus there is a connection cable
from the controller to each device which is controlled by it. Generally, one
controller controls one device. The operating system communicates with
device controllers and the device controller communicates with devices so
indirectly operating system communicates with IO devices.
Device Driver
Device Drivers are a set of programs that act as an intermediary between the
operating system of the computer and the hardware components.
Device Driver in computing refers to a special kind of software program or a
specific type of software application that controls a specific hardware device
that enables different hardware devices to communicate with the computer's
Operating System. A device driver communicates with the computer hardware
by computer subsystem or computer bus connected to the hardware.
Device Drivers are essential for a computer system to work properly because,
without a device driver, the particular hardware fails to work accordingly, which
means it fails in doing the function/action it was created to do. Most use the
term Driver, but some may say Hardware Driver, which also refers to
the Device Driver.
Page 13 of 20
Types of Device Driver
1. Kernel-mode Device Driver
This Kernel-mode device driver includes some generic hardware that loads
with the operating system as part of the OS these are BIOS, motherboard,
processor, and some other hardware that are part of kernel software. These
include the minimum system requirement device drivers for each operating
system.
2. User-mode Device Driver
Other than the devices which are brought by the kernel for working the
system the user also brings some devices for use during the using of a system
that devices need device drivers to function those drivers fall under User
mode device driver. For example, the user needs any plug-and-play action
that comes under this.
How it Works:
1. I/O Device Ready: When an I/O device, like a keyboard or a disk drive, has data
ready for transfer, it generates an interrupt signal to the CPU.
2. Interrupt Handler: The CPU, upon receiving the interrupt, temporarily suspends
its current operation and branches to an interrupt service routine (ISR).
3. Data Transfer: The ISR handles the data transfer between the I/O device and
memory, either directly or using DMA (Direct Memory Access).
4. Return to Main Program: After the data transfer is complete, the ISR returns
control to the CPU, which resumes its previous operation.
Page 14 of 20
Disadvantages of Interrupt-Driven I/O:
In first case it is simple because both have different set of address space and
instruction but it require more buses.
Isolated I/O
In Isolated I/O, the CPU uses the same buses (wires) to talk to both memory and
I/O devices, but it has separate control signals to tell whether it’s dealing with
memory or an I/O device.
I/O devices have special addresses called ports.
When the CPU wants to communicate with an I/O device:
o It puts the port address on the address bus.
o It uses special control lines like I/O Read or I/O Write.
o Then data is sent or received using the data bus.
As memory and I/O have separate
address spaces, it’s called Isolated
I/O. Also, the CPU uses different
instructions for memory and I/O
(like IN and OUT for I/O).
Page 15 of 20
Advantages of Isolated I/O
Large I/O Address Space: Isolated I/O allows for a larger I/O address space
because I/O devices have their own separate address space, independent of
the system memory.
Greater Flexibility: It offers greater flexibility, as I/O devices can be added or
removed without affecting the memory address space.
Improved Reliability: Since I/O devices do not share the same address space
as memory, failures in I/O devices are less likely to affect the memory or
other devices, improving system reliability.
Disadvantages of Isolated I/O
Slower I/O Operations: I/O operations may be slower because isolated I/O
requires special instructions, which add extra processing steps.
More Complex Programming: Programming becomes more complex due to
the need for dedicated I/O instructions, such as IN and OUT, which are
separate from standard memory instructions.
Page 16 of 20
Applications of Memory-Mapped I/O
Graphics Processing: Memory-mapped I/O is widely used in graphics cards
to provide fast access to frame buffers and control registers. Graphics data
is mapped directly to memory, allowing the CPU to interact with the
graphics hardware as if it were accessing normal memory. This enables
efficient rendering and display operations.
Page 17 of 20
Differences between memory mapped I/O and isolated I/O
Page 20 of 20