File System Notes UNIT V
File System Notes UNIT V
The file system provides the mechanism for on-line storage of and access to both data and programs of the
operating system and all the users of the computer system. The file systemconsists of two distinct parts: a collection of
files, each storing related data, and a directory structure, which organizes and provides information about all the files
in the system.
FILE CONCEPT
A file is a collection of related information that is recorded on secondary storage. Froma user's perspective, a
file is the smallest allotment of logical secondary storage and data can not be written to secondary storage unless they
are within a file.
Four terms are in common use when discussing files: Field, Record, File and Database
A field is the basic element of data. An individual field contains a single value, such asan employee’s last
name, a date, or the value of a sensor reading. It is characterized byits length and data type.
A record is a collection of related fields that can be treated as a unit by some applicationprogram. For example,
an employee record would contain such fields as name, social security number, job classification, date of hire,
and so on.
A file is a collection of similar records. The file is treated as a single entity by users andapplications and may
be referenced by name.
A database is a collection of related data. A database may contain all of the information related to an
organization or project, such as a business or a scientific study. The database itself consists of one or more types
of files.
File Attributes:
A file has the following attributes:
Name: The symbolic file name is the only information kept in human readable form.
Identifier: This unique tag, usually a number, identifies the file within the file system; itis the non-human-
readable name for the file.
Type: This information is needed for those systems that support different types.
Location: This information is a pointer to a device and to the location of the file on thatdevice.
Size: The current size of the file (in bytes, words, or blocks), and possibly the maximumallowed size are
included in this attribute.
Protection: Access-control information determines who can do reading, writing,executing, and so on.
Time, date, and user identification: This information may be kept for creation, modification and last use.
These data can be useful for protection, security, and usagemonitoring.
File Operations:
The operating system can provide system calls to create, write, read, reposition, delete, andtruncate files. The file
operations are described as followed:
Creating a file: Two steps are necessary to create a file. First, space in the file system must be found for the file.
Second, an entry for the new file must be made in the directory. The directory entry records the name of the file
and the location in the file system, and possiblyother information.
Writing a file: To write a file, we make a system call specifying both the name of the file andthe information to be
written to the file. Given the name of the file, the system searches thedirectory to find the location of the file. The
system must keep a write pointer to the locationin the file where the next write is to take place. The write pointer must
be updated whenevera write occurs.
Reading a file: To read from a file, we use a system call that specifies the name of the file and where (in main memory)
the next block of the file should be put. Again, the directory is searched for the associated directory entry, and the
system needs to keep a read pointer to the location in the file where the next read is to take place. Once the read
has taken place,the read pointer is updated.
Repositioning within a file: The directory is searched for the appropriate entry, and the current-file-position is set to a
given value. Repositioning within a file does not need toinvolve any actual I/O. This file operation is also known as a
file seeks.
Deleting a file: To delete a file, we search the directory for the named file. Having found the associated directory entry,
we release all file space, so that it can be reused by other files,and erase the directory entry.
Truncating a file: The user may want to erase the contents of a file but keep its attributes. Rather than forcing the
user to delete the file and then recreate it, this function allows allattributes to remain unchanged-except for file
length-but lets the file be reset to length zero and its file space released.
File Types:
The files are classified into different categories as follows:
The name is split into two parts-a name and an extension, The system uses the extension to indicate the type of the file
and the type of operations that can be done on that file.
ACCESS METHODS
When a file is used, this information must be accessed and read into computer memory. The information in the
file can be accessed in several ways. There are two major access methods as follows:
Sequential Access: Information in the file is processed in order, one record after the other. A read operation reads the
next portion of the file and automatically advances a file pointer,which tracks the I/O location. Similarly, a write appends
to the end of the file and advances tothe end of the newly written material (the new end of file). Sequential access is based
on a tape model of a file, and works as well on sequential-access devices as it does on random-accessones.
Direct Access: A file is made up of fixed length logical records that allow programs to read and write records rapidly in no
particular order. The direct-access method is based on a disk model of a file, since disks allow random access to any file
block. For direct access, the file is viewed as a numbered sequence of blocks or records. A direct-access file allows arbitrary
blocks to be read or written. There are no restrictions on the order of reading or writing for a direct- access file. For the
direct-access method, the file operations must be modified to include theblock number as a parameter. Thus, we have
read n, where n is the block number, rather thanread next, and write n rather than write next.
DIRECTORY STRUCTURE
A directory is an object that contains the names of file system objects. File system allows the users to organize files
and other file system objects through the use of directories.
The structure created by placement of names in directories can take a number of forms:Single-level tree, Two-
level tree, multi-level tree or cyclic graph.
1. Single-Level Directory: The simplest directory structure is the single-level directory. All files are contained in the
same directory, which is easy to support and understand. A single-level directory has significant limitations, when
the number of files increases or when the system has more than one user. Since all files are in the same directory,
they must haveunique names.
2. Two-Level Directory: In the two-level directory structure, each user has its own user file directory (UFD). Each
UFD has a similar structure, but lists only the files of a single user. When a user job starts or a user logs in, the
system's master file directory (MFD) is searched. The MFD is indexed by user name or account number, and
each entry points to the UFD for that user.
When a user refers to a particular file, only his own UFD is searched. Different users mayhave files with the
same name, as long as all the file names within each UFD are unique. To create a file for a user, the operating system
searches only that user's UFD to ascertain whetheranother file of that name exists. To delete a file, the operating system
confines its search to thelocal UFD; thus, it cannot accidentally delete another user's file that has the same name.
3. Tree-structured directories: A tree structure is A more powerful and flexible approachto organize files and directories
in hierarchical. There is a master directory, which has under it a number of user directories. Each of these user
directories may have subdirectories and files as entries. This is true at any level: That is, at any level, a directory may
consist of entries for subdirectories and/or entries for files.
4. Acyclic-Graph Directories:
An acyclic graph allows directories to have shared subdirectories and files. The same file or subdirectory may be
in two different directories. An acyclic graph is a natural generalizationof the tree structured directory scheme.
A shared file (or directory) is not the same as two copies of the file. With two copies, each programmer can view
the copy rather than the original, but if one programmer changes the file, the changes will not appear in the other's copy.
Shared files and subdirectories can be implemented in several ways. A common way is to create a new
directory entry called a link. A link is a pointer to another file or subdirectory.
When we add links to an existing tree-structured directory, the tree structure isdestroyed, resulting in a
simple graph structure.
A file system is responsible to allocate the free blocks to the file therefore it has to keep track ofall the free blocks present
in the disk. There are mainly two approaches by using which, the free blocks in the disk are managed.
1. Bit Vector
In this approach, the free space list is implemented as a bit map vector. It contains the number of bits where each bit represents
each block.
If the block is empty then the bit is 1 otherwise it is 0. Initially all the blocks are empty therefore each bit in the bit map vector
contains 1.
LAs the space allocation proceeds, the file system starts allocating blocks to the files and settingthe respective bit to 0.
2. Linked List
It is another approach for free space management. This approach suggests linking together all the free blocks and keeping a
pointer in the cache which points to the first free block.
Therefore, all the free blocks on the disks will be linked together with a pointer. Whenever a block gets allocated, its previous
free block will be linked to its next free block.
DISK MANAGEMENT
The operating system is responsible for several aspects of disk management.
Disk Formatting
A new magnetic disk is a blank slate. It is just platters of a magnetic recording material. Before a disk can store
data, it must be divided into sectors that the disk controller can read and write. This process is called low-level formatting
(or physical formatting).
Low-level formatting fills the disk with a special data structure for each sector. The data structure for a sector
consists of a header, a data area, and a trailer. The header and trailer contain information used by the disk controller,
such as a sector number and an error- correcting code (ECC).
To use a disk to hold files, the operating system still needs to record its own data structures on the disk. It does
so in two steps. The first step is to partition the disk into one or more groups of cylinders. The operating system can
treat each partition as though it were a separate disk. For instance, one partition can hold a copy of the operating
system's executablecode, while another holds user files. After partitioning, the second step is logical formatting (or
creation of a file system). In this step, the operating system stores the initial file-system data structures onto the disk.
Boot Block
When a computer is powered up or rebooted, it needs to have an initial program to run.This initial program is
called bootstrap program. It initializes all aspects of the system (i.e. from CPU registers to device controllers and the
contents of main memory) and then starts the operating system.
To do its job, the bootstrap program finds the operating system kernel on disk, loads that kernel into memory, and
jumps to an initial address to begin the operating-system execution.
For most computers, the bootstrap is stored in read-only memory (ROM). This locationis convenient, because
ROM needs no initialization and is at a fixed location that the processor can start executing when powered up or reset.
And since ROM is read only, it cannot be infected by a computer virus. The problem is that changing this bootstrap code
requires changing the ROM hardware chips.
For this reason, most systems store a tiny bootstrap loader program in the boot ROM, whose only job is to bring
in a full bootstrap program from disk. The full bootstrap program can be changed easily: A new version is simply
written onto the disk. The full bootstrap program is stored in a partition (at a fixed location on the disk) is called the boot
blocks. A disk that has a boot partition is called a boot disk or system disk.
Bad Blocks
Since disks have moving parts and small tolerances, they are prone to failure. Sometimes the failure is complete,
and the disk needs to be replaced, and its contents restored from backup media to the new disk.
More frequently, one or more sectors become defective. Most disks even come from the factory with bad
blocks. Depending on the disk and controller in use, these blocks are handledin a variety of ways.
The controller maintains a list of bad blocks on the disk. The list is initialized during thelow-level format at the
factory, and is updated over the life of the disk. The controller can be told to replace each bad sector logically with one
of the spare sectors. This scheme is known assector sparing or forwarding.
DISKS STRUCTURE
Magnetic disks provide the bulk of secondary storage for modern computer systems. Each disk platter has a
flat circular shape, like a CD. Common platter diameters range from 1.8to 5.25 inches. The two surfaces of a platter are
covered with a magnetic material. We store information by recording it magnetically on the platters.
A read-write head "flies" just above each surface of every platter. The heads are attached to a disk arm, which
moves all the heads as a unit. The surface of a platter is logically divided into circular tracks, which are subdivided
into sectors. The set of tracks that are at one arm position forms a cylinder. There may be thousands of concentric
cylinders in a disk drive, andeach track may contain hundreds of sectors. The storage capacity of common disk drives
is measured in gigabytes.
(Structure of Magnetic disks (Harddisk drive))
When the disk is in use, a drive motor spins it at high speed. Most drives rotate 60 to 200times per second. Disk
speed has two parts. The transfer rate is the rate at which data flow between the drive and the computer. The
positioning time (or random-access time) consists ofseek time and rotational latency. The seek time is the time to move
the disk arm to the desired cylinder. And the rotational latency is the time for the desired sector to rotate to the disk head.
Typical disks can transfer several megabytes of data per second and they have seek times androtational latencies of
several milliseconds.
Capacity of Magnetic disks(C) = S x T x P x N
Where S= no. of surfaces = 2 x no. of disks, T= no. of tracks in a surface, P= no. of
sectors per track, N= size of each sector
Transfer Time: The transfer time to or from the disk depends on the rotation speed of the diskin the following fashion: T
= b / ( r x N)
Where T= transfer time, b=number of bytes to be transferred, N= number of bytes on a track,
r= rotation speed, in revolutions per second.
Modern disk drives are addressed as large one-dimensional arrays of logical blocks, where the logical block
is the smallest unit of transfer. The one-dimensional array of logical blocks is mapped onto the sectors of the disk
sequentially. Sector 0 is the first sector of the first track on the outermost cylinder. The mapping proceeds in order
through that track, then through the rest of the tracks in that cylinder, and then through the rest of the cylinders from
outermost to innermost.
By using this mapping, we can convert a logical block number into an old-style disk address that consists of a
cylinder number, a track number within that cylinder, and a sector number within that track. In practical, it is difficult to
perform this translation, for two reasons. First, most disks have some defective sectors, but the mapping hides this by
substituting sparesectors from elsewhere on the disk. Second, the number of sectors per track is not a constant on some
drives.
The density of bits per track is uniform. This is called Constant linear velocity (CLV). The disk rotation
speed can stay constant and the density of bits decreases from inner tracks to outer tracks to keep the data rate constant. This
method is used in hard disks and is known as constant angular velocity (CAV).
DISK SCHEDULING
The seek time is the time for the disk arm to move the heads to the cylinder containing the desired sector. The
rotational latency is the time waiting for the disk to rotate the desired sector to the disk head. The disk bandwidth is the
total number of bytes transferred divided by the total time between the first request for service and the completion of the
last transfer.
We can improve both the access time and the bandwidth by scheduling the servicing ofdisk I/O requests in a
good order. Several algorithms exist to schedule the servicing of disk I/Orequests as follows:
1. FCFS Scheduling
The simplest form of scheduling is first-in-first-out (FIFO) scheduling, which processes items from the queue in
sequential order. We illustrate this with a request queue (0-199): 98, 183, 37, 122, 14, 124, 65, 67
Consider now the Head pointer is in cylinder 53.
(FCFS disk scheduling)
2. SSTF Scheduling
It stands for shortest-seek-time-first (SSTF) algorithm. The SSTF algorithm selects the request with the minimum
seek time from the current head position. Since seek time increases with the number of cylinders traversed by the head,
SSTF chooses the pending request closest to the current head position. We illustrate this with a request queue (0-199): 98,
183, 37, 122, 14,
124, 65, 67
Consider now the Head pointer is in cylinder 53.
I/O SYSTEM
The role of the operating system in computer I/O is to manage and control I/O operations and I/O devices. A
device communicates with a computer system by sending signals over a cable or even through the air. The device
communicates with the machine via a connection point. If one or more devices use a common set of wires for
communication, the connection is called a bus.
When device A has a cable that plugs into device B, and device B has a cable that plugs into device C, and device
C plugs into a port on the computer, this arrangement is called a daisy chain. A daisy chain usually operates as a bus.
Buses are used widely in computer architecture as follows:
1. Human readable: Suitable for communicating with the computer user. Examples include printers and terminals,
the latter consisting of video display, keyboard, and perhaps otherdevices such as a mouse.
2. Machine readable: Suitable for communicating with electronic equipment. Examples aredisk drives, USB keys,
sensors, controllers, and actuators.
3. Communication: Suitable for communicating with remote devices. Examples are digital line drivers and modems.
I/O system calls encapsulate device behaviors in generic classes. Each general kind is accessed through a
standardized set of functions, called an interface. The differences are encapsulated in kernel modules called device drivers
that internally are custom tailored to each device, but that export one of the standard interfaces. Devices vary in many
dimensions: o Character-stream or block o Sequential or random-access o Sharable or dedicated o Speed of operation
o read-write, read only, or write only
Interrupt: Interrupt is a hardware mechanism in which, the device notices the CPU that itrequires its attention.
Interrupt can take place at any time. So when CPU gets an interruptsignal through the indication interrupt-request line,
CPU stops the current process and respond to the interrupt by passing the control to interrupt handler which services
device.
Polling: In polling is not a hardware mechanism, its a protocol in which CPU steadily checkswhether the device needs
attention. Whenever a device tells process unit that it desires hardware processing, in polling process unit keeps asking
the I/O device whether or not it desires CPU processing. The CPU ceaselessly check every and each device hooked up
thereto for sleuthing whether or not any device desires hardware attention. Each device features a command-ready bit
that indicates the standing of that device, i.e., whether or not it’s some command to be dead by hardware or not. If
command bit is ready one, then it’s some command to be dead else if the bit is zero, then it’s no commands.
Let’s see that the difference between interrupt and polling:
Interrupt Polling
1. In interrupt, the device notices theCPU that it Whereas, in polling, CPU steadily checkswhether the device
requires its attention. needs attention.
2. An interrupt is not a protocol, its ahardware Whereas it isn’t a hardware mechanism, its a
mechanism. protocol.
3. In interrupt, the device is servicedby While in polling, the device is serviced byCPU.
interrupt handler.
4. Interrupt can take place at anytime. Whereas CPU steadily ballots the device atregular or proper
interval.
In interrupt, interrupt request line is used as While in polling, Command ready bit is usedas indication
5. indication for indicating that device requires for indicating that device requires servicing.
servicing.
In interrupts, processor is simply On the opposite hand, in polling, processor waste countless
disturbed once any device processor cycles by repeatedly checking the command-ready
6. little
interrupts it.
bit of each device.
DMA Controller is a hardware device that allows I/O devices to directly access memory with less participation of the
processor. DMA controller needs the same old circuits of an interface to communicate with the CPU and Input/Output
devices.
Fig-1 below shows the block diagram of the DMA controller. The unit communicates with the CPU through data bus
and control lines. Through the use of the address bus and allowing the DMA and RS register to select inputs, the
register within the DMA is chosen by the CPU. RD and WR are two-way inputs. When BG (bus grant) input is 0, the
CPU can communicate with DMA registers. When BG (bus grant) input is 1, the CPUhas relinquished the buses
and DMA can communicate directly with the memory.
DMA Controller Register:
The DMA controller has three registers as follows.
Address register – It contains the address to specify the desired location inmemory.
Word count register – It contains the number of words to be transferred.
Control register – It specifies the transfer mode.
Note –
All registers in the DMA appear to the CPU as I/O interface registers. Therefore, the CPU can both read and write into
the DMA registers under program control via the data bus.
Block Diagram
Explanation:
The CPU initializes the DMA by sending the given information through the data bus.
The starting address of the memory block where the data is available (to read)or where data are to be stored
(to write).
It also sends word count which is the number of words in the memory block tobe read or write.
Control to define the mode of transfer such as read or write.
A control to begin the DMA transfer.
Protection
A mechanism that controls the access of programs, processes, or users to the resources defined by a computer system is
referred to as protection. You may utilize protection as a tool for multi- programming operating systems, allowing
multiple users to safely share a common logical namespace, including a directory or files.
1. The policies define how processes access the computer system's resources, such as the CPU, memory, software,
and even the operating system. It is the responsibility of both theoperating system designer and the app
programmer. Although, these policies are modified at any time.
2. Protection is a technique for protecting data and processes from harmful or intentional infiltration. It contains
protection policies either established by itself, set by managementor imposed individually by programmers to
ensure that their programs are protected to the greatest extent possible.
3. It also provides a multiprogramming OS with the security that its users expect whensharing common space
such as files or directories.
Security
Security refers to providing a protection system to computer system resources such as CPU, memory, disk, software
programs and most importantly data/information stored in the computer system. If a computer program is run by an
unauthorized user, then he/she may cause severe damage to computer or data stored in it. So a computer system must be
protectedagainst unauthorized access, malicious access to system memory, viruses, worms etc.
There are several goals of system security. Some of them are as follows:
1. Integrity
Unauthorized users must not be allowed to access the system's objects, and users with insufficient rights should not modify
the system's critical files and resources.
2. Secrecy
The system's objects must only be available to a small number of authorized users. The system files should not be
accessible to everyone.
3. Availability
All system resources must be accessible to all authorized users, i.e., no single user/process should be able to consume all
system resources. If such a situation arises, service denial may occur. In this case, malware may restrict system resources
and preventing legitimate processes from accessing them.
System Authentication
One-time passwords, encrypted passwords, and cryptography are used to create a strong password and a formidable
authentication source.
1. One-time Password
It is a way that is unique at every login by the user. It is a combination of two passwords that allow the user access. The
system creates a random number, and the user supplies a matching one. An algorithm generates a random number for the
system and the user, and the output is matched using a common function.
2. Encrypted Passwords
It is also a very effective technique of authenticating access. Encrypted data is passed via the network, which transfers
and checks passwords, allowing data to pass without interruption or interception.
3. Cryptography
It's another way to ensure that unauthorized users can't access data transferred over a network. It aids in the data secure
transmission. It introduces the concept of a key to protecting the data.The key is crucial in this situation. When a user
sends data, he encodes it using a computer thathas the key, and the receiver must decode the data with the same key. As a
result, even if the data is stolen in the middle of the process, there's a good possibility the unauthorized user won't be able to
access it.
Program threats
The operating system's processes and kernel carry out the specified task as directed. Program Threats
occur when a user program causes these processes to do malicious operations. The common
example of a program threat is that when a program is installed on a computer, it could store and
transfer user credentials to a hacker. There are various program threats. Some of themare as
follows:
1. Virus
A virus may replicate itself on the system. Viruses are extremely dangerous and can
modify/delete user files as well as crash computers. A virus is a little piece of code that is
implemented on the system program. As the user interacts with the program, the virus
becomesembedded in other files and programs, potentially rendering the system inoperable.
2. Trojan Horse
This type of application captures user login credentials. It stores them to transfer them to a
malicious user who can then log in to the computer and access system resources.
3. Logic Bomb
A logic bomb is a situation in which software only misbehaves when particular criteria are met;
otherwise, it functions normally.
4. Trap Door
A trap door is when a program that is supposed to work as expected has a security weakness in
its code that allows it to do illegal actions without the user's knowledge.
System Threats
System threats are described as the misuse of system services and network connections to cause user
problems. These threats may be used to trigger the program threats over an entire network, known as
program attacks. System threats make an environment in which OS resources and user files may be
misused. There are various system threats. Some of them are as follows:
1. Port Scanning
It is a method by which the cracker determines the system's vulnerabilities for an attack. It is a
fully automated process that includes connecting to a specific port via TCP/IP. To protect the
attacker's identity, port scanning attacks are launched through Zombie Systems, which
previously independent systems now serve their owners while being utilized for such terrible
purposes.
2. Worm
The worm is a process that can choke a system's performance by exhausting all system resources. A
Worm process makes several clones, each consuming system resources and preventing all other
processes from getting essential resources. Worm processes can even bring a network to a halt.
3. Denial of Service
Denial of service attacks usually prevents users from legitimately using the system. For example,
if a denial-of-service attack is executed against the browser's content settings, a user may be unable
to access the internet.
RAID (Redundant Arrays of Independent Disks)
RAID (Redundant Arrays of Independent Disks) is a technique that makes use of a combination of multiple disks for
storing the data instead of using a single disk for increased performance, data redundancy, or to protect data in the
case of a drive failure. The term was defined by David Patterson, Garth A. Gibson, and Randy Katz at the University
of California, Berkeley in 1987. In this article, we are going to discuss RAID and types of RAID their Advantages
and disadvantages in detail.
What is RAID?
RAID (Redundant Array of Independent Disks) is like having backup copies of your important files stored in
different places on several hard drives or solid-state drives (SSDs). If one drive stops working, your data is still safe
because you have other copies stored on the other drives. It’s like having a safety net to protect your files from being
lost if one of your drives breaks down.
RAID (Redundant Array of Independent Disks) in a Database Management System (DBMS) is a technology that
combines multiple physical disk drives into a single logical unit for data storage. The main purpose of RAID is to
improve data reliability, availability, and performance. There are different levels of RAID, each offering a balance of
these benefits.
How RAID Works?
It splits your data across multiple drives, so if one drive fails, your data is still safe on the others. RAID helps keep
your information secure.
What is a RAID Controller?
A RAID controller is like a boss for your hard drives in a big storage system. It works between your computer’s
operating system and the actual hard drives, organizing them into groups to make them easier to manage. This helps
speed up how fast your computer can read and write data, and it also adds a layer of protection in case one of your
hard drives breaks down. So, it’s like having a smart helper that makes your hard drives work better and keeps your
important data safer.
Raid Controller
1. RAID-0 (Stripping)
RAID-0
Raid-0
Evaluation
Reliability: 0
There is no duplication of data. Hence, a block once lost cannot be recovered.
Capacity: N*B
The entire space is being used to store data. Since there is no duplication, N disks each having B blocks are
fully utilized.
Advantages
It is easy to implement.
It utilizes the storage capacity in a better way.
Disadvantages
A single drive loss can result in the complete failure of the system.
It’s not a good choice for a critical system.
2. RAID-1 (Mirroring)
More than one copy of each block is stored in a separate disk. Thus, every block has two (or more) copies,
lying on different disks.
Raid-1
Evaluation
Assume a RAID system with mirroring level 2.
Reliability: 1 to N/2
1 disk failure can be handled for certain because blocks of that disk would have duplicates on some other
disk. If we are lucky enough and disks 0 and 2 fail, then again this can be handled as the blocks of these
disks have duplicates on disks 1 and 3. So, in the best case, N/2 disk failures can be handled.
Capacity: N*B/2
Only half the space is being used to store data. The other half is just a mirror of the already stored data.
Advantages
It covers complete redundancy.
It can increase data security and speed.
Disadvantages
It is highly expensive.
Storage capacity is less.
Raid-3
Here Disk 3 contains the Parity bits for Disk 0, Disk 1, and Disk 2. If data loss occurs, we can construct it
with Disk 3.
Advantages
Data can be transferred in bulk.
Data can be accessed in parallel.
Disadvantages
It requires an additional drive for parity.
In the case of small-size files, it performs slowly.
Raid-4
In the figure, we can observe one column (disk) dedicated to parity.
Parity is calculated using a simple XOR function. If the data bits are 0,0,0,1 the parity bit is XOR(0,0,0,1) =
1. If the data bits are 0,1,1,0 the parity bit is XOR(0,1,1,0) = 0. A simple approach is that an even
number of ones results in parity 0, and an odd number of ones results in parity 1.
Raid-4
Assume that in the above figure, C3 is lost due to some disk failure. Then, we can recompute the data bit
stored in C3 by looking at the values of all the other columns and the parity bit. This allows us to recover lost
data.
Evaluation
Reliability: 1
RAID-4 allows recovery of at most 1 disk failure (because of the way parity works). If more than one disk
fails, there is no way to recover the data.
Capacity: (N-1)*B
One disk in the system is reserved for storing the parity. Hence, (N-1) disks are made available for data
storage, each disk having B blocks.
Advantages
It helps in reconstructing the data if at most one data is lost.
Disadvantages
It can’t help reconstructing data when more than one is lost.
Advantages of RAID
Data redundancy: By keeping numerous copies of the data on many disks, RAID can shield data from
disk failures.
Performance enhancement: RAID can enhance performance by distributing data over several drives,
enabling the simultaneous execution of several read/write operations.
Scalability: RAID is scalable, therefore by adding more disks to the array, the storage capacity may be
expanded.
Versatility: RAID is applicable to a wide range of devices, such as workstations, servers, and personal PCs
Disadvantages of RAID
Cost: RAID implementation can be costly, particularly for arrays with large capacities.
Complexity: The setup and management of RAID might be challenging.
Decreased performance: The parity calculations necessary for some RAID configurations, including
RAID 5 and RAID 6, may result in a decrease in speed.
Single point of failure: RAID is not a comprehensive backup solution while offering data redundancy. The
array’s whole contents could be lost if the RAID controller malfunctions.