Chapter Five
File system
05/16/2024     School of Computing, DDUIoT   1
                       Contents
o   File concept
o   File naming
o   File type
o   File access
o   File attribute
o   File operation
o   File structure
o   Directory
o   Directory structure
o   Directory operation
o   File system implementation
o   Implementing directory
05/16/2024              School of Computing, DDUIoT   2
                  File concept
o   All computer applications need to store and retrieve
    information.
o   While a process is running, it can store a limited
    amount of information within its own address
    space.
o   But the following requirements yields for long-term
    information storage:
              It must be possible to store a very large amount
               of information.
              The information must survive the termination of
               the process using it.
              Multiple processes must be able to access the
               information concurrently.
o       Magnetic disks have been used for years for this
        long-term storage.School of Computing, DDUIoT
05/16/2024                                             3
               File concept(cont……)
o Disk support two operations: Read block and Write block.
o With these two operations one could, in principle, solve the
  long-term storage problem.
o Just a few of the questions that quickly arise are:
     How do you find information?
     How do you keep one user from reading another user's data?
     How do you know which blocks are free?.
o So to solve this problem the operating system use a new
  abstraction called the file which is logical units of
  information created by processes
o As a whole, that part of the operating system dealing with
  files is known as the file system
  05/16/2024              School of Computing, DDUIoT         4
                  File naming
o Files are an abstraction mechanism and provide a way
  to store information on the disk and read it back later.
o The most important characteristic of any abstraction
  mechanism is the way the objects being managed is
  name.
o When a process creates a file, it gives the file a name.
  When the process terminates, the file continues to exist
  and can be accessed by other processes using its name.
o The exact rules for file naming vary somewhat from
  system to system, but all current operating systems
  allow strings of one to eight letters as legal file names.
05/16/2024            School of Computing, DDUIoT        5
             File naming (con’t… )
o Frequently digits and special characters are also
  permitted.
o Some file systems distinguish between upper and lower
  case letters, whereas others do not. UNIX falls in the
  first category; MS-DOS falls in the second.
o Many operating systems support two-part file names,
  with the two parts separated by a period, as in prog.c.
o In some systems (e.g., UNIX), file extensions are just
  conventions and are not enforced by the operating
  system.
o In contrast, Windows is aware of the extensions and
  assigns meaning to them.
05/16/2024           School of Computing, DDUIoT      6
                       File type
o Many operating systems support several types of files.
o UNIX and Windows, for example, have regular files and
  directories.
o UNIX also has character and block special files.
o Regular files: are the ones that contain user information.
  They are either ASCII files or binary files.
o Directories: are system files for maintaining the structure of
  the file system.
o Character special files: are related to input/output and used
  to model serial I/0 devices, such as terminals, printers, and
  networks.
o Block special files: are used to model disks. In this chapter
  we will be primarily interested in regular files.
05/16/2024             School of Computing, DDUIoT           7
                    File access
o File can be accessed either sequentially or randomly.
o Sequential access: a process could read all the bytes or
  records in a file in order, starting at the beginning, but
  could not skip around and read them out of order.
o Random access: Files bytes or records can be read in
  any order. Used when the storage medium was
  magnetic disk. Essential for many applications.
o This method is used in UNIX and Windows.
05/16/2024            School of Computing, DDUIoT        8
                  File attribute
o Every file has a name and its data.
o In addition, all operating systems associate other
  information with each file, for example, the date and
  time the file was last modified and the file's size.
o We will call these extra items the file's attributes, Some
  people call them metadata.
o The list of attributes varies considerably from system to
  system.
05/16/2024            School of Computing, DDUIoT        9
             File attribute(con’t…)
05/16/2024         School of Computing, DDUIoT   10
                File operation
o Different systems provide different operations to allow
  storage and retrieval.
o Create: The file is created with no data. The purpose of
  the call is to announce that the file is coming and to
  set some of the attributes.
o Delete: When the file is no longer needed, it has to be
  deleted to free up disk space.
o Open: Before using a file, a process must open it. The
  purpose of the open call is to allow the system to fetch
  the attributes and list of disk addresses into main
  memory for rapid access on latter caller.
o Close: When all the access are finished, the attributes
  and disk addresses are no longer needed, so the file
  should be closed to free up internal table space.
05/16/2024           School of Computing, DDUIoT      11
             File operation(con’t…)
o Read: Data are read from file. Usually, the bytes come from
  the current position. The caller must specify how much data
  are needed and must also provide a buffer to put them in.
o Write: Data are written to the file again, usually at the
  current position. If the current position is the end of the
  file, the file’s size increases. If the current position is in the
  middle of the file, existing data are lost forever.
o Append: This call is a restricted in the form of write. It can
  only add data to the end of the file.
o Seek: For random access files, a method is needed to specify
  from where to take the data. One common approach is a
  system call, seek, that reposition the file pointer to a
  specific place in the file. After this call has completed, data
  can be read from, or written to, that position.
05/16/2024               School of Computing, DDUIoT            12
             File operation(con’t…)
o Get Attributes: Processes often need to read file
  attributes to do their work.
o Set Attributes: Some of the attributes are user settable
  and can be changed after the file has been created.
  This system call makes that possible.
       o The protection mode information is an obvious
         example.
       o Most of the flags also fall in this category.
o Rename: It frequently happens that a user needs to
  change the name of an existing file. This system call
  makes that possible
05/16/2024           School of Computing, DDUIoT       13
                           File structure
o Files can be structured in any of several ways.
o Three common possibilities are:
1. Byte sequence: All it sees are bytes. Any meaning must be
   imposed by user-level programs.
            Both UNIX and Windows use this approach.
2. Record sequence: Central to the idea of a file being a
   sequence of records is the idea that the read operation
   returns one record and the write operation overwrites or
   appends one record.
       •       Used in database system.
3. Tree: File consists of tree not necessarily all the same
   length, each containing a key field in a fixed position in
   the record. The tree is sorted on the key field, to allow
   rapid searching for a particular key.
05/16/2024                        School of Computing, DDUIoT   14
              File structure (con’t….)
(a) Byte sequence. (b) Record sequence. (c) Tree.
 05/16/2024          School of Computing, DDUIoT   15
                Directory
o To keep track of files, file systems normally
  have directories or folders, which in many
  systems are themselves files and containing
  information about all files.
05/16/2024        School of Computing, DDUIoT   16
            Directory structure
o Defines the    organization   or   logical   structure   of
  directory.
  Single-Level Directory Systems:
o The simplest form of directory system is having one
    directory containing all the files.
o They require unique name and Sometimes it is called
    the root directory.
o The advantages of this scheme are its simplicity and
    the ability to locate files quickly-there is only one
    place to look, after all.
o It is often used on simple embedded devices such as
    telephones, digital cameras, and some portable music
    players.
05/16/2024               School of Computing, DDUIoT    17
             Directory structure(con’t…)
    Hierarchical Directory Systems:
o       With this approach, there can be as many directories as are
        needed to group the files in natural ways.
o       When the file system is organized as a directory tree, some
        way is needed for specifying file names.
o       Two different methods are commonly used.
o       In the first method, each file is given an absolute path name consisting
        of the path from the root directory to the file.
             Example : /usr/ast/mailbox
o       The other kind of name is the relative path name. This is used
        in conjunction with the concept of the working directory (also
        called the current directory). A user can designate one directory
        as the current working directory
o       For example, if the current working directory is /usr/ast, then
        the file whose absolute path is /usr/ast/mailbox can be
        referenced simply as mailbox.
05/16/2024                        School of Computing, DDUIoT                 18
             Directory operation
o They allowed system calls for managing directories
  exhibit more variation from system to system than system
  calls for files.
o Create: A directory is created. It is empty except for dot
  and dot dot, which are put there automatically by the
  system .
o Delete: A directory is deleted. Only an empty directory
  can be deleted.
o Opendir: Directories can be read. Before a directory can
  be read, it must be opened, analogous to opening and
  reading a file.
o Closedir: When a directory has been read, it should be
  closed to free up internal table space.
05/16/2024            School of Computing, DDUIoT       19
         Directory operation(con’t…)
o Readdir: This call returns the next entry in an open directory, no
  matter which of the possible directory structures is being used.
o Rename: In many respects, directories are just like files and can
  be renamed the same way files can be.
o Link: Linking is a technique that allows a file to appear in more
  than one directory. This system call specifies an existing file and
  a path name, and creates a link from the existing file to the
  name specified by the path
       Track of the number of directory entries containing the file), is
        sometimes called a hard link.
o Unlink: A directory entry is removed. If the file being unlinked is
  only present in one directory (the normal case), it is removed
  from the file system. If it is present in multiple directories, only
  the path name specified is removed. The others remain.
o In UNIX, the system call for deleting files (discussed earlier) is, in
  fact, unlink.
05/16/2024                  School of Computing, DDUIoT              20
         File system implementation
o Users are concerned with how files are named,
  what operations are allowed on them, what the
  directory tree looks like, and similar interface
  issues.
o Implementers are interested in how files and
  directories are stored, how disk space is
  managed, and how to make everything work
  efficiently and reliably
05/16/2024        School of Computing, DDUIoT   21
               File system layout
o       Most disks can be divided up into one or more
        partitions, with independent file systems on each
        partition.
o       Sector 0 of the disk is called the MBR (Master Boot
        Record) and is used to boot the computer.
o       The end of the MBR contains the partition table and
        gives the starting and ending addresses of each
        partition.
o       One of the partitions in the table is marked as active
o       When the computer is booted, the BIOS reads in and
        executes the MBR.
o       The first thing the MBR program does is locate the
        active partition, read in its first block, called the
        boot block, and execute
05/16/2024
                                         it.
                          School of Computing, DDUIoT       22
             File system layout(con’t…)
o       The program in the boot block loads the operating
        system contained in that partition.
o       Every partition starts with a boot block, even if it
        does not contain a bootable operating system.
o       Often the file system will contain some of the items
o       Superblock : contains all the key parameters about
        the file system and is read into memory when the
        computer is booted or the file system is first touched.
o       Typical information in the superblock includes: a
        magic number, to identify the file system type, the
        number of blocks, in the file system, and other key
        administrative information.
05/16/2024               School of Computing, DDUIoT       23
             File system layout(con’t…)
o Free space management: includes information about free
  blocks in the file system, for example in the form of a
  bitmap or a list of pointers. This might be followed by
  the
o I-nodes: an array of data structures, one per file, telling
  all about the file.
o Root directory: which contains the top of the file system
  tree.
o Finally, the remainder of the disk contains all the other
  directories and files.
05/16/2024            School of Computing, DDUIoT        24
             File system layout(con’t…)
    A possible file system layout.
05/16/2024            School of Computing, DDUIoT   25
             Implementing files
o Probably the most important issue in implementing
  file storage is keeping track of which disk blocks go
  with which file.
o Various methods are used in different operating
  systems.
          o Contiguous allocation
          o Linked list allocation
          o i-node
05/16/2024           School of Computing, DDUIoT     26
             Contiguous allocation
o The simplest allocation scheme is to store each file as a
  contiguous run of disk blocks.
o It is simple to implement because keeping track of where a file's
  blocks are is reduced to remembering two numbers: the disk
  address of the first block and the number of blocks in the file
o The read performance is excellent because the entire file can be
  read from the disk in a single operation.
o Only one seek is needed (to the first block). After that, no more
  seeks or rotational delays are needed, so data come in at the full
  bandwidth of the disk.
o Thus contiguous allocation is simple to implement and has high
  performance.
o Contiguous allocation also has a fairly significant drawback: over
  the course of time, the disk becomes fragmented.
05/16/2024               School of Computing, DDUIoT             27
             Linked list allocation
o Keep each one as a linked list of disk blocks.
o The first word of each block is used as a pointer
  to the next one and the rest of the block is for
  data.
o Unlike contiguous allocation, every disk block can
  be used in this method. No space is lost to disk
  fragmentation (except for internal fragmentation in
  the last block).
o It is sufficient for the directory entry to merely
  store the disk address of the first block. The rest
  can be found starting there.
o Although      reading   a    file  sequentially   is
  straightforward, random access is extremely slow.
05/16/2024          School of Computing, DDUIoT     28
             Linked list allocation(con’t…)
o The amount of data storage in a block is no
  longer a power of two because the pointer
  takes up a few bytes.
o Both disadvantages of the linked list
  allocation can be eliminated by taking the
  pointer word from each disk block and
  putting it in a table in memory.
05/16/2024             School of Computing, DDUIoT   29
                        i-node
o Method for keeping track of which blocks belong to
  which file is to associate with each file of a data
  structure called an i-node (index-node)
o Lists the attributes and disk addresses of the file's
  blocks.
o The big advantage of this scheme over linked files
  using an in-memory table is that the i-node need
  only be in memory when the corresponding file is
  open.
05/16/2024           School of Computing, DDUIoT      30
             Implementing directory
o The main function of the directory system is to map
  the ASCII name of the file onto the information
  needed to locate the data.
o A directory consists of a list of fixed-size entries, one
  per file, containing a (fixed-length) file name, a
  structure of the file attributes, and one or more disk
  addresses (up to some maximum) telling where the
  disk blocks are.
05/16/2024            School of Computing, DDUIoT        31
               Implementing directory(con’t…)
o      Nearly all modem operating systems support longer,
       variable-length file names.
       How can these be implemented?
o      The simplest approach is to set a limit on file name
       length, typically 255 characters
o      One alternative is to give up the idea that all directory
       entries are the same size.
o      With this method, each directory entry contains a fixed
       portion, typically starting with the length of the entry,
       and then followed by data with a fixed format and
       other.
o      A disadvantage of this method is that when a file is
       removed, a variable-sized gap is introduced into the
       directory into which the next file to be entered may
  05/16/2024
       not fit.             School of Computing, DDUIoT      32
        Implementing directory(con’t…)
o Another way to handle variable-length names is to make
  the directory entries themselves all fixed length and
  keep the file names together in a heap at the end of the
  directory.
o This method has the advantage that when an entry is
  removed, the next file entered will always fit there.
o In all of the designs so far, directories are searched
  linearly from beginning to end when a file name has to
  be looked up.
o One way to speed up the search is to use a hash table
  in each directory.
o The table entry corresponding to the hash code is
  inspected.
05/16/2024            School of Computing, DDUIoT       33
        Implementing directory(con’t…)
o If it is unused, a pointer is placed there to the file
  entry.
o If that slot is already in use, a linked list is
  constructed, headed at the table entry and threading
  through all entries with the same hash value.
o A different way to speed up searching large
  directories is to cache the results of searches.
05/16/2024           School of Computing, DDUIoT      34
05/16/2024   School of Computing, DDUIoT   35