KEMBAR78
Understanding File Systems | PDF | Art | Computers
0% found this document useful (0 votes)
57 views74 pages

Understanding File Systems

Uploaded by

罗芷晴
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views74 pages

Understanding File Systems

Uploaded by

罗芷晴
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 74

FILE SYSTEMS

Topic 3
2

Objectives
• List the basic functions common to all file systems

• Describe and explain file systems as understood from


end-users’ perspective and implementation (i.e. file
system designers’) perspective.

• Describe the file systems used by Windows (FAT16,


FAT32, FAT64, and NTFS) and Linux (ufs and ext).

2
3

Understanding the File System Functions


• All information stored on a computer’s hard disk is
managed, stored, and retrieved through a file system
• The file system allocates locations on a disk for storage, and it
keeps a record of where specific information is kept.
• Some file systems implement recovery procedures when a disk
area is damaged or when the OS goes down.
• The overall purpose of a file system is to create a
structure for filing data.
• A file is a set of data that are grouped in some logical
manner, assigned a name, and stored on the disk
CC216 Lecture 3 - File Systems 4
5

Understanding File System Functions


File systems used by operating systems perform the
following general tasks:
1. Partition and format disks to store and retrieve
information
2. Enable files to be organized through directories and
folders.
3. Establish file-naming conventions.
4. Provide utilities to maintain and manage the file system
and storage media.
5. Provide for file and data integrity.
6. Enable error recovery or prevention.
7. Secure the information in files.
6

Understanding File System Functions


• Directory or folder – an organizational structure that contains
files and may additionally contain subdirectories (or subfolders)
• a directory is a hierarchical system that allows for Organizing
data on different levels, a folder simply organizes data into one
level.
• Directories may store the following information
• Date and time the directory or file was created
• Date and time the directory or file was last modified
• Date and time when the directory or file was last accessed
• Directory or file size
• Directory or file attributes, such as security information, or if
the directory or file was backed up
• If the information in a directory or file is compressed or
encrypted
CC216 Lecture 3 - File Systems 7

A file system analogy


8

Designing a Directory Structure


A chaotic file structure:
• makes it difficult to run or remove programs
• Makes it difficult to determine the most current versions
• Makes users spend unproductive time looking for specific
files
Some users keep most of their files in the computer’s primary
level or root directory (root folder)
• Some programs use an automated setup that suggests folders
for new programs
• Example – creating new subfolders under the Program Files
folder
• To avoid chaos, design the file and folder structure from the
start (especially on servers)
9

Designing a Directory Structure


Consider following some general practices:
• Root folder should not be cluttered with files or too many
directories/folders
• Each software application should have its own folder/subfolder
• Similar information should be grouped (for example, accounting
systems or office productivity software)
• Operating system files should be kept separate and protected
• Directories and folders should have names that clearly reflect their
purposes (a folder named “Shared” would contain documents that
can be shared by many users)
10

Designing a Directory Structure

Sample folder structure for a Windows-based system


CC216 Lecture 3 - File Systems 11

Designing a Directory Structure

Fedora root
folders
Tanenbaum & Bo, Modern Operating
Systems:4th ed., (c) 2013 Prentice-
Hall, Inc. All rights reserved.
File Systems (1)
Essential requirements for long-term information storage:
1. It must be possible to store a very large amount of
information.
2. Information must survive termination of the process using
it.
3. Multiple processes must be able to access information
concurrently.
Tanenbaum & Bo, Modern Operating
Systems:4th ed., (c) 2013 Prentice-
Hall, Inc. All rights reserved.
File Systems (2)

Think of a disk as a linear sequence of fixed-size


blocks and supporting two operations:
1. Read block k.
2. Write block k
File Systems (3)
Questions that quickly arise:
1. How do you find information?
2. How do you keep one user from reading another
user’s data?
3. How do you know which blocks are free?
CC216 Lecture 3 - File Systems 15

Files
• In computing, abstractions are important:
• Processor  processes
• Physical memory  (virtual) address space
• Disk storage & management  the file
• Files: logical units of information created by processes.
• A disk will usually contain thousands/millions of them
• Each one independent of the others
• Files model the disk
• Shield user from details of how/where information is stored and
how disks work.
• Files managed by OS.
• How they are structured, named, accessed, used,
protected, implemented, and managed  major topics
• Part of OS dealing with files  file system
CC216 Lecture 3 - File Systems 16

File system
• Can be understood from:
1. End-users’ perspective
2. Implementation (i.e. file system designers’) perspective.
• Earlier slides have explained a lot from the end user’s
perspective.
• Now, we cover both of them systematically.
CC216 Lecture 3 - File Systems 17

File systems as seen from end-users’ perspective…


File Naming

1. Naming
conventions
2. Two-part name

Fig: Some typical file extensions.


File Structure

Fig: Three kinds of files.


(a) Byte sequence. (b) Record sequence. (c) Tree.
File Types
2 main types:
1. ASCII
2. Binary

For binary files:


• Every OS must
recognize at least one
type: its own
executable file.
• For other binary files,
OS leaves them to
corresponding
applications to handle.

Fig: Examples of binary files: (a) An executable file. (b) An


archive
ASCII (American
Standard Code for
Information Interchange)
is the most common
character encoding format
for text data in computers
and on the internet. In
standard ASCII-encoded
data, there are unique
values for 128 alphabetic,
numeric or special
additional characters and
control codes
File Attributes

• Every file has name &


data.
• All OS associate other
info with each file 
attributes/metadata.
• Vary between system to
system.

Fig. Some possible file attributes.


Tanenbaum & Bo, Modern Operating
Systems:4th ed., (c) 2013 Prentice-
Hall, Inc. All rights reserved.
File Operations
1. Create 7. Append
2. Delete 8. Seek
3. Open 9. Get attributes
4. Close 10. Set attributes
5. Read 11. Rename
6. Write
File System Calls
The open() system call is used to provide
access to a file in a file system. This system call
allocates resources to the file and provides a
handle that the process uses to refer to the file. A
file can be opened by multiple processes at the
same time or be restricted to one process
Example Program Using File System Calls (1)

Figure 4-5. A simple program to copy a file.


Example Program Using File System Calls (2)

Figure 4-5. A simple program to copy a file.


Example Program Using File System Calls (3)

Figure 4-5. A simple program to copy a file.


Directories
• Directories/folders  Keep track of files
• In many systems, they themselves are files.
• Need to understand:
• Their organization
• Their properties
• Operations that can be performed on them
Single-Level Directory Systems

• Simplest
• Common in early PCs (only 1 user).
• World’s 1st supercomputer CDC 6600
had only 1 directory.
• Advantages:
• Keep software design simple.
• Locate files quickly for small
number of files.
• Good for simple embedded
devices, e.g. telephones, cameras

Fig. A single-level directory system containing four files.


Hierarchical Directory Systems

• To cater modern users


with thousands/millions
of files.
• Impossible with single-
level directory systems.
• Need a way to group
related files together.
• Hierarchy  natural
way for grouping.
• Arbitrary number of
subdirectories.
Fig. A hierarchical directory system.
Fig. A UNIX
directory tree.
Tanenbaum & Bo, Modern Operating
Systems:4th ed., (c) 2013 Prentice-
Hall, Inc. All rights reserved.
Directory Operations
1. Create 5. Readdir
2. Delete 6. Rename
3. Opendir 7. Link
4. Closedir 8. Unlink

• Allowed system calls for managing directories.


• Varies from system to system.
• Operations listed above are a set of samples taken from UNIX.
CC216 Lecture 3 - File Systems 33

File systems as seen from implementer's point of


view…
CC216 Lecture 3 - File Systems 34

File System Implementation: Intro


• Time to turn from user’s view of file system to
implementer’s view.
• Users are concerned with:
• How files are named
• What operations are allowed on them.
• What directory tree looks like
• Other similar interface issues.
• Implementers are interested in:
• How files and directories are stored and managed.
• How disk space is managed.
• How to make everything work efficiently and reliably.
35

Disk Storage Basics


• Hard disks arrive from manufacturer with low-level
formatting
• A low-level format is a software process that marks the location of
disk tracks and sectors
• Tracks are like several circles around a disk and each track is
divided into sections of equal size called sectors
CC216 Lecture 3 - File Systems 36

A hard disk (top


cover removed)
CC216 Lecture 3 - File Systems 37
38

Disk Storage Basics

Figure 2: Disk tracks and sectors on a platter


CC216 Lecture 3 - File Systems 39

File System Layout


• File systems stored on disks.
• Most disks can be divided up into one or more partitions,
with independent file systems on each partition.
• Sector 0 of the disk  MBR (Master Boot Record)
• used to boot computer.
• End of MBR  partition table.
• Gives starting & ending addresses of each partition.
• One of the partitions marked as active.
File System Layout

Fig. A possible file system layout.


CC216 Lecture 3 - File Systems 41

File System Layout (cont’d)


• When a computer is booted:
• BIOS reads in and executes Master Boot Record (MBR).
• The first thing the MBR program does is locate the active partition,
read in its first block, which is called the boot block, and execute it.
• The program in the boot block loads the operating system contained
in that partition.
• For uniformity, every partition starts with a boot block,
even if it does not contain a bootable OS. Besides, it might
contain one in the future.
• Other than starting with a boot block, the layout of a disk
partition varies a lot from file system to file system.
CC216 Lecture 3 - File Systems 42

Items in a possible layout of a disk partition


• Superblock
• contains all key parameters about file system
• read into RAM when computer booted or file system first touched.
• typical information includes a magic number to identify file-system
type, number of blocks in file system and other key admin info.
• Information about free blocks in the file system
• e.g. in the form of a bitmap or a list of pointers.
• I-nodes
• an array of data structures, one per file, telling all about the file.
• Root directory
• contains the top of the file-system tree.
• All the other directories and files
CC216 Lecture 3 - File Systems 43

Implementing Files
• Most important issue: keep track of which disk blocks go
with which file.
• Various methods are used in different OSs.
• We examine a few of them in the coming slides:
• Contiguous Allocation
• Linked-List Allocation
• Linked-List Allocation Using a Table in Memory
• I-nodes
CC216 Lecture 3 - File Systems 44

Implementing Files
Contiguous Layout
• Simplest allocation scheme
• Stores each file as a contiguous run of disk blocks.
• 1-KB blocks 50-KB file allocated 50 consecutive blocks.
• 2-KB blocks 50-KB file allocated 25 consecutive blocks.
• Example shown in Fig next slide.
• 40 disk blocks starting with block 0 on the left.
• Initially, empty disk. Then file A written. Then B.
• Each file begins at start of new block.
• If a file size is really 3.5 blocks, space wasted at end of last block.
Implementing Files
Contiguous Layout

Fig. (a) Contiguous allocation of disk


space for seven files. (b) The state of the
disk after files D and F have been removed.
CC216 Lecture 3 - File Systems 46

Implementing Files
Contiguous Layout (cont’d)
• Advantages
1. Simple to implement: keeping track of where a file’s blocks 
remembering: disk address of first block and # blocks occupied by file.
2. Excellent read performance
• Only 1 seek needed (to first block).
• No more seeks or rotational delays after that  data come in at full
bandwidth of the disk.
• Drawback
• Over course of time, disk becomes fragmented: the disk ultimately
consists of files and holes (see Fig).
• Compacting disk  prohibitively expensive (moving millions of blocks).
• Reuse free space in the holes  doable but impractical (whenever a new
file created, need to know its final size in order to choose hole of correct
size to place it in).
Implementing Files
Linked List Allocation

Fig. Storing a file as a linked list of disk blocks.


Implementing Files
Linked List Allocation
• Advantages
• Unlike contiguous allocation, every disk block can be used.
• No space lost to disk fragmentation (except for internal fragmentation)
• Enough to store disk address of first block.
• Reading a file is straightforward
• Disadvantages
• Random access extremely slow.
• To get to block n, OS has to start at beginning and read n-1 blocks prior
to it, one at a time. Doing so many reads painfully slow.
• Amount of data storage no longer a power of 2 (since pointer
takes up a few bytes).
• Not fatal but many programs read/write in blocks whose size power of 2
• With few bytes occupied by pointer, reads of full block size require
acquiring and concatenating info from 2 disk blocks  extra overhead
due to copying.
Implementing Files
Linked List – Table in Memory
• Eliminate both disadvantages of linked list allocation.
• Take pointer word from each disk block and put it in a
table in memory  File Allocation Table (FAT).
• See Fig next slide.
• Advantages
• All the blocks available for data.
• Random access much easier.
• Although chain must still be followed, chain in entirely in memory 
can be followed without making any disk references.
• Sufficient for directory entry to keep single integer (starting block
number) and can locate all blocks, no matter the file size.
Implementing Files
Linked List – Table in Memory

• Two files.
• File A uses blocks 4,7,2,10,12
• File B uses blocks 6,3,11,14
• Both chains terminated with
special marker (“-1”).

Fig. Linked list allocation using a file


allocation table in main memory.
Implementing Files
Linked List – Table in Memory (cont’d)
• Disadvantage
• Entire table must be memory all the time.
• E.g. how much does FAT table take in RAM given a
200 GB disk and 1-KB block size.
• The table needs 200 million entries (one for each of 200
million disk blocks)
• (200x109) / (1x103) = 200x106
• Each entry should be 4 bytes.
• log(200x10^6) / log(2) = 27.58 bits = 27.58 / 8 = 3.45 bytes
• Table will take up 800 MB of RAM.
• (200x10^6) entries x4 bytes = 800x10^6 bytes = 800 MB
• Not practical and does not scale well to large disks.
Implementing Files
I-nodes
• Associate with each file a data structure called i-node
(index-node).
• Lists attributes and disk addresses of the file’s blocks.
• See Fig next slide.
• Given the i-node of a file, can find all blocks of the file.
• Advantage over linked list allocation with in-memory table
is that i-node need only be in memory when
corresponding file is open.
• If each i-node occupies n bytes and max k files may be
open at once, total memory needed = kn bytes only.
• Only this much space need be reserved in advance.
• Far smaller than FAT which grows proportionally with disk size.
• Memory required by i-node is independent of disk size.
Implementing Files
I-nodes

Figure 4-13. An
example i-node.
Implementing Files
I-nodes (cont’d)
• Problem with i-nodes
• If each one has room for fixed number of disk addresses, what
happens when a file grows beyond this limit?
• Solution
• Reserve last disk address not for a data block but instead for
address of a block containing more disk block addresses.
Implementing Directories
• Before file is read, must be opened.
• When file opened, OS uses path name supplied by user
to locate directory entry.
• Directory entry provides info needed to find disk blocks.
• This info may be:
• The disk address of entire file (with contiguous allocation)
• The number of the first block (both linked list schemes)
• The number of the i-node.
• Main function of directory system is to map ASCII
name of file onto info needed to locate the data.
Implementing Directories
Example: reading a file
• Assume you want to open “/foo/bar”, read and close it.
• For simplicity, assume its size is 4KB (i.e. 1 block).
• File system (FS) first needs to inode for the file “bar”
• to obtain some basic information about the file (permissions
info, file size, etc.).
• To do so, FS must be able to find the inode, but all it
has right now is the full pathname.
• FS must traverse pathname and locate desired inode.
• All traversals begin at the root of the file system, in the
root directory which is simply called “/”.
Implementing Directories
Example: reading a file (cont’d)
• Thus, first thing FS will read from disk is the inode of
the root directory.
• But where is this inode? To find an inode, we must
know its i-number. Usually, we find the i-number of a file
or directory in its parent directory; the root has no
parent (by definition).
• Thus, the root inode number must be “well known”; the
FS must know what it is when the file system is
mounted. In most UNIX file systems, the root inode
number is 2.
• Thus, to begin the process, the FS reads in the block
that contains inode number 2 (the first inode block).
Implementing Directories
Example: reading a file (cont’d)
• Once the inode is read in, FS can look inside of it to find
pointers to data blocks, which contain the contents of the
root directory.
• FS will read through these looking for an entry for foo
• Once found, FS will also have found the inode number of
foo (say it is 44) which it will need next.
• The next step is to recursively traverse the pathname until
the desired inode is found.
• FS reads the block containing the inode of foo and then
its directory data, finally finding the inode number of bar.
• Then contents of bar can be read from following pointers
to data blocks inside the inode of bar.
Tanenbaum & Bo, Modern Operating
Systems:4th ed., (c) 2013 Prentice-
Journaling File Systems Hall, Inc. All rights reserved.

Steps to remove a file in UNIX:


1. Remove file from its directory.
2. Release i-node to the pool of free i-nodes.
3. Return all disk blocks to pool of free disk blocks.
• What happens if system crashes after 1st step?
• i-node and file blocks won’t be accessible from any file but will
also not be available for reassignment
• just in limbo somewhere  decreasing available resources.
• Keep a log of what file system is going to do before it
does it.
• If system crashes before it can do its planned work,
upon rebooting, system can look in the log to see what
was going on at the time of the crash and finish the job.
Virtual File Systems

Figure 4-18. Position of the virtual file system.


Keeping Track of Free Blocks

Fig. (a) Storing the free list on a


linked list. (b) A bitmap.
The MS-DOS File System (2)

Fig. Maximum partition size for different block sizes.


The empty boxes represent forbidden combinations.
63

Windows File Systems


• Windows XP, Vista, 7, Server 2003, and Server 2008
support three files systems:
• Extended FAT16
• FAT32
• NTFS
• These OSs also support file systems for DVD/CD-ROM
drives and USB devices (flash drives)
64

FAT16 and Extended FAT16


• Extended FAT16 evolved from FAT16 used in earlier
versions of MS-DOS and Windows (3.x/95/98/Me)
• In extended FAT16:
• Maximum size of a volume is 4GB
• Maximum size of a file is 2GB
• Has been around for awhile and can be read by non-Windows
operating systems like UNIX/Linux
• Considered a stable file system
• Long filenames (LFNs) can be used
• Can contain up to 255 characters
• Not case sensitive
65

FAT32
• Support for FAT32 started with Windows 95 Release 2
• Designed to accommodate larger capacity disks
• FAT32:
• Root folder does not have to be at the beginning of a volume
• Can use disk space more efficiently than FAT16 (because it uses
smaller cluster sizes)
• Largest volume that can be formatted is 32 GB
• Maximum file size is 4 GB
• Offers fast response on small 1 or 2 GB partitions
66

FAT64
• FAT64 is also known as exFAT
• Proprietary file system introduced by Microsoft for mobile
personal storage
• Good choice for USB flash devices that may store large
files (such as pictures, videos, etc…)
• Available in Service Pack 1 for Windows Vista, Windows
7, and Windows Server 2008, Mac OS X Snow Leopard
• Support is available for Linux from a third party
67

NTFS
• NTFS – dominant Windows file system for all Windows
operating systems starting with Windows 2000
• Uses a Master File Table (MFT) instead of FAT tables
• The MFT and related files take up about 1 MB of disk
space
• When a file is created, a record for that file is added to the
MFT
• Contains additional attributes such as security settings, ownership,
and permissions
68

NTFS
• The MFT record reflects the sequence of disk blocks that
a file uses
• It is possible to have multiple filenames that refer to the
same file
• A technique known as hard linking
• This feature is also available in UNIX/Linux file systems
• Windows Vista, Server 2008, and 7 use NTFS version 6
• Windows XP and Server 2003 use NTFS version 5
• Windows NT 4.0 used NTFS 4
69

NTFS
• Basic features of NTFS:
• Long filenames
• Built-in security features
• Better file compression than FAT
• Ability to use larger disks and files than FAT
• File activity tracking for better recovery and stability than FAT
• Portable Operating System Interface for Unix (POSIX) support
• Volume striping and volume extensions
• Less disk fragmentation than FAT
70

NTFS
• NTFS is equipped with security features that meet the US
government’s C2 security specifications
• Refers to high-level, “top-secret” standards for data protection,
system auditing, and system access
• Examples:
• System files can be protected so only the server administrator has
access
• A folder of databases can be protected with read access, but no
access to change data
• Public folder can give users in a designated group access to read
and update files, but not to delete files
71

NTFS
• Some files can be compressed by more than 40%, saving
disk storage for other storage needs
• NTFS has the ability to keep a log or journal of file system
activity (called journaling)
• Makes it possible for files to be restored in the event of a power failure
• NTFS supports volume striping
• A striped volume uses more than one physical hard disk to create a
bigger volume.
• Faster than simple volume because reads and writes happen across
multiple disks at the same time.
• Increased risk of catastrophic failure leading to data loss
• NTFS has hot fix capabilities
• If a bad disk area is detected, automatically copies the information to
another disk area that is not damaged
CC216 Lecture 3 - File Systems 72

NTFS

Fig: FAT16, FAT32, FAT64, and NTFS compared


CC216 Lecture 3 - File Systems 73

Summary
• When seen from the outside, a file system is a collection
of files and directories, plus operations on them.
• Files can be read and written, directories can be created
and destroyed, and files can be moved from directory to
directory.
• Most modern file systems support a hierarchical directory
system in which directories may have subdirectories and
these may have subsubdirectories ad infinitum.
• When seen from the inside, a file system looks quite
different. The file system designers have to be concerned
with how storage is allocated, and how the system keeps
track of which block goes with which file.
74

Chapter Summary
• Possibilities include contiguous files, linked lists, file-
allocation tables, and i-nodes.
• Different systems have different directory structures.
• Disk space can be managed using free lists or bitmaps.
• The main file systems used in Windows since Windows
2000 are extended FAT16, FAT32, and NTFS.
• NTFS is the native file system for Windows 2000 and after
with the advantage of better security, larger disk and file
sizes, better management tools, and greater stability than
FAT16 and FAT32.
• UNIX and Linux support many support many different file
systems but typically employ ufs or ext.
• ufs and ext use information nodes (inodes) to organize
information about files.

You might also like