file system LINUX

Document Sample
file system LINUX Powered By Docstoc
					                                The LINUX file system
File System

The file system organizes the data into directories and files. It also provides file i/o services to
the application. It also handles the file protection when multiple users have access to files.

We store information on several different storage media, such as magnetic disks, magnetic tapes
and optical disks. The operating system provides a uniform logical view of information storage.

Hard Disk Layout: The following diagram shows the physical hard disk layout. The first sector
contains the MBR (Master Boot record). The MBR contains the boot code and
partition information. There could be maximum four partitions.
                   E xam p le h ard d isk layou t

                                    B oot code

                                2       P artition table
                                              P artitions w ithin an
                                              extended partition
                   B oot S ector

    P artition 1           P artition 2          P artition 3          P artition 4
                                                 (E xtended)

The file system organizes the data into directories and files. It also provides file i/o services to
the application. The DOS file system is known as FAT file system. A FAT volume is divided
into several regions, as shown below. The file allocation table, which gives the FAT file system
format its name, has one entry for each cluster on a volume. Because the file allocation table is
critical, it maintains a duplicate copy of FAT. Entries in the file allocation table define file
allocation chains for files and directories, where the links in the chain are indexes to the next
cluster of a file’s data. A file’s directory entry stores the starting cluster of the file. The last entry
of the file’s allocation chain is the reserved value of 0xFFFF for FAT16 and 0xFFF for FAT12.
The FAT entries for unused clusters have a value of 0.

     Boot       File Allocation      File Allocation    Root
     Sector      Table 1                  Table 2      Directory Other directories and all files

FAT12’s 12-bit cluster identifier limits its partition to storing a maximum of 2**12 (4096)
clusters. Windows uses cluster sizes from 512 bytes to 8KB, which limits a volume to 32MB.
FAT16 with a 16bit cluster identifier, can address 2**16 (65536) clusters. The cluster sizes range
from 512 bytes to 64KB, which limits volume to 4 GB.
FAT32 uses 32 bit cluster identifiers. The windows limits volumes to a maximum of 32GB.

Virtual File System

There is large number of file systems supported by LINUX. The range of file systems supported
is made possible by the unified interface to the LINUX kernel. This is the Virtual File System
Switch (VFS). The Virtual file system supplied the applications with the system calls for the
management, maintains internal structures and passes tasks onto the appropriate actual file

          The layers in the file system

              Process             Process              Process
                 1                  2       …            n

                                                                                        User mode
                                                                                       System mode

                              Virtual File System

         ext2        msdos            minix …                                 proc

                 Buffer cache                                      File system

     Device drivers


Basic Principles

The main demand made of the file system is the purposeful structuring of data along with two
factors: speed of access and a facility for random access. In unix, the data are stored in a
hierarchical file system containing files of different types. These comprise not only regular files
and directories but also device files, FIFOs (named pipes), symbolic links and sockets.

The management information contained includes access times, access rights and the allocation of
data to blocks on the physical media. The inode contains a few block numbers to ensure efficient
access to small files. Access to large files via indirect blocks, which also contain block numbers.
Thus each node has a unique number.
Directories allow the file system to be given a hierarchical structure. These are also implemented
as files, but the kernel assumes them to contain pairs consisting of a filename and its node

     S tructure of a U N IX inode

   A ccess rights
                                        D ata
     O w ner
                                        D ata
      S ize
     T im es                            D ata

         …                              D ata               D ata
                                                             D ata
     D irect
     references                                              D ata               D ata
     data blocks                                                                 D ata
   Indirect block                                                                 D ata
  T w o- step indirect
  reference                                                                      D ata
 T hree-step indirect

Each file system starts with a boot block. This block is reserved for the code required to boot the
operating system. All the information which is essential for managing the file system is held in
the superblock. This is followed by a number of inode blocks containing the inode structures for
the file system. The remaining blocks for the device provide the space for the data. These data
blocks thus contain ordinary files along with the directory entries and the indirect blocks. In
Linux, the device independence is handled by the respective file system implementation,
enabling the VFS to work with device independence structure.

The representation of file system in the kernel

Mounting: Before a file can be accessed, the file system containing the file must be mounted.
This can be done using the system command mount.

Superblock operations: The superblock structure provides, functions for accessing the file

Inode: The inode structure holds the information on the file and remainder contains management
information and the file system dependent union.

File operations: The file_operations structure is the general interface for work on files, and
contains the functions to open, close, read and write files.

            Schem atic structure of a U N IX file system

 B oot block S uperblock  Inode blocks                            D ata blocks
 0           1           2…

The proc file system

The proc file system provides information on the current status of the Linux kernel and running
processes. It resembles the process file system of System V R4. Each process in the system
which is currently running is assigned a directory /proc/pid. This directory contains files holding
information on certain characteristics of the process.
Some of the frequently used files in /proc file system includes:

/proc/modules          For a list of modules
/proc/meminfo          For memory usage statistics
/proc/pci              For list of PCI devices detected and
                                      their configuration status
/proc/ioports                For I/O Port information

/proc/iomem                For information regarding Memory Ports
/proc/interrupts           For information related to interrupts

The Ext2 file system

Initiallly the LINUX file system was the MINIX file system. In this file system restricts to a max
of 64MB and filenames to 14 characters. The Ext file system was designed in 1992. This allowed
partition size of 2 Gbytes and filenames upto 255 characters. It was slow and extensive
fragmentation of the file system. The Ext2 file system as enhancement of Ext file system is
considered to be the Linux file system.
           Structure of the      E xt2     file system

           B lock group 0 B lock group 1                  …     B lock group n
 b lo ck

 S uper G roup                B lock Inode Inode
                                                               D ata blocks
 block descriptors            bitm ap bitm ap table

  1 B lk      1B lk            1B lk       1B lk     n B lks     n B lks

The superblock in the ext2 filesystem

Number of inodes, number of blocks
Number of reserved blocks, number of free blocks
Number of free inodes, first data block
Block size
Blocks per group
Inodes per group
Time of mounting
Ext2 signature

The block group descriptors in the Ext2 filesystem

Block bitmap, inode bitmap
Inode table, No.of free blocks/No. of free inodes
No. of directories

The inode in the Ext2 filesystem

Type/Permissions, User(UID), File size
Access time, time of creation
Time of modification, time of deletion
Group (GID), link counter, no. of blocks

File Attributes,
Direct blocks, one/two/three stage indirect blocks
File ACL, Directory ACL

Ext2 File Types

File_type                           Description
  0                                 Unknown
  1                                 Regular file
  2                                 Directory
  3                                 Character Device
  4                                 Block device
  5                                 Named pipe
  6                                 Socket
  7                                 Symbolic link

Creating the Filesystem

Creating a filesystem means setting up the structures. Ext2 filesystems are created by mke2fs utility. Default options
    Block size: 1024 bytes
    Fragment size: block size
    Number of allocated inodes: one for each group of 4096 bytes
    Percentage of reserved blocks: 5%
1. Initializes the superblock and group descriptors
2. For each bock group, reserves all the disk blocks needed to store the superblock, group descriptors, inode table,
bitmaps etc.
3. Initializes the inode and data bitmap
4. Initializes the inode table of each block group
5. Creates the / root directory
6. Creates the lost+found directory, which is used by e2fsck to link the lost and defective blocks.

Ext3 Filesystem

The enhanced filesystem evolved from Ext2, so is compatible with Ext2. The most significant features are:
    • Block fragmentation
    • Access Control Lists (ACL)
    • Handling of transparently compressed and encrypted files
    • Logical deletion
    • Journaling

The Ext3 Journaling Filesystem

The goal of a journaling filesystem is to avoid running time-consuming consistency checks on the whole filesystem
by lookinginstead in a special disk area that contains the most recent disk write operations named journal.
The Ext3 journaling is to perform any high-level change to the filesystem in two steps.
First, a copy of the blocks to be written is stored in the journal; then, when the I/O data transfer to the journal is
completed (in short, data is committed to the journal), the blocks are written in the filesystem.

Linux system calls for I/O

The Linux system calls for I/O (open, read, write etc) are direct entry points into the kernel.

open System call
A file is opened by

#include <fcntl.h>
int open(char *pathname, int oflag, [ , int mode ]);

oflag argument can be
        O_RDONLY               Open for reading only
        O_WRONLY               Open for writing only
        O_RDWR                 Open for reading and writing
        O_NDELAY               Do not block on open or read or write
        O_APPEND               Append to end of file on each write
        O_CREAT                Creat the file if it doesn't exist
        O_TRUNC                If the file exists, truncate its length to zero
        O_EXCL                 Error if O_CREAT and the file already exists

mode argument is optional and is required only if O_CREAT is specified.

open returns a file descriptor if successful, otherwise -1 is returned.

creat System call
A new file is created by

int creat(char *pathname, int mode);

If successful returns a file descriptor and the file is opened for writing, otherwise -1 is returned.

The mode specifies the low order 12 bits of the file access mode word.

close System call
An open file is closed by

int close(int filedes);

When process terminates, all open files are automatically closed by the kernel.

read System call
Data is read from an open file using

int read(int filedes, char *buff, unsigned int nbytes);

If the read is successful, the number of bytes read is returned. This can be less than nbytes that
was requested. If the end of file is encountered, zero is returned. If an error is encountered, -1 is

write System call
Data is written to an open file using

int write(int filedes, char *buff, unsigned int nbytes);

The actual number of bytes written is returned by the system call. This is usaually equal to the
nbytes argument. If an error occurs, -1 is returned.

lseek System call
Every open file has a "current byte position" associated with it. This is measured as the number
of bytes from the start of the file. The read and write system calls update the file's position by the
number of bytes read or written. Before a read or write, an open file can be positioned using

long lseek(int fildes, long offset, int whence);

The offset and whence arguments are interpreted as follows:
       * If the whence is 0, the file's position is set to offset bytes from the beginning
        of the file.
       * If the whence is 1, the file's position is set to its current position plus the
        offset. The offset can be positive or negative.
       * If whence is 2, the file's position is set to the size of the file plus the offset.
        The offset can be positive or negative.

fcntl System call
The fcntl system call is used to change the properties of a file that is already open.

#include <fcntl.h>
int fcntl(int fieldes, int cmd, int arg);

The cmd agrument must be one of the follwing:
      F_DUPED: Duplicate the file descripter.
      F_SETFD:     Set the close-on-exec for the file to the low order bit arg.
      F_GETFD: Return the close_on_exec flag for the file as the value of
                   the system call.
      F_SETFL:     Set status flags for this file to the value of arg.
      F_GETFL:     Return the status flags for this file as the value of the system call.
      F_GETOWN: Return the value of the process Id or the process group Id.
      F_SETOWN: Set the process Id or group Id.

ioctl System call
The ioctl system call is also used to change the behaviour of an open file.

#include <sys/ioctl.h>
int ioctl(int filedes, unsigned long request, char *arg);

This system call performs a variety of control functions on terminals, devices, sockets and
The main difference between fcntl and ioctl is that former is intended for any open file, while the
latter is intended for device specific operations.


Shared By:
Tags: file, system, LINUX
Chandra Sekhar Chandra Sekhar http://
About My name is chandra sekhar, working as professor