[Previous Section] [Back to Table of Contents] [Next Section]

IRIX Advanced Site and Server Administration Guide


Chapter 8
File System Administration

The file system is the structure by which files and directories are organized in the IRIX system. The file system is also the logical layout that is placed on the hard disk to allow the CPU easier access to data. It is extremely important to maintain file systems properly, in addition to backing up the data they contain. Failure to do so might result in loss of valuable system and user information.

This chapter contains:

Even if you are familiar with the basic concepts of the UNIX file system, you should read through the overview in the next section. The IRIX Extent File System is slightly different internally from other UNIX file systems.


IRIX File System Overview

The basic IRIX file system contains an enhancement to the standard UNIX file system called extents, and thus is called the IRIX Extent File System (EFS). Extents, and the various file systems available with IRIX, are described in the next sections.

Basic File System Parameters

The following are some basic parameters of the Extent File System, and a list of what you can and can't do with a file system:

Kinds of File Systems

Several types of file systems are available with the IRIX system:

Extent File System (EFS)

This section describes the IRIX EFS:

Floppy and CD File Systems

IRIX allows you to mount and use file systems on floppy disks and on CD-ROM drives. You can use these file systems on your own system, or you can export them via NFS for use on other systems (if you have NFS installed). See the NFS Administration Guide for more information on exporting file systems.

The operating instructions for these kinds of file systems are very similar and are covered in detail in the mediad(1M) reference page.

IRIX supports the following CD and floppy disk file system formats:

Floppy Disk File Systems

File systems on floppy disks are controlled by the mediad(1M) daemon. mediad monitors a given floppy drive, waiting for a disk to be inserted. If your system is running the objectserver, floppy disks are mounted on /floppy if the disk is in FAT (MS-DOS) or HFS (Macintosh) format. If you have more than one floppy drive, additional disks in additional drives are automatically mounted on /floppy2, /floppy3, and so on. If your system is not running the objectserver, you must provide a location for the mount point. See the mediad(1M) reference page for complete information. When you are through using the floppy file system, issue the eject(1) command, and mediad will attempt to unmount the file system. If the unmount is successful, it ejects the floppy immediately. Note that only one instance of mediad is allowed per system. Two invocations of mediad with the same floppy parameter generate an error.

To specify a particular floppy drive, use the appropriate device special file in /dev/rdsk. High density diskettes can be accessed by using floppy devices with the hi suffix on the device special file name.

If you are not running the objectserver and you wish to start mediad for a high density floppy drive with SCSI identifier 7 and the mount point /floppy, use this command:

mediad -ip /dev/rdsk/fds0d7.3.5hi /floppy

You must give instructions for the floppy device to be monitored and a location to mount the file system. You must also have created the mount point and ensured that the directory permissions are set appropriately (777 is usually required for read-write file systems).

CD-ROM File Systems

mediad(1M) also monitors CD-ROM drives, waiting for a disk to be inserted. When a disk is inserted, the file system it contains is mounted if the file system is in EFS, HFS, ISO 9660, or High Sierra format. If your system is running the objectserver, the CD-ROM drives are monitored automatically and when a CD containing a valid file system is inserted, it is automatically mounted on /CDROM (for the first CD-ROM drive), and /CDROM2, /CDROM3, and so on for additional drives.

If you are not running the objectserver when you invoke mediad, you must give instructions for the SCSI device to be monitored and a location to mount the CD-ROM file system. You must also have created the mount point and ensured that the directory permissions are set appropriately (755 is usually adequate for read-only file systems). If you are not running the objectserver and you wish to start mediad for a CD-ROM drive with SCSI identifier 4 and the mount point /cdrom with the mount option ro (for read-only), issue this command:

mediad -o ro -ip /dev/scsi/sc0d410 /cdrom

Note that CD-ROM file systems are always read-only. When you are finished using the file system, issue the eject command, and mediad will attempt to unmount the file system. If the unmount is successful, it ejects the CD. When mediad is running, however, any user can unmount and eject a CD with the eject command. Only one instance of mediad is allowed per system.


Maintaining File Systems

To administer file systems, you need to do the following:

The first three tasks are described in this chapter. Information about backing up file systems is found in Chapter 6, "Backing Up and Restoring Files."

Shell Scripts for File System Administration

Many routine administration jobs can be performed by shell scripts. Here are a few ideas:

All of these scripts can be run automatically by cron(1M) and the output sent to you using electronic mail. Typically, these scripts use some combination of find(1), du(1M), Mail(1), and shell commands.

The process accounting system performs many similar functions. If the process accounting system does not meet your needs, examine the scripts in /usr/lib/acct, such as ckpacct and remove, for ideas about how to build your own administration scripts.

Checking Free Space and Free Inodes

You can quickly check the amount of free space and free inodes with the df(1) command. For example, the command:

df 

produces the following output:

File              Type blocks  use     avail %use  Mounted on 
/dev/root         efs  31464   25238   6226  80%   / 
/dev/usr          efs  491832  452902  38930 92%   /usr 
ralph.cbs:/ralph  nfs  463360  409088  54272 88%   /usr/ralph 

To determine the number of free inodes, use this command:

df -i 

You see a listing similar to the one above, except that it also lists the number of inodes in use, the number of inodes that are free (available), and the percentage of inodes in use.

When a file system is more than about 90-95% full, system performance may degrade, depending on the size of the disk. Therefore, you should monitor the amount of available space and take steps to keep an adequate amount available. The percentage of disk use is non-linear, which means that a larger disk that is 97% full is not burdened as heavily as a smaller disk that is also 97% full.

If it is not possible to significantly reduce the amount of disk space used, and more space is needed for a particular file system, you can change the size of the file system. If you cannot add an additional disk drive and you need to adjust the size of your file systems, you must back them up, remove them, and then remake them using mkfs(1M). If you can add another hard disk to the system, you can grow existing file systems onto the new disk using a logical volume and growfs(1M). For more information on logical volumes, see "Logical Volumes and Disk Striping".

Why Free Space Decreases

The amount of free space on a file system decreases over time for the following reasons:

Monitoring Key Files and Directories

Almost any system that is used daily has several key files and directories that grow through normal use. Some examples are shown in Table 8-1.

Table 8-1 : Files and Directories That Tend to Grow

File Use
/etc/wtmp history of system logins
/var/adm/sulog history of su commands
/var/cron/log history of actions of cron
/tmp directory for temporary files (root file system)



/var/tmp directory for temporary files
/usr/tmp.O directory for temporary files



The frequency with which you should check growing files depends on how active your system is and how critical the disk space problem is. A good technique for keeping them down to a reasonable size uses a combination of the tail(1) and mv(1) commands:

tail -50 /var/adm/sulog > /var/tmp/sulog 
mv /var/tmp/sulog /var/adm/sulog 

This sequence puts the last 50 lines of /var/adm/sulog into a temporary file, then moves the temporary file to /var/adm/sulog. This reduces the file to the 50 most recent entries. It is often useful to have these commands performed automatically every week using cron(1). For more information on using cron to automate your regular tasks, see "Automating Tasks with at(1), batch(1), and cron(1M)".

Cleaning Out Temporary Directories

The directory /tmp and all of its subdirectories are automatically cleaned out every time the system is rebooted. You can control whether or not this happens with the chkconfig option nocleantmp. By default, nocleantmp is off, and thus /tmp is cleaned.

The directory /var/tmp is not automatically cleaned out when the system is rebooted. This is a fairly standard practice on IRIX systems. If you wish, you can configure IRIX to automatically clean out /var/tmp whenever the system is rebooted. Changing this standard policy is a fairly extreme measure, and many people expect that files left in /var/tmp are not removed when the system is rebooted. The same rules apply to /usr/tmp.O.

If you must change the policy, this is how to do it:

    Notify everyone who uses the system that you are changing the standard policy regarding /var/tmp, and that all files left in /var/tmp will be removed when the system is rebooted. Send electronic mail and post a message in the /etc/motd file.

    Give the users at least one week's notice, and longer if possible.

    Edit the file /etc/init.d/RMTMPFILES.

    Find a block of commands in the file that looks something like this:

    # make /var/tmp exist
    if [ ! -d /var/tmp ]
    then
            rm -f /var/tmp # remove the directory
            mkdir /var/tmp
    fi

    Remove the fi statement, then add the following lines:

    else 
            # clean out /var/tmp 
            rm -f /var/tmp/* 
    fi 

    The complete block of commands should look something like this:

    # make /var/tmp exist
    if [ ! -d /var/tmp ]
    then
            rm -f /var/tmp # remove the directory
            mkdir /var/tmp
    else
            # clean out /var/tmp
            rm -f /var/tmp/*
    fi

    Save and exit the file.

Do not make this change without warning users well in advance. You can also automate this task by using the find(1) command to find files over 7 days old in the temporary directories and remove them. Use the following commands:

find /var/tmp -atime 7 -exec rm {} \; 
find /tmp -atime 7 -exec rm {} \; 

See "cron(1M) Command" for information on using the cron command to automate the process.

Tracking Disk Use

Part of the job of cleaning up file systems is locating and removing files that have not been used recently. The find(1) command can locate files that have not been accessed recently.

The find program searches for files, starting at a directory named on the command line. It looks for files that match whatever criteria you wish, for example all regular files, all files that end in ``.trash,'' or any file older than a particular date. When it finds a file that matches the criteria, it performs whatever task you specify, such as removing the file, printing the name of the file, changing the file's permissions, and so forth.

For example:

find /usr -type f -mtime +60 -print > /usr/tmp/deadfiles & 

In the above example:

/usr

specifies the pathname where find is to start.

-type f

tells find to look only for regular files and to ignore special files, directories, and pipes.

-mtime +60

says you are interested only in files that have not been modified in 60 days.

-print

means that when a file is found that matches the -type and -mtime expressions, you want the pathname to be printed.

> /usr/tmp/deadfiles &


directs the output to the temporary file /tmp/deadfiles and runs in the background. Redirecting the results of the search in a file is a good idea if you expect a large amount of output.

Identifying Large Space Users

Four commands are useful for tracking down accounts that use large amounts of space: du(1), find(1), quot(1M), and diskusg(1M).

du displays disk use, in blocks, for files and directories. For example:

du /usr 

This displays the block count for all directories in the usr file system.

The du command displays disk use in 512-byte blocks. To display disk use in 1024-byte blocks, use the -k option. For example:

du -k /usr/people/ralph 

The -s option produces a summary of the disk use in a particular directory. For example:

du -s /usr/people/alice 

For a complete description of du and its options, see the du(1M) reference page.

Use find to locate specific files that exceed a given size limit. For example:

find /usr -size +10000 -print 

This example produces a display of the pathnames of all files (and directories) in the usr file system that are larger than ten 512-byte blocks.

quot reports the amount of disk usage per user on the file system. You can use the output of this command to inform your users of their disk space usage.

diskusg(1M) is part of the process accounting subsystem that serves the same purpose as quot. diskusg, though, is typically used as part of general system accounting. This utility generates disk usage information on a per-user basis. diskusg prints one line for each user identified in the /etc/passwd file. Each line contains the user's UID number and login name, and the total number of 512-byte blocks of disk space currently being used by the account. The command:

/usr/lib/acct/diskusg /dev/usr 

produces output in the following format:

UID login_name number_of_blocks 

The output of diskusg is normally the input to acctdisk (see the acct(1M) reference page), which generates total disk accounting records that can be merged with other accounting records. For more information on the accounting subsystem, consult the acct(4) reference page or Chapter 2, "Operating the System," in this Guide.

Imposing Disk Quotas

If your system is constantly short of disk space and you cannot increase the amount of available space, you may be forced to implement disk quotas. IRIX provides the quotas subsystem to automate this process. A limit can be set on the amount of space a user can occupy, and there may be a limit on the number of files (inodes) he can own.This subsystem is described completely in the quotas(4) reference page. You can use this system to implement specific disk usage quotas for each user on your system. You may also choose to implement "hard" or "soft" quotas. (Hard quotas are enforced by the system, soft quotas merely remind the user to trim disk usage.)

With soft limits, whenever a user logs in with a usage greater than his soft limit, he or she will be warned (via /bin/login(1)). When the user exceeds the soft limit, the timer is enabled. Any time the quota drops below the soft limits, the timer is disabled. If the timer is enabled longer than a time period set by the administrators, the particular limit that has been exceeded will be treated as if the hard limit has been reached, and no more resources will be allocated to the user. The only way to reset this condition is to reduce usage below the quota. Only root may set the time limits and this is done on a per file system basis.

Several options are available with the quotas subsystem. You can impose limits on some users and not others, some file systems and not others, and on total disk usage per user, total number of files, or size of files. The system is completely configurable. You can also keep track of disk usage through the process accounting system provided under IRIX.

The importance of managing disk quotas carefully cannot be over-emphasized. It is strongly recommended that if disk quotas are imposed, they should be soft quotas, and every attempt should be made to otherwise rectify the situation before removing someone's files. Before using the quotas(4) subsystem to enforce disk usage, carefully read the material on disk quotas in Chapter 3, "User Services," in this Guide.

The following steps impose soft disk quotas:

    Log in as root.

    To enable the quotas subsystem, give the commands:

    chkconfig quotas on 
    chkconfig quotacheck on  

    Next, a file named quotas should be created in the root directory of each file system that is to have a disk quota. This file should be zero length and should be writable only by root. The command

    touch quotas 

    issued as root in the root directory of the file system creates an appropriate file.

    Once the quotas files are present in each file system's root directory, you should then establish the quota amounts for individual users. The edquota(1M) command can be used to set the limits desired upon each user. For example, to set a limit of 100MB and 100 inodes on the user ID sedgwick, give the following command:

    edquota sedgwick 

    The screen clears, and you are placed in the vi(1) editor to edit the user's disk quota. You see:

    fs /  kbytes(soft=0, hard=0)  inodes(soft=0, hard=0)

    The file system appears first, in this case the root file system (/). The numeric values for disk space are in kilobytes, not megabytes, so to specify 100 megabytes, you must multiply the number by 1024. The number of inodes should be entered directly. Edit the line to appear as follows:

    fs / kbytes(soft=102400, hard=0)  inodes(soft=100, hard=0)

    Save the file and quit the editor when you have entered the correct values. If you leave the value at 0, no limit is imposed. Since we are setting only soft limits in this example, the hard values have not been set.

    Where a number of users are to be given the same quotas (a common occurrence) the -p option to edquota will allow this to be accomplished easily. Unless explicitly given a quota, users have no limits set on the amount of disk they can use or the number of files they can create.

    Once the quotas are set, issue the quotaon(1M) command. For quotas to be accurate, this command should be issued on a local file system immediately after the file system has been mounted. The quotaon command enables quotas for a particular file system, or with the -a option, enables quotas for all file systems indicated in /etc/fstab as using quotas. See the fstab(4) reference page for complete details on the /etc/fstab file.

    Quotas will be automatically enabled at boot time in the future. The script /etc/init.d/quotas handles enabling of quotas and uses the chkconfig(1M) command to check the quotas configuration flag to decide whether or not to enable quotas.

    If you need to turn quotas off, use the quotaoff(1M) command.

    Periodically, the records retained in the quota file should be checked for consistency with the actual number of blocks and files allocated to the user. Use the quotacheck(1M) command to verify compliance. It is not necessary to unmount the file system or disable the quota system to run this command, though on active file systems, slightly inaccurate results may be seen. This command is run automatically at boot time by the /etc/init.d/quotas script if the quotacheck flag has been turned on with chkconfig(1M). quotacheck(1M) can take a considerable amount of time to execute, so it is convenient to have it done at boot time.

Making New File Systems

Use the mkfs(1M) command to make file systems on formatted disks. For information on how to format and partition disks, see "Formatting Disks Using fx", and "Repartitioning a Hard Disk".

Here are a few definitions to help you:

Partition

A section of disk, normally associated with a file system. A file system is made on partition boundaries, so that no file system overlaps another. One example is to think of a disk as a cake, and think of a partition as a slice of the cake. Here's a list of the standard partitions on your system:
0: root partition 
1: swap partition 
6: usr partition 
7: entire usable disk partition 
8: volume header 
10: the entire usable disk + volume header (7 + 8)
Raw Device

The raw device accesses data on a character by character basis.

Block Device

The block device accesses data in blocks which come from a system buffer cache.

With this general terminology we can more thoroughly explain how to create our large file system. The following steps show how to make a new file system on an option disk that is already formatted and contains a single partition:

    Log in as root.

    Back up all data on the system. You can use either file system backup utilities, such as the backup tool in the System Manager, or file-oriented utilities, such as tar(1).

    It is always prudent to back up your data whenever you manipulate your hardware and file systems. If you perform this operation on your system disk, it will destroy all data resident in your /usr partition. For this reason, the example assumes you are making the file system on a previously unused option disk.

    Before you proceed, verify your backups to be sure they are good.

    Shut down the system and turn it off. Install the new hard disk and configure it. See the reference information provided with the disk for instructions about switch settings and cabling.

    After the disk is installed, bring the system up to single-user mode.

    Verify that the disk is formatted and contains the correct partitions with the prtvtoc(1M) command. If the disk is not already formatted, or you need to adjust existing partitions, see "Repartitioning a Hard Disk" in this guide.

    Format the new disk using the mkfs command to make a new file system. For example:

    mkfs /dev/rdsk/dks0d2s7 

    This example constructs a file system on the second disk (d2) attached to the primary SCSI controller (0), and uses the entire usable portion of the disk (s7). You can use either the block interface (dsk) or the character interface (rdsk) to the disk. The character interface is faster.

    In the above example, mkfs uses default values for the file system parameters. These parameters are:

    To determine these parameters, mkfs examines the device's volume header and uses that information to calculate the minimum number of inodes and the various alignment boundaries. If you want to use parameters other than the default, you can specify these on the mkfs command line. See the mkfs(1M) reference page for information about using command line parameters and proto files.

    For information about specific disk driver interfaces, see the appropriate reference page, including dks(7M) for SCSI, or ipi(7M).

    Create the mount point for the device. This is a directory on the system where the new file system will be mounted. For example:

    mkdir /rsrch 

    The mount point can be anywhere on the system. Mount the directory with the mount(1M) command:

    mount /dev/dsk/dks0d2s7 /rsrch 

    Most systems are configured so that file systems are automatically mounted when the system is booted up. Automatic file system mounting is done in the file /etc/rc2, which is executed when the system comes up to multiuser mode.

    If you do not want to mount file systems when the system is booted, skip this next step.

    Add an entry in the file /etc/fstab for each new file system. For example:

    /dev/dsk/dks0d2s7 /rsrch efs rw,raw=/dev/rdsk/dks0d2s7 0 0 

    If you already mounted the file system, as described in the previous step, you can use the mount command to determine the appropriate /etc/fstab entry. For example:

    mount -p 

    displays all currently mounted file systems, including the new file system. Copy the line that describes the new file system, in this case /rsrch, to /etc/fstab.

    See the fstab(4) reference page for more information about fstab entries.

    You are finished making file systems. Reboot your system and bring it up to multiuser mode and make sure the new file systems are mounted.

    If you want to make more than one file system, use mkfs to create a new file system on each disk partition.

Changing File System Size

There are three ways to change the size of file systems:

    Add a new disk and mount it as a directory on an existing file system.

    Change the size of the existing file systems by removing space from one partition and adding it to another. To do this, you must back up your existing data, run fx(1M) to repartition the disk, then remake both file systems with mkfs.

    Add another hard disk and grow an existing file system onto that disk with growfs(1M).

In the first scenario, you simply add a new disk with a separate file system and create a new mount point for it within your file system. This is generally considered the safest and best way to add space. For example, if your /usr file system is short of space, add a new disk and mount the new file system on a directory called /usr/work. Once again see "Formatting Disks Using fx" for full information on formatting and partitioning hard disks and making file systems on those disks. Then use the instructions in "Mounting and Unmounting File Systems" to mount your file system.

The second method, running fx and mkfs, has serious drawbacks. It is a great deal of work, and has certain risks. For example, to increase the size of a file system, you must remove space from other file systems. You must be sure that when you are finished changing the size of your file systems, your old data still fits on all the new, smaller file system. Also, resizing your file systems may at best be a stop-gap measure until you can acquire additional disk space. This procedure is documented in "Formatting Disks Using fx".

In particular, you should not change the size of the root partition by remaking your file systems or by using a logical volume. There are better solutions to space problems on the root file system. Alternate solutions to this sort of problem are presented in "Insufficient Space on the root File System".

Growing an existing file system onto an additional disk is another way to increase the available space in that file system. The growfs command preserves the existing data on the hard disk and adds space from the new disk on a logical volume. This process is simpler than completely remaking your file systems. The one drawback to growing a file system across disks is that if one disk fails, you cannot recover data from the other disk, even if the other disk still works. If your /usr file system is a logical volume, you will be unable to boot the system into multiuser mode. For this reason, it is preferable, if possible, to mount an additional disk and file system as a directory on /usr or the root file system.

The following steps show how to grow a fictional /work file system onto a logical volume.

    Log in as root and print out the following reference pages for use later in the process:


    Back up all data on the system. This is a prudent step to take whenever you are manipulating your hardware and file systems. Verify your backups to make sure they are good before you proceed.

    Shut down the system. Install the new hard disk and configure it. (If the disk was not purchased from Silicon Graphics, it may need to be formatted with fx(1M) for use with IRIX.)

    Bring up the system to multiuser mode.

    Establish a logical volume. It will span the /work file system on the existing disk and the partition on the new disk by editing the file /etc/lvtab. Place an entry in the file for the logical volume. The entry should look something like this:

    lv0:First Logical Volume:devs=/dev/dsk/dks0d2s7, \ 
    /dev/dsk/dks0d3s7 

    An lvtab entry is made up of several fields. Each field is separated by a colon. In the above example:

    This example shows a logical volume composed of two disk partitions, but it could be made up of several partitions. The only limit is the maximum size of a file system, 8 GB.

    Change the entry for /work in the file /etc/fstab to read:

    /dev/dsk/lv0 /work efs rw,raw=/dev/rdsk/lv0 0 0 

    Shut down the system to single-user mode and make sure the /work file system is unmounted.

    Run the mklv(1M) command to create the logical volume:

    mklv -f lv0 

    In this example, the argument lv0 is the volume device name, which is the first field in the lvtab entry for this logical volume.

    After you set up the logical volume, you can grow the file system into the logical volume. Enter the growfs command:

    growfs /dev/rdsk/lv0 

    If the file system needs to be cleaned, growfs prints an error message and stops.

    Even if growfs did not print any errors, it is advisable to run fsck on the expanded file system:

    fsck /dev/rdsk/lv0 

    You are finished growing the file system. Reboot the system and verify that all data is intact.

For more information about setting up logical volumes, including information about striping, see "Logical Volumes and Disk Striping" in this guide. Other useful information is contained in the lvinit(1M), lvck(1M), mklv(1M), and lv(7M) reference pages. For information about the format of the lvtab file see the lvtab(4) reference page.

Naming a File System

When you create a file system with mkfs, the program assigns a name to the file system. However, the name that mkfs assigns is not always as memorable or easy to type as one might like. Therefore, you probably want to create a more mnemonic name for the file system. An IRIX file system is generally named after the highest-level directory in its hierarchy.

Use the mknod(1M) command to create a special device node in /dev for the file system, if you have not already done so.

Find the device numbers for the specific file system. For example:

ls -l /dev/dsk/dks0d1s6 
brw------ 2 root sys 22, 32 Aug 23 17:08 ips0d1s6 

In this example, the /usr file system is on an ESDI disk (ips) attached to controller 0, it is on the first disk (d1), and it is partition 6 (s6). See the appropriate reference page for your disk, for example, dks(7M).

Mounting and Unmounting File Systems

To use a file system, it must be mounted. The root and /usr file systems are always mounted as part of the boot procedure. This is done in the script /etc/rc2.

You can mount file systems several ways:

Note that if you mount a file system over an existing subdirectory, the original subdirectory and its subtrees will be hidden until the file system is unmounted.

Mounting a File System Manually

To mount a file system manually, use this command:

mount /dev/dsk/dks0d1s6 /usr 

If you have a shorthand name for the file system:

mount /dev/usr /usr

Mounting a File System on Boot Up

To mount a file system every time the computer is rebooted, place a line similar to this in the file /etc/fstab:

/dev/usr /usr efs rw,raw=/dev/rusr 0 0 

This line indicates a file system /dev/usr should be mounted on /usr. It is an IRIX EFS, and it should be mounted read/write so that anyone who has permission can write files in the file system. The fstab entry also specifies the name of the raw device, in this case /dev/rusr.

The last two numbers refer to the frequency in days that the file system should be dumped with the dump program and the fsck parallel pass number, respectively. For more information on dump, see "Backing Up File Systems" and for more information on fsck parallel passes, see "Further fsck Options".

The following details the component fields of each entry in the /etc/fstab file. Here is another sample entry:

/dev/dsk/dks0d2s7 /test efs rw, raw=/dev/rdsk/dks 0 d2s7 0 0

The fields in your new line are defined as follows:

/dev/dsk/dks0d2s7


The block device where the file system is located.

/test

The name of the directory where the file system will be mounted.

efs

The type of file system. In this case, we are using an Extent file system. See the reference page on efs for complete information.

rw, raw

These are some of many options available when mounting a file system. In this instance, we are asking that the file system be mounted read-write, so that root and other users can write to it. The raw= option should be the last option in the options list. See the fstab reference page for all the options available.

0 0

These two numbers represent the frequency of dump cycle and the fsck pass priority. These two numbers must be added after the last option in the options list (raw =). The fstab reference page contains additional information.

The machine will mount our new file system on /test every time the machine boots into multi-user mode.

Mounting a File System Automatically

If you have the optional NFS software, you can automatically mount any remote file system whenever it is accessed (for example, by changing directories to the file system with cd). The remote file system must be exported with the exportfs(1M) command.

For complete information about setting up automounting, including all the available options, see the automount(1M) and exportfs(1M) reference pages. These utilities are discussed more completely in the NFS and NIS Administration Guide.

Unmounting a File System

To unmount a file system, use the umount(1M) command:

umount /dev/usr 

You should always unmount a file system before running fsck on it.

Checking File Systems with fsck

This section provides a quick overview of the steps to using fsck(1M). "Repairing Problems with fsck" provides even more detailed information about using fsck.

File systems are usually checked for data integrity whenever the system is booted. You should also check the integrity of a file system before you make a complete backup of it; otherwise, you run the risk of backing up an inconsistent file system.

The fsck command checks file system consistency. To check a single file system, for example, prior to performing an image backup with the System Manager, follow these steps:

    Log in as root.

    Depending on which file system you want to check, shut down the system to single-user mode.

    To check a file system, the file system must be unmounted. However, you cannot unmount a file system if it is ``busy.'' A file system is busy if any files are open or active in that file system, or if a user's current working directory is a subdirectory of that file system.

    For example, many daemons, such as /usr/lib/lpsched, /usr/etc/ypbind, and /usr/etc/syslogd, execute from the /usr file system. The simplest way to make sure the file system is not busy is to bring the system down to single-user mode.

    If you do not bring the system to single-user mode, unmount the file system with the umount(1M) command. For example, to unmount the /usr file system:

    umount /dev/usr 

    Run fsck:

    fsck /dev/rusr 

    As fsck runs, it proceeds through a series of steps, or phases. You may see an error-free check.

    fsck: Checking /dev/usr 
    ** Phase 1 - Check Blocks and Sizes
    ** Phase 2 - Check Pathnames
    ** Phase 3 - Check Connectivity
    ** Phase 4 - Check Reference Counts
    ** Phase 5 - Check Free List
    7280 files 491832 blocks 38930 free

    If there are no errors, you are finished checking the file system.

    Mount the file system using mount(1M). For example:

    mount /dev/usr /usr 

    If errors are detected in the file system, fsck displays an error message. If you encounter file system inconsistencies, proceed to "Repairing Problems with fsck".

If you cannot shut down the system and cannot unmount the file system, but you need to perform the check immediately, you can run fsck in ``no-write'' mode. The fsck program checks the file system, but makes no changes and does not repair inconsistencies.

For example, the following command invokes fsck in no-write mode:

fsck -n /dev/usr 

If any inconsistencies are found, they are not repaired. You must run fsck again without the -n flag to repair any problems. The benefit of this procedure is that you should be able to gauge the severity of the problems with your file system.

Further fsck Options

You may find it convenient to check multiple file systems at once. This is also known as parallel checking. The -m flag indicates parallel checking. Use the -m flag only when working from the /etc/fstab file.

The -q option runs fsck in quiet mode. Since the program does not prompt for information, this is essentially the same as using -y, though with less verbose output.

For a complete list of options, see the fsck(1M) reference page.

dfsck

The dfsck utility runs simultaneous (dual) checks on two file systems. This functionality is superseded by the -m option to fsck, but the program is included with IRIX for backward compatibility.

It is strongly recommended that you use the multiple option, -m with fsck, instead of dfsck. The -m option with fsck does a much better job of checking file systems, and support for dfsck may be eliminated in a future release of IRIX.

Repairing Problems with fsck

This section describes the messages that are produced by each phase of fsck, what they mean, and what you should do about each one. The following abbreviations are used in fsck error messages:

BLK

block number

DUP

duplicate block number

DIR

directory name

MTIME

time file was last modified

UNREF

unreferenced

The following sections use these single-letter abbreviations:

B

block number

F

file (or directory) name

I

inode number

M

file mode

O

user ID of a file's owner

S

file size

T

time file was last modified

X

link count, or number of BAD, DUP, or MISSING blocks, or number of files (depending on context)

Y

corrected link count number, or number of blocks in file system (depending on context)

Z

number of free blocks

In actual fsck output, these abbreviations are replaced by the appropriate numbers.

Initialization Phase

The command line syntax is checked. Before the file system check can be performed, fsck sets up some tables and opens some files. The fsck program terminates if there are initialization errors.

General Errors Phase

Two error messages may appear in any phase. Although fsck prompts for you to continue checking the file system, it is generally best to regard these errors as fatal. Stop the program and investigate what may have caused the problem.

CAN NOT READ: BLK B (CONTINUE?)


The request to read a specified block number B in the file system failed. This error indicates a serious problem, probably a hardware failure. Press n to stop fsck. Shut down the system to the System Maintenance Menu and run hardware diagnostics on the disk drive and controller.

CAN NOT WRITE: BLK B (CONTINUE?)


The request for writing a specified block number B in the file system failed. The disk may be write-protected or there may be a hardware problem. Press n to stop fsck. Check to make sure the disk is not set to ``read only.'' (Some, though not all, disks have this feature.) If the disk is not write protected, shut down the system to the System Maintenance Menu and run hardware diagnostics on the disk drive and controller.

Phase 1 Check Blocks and Sizes

This phase checks the inode list. It reports error conditions resulting from:

Phase 1 Error Messages

Phase 1 has three types of error messages:

Phase 1 Meaning of Yes/No Responses

Table 8-2 explains the significance of responses to phase 1 prompts:

Table 8-2 : Meaning of fsck Phase 1 Responses

Prompt Response Meaning
CONTINUE? n Terminate the program.
CONTINUE? y Continue with the program. This error condition means that a complete check of the file system is not possible. A second run of fsck should be made to recheck this file system.
CLEAR? n Ignore the error condition. A "no" response is appropriate only if the user intends to take other measures to fix the problem.
CLEAR? y Deallocate inode I by zeroing its contents. This may invoke the UNALLOCATED error condition in Phase 2 for each directory entry pointing to this inode.



Phase 1 Error Messages

UNKNOWN FILE TYPE I=I (CLEAR?)


The mode word of the inode I suggests that the inode is not a pipe, special character inode, regular inode, directory inode, symbolic link, or socket.

LINK COUNT TABLE OVERFLOW (CONTINUE?)


There is no more room in an internal table for fsck containing allocated inodes with a link count of zero.

B BAD I=I

Inode I contains block number B with a number lower than the number of the first data block in the file system or greater than the number of the last block in the file system. This error condition may invoke the EXCESSIVE BAD BLKS error condition in Phase 1 if inode I has too many block numbers outside the file system range. This error condition invokes the BAD/DUP error condition in Phase 2 and Phase 4.

EXCESSIVE BAD BLOCKS I=I (CONTINUE?)


There is more than a tolerable number (usually 50) of blocks with a number lower than the number of the first data block in the file system or greater than the number of the last block in the file system associated with inode I.

B DUP I=I

Inode I contains block number B, which is already claimed by another inode. This error condition may invoke the EXCESSIVE DUP BLKS error condition in Phase 1 if inode I has too many block numbers claimed by other inodes. This error condition invokes Phase 1B and the BAD/DUP error condition in Phase 2 and Phase 4.

EXCESSIVE DUP BLKS I=I (CONTINUE?)


There is more than a tolerable number (usually 50) of blocks claimed by other inodes.

DUP TABLE OVERFLOW (CONTINUE?)


There is not more room in an internal table in fsck containing duplicate block numbers.

PARTIALLY ALLOCATED INODE I=I (CLEAR?)


Inode I is neither allocated nor unallocated.

RIDICULOUS NUMBER OF EXTENTS (%d) (max allowed %d)


The number of extents is larger than the maximum the system can set and is therefore ridiculous.

ILLEGAL NUMBER OF INDIRECT EXTENTS (%d)


The number of extents or pointers to extents (indirect extents) exceeds the number of slots in the inode for describing extents.

BAD MAGIC IN EXTENT


The pointer to an extent contains a ``magic number.'' If this number is invalid, the pointer to the extent is probably corrupt.

EXTENT OUT OF ORDER


An extent's idea of where it is in the file is inconsistent with the extent pointer in relation to other extent pointers.

ZERO LENGTH EXTENT


An extent is zero length.

ZERO SIZE DIRECTORY


It is erroneous for a directory inode to claim a size of zero. The corresponding inode is cleared.

DIRECTORY SIZE ERROR


A directory's size must be an integer number of blocks. The size is recomputed based on its extents.

DIRECTORY EXTENTS CORRUPTED


If the computation of size (above) fails, fsck will print this message and ask to clear the inode.

NUMBER OF EXTENTS TOO LARGE


The number of extents or pointers to extents (indirect extents) exceeds the number of slots in the inode for describing extents.

POSSIBLE DIRECTORY SIZE ERROR


The number of blocks in the directory computed from extent pointer lengths is inconsistent with the number computed from the inode size field.

POSSIBLE FILE SIZE ERROR


The number of blocks in the file computed from extent pointer lengths is inconsistent with the number computed from the inode size field. fsck gives the option of clearing the inode in this case.

Phase 1B Rescan for More DUPS

When a duplicate block is found in the file system, the file system is rescanned to find the inode that previously claimed that block. When the duplicate block is found, the following information message is printed:

B DUP I=I

Inode I contains block number B, which is already claimed by another inode. This error condition invokes the BAD/DUP error condition in Phase 2. Inodes with overlapping blocks may be determined by examining this error condition and the DUP error condition in Phase 1.

Phase 2 Check Path Names

This phase traverses the pathname tree, starting at the root directory. fsck examines each inode that is being used by a file in a directory of the file system being checked.

Referenced files are marked in order to detect unreferenced files later on. The program also accumulates a count of all links, which it checks against the link counts found in Phase 4, pointing to bad inodes found in Phase 1 and Phase 1B.

Phase 2 reports error conditions resulting from the following:

Initial Checks

fsck examines the root directory inode first, since this directory is where the search for all pathnames must start.

If the root directory inode is corrupted, or if its type is not directory, fsck prints error messages. Generally, if a severe problem exists with the root directory it is impossible to salvage the file system, although fsck allows attempts to continue under some circumstances.

Possible error messages caused by problems with the root directory inode include:

ROOT INODE UNALLOCATED. TERMINATING


The root inode points to incorrect information. There is no way to fix this problem, so the program stops.

If this problem occurs on the root file system, you must reinstall IRIX. If it occurs on another file system, you must recreate the file system using mkfs and recover files and data from backups.

ROOT INODE NOT A DIRECTORY. FIX?


The root directory inode does not seem to describe a directory. If you enter <n>, fsck terminates. If you enter <y>, fsck treats the contents of the inode as a directory even though the inode mode indicates otherwise. If the directory is actually intact, and only the inode mode is incorrectly set, this may recover the directory.

DUPS/BAD IN ROOT INODE. CONTINUE?


Something is wrong with the block addressing information of the root directory. If you enter<n>, fsck terminates. If you enter <y>, fsck attempts to continue with the check. If some of the root directory is still readable, pieces of the files system may be salvaged.

Phase 2 Types of Error Messages

Phase 2 has only one type of error message: messages with a REMOVE? prompt.

Phase 2 Meaning of Yes/No Responses

Table 8-3 describes the significance of responses to Phase 2 prompts:

Table 8-3 : Meaning of Phase 2 fsck Responses

Prompt Response Meaning
REMOVE? n Ignore the error condition. A "no" response is appropriate only if the user intends to take other action to fix the problem.
REMOVE? y Remove a bad directory entry.



Phase 2 Error Messages

I OUT OF RANGE I=I NAME=F (REMOVE?)


A directory entry F has an inode number I that is greater than the end of the inode list.

UNALLOCATED I=I OWNER=O MODE=M SIZE=S MTIME=T NAME=F(REMOVE?)


A directory entry F has an inode I without allocate mode bits. The owner O, mode M, size S, modify time T, and file name F are printed. If the file system is not mounted and the -n option is not specified, and if the inode that the entry points to is size 0, the entry is removed automatically.

DUP/BAD I=I OWNER=O MODE=M SIZE=S MTIME=T DIR=F (REMOVE?)


Phase 1 or Phase 1B found duplicate blocks or bad blocks associated with directory entry F, directory inode I. The owner O, mode M, size S, modify time T, and directory name F are printed.

DUP/BAD I=I OWNER=O MODE=M SIZE=S MTIME=T FILE=F (REMOVE?)


Phase 1 or Phase 1B found duplicate blocks or bad blocks associated with file entry F, inode I. The owner O, mode M, size S, modify time T, and file name F are printed.

Phase 3 Check Connectivity

Phase 3 of fsck locates any unreferenced directories detected in Phase 2 and attempts to reconnect them. It reports error conditions resulting from:

Phase 3 Types of Error Messages

Phase 3 has two types of error messages:

Phase 3 Meaning of Yes/No Responses

Table 8-4 explains the significance of responses to Phase 3 prompts:

Table 8-4 : Meaning of fsck Phase 3 Responses

Prompt Response Meaning
RECONNECT n Ignore the error condition. This invokes the UNREF error condition in Phase 4. A "no" response is appropriate only if the user intends to take other action to fix the problem.
RECONNECT? y Reconnect directory inode I to the file system in directory for lost files (lost+found). This may invoke a lost+found error condition if there are problems connecting directory inode I to lost+found. If the link was successful, this invokes CONNECTED information message.



Phase 3 Error Messages

UNREF DIR I=I OWNER=O MODE=M SIZE=S MTIME=T (RECONNECT?)


The directory inode I was not connected to a directory entry when the file system was traversed. The owner O, mode M, size S, and modify time T of directory inode I are printed. The fsck program forces the reconnection of a nonempty directory.

SORRY. NO lost+found DIRECTORY


No lost+found directory is in the root directory of the file system; fsck ignores the request to link a directory in lost+found. The unreferenced file is removed.

There is nothing you can do at this point, but you should remake the lost+found directory as soon as possible.

SORRY. NO SPACE IN lost+found DIRECTORY


There is no space to add another entry to the lost+found directory in the root directory of the file system; fsck ignores the request to link a directory in lost+found. The unreferenced file is removed.

There is nothing you can do at this point, but you should clean out the lost+found directory as soon as possible.

DIR I=I1 CONNECTED. PARENT WAS I=I2


This is an advisory message indicating that a directory inode I1 was successfully connected to the lost+found directory. The parent inode I2 of the directory inode I1 is replaced by the inode number of the lost+found directory.

Phase 4 Check Reference Counts

This phase checks the link count information seen in Phases 2 and 3 and locates any unreferenced regular files. It reports error conditions resulting from:

Phase 4 Types of Error Messages

Phase 4 has five types of error messages:

Phase 4 Meaning of Yes/No Responses

Table 8-5 describes the significance of responses to Phase 4 prompts:

Table 8-5 : Meaning of fsck Phase 4 Responses

Prompt Response Meaning
RECONNECT? n Ignore this error condition. This invokes a CLEAR error condition later in Phase 4.
RECONNECT? y Reconnect inode I to file system in the directory for lost files (lost+found). This can cause a lost+found error condition in this phase if there are problems connecting inode I to lost+found.
CLEAR? n Ignore the error condition. A "no" response is appropriate only if the user intends to take other action to fix the problem.
CLEAR? y Deallocate the inode by zeroing its contents.
ADJUST? n Ignore the error condition. A "no" response is appropriate only if the user intends to take other action to fix the problem.
ADJUST? y Replace link count of file inode I with the link counted computed in Phase 2.
FIX? n Ignore the error condition. A "no" response is appropriate only if the user intends to take other action to fix the problem.
FIX? y Fix the problem.



Phase 4 Error Messages

UNREF FILE I=I OWNER=O MODE=M SIZE=S MTIME=T (RECONNECT?)


Inode I was not connected to a directory entry when the file system was traversed. The owner O, mode M, size S, and modify time T of inode I are printed. If the -n option is omitted and the file system is not mounted, empty files are cleared automatically. Nonempty files are not cleared.

SORRY. NO lost+found DIRECTORY


There is no lost+found directory in the root directory of the file system; fsck ignores the request to link a file in lost+found.

There is nothing you can do at this point, but you should create the lost+found directory as soon as possible.

SORRY. NO SPACE IN lost+found DIRECTORY


There is no space to add another entry to the lost+found directory in the root directory of the file system; fsck ignores the request to link a file in lost+found.

There is nothing you can do at this point, but you should clean out the lost+found directory as soon as possible.

(CLEAR)

The inode mentioned in the immediately previous UNREF error condition cannot be reconnected, so it is cleared.

LINK COUNT FILE I=I OWNER=O MODE=M SIZE=S MTIME=T COUNT=X SHOULD BE Y (ADJUST?)


The link count for inode I, which is a file, is X but should be Y. The owner O, mode M, size S, and modify time T are printed.

LINK COUNT DIR I=I OWNER=O MODE=M SIZE=S MTIME=T COUNT=X SHOULD BE Y (ADJUST?)


The link count for inode I, which is a directory, is X but should be Y. The owner O, mode M, size S, and modify time T of directory inode I are printed.

LINK COUNT F I=I OWNER=O MODE=M SIZE=S MTIME=T COUNT=X SHOULD BE Y (ADJUST?)


The link count for F inode I is X but should be Y. The file name F, owner O, mode M, size S, and modify time T are printed.

UNREF FILE I=I OWNER=O MODE=M SIZE=S MTIME=T (CLEAR?)


Inode I, which is a file, was not connected to a directory entry when the file system was traversed. The owner O, mode M, size S, and modify time T of inode I are printed. If the -n option is omitted and the file system is not mounted, empty files are cleared automatically. Nonempty directories are not cleared.

UNREF DIR I=I OWNER=O MODE=M SIZE=S MTIME=T (CLEAR?)


Inode I, which is a directory, was not connected to a directory entry when the file system was traversed. The owner O, mode M, size S, and modify time T of inode I are printed. If the -n option is omitted and the file system is not mounted, empty directories are cleared automatically. Nonempty directories are not cleared.

BAD/DUP FILE I=I OWNER=O MODE=M SIZE=S MTIME=T (CLEAR?)


Phase 1 or Phase 1B found duplicate blocks or bad blocks associated with file inode I. The owner O, mode M, size S, and modify time T of inode I are printed.

BAD/DUP DIR I=I OWNER=O MODE=M SIZE=S MTIME=T (CLEAR?)


Phase 1 or Phase 1B found duplicate blocks or bad blocks associated with directory inode I. The owner O, mode M, size S, and modify time T of inode I are printed.

FREE INODE COUNT WRONG IN SUPERBLK (FIX?)


The actual count of the free inodes does not match the count in the super-block of the file system.

Phase 5 Check Free List

Phase 5 checks the free-block list. It reports error conditions resulting from:

Phase 5 Types of Error Messages

Phase 5 has four types of error messages:

Phase 5 Meaning of Yes/No Responses

Table 8-6 describes the significance of responses to Phase 5 prompts:

Table 8-6 : Meanings of Phase 5 fsck Responses

Prompt Response Meaning
CONTINUE? n Terminate the program.
CONTINUE? y Ignore rest of the free-block list and continue execution of fsck. This error condition always invokes BAD BLKS IN FREE LIST error condition later in Phase 5.
FIX? n Ignore the error condition. A "no" response is appropriate only if the user intends to take other action to fix the problem.
FIX? y Replace count in super-block by actual count.
SALVAGE? n Ignore the error condition. A "no" response is appropriate only if the user intends to take other action to fix the problem.
SALVAGE? y Replace actual free-block bitmap with a new free- block bitmap.



Phase 5 Error Messages

FREE BLK COUNT WRONG IN SUPERBLOCK (FIX?)


The actual count of free blocks does not match the count in the super-block of the file system.

BAD FREE LIST (SALVAGE?)


This message is always preceded by one or more of the Phase 5 information messages.

Phase 6 Salvage Free List

This phase reconstructs the free-block bitmap. There are no error messages that can be generated in this phase and no responses are required.

Cleanup Phase

Once a file system has been checked, a few cleanup functions are performed. The cleanup phase displays advisory messages about the file system and status of the file system.

Cleanup Phase Messages

X files Y blocks Z free


This is an advisory message indicating that the file system checked contained X files using Y blocks leaving Z blocks free in the file system.

SUPERBLOCK MARKED DIRTY


A field in the super-block is queried by system utilities to decide if fsck must be run before mounting a file system. If this field is not ``clean,'' fsck reports and asks if it should be cleaned.

PRIMARY SUPERBLOCK WAS INVALID


If the primary super-block is too corrupt to use, and fsck can locate a secondary super-block, it asks to replace the primary super-block with the backup.

SECONDARY SUPERBLOCK MISSING


If there is no secondary super-block, and fsck finds space for one (after the last cylinder group), it asks to create a secondary super-block.

CHECKSUM WRONG IN SUPERBLOCK


An incorrect checksum makes a file system unmountable.

***** FILE SYSTEM WAS MODIFIED *****


This is an advisory message indicating that the current file system was modified by fsck.

***** REMOUNTING ROOT... *****


This is an advisory message indicating that w made changes to a mounted root file system. The automatic remount ensures that incore data structures and the file system are consistent.


How the File System Works

This section describes how the IRIX file system is constructed and how it works.

In the IRIX system, a file is a one-dimensional array of bytes with no other structure implied. Files are attached to a hierarchy of directories.

A directory is merely another type of file that the user is permitted to use, but not allowed to write; the operating system itself retains the responsibility for writing directories. The combination of directories and files make up a file system. The starting point of any IRIX file system is a directory that serves as the root. In the IRIX operating system there is always one file system that is itself referred to by that name, root. Traditionally, the root directory of the root file system is represented by a single slash (/).

A directory such as usr is referred to in various ways. You sometimes see the terms ``leaf'' and ``mount point'' used to describe a directory that is used to form the connection between the root file system and another mountable file system. Regardless of the terms used, such a directory is the root of the file system that descends from it. The name of that file system is, coincidentally, the name of the directory. In our example, the file system is usr.

The IRIX EFS file system may contain the following types of files:

The letter in parentheses following each item is the character used by ls -l to identify the file type.

Tables in Memory

When a file system is identified to the IRIX system through a mount(1M) command, an entry is made in the mount table, and the super-block is read into an internal buffer maintained by the kernel. Disk inodes and free storage bitmaps are read in from the disk as needed.

The System I-Node Table

The IRIX system maintains a structure known as the system inode table. Whenever a file is opened, its inode is copied from the secondary storage disk into the system inode table. If two or more processes have the same file open, they share the same inode table entry. The entry includes:

The System File Table

The system maintains another table called the system file table. Because files may be shared among related processes, a table is needed to keep track of which files are accessible by which process. For each file descriptor, an entry in the system file table contains:

The Open File Table

The last table that is used to provide access to files is the open file table. It is located in the user area portion of memory. There is a user area for each process and, consequently, an open file table for each process. An entry in the open file table contains a pointer to the appropriate system file table entry.

System Steps in Accessing a File

The next few paragraphs describe steps the operating system takes to open, create, read, and write a file.

Open

If you give the pathname /a/b to the open(2) system call, the following is performed. (Your program probably uses the fopen(3) subroutine from the standard I/O library, but that in turn invokes the open system call.)

    The operating system sees that the pathname starts with a slash, so the root inode is obtained from the inode table.

    Using the root inode, the system does a linear scan of the root directory file looking for an entry ``a''. When ``a'' is found, the operating system picks up the i-number associated with ``a''.

    The i-number gives the offset into the inode list at which the inode for ``a'' is located. At that location, the system determines that ``a'' is a directory.

    Directory ``a'' is searched linearly until an entry ``b'' is found.

    When ``b'' is found, its i-number is picked up and used as an index into the i-list to find the inode for ``b''.

    The inode for ``b'' is determined to be a file and is copied to the system inode table (assuming it's not already there), and the reference count is incremented.

    The system file table entry is allocated, the pointer to the system inode table is set, the offset for the I/O pointer is set to zero to indicate the beginning of the file, and the reference count is initialized.

    The user area file descriptor table entry is allocated with a pointer set to the entry in the system file table.

    The number of the file descriptor slot is returned to the program.

The linear scan algorithm for locating the inode of a file illustrates why it is advisable to keep directories small.

Create

Creating a file (the creat(2) system call) has these additional steps at the beginning:

    If the file does not already exist, a free inode is located by searching the inode areas of the file system.

    The mode of the file is established (possibly and-ed with the complement of a umask entry; see umask(2)) and entered in the inode.

    Using the i-number, the system goes through a directory search similar to that used in the open system call. The difference is that in the case of creat, the system writes the last portion of the pathname into the directory that is the next to last portion of the pathname. The i-number is stored with it.

Reading and Writing

Both the read(2) and write(2) system calls follow these steps:

    Using the file descriptor supplied with the call as an index, the user's open file table is read, and the pointer to the system file table is obtained.

    The user buffer address and number of bytes to read and write are supplied as arguments to the call. The correct offset into the file is read from the system file table entry.

    (Reading) The inode is found by following the pointer from the system file-table entry to the system inode table. The operating system copies the data from storage to the user's buffer.

    (Writing) The same pointer chain is followed, but the system writes into the data blocks. If new blocks are needed, they are allocated from the file system's list of free blocks. The EFS file system always attempts to grow the last extent if the following blocks are free, or to preallocate a large number of contiguous free blocks when allocating through a new extent. Disk blocks that aren't needed are freed when the file is closed.

    Before the system call returns to the user, the number of bytes read or written is added to the offset in the system file table.

    The number of bytes read or written is returned to the user.

Files Used by More Than One Process

If related processes are sharing a file descriptor (as happens after a fork(1)), they also share the same entry in the system file table. Unrelated processes that access the same file have separate entries in the system file table because they may be reading from or writing to different places in the file. In both cases, the entry in the inode table is shared; the correct offset at which the read or write should take place is tracked by the offset entry in the system file table.

Pathname Conversion

The directory search and pathname conversion takes place only once as long as the file remains open. For subsequent access of the file, the system supplies a file descriptor that is an index into the open file table in your user process area. The open file table points to the system file table entry where the pointer to the system inode table is picked up. Given the inode, the system can find the data blocks that make up the file.

A running process can successfully read and write a file after it has opened it, even if someone has since renamed or unlinked it. This might be important when it seems that there are more used blocks than that accounted for by simply examining files in a directory hierarchy. In this case, the blocks are freed when the process closes the file with the close(2) system call or exits the file with the exit(2) system call.

Synchronization

The above description of pathname conversion, while complex, is rather neat and orderly. The situation is complicated, however, by the fact that the IRIX system is a multi-tasking system. To give some tasks prompt attention, the system may make the decision that other tasks are less urgent. In addition, the system keeps a buffer cache and a cache of free blocks and inodes in memory together with the super-block to provide more responsive service to users. The stability that comes from having every byte of data in a file immediately written to the storage disk is traded for the gain of being able to provide more service to more users.

In normal processing, disk buffers are flushed periodically to the disk devices. This is a system process that is not related directly to any reads or writes of user processes. The process is called ``synchronization.'' It includes writing out the super-blocks in addition to the disk buffers. The sync command can be used to cause the writing of super-blocks and updated inodes and the flushing of buffers. It is worth noting, however, that the return from the command simply means that the writing was scheduled, not necessarily completed.

Processes that must ensure that data is written to the disk immediately can open the file with the O_SYNC flag, which causes the data and inode information to be written to the disk at the time of the write system call. Alternatively, the fsync(2) system call flushes all data to the disk associated with the particular file descriptor argument.

Search Time

Several things affect the amount of time the system needs to spend in looking for and reading in a file:

As described above, when the IRIX system is locating a file to be opened, it searches linearly through all the directories in the pathname. Search time is reduced by keeping the number of entries in a directory small.


File System Corruption

Most often, a file system is corrupted because the address and count information fail to make it out to the storage medium. This is caused by:

Problems can occur from any combination of these factors.

Hardware Failure

There is no fool-proof way to predict hardware failure. The best way to avoid hardware failures is to conscientiously follow recommended diagnostic and maintenance procedures. Use the fx utility to flag bad blocks on a hard disk and remap them to good blocks.

Human Error

Human error is probably the greatest single cause of file system corruption. To avoid problems, follow these rules closely:

    ALWAYS shut down the system properly. Do not simply turn off power to the system. Use a standard system shutdown tool, such as shutdown(1M).

    NEVER remove a file system physically (pull out a hard disk) without first unmounting the file system.

    NEVER physically write-protect a mounted file system, unless it is mounted read-only.

The best way to insure against data loss is to make regular, careful backups. See Chapter 6, "Backing Up and Restoring Files," for complete information on system backups.


Insufficient Space on the root File System

The root file system is typically very static. It contains only basic programs and utilities. A prime reason for running out of space on the root file system is application programs creating many, sometimes very large, files in /tmp.

If you need to increase space, consider these alternatives:

Note: If you use either of the last two options listed above, you should exercise great care when your system is in single-user mode, since the mounted file systems will not be present and your mounts and links to /tmp will therefore not be active.


[Previous Section] [Back to Table of Contents] [Next Section]

Send feedback to Technical Publications.

Copyright © 1997, Silicon Graphics, Inc. All Rights Reserved. Trademark Information