unix filter by gauravsi89


									Filter Definition

     A filter is a small and (usually) specialized program in Unix-like operating
     systems that transforms plain text (i.e., human readable) data in some
     meaningful way and that can be used together with other filters and pipes to
     form a series of operations that produces highly specific results.

     As is generally the case with command line (i.e., all-text mode) programs in
     Unix-like operating systems, filters read data from standard input and write to
     standard output. Standard input is the source of data for a program, and by
     default it is text typed in at the keyboard. However, it can be redirected to
     come from a file or from the output of another program. Standard output is the
     destination of output from a program, and by default it the display screen. This
     means that if the output of a command is not redirected to a file or another
     device (such as a printer) or piped to another filter for further processing, it will
     be sent to the monitor where it will be displayed.

     Numerous filters are included on Unix-like systems, a few of which are awk,
     cat, comm, csplit, cut, diff, expand, fold, grep, head, join, less, more, paste,
     sed, sort, spell, tail, tr, unexpand, uniq and wc.

     It is a simple matter to construct a pipeline of commands with a highly specific
     function by stringing multiple filters together with pipes. As a trivial example,
     the following would display the last three files in the directory /sbin (which
     contains basic programs used for system maintenance or administrative tasks)
     whose names contain the string (i.e., sequence of characters) mk:

             ls /sbin | grep mk | sort -r | head -3

     The ls command lists the contents of /sbin and pipes its output (using the pipe
     operator, which is designated by the vertical bar character) to the filter grep,
     which searches for all files and directories that contain the letter sequence mk
     in their names. grep then pipes its output to the sort filter, which, with its -r
     option, sorts it in reverse alphabetical order. sort, in turn, pipes its output to the
     head filter. The default behavior of head is to read the first ten lines of text or
     output from another command, but the -3 option here tells it read only the first
     three. head thus writes the first three results from sort (i.e., the last three
     filenames or directory names from /sbin that contain the string mk) to the
     display screen.

     cat is one of the most frequently used commands on Unix-like systems. It is
     best known for its ability to display the contents of files rather than for its
          ability to transform them, and thus it might not initially appear to fall into the
          category of a filter. However, it has the two additional (and not unrelated)
          functions of creating files and concatenating (i.e., combining) copies of them,
          which clearly makes it a filter.

          In the next example, cat combines copies of file1, file2 and file3, and this is
          piped to wc, a filter which by default writes the number of bytes, words and
          lines to the display monitor:

                  cat file1 file2 file3 | wc

          Although most filters are highly specialized programs with a very limited range
          of functions, there are a few exceptions. Most notable among them is awk, a
          pattern matching program that has evolved into a powerful and full-fledged
          programming language.

          An important tenet of the Unix philosophy has been to try to develop every
          program (at least every command line program) so that it is a filter rather than
          just a stand-alone program.

Unix Filters

      grep
      sort

Filter (Unix)
From Wikipedia, the free encyclopedia

Jump to: navigation, search
  It has been suggested that this article or section be merged into filter (software).

In UNIX and UNIX-like operating systems, a filter is a program that gets most of its data
from standard output (the main output stream) and writes its main results to standard
input (the main input stream). UNIX filters are often used as elements of pipelines. The
pipe operator ("|") on a command line signifies that the main output of the command to
the left is passed as main input to the command on the right.

The classic filter would be grep, which at it simplest prints to its output any lines
containing a character string. Here's an example:

 cut -d : -f 1 /etc/passwd | grep foo

This finds all registered users that have "foo" as part of their username by using the cut
command to take the first field (username) of each line of the UNIX system password file
and passing them all as input to grep, which searches its input for lines containing the
character string "foo" and prints them on its output.

Here is a perl equivalent to the above, which prints the whole line from the passwd file:

 perl -ne 'print if m/^[^:]*foo/' /etc/passwd

Or, to print only the username, without the rest of the line:

 perl -ane '$_ = shift @F; print "$_\n" if /foo/' -F: /etc/passwd

Common UNIX filter programs are: cat, cut, grep, head, sort, uniq and tail. Programs like
awk and sed can be used to build quite complex filters because they are fully

[edit] List of UNIX filter programs
      awk
      cat
      comm
      cut
      expand
      compress
      fold
      grep
      head
      nl
      perl
      pr
      sed
      sh
      sort
      split
      strings
      tail
      tac
           tee
           tr
          uniq
          wc

Pipes and Filters
The purpose of this lesson is to introduce you to the way that you can construct powerful
Unix command lines by combining Unix commands.

Unix commands alone are powerful, but when you combine them together, you can
accomplish complex tasks with ease. The way you combine Unix commands is through
using pipes and filters.

Using a Pipe
The symbol | is the Unix pipe symbol that is used on the command line. What it means is
that the standard output of the command to the left of the pipe gets sent as standard input
of the command to the right of the pipe. Note that this functions a lot like the > symbol
used to redirect the standard output of a command to a file. However, the pipe is different
because it is used to pass the output of a command to another command, not a file.

Here is an example:

           $ cat apple.txt
           worm seed
           $ cat apple.txt | wc
                 3       4      21

In this example, at the first shell prompt, I show the contents of the file apple.txt to you.
In the next shell prompt, I use the cat command to display the contents of the apple.txt
file, but I sent the display not to the screen, but through a pipe to the wc (word count)
command. The wc command then does its job and counts the lines, words, and characters
of what it got as input.
You can combine many commands with pipes on a single command line. Here's an
example where I count the characters, words, and lines of the apple.txt file, then mail the
results to nobody@december.com with the subject line "The count."

       $   cat   apple.txt          |    wc    |    mail      -s    "The     count"

Using a Filter
A filter is a Unix command that does some manipulation of the text of a file. Two of the
most powerful and popular Unix filters are the sed and awk commands. Both of these
commands are extremely powerful and complex.


Here is a simple way to use the sed command to manipulate the contents of the apple.txt

       $ cat apple.txt
       worm seed
       $ cat apple.txt | sed -e "s/e/WWW/"
       worm sWWWed
       $ cat apple.txt | sed -e "s/e/J/g"
       worm sJJd

In this example, at the first shell prompt, I showed you the contents of the apple.txt file.
At the second shell prompt, I used the cat command to display the contents of the
apple.txt file, and send that display through a pipe to the sed command. The sed
command I created changed the first occurrence of the letter "e" on each line to "WWW."
The sed took as input the information it got through the pipe. The sed command
displayed its output to the screen.

I then used the output of the cat command on the apple.txt file and sent it by a pipe to the
sed command to change all the occurrences of an e on each line with J. Note that every
occurence of e, even where there were more than one on a line, changed to J. This is
because of the "g" on the end of the sed option value string. This "g" stands for global

It is important to note that, in this example, the contents of the apple.txt file itself were
not changed in the file. Only the display of its contents changed.

The Unix command awk is another powerful filter. You can use awk to
manipulate the contents of a file. Here is an example:
       $ cat basket.txt
       Layer1 = cloth
       Layer2 = strawberries
       Layer3 = fish
       Layer4 = chocolate
       Layer5 = punch cards
       $ cat basket.txt | awk -F= '{print $1}'
       $ cat basket.txt | awk -F= '{print "HAS: " $2}'
       HAS: cloth
       HAS: strawberries
       HAS: fish
       HAS: chocolate
       HAS: punch cards
In this example, I first showed you the contents of the basket.txt file. Then I
displayed the contents and sent the output through a pipe to the awk command. I
set up the awk command to display the first word on each line that comes before
the = sign.

Then I did something a bit different. I used the cat command to display basket.txt and
then send that output through a pipe to awk, but this time appending the characters HAS:
to the start of every ouput line followed by the second word on each line in basket.txt,
considering = as the separator between words on a line in basket.txt.


The Unix grep command helps you search for strings in a file. Here is how I can find the
lines that contain the string "jewel" and display those lines to the standard output:

       $ cat apple.txt
       worm seed
       $ grep jewel apple.txt

Exercise: Try out some pipes and filters
Create a simple text file that contains words and symbols. Try using combinations of the
pipe symbol, cat, grep, awk, and sed commands to manipulate its contents.
File System Structure

The Unix file system is a hierarchical structure that allows users to store information by
name. At the top of the hierarchy is the root directory, which always has the name /. A
typical Unix file system might look like:

The location of a file in the file system is called its path. Since the root directory is at the
top of the hierarchy, all paths start from /. Using the sample file system illustrated above,
let us determine the path for quiz1.txt:

       All paths start from the root directory, so the path begins with /
       We need to go to the home subdirectory, and the path becomes /home
       In the home directory, we go into the ian subdirectory. The path is now /home/ian
       In the ian directory, we go into the cpsc124 subdirectory. The path is now
       The file quiz1.txt is in the cpsc124 directory. Appending the filename to the path,
        the path becomes /home/ian/cpsc124/quiz1.txt

Unix assigns a special "shorthand" notation for specifying commonly used paths. These
special paths and their shorthand names are listed below:

       /: A single slash / specifies the root directory. As we have seen, all paths begin
        with / because root is at the top of the file system structure.
       ~: A tilde ~ specifies the current user's home directory. Every Unix user has a
        home directory, where his/her personal files are stored. When a user logs into
        Unix, he/she is automatically moved to this directory. This notation is not always
        available in the Bourne Shell.
      ~user: A tilde ~ followed by a user-id specifies the home directory of the given
       user. For example, at UBC, ~a1a1 is the home directory of the user who has the
       user-id a1a1. This shorthand notation is also not always available in the Bourne
      .: This single dot . specifies the current working directory, that is, the directory
       that the user is currently in.
      ..: A pair of dots .. refers to the parent of the current working directory.

Now that we know how Unix organizes files, let us attempt to define what we mean by
the term file.

Files and Directories

There are 3 kinds of files: ordinary files (files), directory files (directories), and special

      Files: Files are containers for data. This data can be anything, including the text of
       a report, an image (picture) of a house, an executable program like a word
       processor, or any arbitrary data. Files are created by users or programs in order to
       save data for future use. For example, a user might save the report she wrote using
       a text editor into a text file, or she might save an image from a drawing program
       into an image file.
      Directories: Each directory contains a number of files. A directory can contain
       other directories, be contained in another directory, or both. A directory that
       contains a file is called that file's parent directory. Similarly, if directory A
       contains directory B, then directory A is directory B's parent directory. A
       directory that is contained in another directory is called a subdirectory.
      Special files: A special file is much like an ordinary file, and shares the same
       basic interface. However, special files are not stored in the file system because
       they represent input/output devices. The I/O device provides the data directly.
       You will not responsible for knowing how special files work or even how to use
       them. Simply be aware that they are present.

Every file has an associated data structure that contains important information about that
file. This data structure is known as an i-node. An i-node has information about the
length of the file, the time the file was most recently modified or accessed, the time the i-
node itself was most recently modified, the owner of the file, the group identifications
and access permissions, the number of links to the file (we will discuss these shortly), and
pointers to the location on the disk where the data contained in the file is stored. A
filename is used to refer to an i-node. A directory is a file that contains a list of names,
and for each name there is a pointer to the i-node for the file or directory. The following
picture offers a graphical interpretation of files, their i-nodes, and the blocks that they
occupy on disk:
Filenames The name given to a file or directory can contain almost any character that can
be typed on the keyboard and be of almost any length. It is common to use only letters
and digits for the first character of a filename, and use only letters, digits, periods [.],
hyphens [-] and underscores [_] for the remainder of the filename. However, the
following characters should never be used in filenames:

      quotes ["], ['] or [`]
      spaces
      question marks [?]
      asterisks [*]
      slashes [/] or [\]
      greater than [>] or less than [<] signs

These characters all have special meaning to Unix. Also, Unix is case sensitive. For
example, the files hello.cpp, Hello.cpp and HELLO.cpp are three different filenames
representing three different files in Unix.

Section 14: The UNIX File System
Most UNIX machines store their files on magnetic disk drives. A disk drive is a device
that can store information by making electrical imprints on a magnetic surface. One or
more heads skim close to the spinning magnetic plate, and can detect, or change, the
magnetic state of a given spot on the disk. The drives use disk controllers to position the
head at the correct place at the correct time to read from, or write to, the magnetic surface
of the plate. It is often possible to partition a single disk drive into more than one logical
storage area. This section describes how the UNIX operating system deals with a raw
storage device like a disk drive, and how it manages to make organized use of the space.

How the UNIX file system works
Every item in a UNIX file system can de defined as belonging to one of four possible
Ordinary files
       Ordinary files can contain text, data, or program information. An ordinary file
       cannot contain another file, or directory. An ordinary file can be thought of as a
       one-dimensional array of bytes.
       In a previous section, we described directories as containers that can hold files,
       and other directories. A directory is actually implemented as a file that has one
       line for each item contained within the directory. Each line in a directory file
       contains only the name of the item, and a numerical reference to the location of
       the item. The reference is called an i-number, and is an index to a table known as
       the i-list. The i-list is a complete list of all the storage space available to the file
Special files
       Special files represent input/output (i/o) devices, like a tty (terminal), a disk drive,
       or a printer. Because UNIX treats such devices as files, a degree of compatibility
       can be achieved between device i/o, and ordinary file i/o, allowing for the more
       efficient use of software. Special files can be either character special files, that
       deal with streams of characters, or block special files, that operate on larger
       blocks of data. Typical block sizes are 512 bytes, 1024 bytes, and 2048 bytes.
       A link is a pointer to another file. Remember that a directory is nothing more than
       a list of the names and i-numbers of files. A directory entry can be a hard link, in
       which the i-number points directly to another file. A hard link to a file is
       indistinguishable from the file itself. When a hard link is made, then the i-
       numbers of two different directory file entries point to the same inode. For that
       reason, hard links cannot span across file systems. A soft link (or symbolic link)
       provides an indirect pointer to a file. A soft link is implemented as a directory file
       entry containing a pathname. Soft links are distinguishable from files, and can
       span across file systems. Not all versions of UNIX support soft links.

The I-List

When we speak of a UNIX file system, we are actually referring to an area of physical
memory represented by a single i-list. A UNIX machine may be connected to several file
systems, each with its own i-list. One of those i-lists points to a special storage area,
known as the root file system. The root file system contains the files for the operating
system itself, and must be available at all times. Other file systems are removable.
Removable file systems can be attached, or mounted, to the root file system. Typically, an
empty directory is created on the root file system as a mount point, and a removable file
system is attached there. When you issue a cd command to access the files and directories
of a mounted removable file system, your file operations will be controlled through the i-
list of the removable file system.
The purpose of the i-list is to provide the operating system with a map into the memory
of some physical storage device. The map is continually being revised, as the files are
created and removed, and as they shrink and grow in size. Thus, the mechanism of
mapping must be very flexible to accomodate drastic changes in the number and size of
files. The i-list is stored in a known location, on the same memory storage device that it

Each entry in an i-list is called an i-node. An i-node is a complex structure that provides
the necessary flexibility to track the changing file system. The i-nodes contain the
information necessary to get information from the storage device, which typically
communicates in fixed-size disk blocks. An i-node contains 10 direct pointers, which
point to disk blocks on the storage device. In addition, each i-node also contains one
indirect pointer, one double indirect pointer, and one triple indirect pointer. The indirect
pointer points to a block of direct pointers. The double indirect pointer points to a block
of indirect pointers, and the triple indirect pointer points to a block of double indirect
pointers. By structuring the pointers in a geometric fashion, a single i-node can represent
a very large file.

It now makes a little more sense to view a UNIX directory as a list of i-numbers, each i-
number referencing a specific i-node on a specific i-list. The operating system traces its
way through a file path by following the i-nodes until it reaches the direct pointers that
contain the actual location of the file on the storage device.

The file system table

Each file system that is mounted on a UNIX machine is accessed through its own block
special file. The information on each of the block special files is kept in a system
database called the file system table, and is usually located in /etc/fstab. It includes
information about the name of the device, the directory name under which it will be
mounted, and the read and write privileges for the device. It is possible to mount a file
system as "read-only," to prevent users from changing anything.

File system quotas

Although not originally part of the UNIX filesystem, quotas quickly became a widely-
used tool. Quotas allow the system administrator to place limits on the amount of space
the users can allocate. Quotas usually place restrictions on the amount of space, and the
number of files, that a user can take. The limit can be a soft limit, where only a warning is
generated, or a hard limit, where no further operations that create files will be allowed.

The command

will let you know if you're over your soft limit. Adding the -v option will provide
statistics about your disk usage.
File system related commands

Here are some commands related to file system usage, and other topics discussed in this
        On HP-UX systems, reports file system usage statistics
        On HP-UX systems, reports on free disk blocks, and i-nodes
        Summarizes disk usage in a specified directory hierarchy
        Creates a hard link (default), or a soft link (with -s option)
mount, umount
        Attaches, or detaches, a file system (super user only)
        Constructs a new file system (super user only)
        Evaluates the integrity of a file system (super user only)

A brief tour of the UNIX filesystem

The actual locations and names of certain system configuration files will differ under
different inplementations of UNIX. Here are some examples of important files and
directories under version 9 of the HP-UX operating system:
         The kernel program
         Where special files are kept
         Executable system utilities, like sh, cp, rm
         System configuration files and databases
         Operating system and programming libraries
         System scratch files (all users can write here)
         Where the file system checker puts detached files
         Additional user commands
         Standard system header files
         More programming and system call libraries
         Typically a place where local utilities go
       The manual pages are kept here

Other places to look for useful stuff

If you get an account on an unfamiliar UNIX system, take a tour of the directories listed
above, and familiarize yourself with their contents. Another way to find out what is
available is to look at the contents of your PATH environment variable:
       echo $PATH
You can use the ls command to list the contents of each directory in your path, and the
man command to get help on unfamiliar utilities. A good systems administrator will
ensure that manual pages are provided for the utilities installed on the system.

The Unix Filesystem
Under unix we can think of the filesystem as everything being a file. Thus directories are
really nothing more than files containing the names of other files and so on. In addition,
the filesystem is used to represent physical devices such as tty lines or even disk and tape

Each file on the system has what is called an inode that contains information on the file.
To see the fields of the inode look at manual page of the stat system call. This shows the
following fields:

     struct stat {
        dev_t   st_dev;          /* device inode resides on */
        ino_t   st_ino;          /* this inode's number */
        u_short st_mode;         /* protection */
        short   st_nlink;        /* number or hard links to the file */
        short   st_uid;          /* user-id of owner */
        short   st_gid;          /* group-id of owner */
        dev_t   st_rdev;          /* the device type, for inode that is device
        off_t   st_size;   /* total size of file */
        time_t st_atime; /* file last access time */
        int     st_spare1;
        time_t st_mtime;    /* file last modify time */
        int     st_spare2;
        time_t st_ctime;    /* file last status change time */
        int     st_spare3;
        long st_blksize; /* optimal blocksize for file system i/o ops */
        long st_blocks; /* actual number of blocks allocated */
        long st_spare4;
        u_long st_gennum; /* file generation number */

The key fields in the structure are st_mode (the permission bits), st_uid the UID, st_gid
the GID, and st_*time (assorted time fields).

The ls -l command is used to look at all of those fields.

umbc4:gopher> ls -l dead.letter
-rw-rw-rw- 1 gopher         307 Sep 14 12:33 dead.letter

umbc4:gopher> ls -F www
acsinfo/    index.html irc.gif                logo.gif        umbcfaq/

File times.
Unix records three file times in the inode, these are referred to as ctime, mtime, and
atime. The ctime field refers to the time the inode was last changed, mtime refers to the
last modification time of the file, and atime refers to the time the file was last accessed.

The ctime file of the inode is updated whenever the file is written too, protections are
changed, or the ownership changed. Usually, ctime is a better indication of file
modification than the mtime field. The mtime and atime fields can easily be changed via
a system call in C (or a perl script). The ctime field is a little harder to change, although
not impossible.

File times are important because the are used in many ways by system administators. For
example, when performing backups, an incremental dump will check the mtime of the
inode to see if a file has been modified and should be written to tape. Also, system
administrators often check the mtime of certain key system files when looking for signs
of tampering (while sometimes useful, a hacker will sufficient skill will reset the mtime
back). Finally, when managing disk space , some sites have a policy where files not
accessed in a cetain time are marked for archival, it is not uncommon to have certain
users deliberately set the atime or mtime to defeat this policy.

File Permissions
File permissions ae used to control access to files on the system. Clearly in a multi-user
system some method must be devised that allows users to select files for sharing with
other users while at the same time selecting other files to keep private. Under Unix, the
inode maintains a set of 12 mode bits. Three of the mode bits correspond to special
permissions, while the other nine are general user permissions.

The nine general file permissions are divided into three groups of three. The three groups
correspond to owner, group, and other. Within each group there are three distinct
permissions, read, write, and execute. The nine general file permissions are listed via the
ls -l . The following table sumarizes the file permissions:
Read (r)
        Read access means you can open the file with the open system call and can read
        the contents of the file with the read system call.
Write (w)
        Write access means you can overwrite the file or modify itÕs contents. It gives
        you access to the system calls write and truncate.
        Execute access means you can specify the path of this file and run it as a program.
        When a file name is specified to the shell the shell examines the file for execute
        access and calls the exec system call. The first two bytes of the file are checked
        for the system magic number, signifying the file is an executable. If the magic
        number is not contained in the first two bytes the file is assumed to be a shell
The file permissions described above apply to plain files, devices, sockets, and FIFOs.
These permissions do not apply to directories and symbolic links. Symbolic links have no
permission control on the link, all access is resolved by examining the permissions on the
target of the link.

Some anomalies can develop, for examaple, it is possible to set permissions so that a
program can be run but the file cannot be read. Also, it is possible to set permissions so
that anyone on the system, except members of your group can read the file.

Directory permissions
While directories are considered to be part of the file system the protection the mode bits
take on different meanings in the context of directory files.
Read (r)
       Read access permits the opendir and readdir system calls to be performed on the
       directory file. Read access on a directory will allow you to see the file names in
       the directory file.
Write (w)
       Write access allows you to add or remove file names from the directory file. With
       write access I can add files to a particular directory or use the rm command to
       delete a file.
       Execute access allows access to the stat system call to displays the values in the
       inode (such as the ls command does). You must have execute access to the
       directory to make it your current directory or to open files inside that directory.

SUID, SGID, and Sticky bit file permissions
Unix allows programs to be endowed with privileges that belong to another user (such as
root). Unix uses three of the twelve mode bits to support special permissions. These
permissions are named SetUID (SUID), SetGID (SGID), and sticky bit permissions. Files
that have the SUID bit set will run with effective user UID of the owner of the file. Files
that have the SGID bit set will run with the effective group ID of the group owner of the
file. Files with the sticky bit have special properties. Regular files with the sticky bit set
are supposed to remain in the swap file after they have finished execution. This was to
provide better performance to the system and not force commonly accessed programs to
be loaded from swap each time. On directory files, the sticky bit is interpreted in such a
way that only the owner of the file in that directory can delete a file. This is generally
used with the /tmp directory so that users cannot delete other users files even though all
users The SUID and SGID permissions are indicated with the ls -l command. A s in the
execute field for owner or group indicates SUID or SGID respectively. The sticky bit is
indicated in the ls -l command by a t in the execute bit for others.

Setting file protections

The umask command sets the default file protection for your process. The default file
protection on the system is 0666, a umask value is subtracted from the default to give a
new user specified default. Thus a umask of 0077 denies all users except your own. A
umask of 022 elimiantes execute mode.


The chmod [-Rf] mode filelist command (change mode) sets the file permission for a set
of files. The command supports two options -R and -f. The -R option is used to
recursively move down through the file system and select files matching the filelist. The -
f option is used to keep errors from being reported back (when used in shell scripts). The
values you can specify are either a absolute mode mask or a set of letters. When using the
absolute mode mask the value is specified as an three digit octal number and each octal
number corresponds to a user category (owner, group, or other). Since an octal number
requires three bits each bit of the number corresponds to a file permission (read, write, or
execute). Thus using the octal number 7 which has a binary value of 111 which implies
read,write, and execute access.

      4000 - Setuid on execution
      2000 - setgid on execution
      1000 - set sticky bit
      0400 - read by owner
      0200 - write by owner
      0100 - execute by owner
      0040 - read by group
      0020 - wr
      0010 - execute by group
      0004 - read by others
      0002 - write by others
      0001 - execute by others
Common file setting

       Owner has RWX, Group has RX, Others have RX.
       Owner has RWX, Group has none, Other have none
        Owner has RWX, Group has RX, Others have RX, SUID bit is set.
ite by group

Problems with SUID

SUID programs have a place on the system, in fact it would be difficult, if not
impossible, to run Unix without SUID programs. For example, the file /bin/passwd is a
SUID program owned by root. This makes sense, you want to restrict access to changing
passwords in some way and SUID is a good way to do that. That said, SUID programs
are the most frequent way that security is compromised. Problems with SUID programs
fall into the following two categories.
         The root user leaves an unattended root window and someone uses that window to
         create a backdoor into the system.
Poor Software Design.
         Often programs are written SUID that can be written without being SUID root.
         Also, many programmers don't fully understand the consequences of the code
         they write and leave unintended back doors into the system.
To identify SUID programs on your system use the command find . The find command
with the correct options will list all files that are SUID or SGID on the system.

find / -perm -004000 -o -perm -002000 -type f -print

Changing ownership of a file.
The chown command is used to change the ownership of files on the system. The format
of the command is

chown [-Rf] owner.group

Under BSD based systems you must be root to use the chown command. Under SYSV,
the chown command can be run by any user. This has interesting implications when
running diskquotas. You will occasionally see users give files to other users so the file
will count against the disk quota of the new owner. There is also a chgrp command for
changing group ownership only. The chown command can set both the owner and group
of a file and is the command generally used.

Creating Links
Unix provides both hard and symbolic links to files. Hard links must reside on the same
filesystem. A hard link is essentially a direct reference to the file by a another name. Both
files share the same inode and the inode increases the reference count of the file by one
for each link. The ln command creates hard links. The format of the command is ln
existing-file new-file.

A symbolic link is a file that indirectly points to another file. While the affect is the same
as a hard link a symlink does not increase the reference count of a file. A symbolic link
also can point to files on other file systems and files that don't even exist! The ln -s
command creates a symbolic link. The format of the command is ln -s existing-file new-
file. Symbolic links can greatly aid a system administrator when file systems must be
redone. Through the use of symbolic links you can create your own view of the file
system to keep up consistency for users. Jack Suess//jack@umbc.edu

UNIX file system
A file system is a logical method for organising and storing large amounts of information
in a way which makes it easy manage. The file is the smallest unit in which information
is stored. The UNIX file system has several important features.

      Different types of file
      Structure of the file system
      Your home directory
      Your current directory
      Pathnames
      Access permissions

Different types of file
To you, the user, it appears as though there is only one type of file in UNIX - the file
which is used to hold your information. In fact, the UNIX filesystem contains several
types of file.

      Ordinary files
      Directories
      Special files
      Pipes

Ordinary files
This type of file is used to store your information, such as some text you have written or
an image you have drawn. This is the type of file that you usually work with.

Files which you create belong to you - you are said to "own" them - and you can set
access permissions to control which other users can have access to them. Any file is
always contained within a directory.


A directory is a file that holds other files and other directories. You can create directories
in your home directory to hold files and other sub-directories.

Having your own directory structure gives you a definable place to work from and allows
you to structure your information in a way that makes best sense to you.

Directories which you create belong to you - you are said to "own" them - and you can
set access permissions to control which other users can have access to the information
they contain.

Special files
This type of file is used to represent a real physical device such as a printer, tape drive or

It may seem unusual to think of a physical device as a file, but it allows you to send the
output of a command to a device in the same way that you send it to a file. For example:

        cat scream.au > /dev/audio

This sends the contents of the sound file scream.au to the file /dev/audio which
represents the audio device attached to the system. Guess what sound this makes?

The directory /dev contains the special files which are used to represent devices on a
UNIX system.


UNIX allows you to link commands together using a pipe.

The pipe acts as a temporary file which only exists to hold data from one command until
it is read by another

        Connecting commands together
       Unix allows you to link two or more commands together using a pipe. The pipe
       takes the standard output from one command and uses it as the standard input to
       another command.

           command1 | command2 | command3

       The | (vertical bar) character is used to represent the pipeline connecting the

       With practise you can use pipes to create complex commands by combining
       several simpler commands together.

Structure of the file system
The UNIX file system is organised as a hierarchy of directories starting from a single
directory called root which is represented by a / (slash). Imagine it as being similar to the
root system of a plant or as an inverted tree structure.

Immediately below the root directory are several system directories that contain
information required by the operating system. The file holding the UNIX kernel is also

      UNIX system directories
      Home directory
      Pathnames

UNIX system directories
The standard system directories are shown below. Each one contains specific types of
file. The details may vary between different UNIX systems, but these directories should
be common to all. Select one for more information on it.

 |         |       |       |          |     |       |         |
 /bin    /dev    /etc    /home      /lib  /tmp    /usr   kernel file

Home Directory
Any UNIX system can have many users on it at any one time. As a user you are given a
home directory in which you are placed whenever you log on to the system.

User's home directories are usually grouped together under a system directory such as
/home. A large UNIX system may have several hundred users, with their home
directories grouped in subdirectories according to some schema such as their
organisational department.

Every file and directory in the file system can be identified by a complete list of the
names of the directories that are on the route from the root directory to that file or

Each directory name on the route is separated by a / (forward slash). For example:


This gives the full pathname starting at the root directory and going down through the
directories usr, local and bin to the file ue - the program for the MicroEMACS

You can picture the full pathname as looking like this:

                      |                |
                     tmp             usr
                               ---------------- ... ----
                               |         |             |
                           /games    /local         /spool
Controlling access to your files and
Every file and directory in your account can be protected from or made accessible to
other users by changing its access permissions.

You can only change the permissions for files and directories that you own.

      Displaying access permissions
      Understanding access permissions
      Default access permissions
      Changing the group ownership of files
      Changing access permissions

Displaying access permissions
To display the access permissions of a file or directory use the the command:

   ls -l filename (directory)

This displays a one line summary for each file or directory. For example:

   -rwxr-xr-x      1 erpl08     staff           3649 Feb 22 15:51 my.html

This first item -rwxr-xr-x represents the access permissions on this file. The
following items represent the number of links to it; the username of the person owning it;
the name of the group which owns it; its size; the time and date it was last changed, and
finally, its name.

Understanding access permissions
There are three types of permissions:

   r   read the file or directory
   w   write to the file or directory
   x   execute the file or search the directory

Each of these permissions can be set for any one of three types of user:

   u   the user who owns the file (usually you)
   g   members of the group to which the owner belongs
   o   all other users
The access permissions for all three types of user can be given as a string of nine

   user       group     others
   r w x      r w x     r w x

These permissions have different meanings for files and directories

Default access permissions
When you create a file or directory its access permissions are set to a default value. These
are usually:

     gives you read and write permission for your files; no access permissions for the
     group or others.
     gives you read write and execute permission for your directories; no access
     permissions for the group or others.

Access permissions for your home directory are usually set to rwx--x--x or rwxr-

Changing group ownership of files and
Every user is a member of one or more groups. To find out which groups you belong to
use the command:


To find out which groups another user belongs to use the command:

     groups username

Your files and directories are owned by the group (or one of the groups) that you belong
to. This is known as their "group ownership".

To list the group ownership of your files:

     ls -gl
You can change the group ownership of a file or directory with the command:

       chgrp group_name file/directory_name

You must be a member of the group to which you are changing ownership to.

Changing access permissions
To change the access permissions for a file or directory use the command

   chmod mode filename
   chmod mode directory_name

The "mode" consists of three parts: who the permissions apply to, how the permissions
are set and which permissions to set.

UNIX Groups
        About UNIX Groups
        Group Membership
        Group Ownership
        Group Permission Modes
        Some UNIX Commands for Working with Groups
        Troubleshooting

About UNIX Groups
Unix groups can be used to share files with a small number of University of
Delaware users. Each user on the central machines is associated with a list
containing at least one group, and each file or directory on the central Unix
machines is associated with one group. This is usually referred to as group
membership and group ownerships, respectively. That is, users are in groups
and files are owned by a group.
Users do not need to do anything to be in a group - this is all managed for them.
All users with an email account are in group 4000. Most students, registered for
class, are in a group created specifically for their class section. Researchers
using Strauss for computing work are in a group created for their computing
projects. Here at the University of Delaware we also use Unix groups for
accounting purposes, and that is why the group names are usually four digit
account project codes. Each accounting project has a project director who is
responsible for adding or removing members from the group. The project director
is an instructor for a class project, a principle investigator for a sponsored project,
or the university staff member originally requesting the project. Maintaining the
members of projects is done through the email account - access@udel.edu.

Managing group ownership of files and directories requires some action by the
user. All files or directories are owned by the user creating them. In addition to
being owned by a user, each file or directory is owned by a group. It is important
to have group ownership correct, if you ever want to share files with your group.
Group ownership does not imply group access, you must set the file access
permissions so your group can use the files. Permissions can be set to restrict
the type of access that group members have to your directories and files. You
can use different Unix groups to share files with separate sets of users.

Unix Group Membership
Users are organized into groups, every users is in at least one group, and may
be in other groups. Group membership gives you special access to files and
directories which are permitted to that group.

Every user is in a primary group and may be in several secondary groups. The
user is said to be in a group if the group name is in their list of groups. You do not
have to be logged on to be in a group. When you are logged on you are assigned
a group which is called your current group. This is also termed "being in a group",
but it is better to say "your shell is assigned to the group". When you first log on,
you are assigned your primary group, which is also called your default group.
You can change your current group, i.e., start a shell with a secondary group as
the current group, with the newgrp command. You can change your primary
group, i.e., set a default group for your next login, from the UD&Me network web
page. You can see your group list or the group list of any user with the groups
command. For example

strauss<1>%                 groups                     dnairn                     anita
dnairn : 1864 0123 0191 0217 0361              0363   0379 0380     0400   0583    4000
anita : 1864 0123 0388 0400 0583 4000
lists all the groups for dnairn and anita, the first group is the primary group,
the remaining groups are in alphabetic order. If you just type groups you will get
your groups.
   Note: Currently the Unix systems are configured to only allow 16 total groups in this group list. If
   you you see exactly 16 projects in your list then you may be in project, but not in the Unix group
   for that project.

Group ownership of Files and Directories
Every file and directory has a username and a groupname associated with it. We
say the username is the owner and the groupname owns the file or directory. A
directory is a collection of files and possibly other sub-directories. There are
commands for managing group ownership for both directories and files. In the
example commands given in this document we use filename to indicate the
name of a file, but in most cases you can use the same command with the name
of a directory.

The long format of the listing command gives the permission modes, the owner
and the group for both files and directories. Use the ls -dl filename
command to get a one-line listing of a single file or directory. The command ll
(or ls -l ) will list all the files and directories in your current directory. The ones
beginning with a "d" are directories.

When a file or directory is first created it takes as its group the current group of
your shell. This is the default group for all login shells, but you can start another
shell with any group with the command newgrp project. If you are going to
create files for a secondary group then it easier to create all these files from a
shell started with the newgrp command.

If you want to change the group associated with a file or directory, which already
exists use the command chgrp project filename. You must be the owner
of the file filename and you must be a member of the group project to make
the change. If the long listing shows a file which is not owned by the proper group
you must contact the owner of the file and get them to change the group.

In many cases the group ownership does not matter, but if you want to share a
file with a group, then it is important that you get the ownership correct.
Otherwise you may be inviting all users to put their large files in your directory.

Group permissions of Files and Directories
Just setting up a file to be owned by a group does not give your group any
access to the file. Granting and limiting access is done by setting permission
modes. You can see the permission modes as a set of 10 letters or dashes in the
long listing of a file or directory using the ls -dl command. The -dl option on
the ls command will list the information for the directory or file in long format.
Without the "d" all the files in the directory would be listed instead of just the
directory you asked for. For example to get a long listing for a directory with the
name kneeland
<2>%                         ls                        -dl                   kneeland
drwxr-x---     3 dnairn     0217            512 Aug 14 15:14 kneeland
The first string of characters are the mode, the following number is a count, the
user name is the owner and the 4 digit account code is the group.
mode: drwxr-x---
       Begins with a "d" so it is a directory, The owner, dnairn, has permission
       modes rwx which is full access. Any other user in group 0217 has
       permission modes r-x which is browsing access (can read and search
       without permission to add, rename or delete files in the directory.) Every
       other user, that is not dnairn and not in group 0217 has permission
       modes --- which is no access.
count: 3
       There are three files in this directory. The count is always one if you are
       listing a file.
username: dnairn
       The user with login name dnairn is the owner of the file. The owner will
       have permission modes according the the first three codes after the "d".
       The owner always can change permission modes with the chmod
groupname: 0217
       The directory is said to be owned by this group. Any user in group 0217 ,
       except dnairn, will have permissions granted according to the middle
       three codes in the permission modes.

Some UNIX Commands for Working with Groups
Command Description                                            Example
chdgrp       List groups with title and remaining balance      chdgrp
             See groups to which you belong with primary
groups                                                   groups
             group first
id           See current group as part of your id              id
newgrp       Start a shell in a different group                newgrp 1234
                                                               chmod          g+rwx
chmod        Change permissions for directories and files
chgrp        Change group ownership of directories and files   chgrp 1234 myfile
ls           List file permissions                             ls -l

        You are automatically assigned to a primary group when your userid is
         created. All faculty, staff and students are put in to project 4000. This
         primary group is the group assigned to any login shell. It is also called
    your default group. Use the network web page to choose your default
    group for all subsequent logins. This will be your current group at your
    next login.

    Just by itself, the chdgrp will list your current groups with a short

    Project       Title                        Remaining            Valid on hosts
    0068      WWW-IDEA CENTER                     100.00            mahler strauss
    1864      US-STAFF                           1740.58            mahler strauss
    0123      RESTRICTED DATA                       50.00           mahler strauss
    0583      WWW-IT                                 89.57          mahler strauss
    0191       USMAILTEST-ALIAS                   100.00            mahler strauss
    0217      WWWMAINT                             100.00           mahler strauss
    0380       US-QUOTA-REQUESTS                  100.00            mahler strauss
    0400       US-ALTERNATE INBOX                 200.00            mahler strauss
    4000      U. OF D. E-MAIL                       50.00           mahler strauss

    Your        default         group          is            currently       1864.

    To change your default group please go to http://www.udel.edu/network

    This is helpful if you forgot which project number to use for your groups

   Use the groups command to see which groups you belong to:

    <4>%                                                                    groups
    1864 0123 0217 0380 0400 0583 4000

    The first group which is listed is your primary group. That may be the only
    group to which you belong.

               Both chdgrp and groups commands will list your groups
               and tell you which one is the default group. However the
               chdgrp command can not be used to get information about
               another account, whereas the groups command can be
               used to list of groups for any user.

   Use the id to see your current group which is part of your identification.
    Your current group is the group name after the gid=number. This is
    usually a four digit project code.

    <54>%                                                                       id
    uid=7101(dnairn) gid=1267(1864)

    The current group is the project code 1864.

   When you login, you are automatically given your primary group as your
    current group. If you belong to other groups, you can use the newgrp
    command to start a new shell with a different current group. For example,
    suppose you are a member of the 0217 group, then you can use the
    following command to start a new shell in that group:

    <5>%                                  newgrp                                  0217
    <1>%                                                                        groups
    1864   0123   0191   0217   0361       0363    0379     0380    0400   0583   4000
    <2>%                                                                            id
    uid=7101(dnairn)                                                    gid=1829(0217)
    <3>%                                                                          exit
    <6>%                                                                           id
    uid=7101(dnairn) gid=1267(1864)

    Use the exit command to exit the shell and your current group will be
    restored to what it was before the exit command.

               The first group in the groups list is your primary group,
               whereas the group in the id information is your current group.
               You can also find all your groups with the id -a command.

   You can use the chmod command to set permission modes for selected
    directories and files. In general, you need to set at least read and execute
    permissions for the directories and read permissions for the files.

    The command syntax to enable all members of a group to read some file

           chmod g+r filename

    where filename is the name of the file you want to share. The file is now
    readable to the group associated with the file filename.

    Once you check to make sure a directory and all its files and sub-
    directories are owned by the correct group you can set the permission
    modes for everything with the one command

           chmod -R g+rX dirname

    where dirname is the name of the directory that contains the files you
    want to share. This command is called.

    The chmod command can also be used to allow members of a group to
    put files in a directory. The owner of the directory can open a directory for
    shared writing with the command:

           chmod g=swrx,+t dirname
       where dirname is the name of the directory you want to members of your
       group to create files in. The "s" is the group set-ID setting, which means all
       new files in this group will be owner by the user putting them there, but the
       group ownership will be set to match the group of the director, not the
       current group of the owner. This is the recommended way to keep all the
       group ownerships correct. The "+t" makes this a sticky directory. This
       means only the owner of a file (or the owner of the directory) can delete or
       rename a file. This is recommended if several users will be putting files in
       the same directory.

      Use the chgrp command to change group ownership of a directory or file.
       You need to use this command to share files with users who are in the
       same UNIX group as you, when that group is not your primary group.

       The syntax for the chgrp command is:

               chgrp groupname filename

       where groupname is the name of the group with which you would like to
       share a file named filename.

                   Whereas the chmod command determines the type of access
                   that group members may have to a file or directory, the
                   chgrp command determines which group may access that
                   file or directory.

      Use the ls command to get a long formatted listing of a file or directory.

               ls -l

       will list all the files and directories in the current directory. You can use this
       command to verify that:

          1. the files which you want to share have at least read permissions;
          2. all of the directories in the search path for those files have at least
             execute permissions;
          3. those files are owned by the group with which you want to share.

You can use a UNIX group to share an unlimited number of files on an ongoing
basis with others who have their own central UNIX account and are members of
the same UNIX group.
One of the most common mistakes in sharing files on a UNIX system is to forget
to set file permissions or to set them incorrectly. If permissions are not set
correctly, then a user will see the following message or a similar one when they
try to access your directory or files:

       permission denied

   1. Make sure you have a proper group for sharing. You must have a group
      which both of you are in, but not 4000 since every user with an e-mail
      account is in group 4000. You can check this with the command groups
      $USER username where the second username is the user name of the
      user who got the "permission denied" message. You must pick a group on
      both lists. For example I want to share with the user anita

      <1>%                  groups                    $USER                     anita
      dnairn : 1864 0123 0191 0217 0361        0363   0379 0380   0400   0583    4000
      anita : 1864 0123 0388 0400 0583 4000

      Project code 0123 is a good group name.

   2. Check to make sure the correct group owns the file with the ls -dl
      filename command. You should see this project number in this long
      formatted list as the group name.

      <2>%                    ls                    -dl                     myfile
      -rw-r-----   1 dnairn    1864           0 Dec 21 15:09 myfile

   3. Check to make sure the "r" code appears in the middle three permission
      modes, in this same ls command. If this is not correct type:

      chmod g+r myfile

   4. Finally check to make sure every directory above your current directory
      has the "x" permission in all three locations. This is called execute
      permissions for all, or symbolically "a+x". You can use the . as the current
      directory and .. for parent directory to list several levels

      <2>%        ls        -dl         .       ..         ../..     ../../..
      drwxrwsr-x      2 dnairn       1864                 512 Oct 16 10:42 .
      drwxrwsr-t      3 dnairn      1864                 512 Oct 16 10:26 ..
      drwxr-xr-x    84 dnairn      1864              6656 Dec 21 11:07 ../..
      drwxr-xr-x 198 root     root        9216 Aug 22 04:10 ../../..
Another common problem is to set file permissions for existing files, but to
neglect to set permissions for newly created files. By default, others cannot
access your files. You must give explicit permissions to each file when it is
Users, Groups, and the Superuser
Users and Groups
Special Usernames
su: Changing Who You Claim to Be

In Chapter 3, Users and Passwords, we explained that every UNIX user has a user name
to define an account. In this chapter, we'll describe how the operating system views users,
and we'll discuss how accounts and groups are used to define access privileges for users.
We will also discuss how you may assume the identity of another user or group so as to
temporarily use their access rights.

4.1 Users and Groups
Although every UNIX user has a username of up to eight characters long, inside the
computer UNIX represents each user by a single number: the user identifier (UID).
Usually, the UNIX system administrator gives every user on the computer a different

UNIX also uses special usernames for a variety of system functions. As with usernames
associated with human users, system usernames usually have their own UIDS as well.
Here are some common "users" on various versions of UNIX:

      root, the superuser, which performs accounting and low-level system functions.
      daemon or sys, which handles some aspects of the network. This username is also
       associated with other utility systems, such as the print spoolers, on some versions
       of UNIX.
      agent, which handles aspects of electronic mail. On many systems, agent has the
       same UID as daemon.
      guest, which is used for site visitors to access the system.
      ftp, which is used for anonymous FTP access.
      uucp, which manages the UUCP system.
      news, which is used for Usenet news.
      lp, which is used for the printer system.[1]

               [1] lp stands for line printer, although these days most people seem
               to be using laser printers.

      nobody, which is a user that owns no files and is sometimes used as a default user
       for unprivileged operations.

Here is an example of an /etc/passwd file containing these system users:
       ftp:*:3:3:FTP User:/usr/spool/ftp:

Notice that most of these accounts do not have "people names," and that all except root
have a password field of *. This prevents people from logging into these accounts from
the UNIX login: prompt, as we'll discuss later.[2]

       [2] This does not prevent people from logging in if there are trusted
       hosts/users on that account; we'll describe these later in the book.

       NOTE: There is nothing magical about these particular account names.
       All UNIX privileges are determined by the UID (and sometimes the group
       ID, or GID), and not directly by the account name. Thus, an account with
       name root and UID 1005 would have no special privileges, but an account
       named mortimer with UID 0 would be a superuser. In general, you should
       avoid creating users with a UID of 0 other than root, and you should avoid
       using the name root for a regular user account. In this book, we will use
       the terms "root" and "superuser" interchangeably.

4.1.1 User Identifiers (UIDs)

UIDs are historically unsigned 16-bit integers, which means they can range from 0 to
65535. UIDs between 0 and 9 are typically used for system functions; UIDs for humans
usually begin at 20 or 100. Some versions of UNIX are beginning to support 32-bit UIDs.
In a few older versions of UNIX, UIDs are signed 16-bit integers, usually ranging from -
32768 to 32767.

UNIX keeps the mapping between usernames and UIDs in the file /etc/passwd. Each
user's UID is stored in the field after the one containing the user's encrypted password.
For example, consider the sample /etc/passwd entry presented in Chapter 3:


In this example, Rachel's username is rachel and her UID is 181.

The UID is the actual information that the operating system uses to identify the user;
usernames are provided merely as a convenience for humans. If two users are assigned
the same UID, UNIX views them as the same user, even if they have different usernames
and passwords. Two users with the same UID can freely read and delete each other's files
and can kill each other's programs. Giving two users the same UID is almost always a
bad idea; we'll discuss a few exceptions in the next section.
4.1.2 Multiple Accounts with the Same UID

There are two exceptions when having multiple usernames with the same UID is sensible.
The first is for logins used for the UUCP system. In this case, it is desirable to have
multiple UUCP logins with different passwords and usernames, but all with the same
UID. This allows you to track logins from separate sites, but still allows each of them
access to the shared files. Ways of securing the UUCP system are described in detail in
Chapter 15, UUCP.

The second exception to the rule about only one username per UID is when you have
multiple people with access to a system account, including the superuser account, and
you want to track their activities via the audit trail. By creating separate usernames with
the same UID, and giving the users access to only one of these identities, you can do
some monitoring of usage. You can also disable access for one person without disabling
it for all.

As an example, consider the case where you may have three people helping administer
your Usenet news software and files. The password file entry for news is duplicated in the
/etc/passwd file as follows:

       ftp:*:3:3:FTP User:/usr/spool/ftp:
       newsa:Wx3uoih3B.Aee:6:6:News                                      co-admin
       newsb:ABll2qmPi/fty:6:6:News                                      co-admin
       newsc:x/qnr4sa70uQz:6:6:News                                      co-admin

Each of the three helpers has a unique password, so they can be shut out of the news
account, if necessary, without denying access to the others. Also, the activities of each
can now be tracked if the audit mechanisms record the account name instead of the UID
(most do, as we describe in Chapter 10, Auditing and Logging). Because the first entry in
the passwd file for UID 6 has the account name news, any listing of file ownership will
show files belonging to user news, not to newsb or one of the other users. Also note that
each user can pick his or her own command interpreter (shell) without inflicting that
choice on the others.

This approach should only be used for system-level accounts, not for personal accounts.
Furthermore, you should institute rules in your organizations that require users (Sabrina,
Rachel, and Fred) to log in to their own personal accounts first, then su to their news
maintenance accounts - this provides another level of accountability and identity
verification. (See the discussion of su later in this chapter.) Unfortunately, in most
versions of UNIX, there is no way to enforce this requirement, except by preventing root
from logging on to particular devices.
4.1.3 Groups and Group Identifiers (GIDs)

Every UNIX user belongs to one or more groups. Like user accounts, groups have both a
groupname and a group identification number (GID). GID values are also historically 16-
bit integers.

As the name implies, UNIX groups are used to group users together. As with usernames,
groupnames and numbers are assigned by the system administrator when each user's
account is created. Groups can be used by the system administrator to designate sets of
users who are allowed to read, write, and/or execute specific files, directories, or devices.

Each user belongs to a primary group that is stored in the /etc/passwd file. The GID of
the user's primary group follows the user's UID. Consider, again, our /etc/passwd


In this example, Rachel's primary GID is 100.

Groups provide a handy mechanism for treating a number of users in a certain way. For
example, you might want to set up a group for a team of students working on a project so
that students in the group, but nobody else, can read and modify the team's files.

Groups can also be used to restrict access to sensitive information or specially licensed
applications to a particular set of users: for example, many UNIX computers are set up so
that only users who belong to the kmem group can examine the operating system's kernel
memory. The ingres group is commonly used to allow only registered users to execute
the commercial Ingres database program. And a sources group might be limited to people
who have signed nondisclosure forms so as to be able to view the source code for some

       NOTE: Some special versions of UNIX support MAC (Mandatory Access
       Controls), which have controls based on data labeling instead of, or in
       addition to, the traditional UNIX DAC (Discretionary Access Controls).
       MAC-based systems do not use traditional UNIX groups. Instead, the GID
       values and the /etc/group file may be used to specify security access
       control labeling or to point to capability lists. If you are using one of these
       systems, you should consult the vendor documentation to ascertain what
       the actual format and use of these values might be. The /etc/group file

The /etc/group file contains the database that lists every group on your computer and its
corresponding GID. Its format is similar to the format used by the /etc/passwd file.[3]
       [3] As with the password file, if your site is running NIS, NIS+, NetInfo,
       or DCE, the /etc/group file may be incomplete or missing. See the
       discussion in "The /etc/passwd File and Network Databases" in Chapter 3.

Here is a sample /etc/group file that defines five groups: wheel, uucp, vision, startrek, and


The first line of this file defines the wheel group. The fields are explained in Table 4.1.

Table 4.1: Wheel Group Fields
Field Contents Description
wheel          The group name
*              The group's "password" (described below)
0              The group's GID
root, rachel   The list of the users who are in the group


Most versions of UNIX use the wheel group[4] as the list of all of the computer's system
administrators (in this case, rachel and the root user are the only members). The second
line of this file defines the uucp group. The only member in the uucp group is the uucp
user. The third line defines the users group; the users group does not explicitly list any
users; each user on this particular system is a member of the users group by virtue of their
individual entries in the /etc/passwd file.

       [4] Not all versions of UNIX call this group wheel; this is group 0,
       regardless of what it is named.

The remaining two lines define two groups of users. The vision group includes the users
keith, arlin and janice. The startrek group contains the users janice, karen, and arlin.
Notice that the order in which the usernames are listed on each line is not important.
(This group is depicted graphically in Figure 4.1.)

Remember, the users mentioned in the /etc/group file are in these groups in addition to
the groups mentioned as their primary groups in the file /etc/passwd. For example,
Rachel is in the users group even though she does not appear in that group in the file
/etc/group because her primary group number is 100. On some versions of UNIX, you
can issue the groups command or the id command to list which groups you are currently
Groups are handled differently by versions of System V UNIX before Release 4 and by
Berkeley UNIX; SVR4 incorporates the semantics of BSD groups.

       NOTE: It is not necessary for there to be an entry in the /etc/group file for
       a group to exist! As with UIDs and account names, UNIX actually uses
       only the integer part of the GID for all settings and permissions. The name
       in the /etc/group file is simply a convenience for the users - a means of
       associating a mnemonic with the GID value.

Figure 4.1 illustrates how users can be included in multiple groups.

Figure 4.1: Users and groups Groups and older AT&T UNIX

Under versions of AT&T UNIX before SVR4, a user can occupy only a single group at a
time. To change your current group, you must use the newgrp command. The newgrp
command takes a single argument: the name of the group that you're attempting to
change into. If the newgrp command succeeds, it execs a shell that has a different GID,
but the same UID:

       $ newgrp news

This is similar to the su command used to change UID.
Usually, you'll want to change into only these groups in which you're already a member;
that is, groups that have your username mentioned on their line in the /etc/group file.
However, the newgrp command also allows you to change into a group of which you're
not normally a member. For this purpose, UNIX uses the group password field of the
/etc/group file. If you try to change into a group of which you're not a member, the
newgrp command will prompt you for that group's password. If the password you type
agrees with the password for the group stored in the /etc/group file, the newgrp command
temporarily puts you into the group by spawning a subshell with that group:

       $ newgrp fiction
       password: rates34

You're now free to exercise all of the rights and privileges of the fiction group.

The password in the /etc/group file is interpreted exactly like the passwords in the
/etc/passwd file, including salts (described in Chapter 8, Defending Your Accounts).
However, most systems do not have a program to install or change the passwords in this
file. To set a group password, you must first assign it to a user with the passwd command,
then use a text editor to copy the encrypted password out of the /etc/passwd file and into
the /etc/group file. Alternatively, you can encode the password using the /usr/lib/makekey
program (if present) and edit the result into the /etc/group file in the appropriate place.[5]

       [5] We suspect that passwords have seldom been used in the group file.
       Otherwise, by now someone would have developed an easier, one-step
       method of updating the passwords. UNIX gurus tend to write tools for
       anything they have to do more than twice and that require more than a few
       simple steps. Updating passwords in the group file is an obvious
       candidate, but a corresponding tool has not been developed. Ergo, the
       operation must not be common.

       NOTE: Some versions of UNIX, such as AIX, do not support group
       passwords. Groups and BSD or SVR4 UNIX

One of the many enhancements that the Berkeley group made to the UNIX operating
system was to allow users to reside in more than one group at a time. When a user logs in
to a Berkeley UNIX system, the program /bin/login scans the entire /etc/group file and
places the user into all of the groups in which that user is listed.[6] The user is also placed
in the primary group listed in the user's /etc/passwd file entry. When the system needs to
determine access rights to something based on the user's membership in a group, it
checks all the current groups for the user to determine if that access should be granted (or

       [6] If you are on a system that uses NIS, NIS+ or some other system for
       managing user accounts throughout a network, these network databases
         will be referenced as well. For more information, see Chapter 19, RPC,
         NIS, NIS+, and Kerberos.

Thus, Berkeley and SVR4 UNIX have no obvious need for the newgrp command -
indeed, many of the versions do not include it. However, there may be a need for it in
some cases. If you have a group entry with no users listed but a valid password field, you
might want to have some users run the newgrp program to enter that group. This action
will be logged in the audit files, and can be used for accounting or activity tracking.
However, situations where you might want to use this are likely to be rare. Note,
however, that some systems, including AIX, do not support use of a password in the
/etc/group file, although they may allow use of the newgrp command to change primary

From Wikipedia, the free encyclopedia

Jump to: navigation, search
For the enterprise system role, see Power user.

On many computer operating systems, the superuser, or root, is a special user account
used for system administration.

Many older operating systems on computers intended for personal and home use,
including MS-DOS and Windows 9x, do not have the concept of multiple accounts and
thus have no separate administrative account; anyone using the system has full privileges.
Separation of administrative privileges from normal user privileges makes an operating
system more resistant to viruses and other malware, and the lack of this separation in
these operating systems has been cited as one major source of their insecurity.[citation needed]
However, requiring a user to validate his superuser status for simple administrative
functions can inconvenience the administrator as he is required to repeatedly enter his
login information.[citation needed]


        1 Unix and Unix-like
        2 Windows NT
        3 Novell NetWare
        4 See also
        5 References
        6 External links
[edit] Unix and Unix-like
In Unix-style computer operating systems, root is the conventional name of the user who
has all rights or permissions (to all files and programs) in all modes (single- or multi-
user). Alternative names include baron in BeOS and avatar on some Unix variants. BSD
often provides a toor ("root" backwards) account in addition to a root account for better
usability while performing administrative tasks. The root user can do many things an
ordinary user cannot, such as changing the ownership of files and binding to ports
numbered below 1024. The etymology of the term may be that root is the only user
account with permission to modify the root directory of a Unix system.[1]

It is never good practice for anyone to use root as their normal user account, since simple
typographical errors in entering commands can cause major damage to the system. It is
advisable to create a normal user account instead and then use the su command to switch
when necessary. The sudo utility can also be used instead to allow a measure of
graduated access.

Many operating systems, such as Mac OS X and some Linux distributions, allow
administrator accounts which provide greater access while shielding the user from most
of the pitfalls of full root access. In some cases (as in Ubuntu[2]), the root account is
disabled by default, and must be specifically enabled. In a few systems, such as Plan 9,
there is no superuser at all.

Software defects which allow a user to "gain root" (to execute with superuser privileges
code supplied by that user) are a major computer security issue, and the fixing of such
software is a major part of maintaining a secure system. One common way of gaining
root is to cause a buffer overflow in a program already running with superuser privileges.
This is often avoided in modern operating systems by running critical services, such as
httpd, under a unique limited account. A related term is rootkit, using root privileges to
conceal certain data from the system administrator.

[edit] Windows NT
In Windows NT and later systems derived from it (Windows 2000, Windows XP,
Windows Server 2003 and Windows Vista), there may or may not be a superuser. By
default, there is a superuser named Administrator, although it is not an exact analogue
of the Unix root superuser account. Administrator does not have all the privileges of
root because some superuser privileges are assigned to the Local System account in
Windows NT. The user may gain access to the Local System account by making Task
Scheduler start a command prompt. Since Task Scheduler starts programs as Local
System, the user can run any program as Local System. However, this may be regarded
as a vulnerability.

In Windows Vista or later, you can use User Account Control to run a process with
elevated privileges (for example, by right-clicking (Windows 2000 users must hold the
SHIFT key while right-clicking) on the program and selecting Run as administrator). In
earlier version of Windows, the command runas fulfils this task (see Microsoft's
documentation for runas for more details).

The superuser is a privileged user who has unrestricted access to the whole system; all
commands and all files regardless of their permissions. By convention the username for
the superuser account is root.

The root account is necessary as many system administration files and programs need to
be kept separate from the executables available to non-privileged users. Unix allow users
to set permissions on the files they own. A system administrator may need to override
those permissions.

Access to the root account is restricted by a password. Because the superuser has the
potential to affect the security of the entire system, it is recommended that this password
be given only to people who absolutely need it, such as the system administrator. It is
also a good idea to change the password on this account often. On BSD derivative
systems, users who have access to the root account are frequently listed as members of
group 0, also known as wheel.

There are several ways to log into the root account. If a system comes up in single user
mode whoever is logged in automatically has root privileges. When a system is already
up in multi user mode, a user can log in directly as root. This is not recommended as it is
easy to take superuser privileges for granted and perform mundane tasks in this mode.
When a user is already logged in, issuing the su command, without options, will cause the
system to prompt for the root password. Once it is given the user becomes root.

6. Unix deals with superuser the same way other multiuser systems do.

      True.
      False.

The root account has its own shell and frequently displays a prompt that is different from
the normal user prompt. If this is not the case, changing the default shell for the root
account will change the prompt. Commands and programs that a system administrator
will need as root are kept in /etc to decrease the chances of a user trying them by
accident. For example, /etc contains /etc/passwd, which holds a list of all users who have
permission to use the system.
Because the root account has extensive privileges it has an equal potential for destruction.
This is amplified by the fact that safeguards built into some commands do not apply to
this account. For example, the superuser may change another user's password without
knowing the old password. The superuser can also mount and unmount file systems,
remove any file or directory, and shut down the entire system. The root account should be
used with caution and only when necessary to perform a given task. A misplaced
keystroke in this mode can have disastrous results.

Further protection can be obtained by placing restrictions on the root account. On BSD
based systems like SunOS, the su program can be modified so that any user who tries the
command will be checked to see if they are a member of group 0. BSD 4.3 has an
additional security measure known as the "secure terminal concept". In this case the
terminal itself will not accept a login as root unless it is designated as secure in /etc/ttys.

Superuser (aka "root") is the UNIX System Manager

On any system someone must be able to kill any runaway program, purge corrupted files,
reset passwords when users forget them, remove users' permission to use the system, and
a myriad of other system management tasks. On MPE this person is called the System
Manager (actually, any user with SM capability).

On UNIX this special user is known as superuser or root (not to be confused with the
root directory). Superuser can override file security and do almost anything she wants on
the system (she cannot see your password, since it is encrypted, but she can change it). In
fact, any user with a userid of 0 is a superuser. Naturally, such users should always have
a password.

It is not good practice for the system administrator to always logon as superuser. It is too
easy to make a trivial mistake and damage the system, perhaps by rm * in an important
directory. Instead, logon as a regular user, then switch to superuser with the su command
when you need it.

Root is Also the Start of the Directory

In a Hierarchical File System, one directory is the root or start of the tree. Other
directories hang off root and they in turn can have subdirectories. On UNIX and POSIX,
root is specified as a forward slash "/". On DOS, root is specified as a backward slash
"\". This meaning of root should not be confused with the alternate meaning of root as
the UNIX system manager (that is, superuser).

Users, Groups and File Ownership in Unix

Every UNIX system has an account called root, or "Super User". This is an account
that has almost complete control of the system, and is in charge of maintaining it. It is
not constrained by any of the permissions or ownership of a file, it is able to create,
destroy, modify, and view any file on the system. In addition, the account is able to
perform functions such as adding/deleting users, setting usage limits, disk
administration, accounting, and a whole slew of other administrative tasks. Root may
be one person, or a group of people that are in charge of a system. Usually, the
system administrator will have a normal user account for day-to-day use that does not
have the responsibilities or privileges (and the dangers that accompany them) of root.

Since UNIX is a multiuser system, it needs a way to keep track of all the different
users. Each user account has a unique name, called a login ID, that is used to log in to
the system with. That name, along with other information about the account
(encrypted password, real name, shell, etc.) is usually kept in a file called
/etc/passwd. Some UNIX systems keep account information in different places. When
an account on a UNIX system is created, it is assigned a UID which is a number that
the system uses to keep track of who you are. All files are stored as a numeric owner
in the inode information, not as a login id owner. For example, the root account has a
UID of 0. If you do an ls -l on any files that are owned by root, UNIX sees that the
file is owned by UID 0, so it looks up 0 in the password information. It then pulls out
the login-id of the UID 0 and prints root as the owner.

When an account is created it is assigned to a default group depending on what type
of account it is, and what it is to be used for. Like login ids, groups are also stored as
numbers. Only the root account has the privilege of creating, deleting, or assigning
group membership. If users need to be in additional groups, then they must be
assigned to a secondary group, in most cases this means the administrator adding
them to the /etc/groups file. /etc/groups is a list of group names, group password
(mostly unused), group ID numbers (GID), and a list of members of the groups.

Each file in the UNIX file system has two types of owners:

      User: The user who owns a file is the only user (other than root) who has the
       privilege to change the permission and group ownership of a file.
      Group: Group ownership on the other hand is merely a way of granting
       privileges to a group of users.

If a file is readable by a group, then only those in that group can read the file. While
users cannot change the ownership of a file, an owner of a file is permitted to change
the group association, provided they are in the group. If a user is in two groups,
firstaid and student, and a file they own is in the student group, chgrp will allow them
to change the files group ownership to firstaid. Even though users cannot manipulate
groups and group membership on standard UNIX machines, AFS supports these
features, and is available on the WAM machines.

To find out what group a person is in, use the groups command. By itself, it will tell
you what group(s) you are in, or if you use another username as an argument, it will
tell you what group(s) that person is in. The command whoami will tell you your
username, although you probably already know it.


fork: a system call that creates a new process under the UNIX operating system

Crucial warning: For CS330 Labs, you ABSOLUTELY MUST use a linux machine
instead of hercules for any program containing a fork() system call. This requirement is
due to the possibility of a fork bomb , which is a runaway process that creates too many
other processes either directly or indirectly.

At present, the CC compiler does not run on linux. Use g++ instead. The g++ on linux is

For example, if a program contains a call to fork( ), the execution of the program results
in the execution of two processes. One process is created to start executing the program.
When the fork( ) system call is executed, another process is created. The original process
is called the parent process and the second process is called the child process. The child
process is an almost exact copy of the parent process. Both processes continue executing
from the point where the fork( ) calls returns execution to the main program. Since UNIX
is a time-shared operating system, the two processes can execute concurrently.

Click here to see a sample program testing some system calls

Click here to see a sample program without fork
Click here to see the output

Click here to see a sample program with fork that runs two processes forever
Click here to see the output

Click here to see a sample program with fork that prints 5 lines each from two processes
Click here to see the output

      fork2.cpp program creates two processes, as does fork3.cpp
      The value returned by fork( ) is stored in a variable of type pid_t, which is really
       an integer. Since the value of this variable is not used, we could have ignored the
       result of fork(), i.e., type cast it to void
      With fork2.cpp, each process keeps printing its process id (pid) over and over
      You have to kill these processes with control-c
      For documentation on fork( ), see
         man 2 fork

       To find out the id of a process, the getpid( ) system call is used
       For documentation on getpid( ), see

         man 2 getpid
Some differences between the child and parent process are:

       different pids
       in the parent, fork( ) returns the pid of the child process if a child process is
       in the child, fork( ) always returns 0
       separate copies of all data, including variables with their current values and the
       separate program counter (PC) indicating where to execute next; originally both
        have the same value but they are thereafter separate
       after fork, the two processes do not share variables

fork returns:

       the pid of the new child process: to the parent process; this is equivalent to telling
        the parent the name of its child.
       0: to the child process
       -1: 1 if there is an error; i.e., fork( ) failed because a new process could not be


       a system call that causes the process to wait for a signal (waits until any type of
        signal is received from any process).
       most commonly used in a parent process to wait for the signal that the OS sends
        to a parent when its child is terminated
       returns the pid of the process sending the signal
       see

         man 2 wait

        for documentation

Suppose we want to have the parent wait for a signal from the child process:
Click here to see a sample program
Click here to see the output

       In this simple example, the parent waits for the child to finish.
       The parent process enters a do/while loop. When it receives a signal, it checks the
        wait_result, which tells it which process sent the signal. If it is equal to the child's
        pid, the parent can terminate. Otherwise, it waits for another signal.

Suppose we want the new process to do something quite different from the parent
process, namely run a different program. The execl system call loads a new executable
into memory and associates it with the current process. In other words, it changes things
so that this process starts executing executable code from a different file.

        In the following program, execl is called to load the file /bin/ls into memory. The
         argument vector is set to the words "ls" and "-l".
        argv[0] is set to "ls"
        argv[1] is set to "-l"
        The number of arguments to execl can vary so we end with NULL to show that
         there are no more arguments.

Click here to see a sample program with fork/exec and wait that executes the system
program named ls, using the executable file called /bin/ls and using one parameter "-l".
The overall result is equivalent to doing: ls -l
Click here to see a sample program with fork/exec and wait
Click here to see the output

process control block (pcb): data structure in OS that represents one process

Fork Bombs

        You are ABSOLUTELY FORBIDDEN to experiment with programs involving
         fork on hercules.
        Why? Because almost every term, some CS330 student crashes hercules with a
         fork bomb, often at a crucial time for other classes.
        So test, your programs involving forks on a Linux workstation
        Below is a program with a loop that creates 2^n processes for n iterations through
         the loop. I have restricted n to 4 or less (16 processes or less)
        If argv[1] is 1, we get 2 processes, if it is 2, we get 4 processes, if it is 3, we get 8
         processes, and if it is 4, we get 16 processes.
        The danger is that inadvertantly, when experimenting with fork, you will set up a
         process that creates a child process, and then both of these processes will create
         child processes, and then all of these processes will create child processes, and so
         on until the system is overwhelmed. These processes are hard for the system
         manager to kill because they keep creating more processes whenever resources
         are available.
        Such a program is called a fork bomb.
        On the other hand, it is fine for you to experiment with fork on the Linux
         machines. These Linux machines have names like "a035178", whereas hercules
         has the name "hercules".

Click here to see a sample forkbomb program, restricted to n <= 4
Click here to see the output

To help understand the forkbomb, consider this series of simple programs, which create
2^0 = 1, 2^1 = 2, 2^2 = 4, and 2^3 = 8 because they have 0, 1, 2, and 3 fork system calls,

simplefork0.cpp - no fork system call

simplefork1.cpp - one fork system call

simplefork2.cpp - two fork system calls

simplefork3.cpp - three fork system calls

Click here to see the output for all these programs

Implementation of the fork() system call:

      In UNIX, fork is a system call that creates a new pcb
      copies most information from the current process's pcb into the next free location
       in the process table
      The parent and child processes will now both be ready to execute.
      One can be left/placed in the RUNNING state and the other in the READY state.
      Henceforth, both will takes turns in using the processor (along with all other
      Which one is chosen to run first, depends on the exact version of the operating
      In Windows, the CreateProcess system call creates a new process initialized to
       default/null entries, but in UNIX the entries are copied from an existing process,
       namely the parent process.

Fork (operating system)
From Wikipedia, the free encyclopedia

Jump to: navigation, search
This article is about "forking" a process in a multitasking or multithreading operating
system. For other uses, see Fork (disambiguation).

In computing, when a process forks, it creates a copy of itself, which is called a "child
process." The original process is then called the "parent process". More generally, a fork
in a multithreading environment means that a thread of execution is duplicated, creating a
child thread from the parent thread.
Under Unix and Unix-like operating systems, the parent and the child processes can tell
each other apart by examining the return value of the fork() system call. In the child
process, the return value of fork() is 0, whereas the return value in the parent process is
the PID of the newly-created child process.

The fork operation creates a separate address space for the child. The child process has an
exact copy of all the memory segments of the parent process, though if copy-on-write
semantics are implemented actual physical memory may not be assigned (i.e., both
processes may share the same physical memory segments for a while). Both the parent
and child processes possess the same code segments, but execute independently of each

Importance of Fork In Unix
Forking is an important part of Unix, critical to the support of its design philosophy,
which encourages the development of filters. In Unix, a filter is a small program that
reads its input from stdin, and writes its output to stdout. A pipeline of these commands
can be strung together by a shell to create new commands. For example, one can string
together the output of the find(1) command and the input of the wc(1) command to
create a new command that will print a count of files ending in ".cpp" found in the
current directory and any subdirectories, as follows:

$ find . -name "*.cpp" -print | wc -l

In order to accomplish this, the shell forks itself, and uses pipes, a form of interprocess
communication, to tie the output of the find command to the input of the wc command.
Two child processes are created, one for each command (find and wc). These child
processes are overlaid with the code associated with the programs they are intended to
execute, using the exec(3) family of system calls (in the above example, find will
overlay the first child process, and wc will overlay the second child process, and the shell
will use pipes to tie the output of find with the input of wc).

More generally, forking is also performed by the shell each time a user issues a
command. A child process is created by forking the shell, and the child process is
overlaid, once again by exec, with the code associated with the program to be executed.

To top