Unix Free Tutorial

Web based School

Previous Page Main Page Next Page


4 — Listing Files

This chapter covers some useful commands and constructs that help you better manage your data files. As a system matures, the file system becomes an eclectic collection of data files—some old, some new, some borrowed, some blue. The file system maintains information about data files such as file ownership, the size of the file, and the access dates. All of this information is useful in helping to manage your data. You'll learn more about ls, the directory list command. In addition, you'll learn about the find command, which you can use to locate files even when you don't know the complete path name.

Sometimes you want to limit the scope of a command so that the output from the command is more focused. You accomplish this by using partial filenames and some special wildcard characters. This chapter discusses three ways of causing the system to make filename substitutions.

You'll also look at two of the most powerful features of UNIX—redirection and piping—which are methods for rerouting the input and output of most commands.

Listing Files and Directories: ls Revisited

As you learned in Chapter 3, "The UNIX File System: Go Climb a Tree," the ls command lists the names of files and directories. This section reviews the basics of ls and provides examples of its options.

ls The Short and Long of It

In its simplest form, the ls command without arguments displays the names of the files and directories in the current working directory in alphabetical order by name. For example,

$ ls

21x        LINES.dat    LINES.idx   PAGES.dat   PAGES.idx

acct.pds   marsha.pds   p11         t11         users

On some systems, the default output from ls is a single column of output. Most of the examples in this chapter use the columnar format to conserve space.

The ls command can also accept a filename as a command line parameter. For example,

$ ls marsha.pds

marsha.pds

If the command line parameter is a directory name, all the files in that directory are listed. For example,

$ ls users

dave     marsha     mike

Notice that the files are listed in order by collating sequence. That is, files beginning with numbers come first; files beginning with uppercase characters come next; and files beginning with lowercase characters come last. Also notice that although this format displays your filenames in a compact fashion, it doesn't give you much information about the files. You can get more detail about the files by requesting a long listing with the -l option. For example,

$ ls -l

-rwxr-xr—   1 asm      adept       512 Dec 14 16:16 21x

-rw-rw-r—   1 marsha   adept      1024 Jan 20 14:14 LINES.dat

-rw-rw-r—   1 marsha   adept      3072 Jan 20 14:14 LINES.idx

-rw-rw-r—   1 marsha   adept       256 Jan 20 14:14 PAGES.dat

-rw-rw-r—   1 marsha   adept      3072 Jan 20 14:14 PAGES.idx

-rw-rw-r—   1 marsha   acct        240 May  5  1992 acct.pds

-rw-rw-r—   1 marsha   adept      1024 Nov 22 15:42 marsha.pds

-rwxrwxr—   4 root     sys      243072 Aug 22  1991 p11

-rwxrwxr—   4 root     sys      256041 Aug 22  1991 t11

drw-rw-r—   1 marsha   adept      3072 Oct 12 11:42 users

A long listing displays seven columns of information about each file. In the first line of the listing,

-rwxr-xr—

indicates the file's type and permissions

1

indicates the number of links to the file

asm

is the user ID of the file's owner

adept

is the group ID of the group that the owner belongs to

512

is the size of the file in bytes

Dec 14 16:16

is the time stamp—the date and time when the file was last modified

21x

is the name of the file (refer to Figure 3.4 in Chapter 3)

The first and second columns require a bit more explanation. The first column is a ten-character field that indicates the file's mode—its type and its permissions. In the first line of the list, the file's mode is -rwxr-xr—. The first character tells the file type, which is a hyphen (-) for regular files, and d for directories. In this example, the first nine items in the list are all ordinary files, and the last item is a directory.

The next nine characters of the entry are the file's permissions—three sets of three characters that control which users may access a file and what they can do with it. The first set of three characters controls what the file's owner can do; the second set of three characters controls what others in the group can do; and the third set of three characters controls what all other users can do. Each set of three characters shows read (r), write (w), and execute (x) permission, in that order. A hyphen (-) means that the permission is denied.

The second column of the long listing is the number of links to this file. All the files except two—p11 and t11—are pointed to only from this directory. p11 and t11 have entries in three other directories, for a total of four links.

You should refer to the "Keeping Secrets—File and Directory Permissions" section in Chapter 3 for a complete description of file types and for further details on file permissions. File links are covered in the "Hard and Symbolic Links" section of Chapter 3.

Other ls Options

The ls command has several options. This section covers many of the ones more frequently used.

Showing Hidden Files with -a

The ls option doesn't normally list files that begin with a period. Suppose that the directory displayed in the previous section also contained a file named .profile. In that case, you would see

$ ls -a

.           ..          .profile    21x          LINES.dat

LINES.idx   PAGES.dat   PAGES.idx   acct.pds    marsha.pds

p11         t11         users

Note that the files . and .. represent the current and parent directories, respectively.

You can combine options, as in this example:

$ ls -al

-rw-r—r—   1 marsha   adept      2156 Jul 19 1991  .

-rw-r—r—   1 marsha   adept      2246 Jul 19 1991  ..

-rw-r—r—   1 marsha   adept       117 Jul 19 1991  .profile

-rwxr-xr—   1 asm      adept       512 Dec 14 16:16 21x

-rw-rw-r—   1 marsha   adept      1024 Jan 20 14:14 LINES.dat

-rw-rw-r—   1 marsha   adept      3072 Jan 20 14:14 LINES.idx

-rw-rw-r—   1 marsha   adept       256 Jan 20 14:14 PAGES.dat

-rw-rw-r—   1 marsha   adept      3072 Jan 20 14:14 PAGES.idx

-rw-rw-r—   1 marsha   acct        240 May  5  1992 acct.pds

-rw-rw-r—   1 marsha   adept      1024 Nov 22 15:42 marsha.pds

-rwxrwxr—   4 root     sys      243072 Aug 22  1991 p11

-rwxrwxr—   4 root     sys      256041 Aug 22  1991 t11

drw-rw-r—   1 marsha   adept      3072 Oct 12 11:42 users

Showing File Types with -F

Another useful option is -F, which distinguishes directory and executable files from ordinary files. The -F option causes a slash (/) to be appended to the filename for directories and an asterisk (*) to be appended to files which are executable. For example,

$ ls -F

21x*       LINES.dat    LINES.idx   PAGES.dat   PAGES.idx

acct.pds   marsha.pds   p11*        t11*        users/

Listing Files Whose Names Contain Nonprintable Characters with -q

When a file is created, the filename can inadvertently acquire nonprintable characters. Suppose that a filename contained a backspace character (represented here as ^H). The file named abcd^Hefg would display in a normal ls command as abcefg. Because you cannot see the backspace character, you might be confused about the actual filename. With the ls -q option, this filename would display as abcd?efg.

Even if you don't know what the mystery character is, you can still work with the file by using filename substitution (discussed in the next section). If you need to know the exact nature of the mystery character, you can use the -b option, which causes the nonprintable character to print in octal mode. With the b option, the filename would display as abcd\010efg, in which \010 is the octal representation of a backspace.

Other Useful ls Options

Additional ls options include the following:

-u

Used with -l, causes the last access time stamp to be displayed instead of the last modification time.

-s

Used with -l, gives the file size in blocks instead of bytes.

-t

Sorts the output by time stamp instead of name. Used with -u sorts the output by access time.

-r

Reverses the order of the output. By itself, displays the output in reverse alphabetic order, used with -t, displays the output by the most recent time stamp.

-x

Forces the output into multicolumn

Using Metacharacters When Referring to Filenames

So far you've learned how to work with files by referring to their complete names. Sometimes, however, it is useful to refer to several files without having to name each one of them. Likewise, if you can remember only part of a filename, it is useful to list all the files whose names contain that part. UNIX provides metacharacters, also known as wildcards, which enable you to refer to files in these ways.

There are two metacharacters: the question mark (?) and the asterisk (*). In addition to metacharacters, filename substitution can be done on character sets. For more information about metacharacters, see Chapter 11, "Bourne Shell," Chapter 12, "Korn Shell," and Chapter 13, "C Shell."

Pattern Matching on a Single Character

In filename substitution, the question mark (?) stands for any single character. Consider the following directory:

$ls

21x        LINES.dat    LINES.idx   PAGES.dat   PAGES.idx

acct.pds   marsha.pds   p10         p101        p11

t11        z11

You can use the question mark (?) in any position. For example,

$ ls ?11

p11    t11    z11

You can also use more than one question mark in a single substitution. For example,

$ ls p??

p10    p11

The following command gives you all three-character filenames:

$ ls ???

21x    p10    p11    t11    z11

Suppose that you wanted to list all of the files that begin with LINES. We could do this successfully with

$ ls LINES.???

LINES.dat    LINES.idx

Now suppose that you wanted to find the files that end in .pds. The following two commands illustrate how to do this:

$ ls ????.pds

acct.pds

$

$ ls ?????.pds

marsha.pds

Pattern Matching on a Group of Characters

In the previous example, to list all of the files ending in .pds using single character substitution, you would have to know exactly how many characters precede the period. To overcome this problem, you use the asterisk (*), which matches a character string of any length, including a length of zero. Consider the following two examples:

$ ls *.pds

acct.pds    marsha.pds

$ ls p10*

p10    p101

As with single character substitution, more than one asterisk (*) can be used in a single substitution. For example,

$ ls *.*

LINES.dat   LINES.idx   PAGES.dat   PAGES.idx   acct.pds

marsha.pds

Pattern Matching on Character Sets

You have seen how you can access a group of files whose names are similar. What do you do, though, if you need to be more specific? Another way to do filename substitution is by matching on character sets. A character set is any number of single alphanumeric characters enclosed in square brackets—[ and ].

Suppose that you wanted a list of all the filenames that start with p or t followed by 11. You could use the following command:

$ ls [pt]11

p11   t11

You can combine character sets with the metacharacters. To list the names of all the files that begin with p or t, you could use

$ ls [pt]*

p10         p101        p11      t11

Now suppose that you wanted a list of all the filenames that begin with an uppercase alphabetic character. You could use

$ ls [ABCDEFGHIJKLMNOPQRSTUVWXYZ]*

LINES.dat    LINES.idx   PAGES.dat   PAGES.idx

If you're guessing that there might be a better way to do this, you're right. When the characters in a character set substitution are in sequence, you can use a hyphen (-) to denote all of the characters in the sequence. Therefore, you can abbreviate the previous command in this way:

$ ls [A-Z]*

If a character sequence is broken, you can still use the hyphen for the portion of the character set that is in sequence. For example, the following command lists all the three-character filenames that begin with p, q, r, s, t, and z:

$ ls [p-tz]??

p10    p11    t11    z11

How File Substitution Works

It is important to understand how file substitution actually works. In the previous examples, the ls command doesn't do the work of file substitution—the shell does. (Refer to Chapter 10, "What Is a Shell," for more information.) Even though all the previous examples employ the ls command, any command that accepts filenames on the command line can use file substitution. In fact, using the simple echo command is a good way to experiment with file substitution without having to worry about unexpected results. For example,

$ echo p*

p10 p101 p11

When a metacharacter is encountered in a UNIX command, the shell looks for patterns in filenames that match the metacharacter. When a match is found, the shell substitutes the actual filename in place of the string containing the metacharacter so that the command sees only a list of valid filenames. If the shell finds no filenames that match the pattern, it passes an empty string to the command.

The shell can expand more than one pattern on a single line. Therefore, the shell interprets the command

$ ls LINES.* PAGES.*

as

$ ls LINES.dat LINES.idx PAGES.dat PAGES.idx

There are file substitution situations that you should be wary of. You should be careful about the use of whitespace (extra blanks) in a command line. If you enter the following command, for example, the results might surprise you:

$ ls LINES. *

LINES.: not found

21x        LINES.dat    LINES.idx   PAGES.dat   PAGES.idx

acct.pds   marsha.pds   p10         p101        p11

t11        z11

What has happened is that the shell interpreted the first parameter as the filename LINES. with no metacharacters and passed it directly on to ls. Next, the shell saw the single asterisk (*), and matched it to any character string, which matches every file in the directory. This is not a big problem if you are simply listing the files, but it could mean disaster if you were using the command to delete data files!

Unusual results can also occur if you use the period (.) in a shell command. Suppose that you are using the

$ ls .*

command to view the hidden files. What the shell would see after it finishes interpreting the metacharacter is

$ ls . .. .profile

which gives you a complete directory listing of both the current and parent directories.

When you think about how filename substitution works, you might assume that the default form of the ls command is actually

$ ls *

However, in this case the shell passes to ls the names of directories, which causes ls to list all the files in the subdirectories. The actual form of the default ls command is

$ ls .

The find Command

One of the wonderful things about UNIX is its unlimited path names. A directory can have a subdirectory that itself has a subdirectory, and so on. This provides great flexibility in organizing your data.

Unlimited path names have a drawback, though. To perform any operation on a file that is not in your current working directory, you must have its complete path name. Disk files are a lot like flashlights: You store them in what seem to be perfectly logical places, but when you need them again, you can't remember where you put them. Fortunately, UNIX has the find command.

The find command begins at a specified point on a directory tree and searches all lower branches for files that meet some criteria. Since find searches by path name, the search crosses file systems, including those residing on a network, unless you specifically instruct it otherwise. Once it finds a file, find can perform operations on it.

Suppose you have a file named urgent.todo, but you cannot remember the directory where you stored it. You can use the find command to locate the file.

$ find / -name urgent.todo -print

/usr/home/stuff/urgent.todo

The syntax of the find command is a little different, but the remainder of this section should clear up any questions.

The find command is different from most UNIX commands in that each of the argument expressions following the beginning path name is considered a Boolean expression. At any given stop along a branch, the entire expression is true—file found—if all of the expressions are true; or false—file not found—if any one of the expressions is false. In other words, a file is found only if all the search criteria are met. For example,

$ find /usr/home -user marsha -size +50

is true for every file beginning at /usr/home that is owned by Marsha and is larger than 50 blocks. It is not true for Marsha's files that are 50 or fewer blocks long, nor is it true for large files owned by someone else.

An important point to remember is that expressions are evaluated from left to right. Since the entire expression is false if any one expression is false, the program stops evaluating a file as soon as it fails to pass a test. In the previous example, a file that is not owned by Marsha is not evaluated for its size. If the order of the expressions is reversed, each file is evaluated first for size, and then for ownership.

Another unusual thing about the find command is that it has no natural output. In the previous example, find dutifully searches all the paths and finds all of Marsha's large files, but it takes no action. For the find command to be useful, you must specify an expression that causes an action to be taken. For example,

$ find /usr/home -user me -size +50 -print

/usr/home/stuff/bigfile

/usr/home/trash/bigfile.old

first finds all the files beginning at /usr/home that are owned by me and are larger than 50 blocks. Then it prints the full path name. (Actually, the full path name of the found files is sent to the standard output file, which is discussed later in this chapter.)

The argument expressions for the find command fall into three categories:

  • Search criteria

  • Action expressions

  • Search qualifiers

Although the three types of expressions have different functions, each is still considered a Boolean expression and must be found to be true before any further evaluation of the entire expression can take place. (The significance of this is discussed later.) Typically, a find operation consists of one or more search criteria, a single action expression, and perhaps a search qualifier. In other words, it finds a file and takes some action, even if that action is simply to print the path name. The rest of this section describes each of the categories of the find options.

Search Criteria

The first task of the find command is to locate files according to some user-specified criteria. You can search for files by name, file size, file ownership, and several other characteristics.

Finding Files with a Specific Name: -name fname

Often, the one thing that you know about a file for which you're searching is its name. Suppose that you wanted to locate—and possibly take some action on—all the files named core. You might use the following command:

$ find / -name core -print

This locates all the files on the system that exactly match the name core, and it prints their complete path names.

The -name option makes filename substitutions. The command

$ find /usr/home -name "*.tmp" -print

prints the names of all the files that end in .tmp. Notice that when filename substitutions are used, the substitution string is enclosed in quotation marks. This is because the UNIX shell attempts to make filename substitutions before it invokes the command. If the quotation marks were omitted from "*.tmp" and if the working directory contained more than one *.tmp file, the actual argument passed to the find command might look like this:

$ find /usr/home -name a.tmp b.tmp c.tmp -print

This would cause a syntax error to occur.

Locating Files of a Specific Size: -size n

Another useful feature of find is that it can locate files of a specific size. The -size n expression is a good example of a search criterion that is evaluated numerically. The numeric portion of the expression may be integers in the form n, -n, or +n. An integer without a sign matches if the file is exactly n. An integer preceded by a minus sign matches if the requested file is smaller than n. An integer preceded by a plus sign matches if the file is larger than n. For example,

$ find / -size +100 -print

prints the names of all the files that are more than 100 blocks long.

In the -size expression, the integer may be suffixed with the character c. With the suffix, the match is made on the file size in characters. Without the suffix, the match is made on the file size in blocks. Therefore, the command

$ find / -size -512c -print

prints the names of all the files that are less than 512 bytes long.

Other search criteria include:

-user uname

Looks for files that are owned by the user with the login name of uname. If uname is numeric it is compared to the user number.

-group gname

Looks for files that are owned by a member of the group gname. If gname is numeric, it is compared to the group number.

-atime n

Looks for files that were last accessed n days ago. n must be an integer. It can take the form n, -n, or +n.

-mtime n

Looks for files that were last modified n days ago. n must be an integer.

-perm onum

Looks for files whose permission flags match the octal number onum. If onum is preceded by a minus sign, the match will be made if the permission flag has the bit(s) set that matches the bit(s) in onum. For example, the expression -perm -100 will be true for any file that is executable by its owner.

-links n

A match if the file has n links. n must be an integer. It can take the form n, -n, or +n.

-type x

Looks for files that are of type x. Valid values for x are: b for a block special file, c for a character special file, d for a directory, p for a fifo (named pipe), and f for an ordinary file.

-newer fname

Looks for files that have been modified more recently than the file fname.

-local

Looks for files that reside on the local system as opposed to a remote site.

Locating Files of a Specific Size: -size n

Once the find command has located a file, it must be told what to do with it. These are called action expressions.

Displaying the Path Names of Found Files: -print

As you know, it does little good to locate a file, and then take no action. One commonly used action is the print expression, which causes the complete path name to be printed when a file is found. This is useful if you want to check for the existence of a file before deciding to take some other action.

Executing a UNIX Command on the Found Files: -exec cmd \;

Sometimes you know what action you want to take once you find a file. In those cases, you can use the expression

exec cmd \;

where cmd is any UNIX command. \; tells the find command to take the action specified between exec and \;. find then continues to evaluate argument expressions.

The most powerful aspect of the find command is the unique file substitution method found within the exec cmd expression. In any cmd statement, the argument {} is replaced with the name of the currently matched file. For example, suppose that the command

$ find /usr/home -name core -print

gives the following results:

/usr/home/dave/core

/usr/home/marsha/core

/usr/home/mike/core

The command

$ find /usr/home -name core -exec rm {} \;

has the same effect as issuing these commands:

$ rm /usr/home/dave/core

$ rm /usr/home/mike/core

$ rm /usr/home/marsha/core
Executing a UNIX Command on Found Files, But Querying First: -ok cmd \;

The -ok expression works exactly like the -exec expression, except that the execution of the command is optional. When it encounters an ok expression, the find program displays the generated command, with all substitutions made, and prints a question mark. If the user types y, the command is executed.

Writing Found Files to a Device: -cpio device

The -cpio device action expression causes a file to be written to a given device in cpio form. For example, the command

$ find /usr/home -cpio -o >/dev/rmt0

writes all the files in /usr/home and all its subdirectories to the magnetic tape device /dev/rmt0. This is a good way to back up data files. It is a shorthand equivalent of

$ find /usr/home -print | cpio >/dev/rmt0

Search Qualifiers

There are times when you may want the find command to alter its normal search path. This is accomplished by adding search qualifiers to the find command.

Searching for Files on Only the Current File System: -mount

The -mount search qualifier restricts the search to the file system named in the starting point. For example, the command

$ find / -mount -type d -print

prints the names of all the directories in only the root file system.

Altering the Search Path with -depth

The -depth search qualifier alters the seek order to a depth-first search. The find command processes the files in a directory before it processes the directory itself. This helps in finding files to which the user has access, even if his access to the directory is restricted. To see the difference, try the following two commands. Remember that -print is always true.

$ find /usr -print

$ find /usr -depth -print

Combining Search Criteria

You can combine search criteria in a single command. Because the expressions in a find command are evaluated from left to right and the search fails when any one expression fails, the effect is a logical AND. For example, the command

$ find /usr/home -name "*.tmp" -atime +7 -exec rm {} \;

removes all the files that end in .tmp and that have not been accessed in the last 7 days.

Suppose, though, that you wanted to locate files ending in either .tmp or .temp. You could use the expression -name "*mp", but you might find files that you didn't expect. The solution is to combine search criteria in a logical OR expression. The syntax is

\( expression -o expression \)

The \ in front of the parentheses is an escape character; it prevents the shell from misinterpreting the parentheses. The following command line, for example, finds files ending in either .tmp or .temp:

$ find /usr/home \( -name "*.tmp" -o -name "*.temp" \)
Negating Expressions to Find Files That Don't Meet Criteria

Suppose that Marsha wanted to see whether anyone was putting files into her personal directory. She could use the negation operator (!), as in

$ find /usr/home/marsha ! -user marsha -print

$ /usr/home/marsha/daves.todo
Specifying More Than One Path to Search

By specifying a directory in which the find command should begin searching, you can control the scope of the search. The find command actually takes a list of directories to be searched, but you must specify all paths before you supply any expression arguments. For example, the command

$ find /usr/home/mike /usr/home/dave

produces a list of all the files in Mike's and Dave's directories and in your current working directory.


NOTE: You must specify at least one directory for a search. To specify the current directory for a search, use .pathname.

Controlling Input and Output

One thing common to almost all computer programs is that they accept some kind of input and produce some kind of output. UNIX commands are no different. In this section, you'll discover how you can control the source of input and the destination of output.

One reason why UNIX is so flexible is that each program is automatically assigned three standard files: the standard input file, the standard output file, and the standard error file. Programmers are not restricted to using only these files. However, programs and commands that use only the standard files permit maximum flexibility. The three standard files also can be redirected. When it is not redirected, the standard input file is the user's keyboard, and both standard output and standard error go to the user's screen.

Output Redirection

Two operators enable you to redirect output to a file: > and >>. The > operator either creates a new file that contains the redirected output, or overwrites an existing file with the redirected output. The >> operator appends the new output to the end of the specified file. That is, if the file already contains data, the new output is added to the end of it.

To divert the standard output from your screen, use the > operator. Consider the directory used in an example at the beginning of this chapter. To redirect the output to a file named dirfile, you would use the command

$ ls >dirfile

Now you could use dirfile in another command. For example,

$ cat dirfile

21x

LINES.dat

LINES.idx

PAGES.dat

PAGES.idx

acct.pds

dirfile

marsha.pds

p11

t11

users

NOTE: Notice that the specified output file, dirfile, already appears in the listing. This is because the first thing that ls does is to open its output file.


NOTE: When the output of ls is redirected, the default output is in a single column. This is useful if the result is to be processed by another command that looks for one filename per line.

The > operator causes a new file to be created. If you had already created a file named dirfile, it would be deleted and replaced with the new data. If you wanted to add the new data to the old dirfile, you could use the >> operator. For example:

$ ls -x >dirfile

$ ls -x >>dirfile

$ cat dirfile

21x        LINES.dat    LINES.idx   PAGES.dat   PAGES.idx

acct.pds   dirfile      marsha.pds  p11         t11

users

21x        LINES.dat    LINES.idx   PAGES.dat   PAGES.idx

acct.pds   dirfile      marsha.pds  p11         t11

users

Input File Redirection

There are two possible sources of input for UNIX commands. Programs such as ls and find get their input from the command line in the form of options and filenames. Other programs, such as cat, can get their data from the standard input as well as from the command line. Try the cat command with no options on the command line:

$ cat

There is no response. Because no files are specified with the command, cat waits to get its input from your keyboard, the standard input file. The program will accept input lines from the keyboard until it sees a line which begins with Ctrl+D, which is the end-of-file signal for standard input.

To redirect the standard input, you use the < operator. For example, if you wanted cat to get its input from dirfile, you could use the command

$ cat <dirfile

The difference between this command and

$ cat dirfile

is a subtle one. In filenames provided as options to a command, you can use filename substitution. When redirecting input, you must use the name of an existing file or device. Therefore, the following command is a valid UNIX command:

$ cat dir*

You cannot, however, use the following command, for it is an invalid UNIX command:

$ cat <dir*

Redirecting Error Messages

Most commands have two possible types of output: normal or standard output, and error messages. Normally, error messages display to the screen, but error messages can also be redirected.

Earlier in this chapter, you saw the following example with a space between the partial filename and the metacharacter:

$ ls LINES. *

LINES.: not found

21x        LINES.dat    LINES.idx   PAGES.dat   PAGES.idx

acct.pds   marsha.pds   p10         p101        p11

t11        z11

It appears that all of the output in this example is on the standard output. However, if you change the command slightly, you get different results:

$ ls LINES. * >dirfile

LINES.: not found

What has happened is that the legitimate output from the ls command has been redirected to dirfile, and the error message has been sent to the standard error file.

To redirect error messages, use the > operator prefixed with a 2. For example,

$ ls LINES. * 2>errmsg

21x        LINES.dat    LINES.idx   PAGES.dat   PAGES.idx

acct.pds   marsha.pds   p10         p101        p11

t11        z11

Now the error message has been directed to the file errmsg, and the legitimate output has gone to the standard output file.

You can redirect both standard output and standard error for the same command. For example,

$ ls LINES. * >dirfile 2>errmsg

You cannot redirect the same standard file twice. For example, the following command is invalid:

$ ls >dirfile >anotherdir

If you wanted to discard all error messages, you could use the following form:

$ ls LINES. * >dirfile  2>/dev/null

NOTE: The standard error redirection operator (2>) is actually the same operator as standard output redirection (>). When a UNIX program opens files, they are given integer numbers. The three standard files are numbered 0, 1, and 2.

0 is assumed for input redirection. 1 is assumed for output redirection; therefore, redirection of standard output can also be written as 1>. Redirection is not restricted to only the first three files. However, to redirect higher-numbered files, the user would need to know how they are used within the program.


NOTE: Note for C shell users. In the C shell, error messages cannot be redirected separately from standard output. In the C-shell you can include error output with standard output by adding an ampersand (&) to the redirection symbol.

$ ls LINES. * >& dirfile

This command would redirect both standard output and error messages to dirfile.

Using Pipes to Pass Files Between Programs

Suppose that you wanted a directory listing that was sorted by the mode—file type plus permissions. To accomplish this, you might redirect the output from ls to a data file and then sort that data file. For example,

$ ls -l >tempfile

$ sort <tempfile

-rw-rw-r—   1 marsha   adept      1024 Jan 20 14:14 LINES.dat

-rw-rw-r—   1 marsha   adept      3072 Jan 20 14:14 LINES.idx

-rw-rw-r—   1 marsha   adept       256 Jan 20 14:14 PAGES.dat

-rw-rw-r—   1 marsha   adept      3072 Jan 20 14:14 PAGES.idx

-rw-rw-r—   1 marsha   acct        240 May  5  1992 acct.pds

-rw-rw-r—   1 marsha   adept      1024 Nov 22 15:42 marsha.pds

-rw-rw-r—   1 marsha   adept         0 Jan 21 10:22 tempfile

-rwxr-xr—   1 asm      adept       512 Dec 14 16:16 21x

-rwxrwxr—   4 root     sys      243072 Aug 22  1991 p11

-rwxrwxr—   4 root     sys      256041 Aug 22  1991 t11

drw-rw-r—   1 marsha   adept      3072 Oct 12 11:42 users

Although you get the result that you wanted, there are three drawbacks to this method:

  • You might end up with a lot of temporary files in your directory. You would have to go back and remove them.

  • The sort program doesn't begin its work until the first command is complete. This isn't too significant with the small amount of data used in this example, but it can make a considerable difference with larger files.

  • The final output contains the name of your tempfile, which might not be what you had in mind.

Fortunately, there is a better way.

The pipe symbol (|) causes the standard output of the program on the left side of the pipe to be passed directly to the standard input of the program on the right side of the pipe symbol. Therefore, to get the same results as before, you can use the pipe symbol. For example,

$ ls -l | sort

-rw-rw-r—   1 marsha   adept      1024 Jan 20 14:14 LINES.dat

-rw-rw-r—   1 marsha   adept      3072 Jan 20 14:14 LINES.idx

-rw-rw-r—   1 marsha   adept       256 Jan 20 14:14 PAGES.dat

-rw-rw-r—   1 marsha   adept      3072 Jan 20 14:14 PAGES.idx

-rw-rw-r—   1 marsha   acct        240 May  5  1992 acct.pds

-rw-rw-r—   1 marsha   adept      1024 Nov 22 15:42 marsha.pds

-rwxr-xr—   1 asm      adept       512 Dec 14 16:16 21x

-rwxrwxr—   4 root     sys      243072 Aug 22  1991 p11

-rwxrwxr—   4 root     sys      256041 Aug 22  1991 t11

drw-rw-r—   1 marsha   adept      3072 Oct 12 11:42 users

You have accomplished your purpose elegantly, without cluttering your disk. It is not readily apparent, but you have also worked more efficiently. Consider the following example:

$ ls -l | sort >dirsort & ; ps

PID TTY  STAT  TIME COMMAND

13678 003  R     2:13 sh

15476 003  R     0:01 ls

15477 003  R     0:00 sort

15479 003  R     0:00 ps

Both ls and sort are executing simultaneously, which means that sort can begin processing its input, even before ls has finished its output. A program, such as sort, that takes standard input and creates standard output is sometimes called a filter.

The capability to string commands together in a pipeline, combined with the capability to redirect input and output, is part of what gives UNIX its great power. Instead of having large, comprehensive programs perform a task, several simpler programs can be strung together, giving the end user more control over the results. It is not uncommon in the UNIX environment to see something like this:

$ cmd1 <infile | cmd2 -options | cmd3 | cmd4 -options >outfile

Summary

In this chapter, you learned how to use UNIX commands to list filenames with ls and to locate files based on search criteria with find. You also learned how to supply partial filenames to a command by using filename substitution. Finally, you learned how to reroute input and output by using standard file redirection and piping.

Previous Page Main Page Next Page