Unix Free Tutorial

Web based School

Previous Page Main Page Next Page

  • 28
    • Tools for Writers

    • 28

      Tools for Writers

      By Susan Peppard

      Using spell

      You've gone to a lot of trouble to prepare a document that looks splendid and you don't want it to be marred by spelling mistakes. The spell program will catch most of your typos. An interactive version of spell, called ispell, also is available on some systems.


      NOTE: spell will not find errors such as is for in or affect for effect. You still have to proofread your document carefully.

      spell uses a standard dictionary. It checks the words in your file against this dictionary and outputs a list of words not found in the dictionary. You can create a personal dictionary to make spell more useful, as you'll learn in the next section.

      If your file includes .sos or .nxs, spell searches the sourced in files.

      To invoke spell, type spell and your filename. All the words that spell doesn't recognize are displayed on your screen, one word per line. This list of unrecognized words is arranged in ASCII order. That is, special characters and numbers come first, uppercase letters come next, and then lowercase letters. In other words, the words are not in the order in which they occur in your file. Each unrecognized word appears only once. Therefore, if you typed teh for the 15 times, teh appears only once in the spell output.

      The list of unrecognized words can be very long, especially if your text is full of acronyms or proper names or if you don't type well. The first few screens will speed by at what seems like 1,000 miles per hour, and you won't be able to read them at all. To read all the screens, redirect the output of spell to a file:

      $ spell filename > outputfilename

      TIP: Use a short name for this output file. w—for wrong—works well. You can open a file with a short name more quickly, and delete it more quickly, too. It's also less embarrassing to have a file called w in all your directories instead of one called misspelled_words.

      After you create the file of unrecognized words, you can handle it in several ways:

      • You can print the file.

      • You can vi the file and try to remember the misspellings—or scribble them on a slip of paper.

      • You can vi the file in another window if you are using a window-like environment.

      Now correct your mistakes. The list probably contains a number of words that are perfectly legitimate. For example, spell refuses to recognize the words diskette and detail. There is no good reason for this, but it may spur you to create a personal dictionary.

      To correct your mistakes, first vi your file. Next, do one of the following:

      • Search for the misspelling—/teh—and correct it—cw the. Then search for the next occurrence of teh—n—and correct it with the . command. Continue doing this until the search produces pattern not found.

      • Globally change all occurrences of teh to the—:1, $ s/teh/the/g.

      There's a risk associated with the global method. For example, if I ran spell on this chapter, teh would appear on the list of unrecognized words. Then if I globally changed all occurrences of teh to the, this chapter, or at least this section, would be virtually incomprehensible. The moral is, use global substitutions wisely, and never use them on someone else's files.

      After you correct your file, run it through spell once more just to be sure. The new output overwrites the old file.


      TIP: If you're a less-than-perfect typist—or if you have fat fingertips—unwanted characters can sneak into words—for example, p[rint. When this happens, rint appears on spell's list of unrecognized words. Just search for rint. However, if you type p[lace, spell won't help you, because lace is a perfectly good word.

      Occasionally, spell finds something like ne. Searching for all instances of ne isn't fun, especially in a file with 2,000 lines. You can embed spaces in your search—s/[space]ne[space]. However, this is rarely helpful, because spell ignores punctuation marks and special characters. If you typed This must be the o ne, s/[space]ne[space], it won't find it. You can try searching with one embedded space—s/[space]ne and s/ne[space]—, but you still may not find the offender. Try /\<ne\>. This will find ne as a complete word, that is, surrounded by spaces; at the beginning or end of a line; or followed by punctuation.


      TIP: Even if you added only half a line, run spell once more after you've edited a chapter. You always find a mistake.

      Creating a Personal Dictionary

      If your name is Leee—with three es—and you get tired of seeing it in the list of unrecognized words, you can add Leee to a personal dictionary.

      To create a personalized dictionary, follow these steps:

      1. Create a file called mydict. Of course, you may call it anything you like.

      2. Invoke spell with $ spell+mydict inputfile > w.

      Your personal dictionary doesn't have to be in the same directory as your input files. If it isn't, however, you must specify a path on the command line, as in

      $ spell+/dict/mydict inputfile > w

      Creating Specialized Dictionaries

      Personalized dictionaries are a great help if you're working on several writing projects, each of which has a specialized vocabulary. For example, if you're working on the XYZZY project, and the same words keep turning up in your w file—words that are perfectly O.K. in the context of the XYZZY system but not O.K. in any other files—you can create an xyzzy.dict.

      An easy way to automate some of the steps necessary for creating a specialized dictionary is to run spell on your first file. For example,

      $ spell ch01 > w

      Then run it on all the rest of your files. Append the output to w, instead of replacing w. For example,

      $ spell ch02 >> w

      At this point, you'll have a long file that contains all the words that spell doesn't recognize. First, you need to sort the file and get rid of the duplicates. (Refer to the sort command in Chapter 6, "Popular File Tools.")

      $ sort w -u>sorted.w

      Here, the -u option stands for unique. sort drops all the duplicates from the list.

      Now edit sorted.w, deleting all the misspelled words and all words not specific to your XYZZY project. The words that remain form the basis of xyzzy.dict. Change the name of sorted.w to xyzzy.dict by using mv sorted.w xyzzy.dict. You can add words to or delete words from this file as necessary.

      Repeat this process to create additional specialized dictionaries. And if you're a nice person, you'll share your specialized dictionaries with your colleagues.

      Using ispell

      ispell is an interactive version of spell. It works like the spell checkers that come with word processing applications. That is, it locates the first word in your file that it doesn't recognize—ispell uses the same dictionary as spell—and stops there. Then you can correct the word or press Enter to continue.

      To invoke ispell, do one of the following:

      • Enter ispell ch01.

      • vi your first chapter. Then from within vi, escape to the shell and invoke ispell with :!ispell.

      Although some people prefer ispell, unadorned, ordinary spell is more useful if you want to create personal or specialized dictionaries or if you want make global changes to your input file.

      /dev/null: The Path to UNIX Limbo

      As you're surely tired of hearing, UNIX views everything as a file, including devices (such as your terminal or the printer you use). Device files are stored neatly in the /dev directory.

      Occasionally, you specify devices by their filenames (for example, when you're reading a tape or mounting a disk drive), but most often you don't bother to think about device files.

      There's one device file, however, that you may want to use: /dev/null.

      The null file in the /dev directory is just what it sounds like: nothing. It's the equivalent of the fifth dimension or the incinerator chute. If you send something there, you can't get it back—ever.

      Why would you want to send output to /dev/null? If you've just created a complex table (or picture, graph, or equation), you can process your creation without wasting paper. Just direct the output to /dev/null:

      tbl filename> /dev/null
      
      eqn filename> /dev/null
      
      pic filename > /dev/null

      You'll see any error messages on your screen. This is usually more reliable than checkeq. And you can use it for text files.

      Countoing Words with wc

      Sometimes you need to count the words in a document. UNIX has the tool for you. The wc shell command counts lines, words, and characters. It can give you a total if you specify more than one file as input.

      To count the words in ch01, enter wc -w ch01.

      You can count lines by using the -l option, or characters by using the -c option. Bear in mind, however, that wc counts all your macros as words. (Refer to Chapter 6 for more details on wc.)

      Using grep

      The grep command is an invaluable aid to writers. It is used primarily for checking the organization of a file or collection of files, and for finding occurrences of a character string.

      Checking the Organization of a Document

      If you're writing a long, complex document—especially one that uses three or more levels of headings—you can make sure that your heading levels are correct and also produce a rough outline of your document at the same time.


      NOTE: This technique is useful only if you are using a macro package—a reasonable assumption for a long, complex document. If you've formatted your document with embedded troff commands, this technique won't work.

      For example, if your heading macros take the form

      .H n "heading"

      a first-level heading might be

      H 1 "Introduction to the XYZZY System"

      If your chapters are named ch01, ch02, and so on through chn, the following command will search all your chapter files for all instances of the .H macros. It will also print the filename and the line that contains the .H macro in a file called outline.

      $ grep "\.H " ch* > outline

      The backslash is needed to escape the special meaning of the period. The space after H is needed so that you don't inadvertently include another macro or macros with names such as .HK or .HA. The quotation marks are used to include that space.

      You can view your outline file with vi, or you can print it. At a glance, you're able to see whether you've mislabeled a heading in Chapter 1, omitted a third-level heading in Chapter 4, and so forth. You also have an outline of your entire document. Of course, you can edit the outline file to produce a more polished version.

      Finding Character Strings

      If you've just finished a 1,000-page novel and suddenly decide—or are told by your editor—to change a minor character's name from Pansy to Scarlett, you might vi every one of your 63 files, search for Pansy, and change it to Scarlett. But Scarlett isn't in every chapter—unless you've written Scarlett II. So why aggravate yourself by viing 63 files when you need to vi only six? grep can help you.

      To use grep to find out which files contain the string Pansy, enter the following:

      $ grep "Pansy" ch* > pansylist

      Here, the quotation marks aren't strictly necessary, but it's a good idea to get into the habit of using them. In other situations, such as the previous example, you need them.

      This command creates a file called pansylist, which looks something like this:

      ch01:no longer sure that Pansy was
      
      ch01:said Pansy.
      
      ch07:wouldn't dream of wearing the same color as Pansy O'Hara.
      
      ch43:Pansy's dead. Pansy O'Hara is dead.
      
      ch57:in memory of Pansy. The flowers were deep purple and yellow

      Now you know which chapters have to be edited: 1, 7, 43, and 57. To change Pansy to Scarlett globally, vi one of the files that contains the string Pansy and enter the following command. Make sure that you're in Command mode, not Insert mode.

      :/,$ s/Pansy/Scarlett/g

      The g at the end of this code line is important. If the string Pansy occurs more than once in a line, as it does in Chapter 43, g ensures that all instances be changed to Scarlett.


      NOTE: The same cautions about making global changes apply here. You might be referring to the flower, not the character; therefore, you'll want to retain Pansy. grep usually gives you enough context to alert you to potential problems.

      Using sed

      The UNIX stream editor, sed, provides another method of making global changes to one or more files. sed is described in Chapter 7, "Editing Text Files."


      CAUTION: Don't use sed unless you understand the perils of overwriting your original file with an empty file.

      There are two ways to use sed: on the command line or with a sed script. (The example given here uses the command line form, not because it is preferable, but because it is easier to see what is going on.) The script, called substitute, changes all occurrences of the first argument to the second argument. You wouldn't want to go to all this trouble just to change Pansy to Scarlett. However, because you can specify more than one command with sed—in the command line form and in the sed script form—sed is a useful and powerful tool.

      Using diffmk

      diffmk comes from the many diff commands offered by the UNIX system. Its purpose is to diffmark text—that is, to mark text that has changed from one version of a file to another. The text is marked with a vertical bar in the right margin. Sometimes, other characters creep in, especially with tables.

      Use diffmk like this:

      $ diffmk oldfile newfile difffile

      The order is important. If you get it wrong, diffmk blithely prints your old file with diffmarks on it. That's probably not what you want.

      Often your files are in two different directories—possibly because the files have the same names. Suppose that you have a ch01 in the draft2 directory and in the draft3 directory. You can specify a pathname for diffmk, and you can even write the diffmarked files into a third directory. The third directory must already exist; diffmk won't create it for you. The following command diffmarks files in two directories and writes them into a third directory. It assumes that your current directory is draft3.

      $ diffmk ../draft2/file1 file1 ../diffdir/dfile1

      If you have many chapters, you might want to consider a shell script. To create a shell script that diffmarks all the files in the draft3 directory against the files in the draft2 directory, follow these steps:

      1. Make sure that you're in the draft3 directory—that is, the directory for the new file.

      2. List the files in draft3:

        $ ls > difflist

      3. 3. Create the following shell script:

        for i in 'cat difflist'
        do
        diffmk ../draft2/$i $i ../diffdir/d$i
        done

      4. 4. Make the script executable:

        $ chmod +x diffscript

      5. 5. Put diffscript in your bin:

        $ mv diffscript $HOME/bin

      6. 6. Execute diffscript:

        $ diffscript

      The man Command

      The man command consults a database of stored UNIX system commands—basically everything that is in the system reference manuals—and nroffs it to your screen. If you don't have all that documentation on a shelf in your office, the man command can save the day.

      man is simple to use:

      man commandname

      The output is far from beautiful, and it's slow. It's paged to your screen, so you press Enter when you're ready to go on to the next page. You can't backtrack, though. Once you leave the first screen—that is, the one with the command syntax on it—the only way you can see it again is to run the man command a second time.

      If your terminal has windowing or layering capabilities, man can is more useful, because you can look at it and type on your command line at the same time.

      You can also print the output from man, but you may not know which printer the output is going to. If you work in a multi-printer environment, this can be a nuisance. Check with your system administrator.

      Using SCCS to Control Documentation

      Although the Source Code Control System—SCCS for short—was written to keep track of program code, it also makes a good archiving tool for documentation. It saves each version of a text file—code, troff input, and so on—and essentially enables only the owner to change the contents of the file. SCCS is described in detail in Chapter 30, "SCCS Version Control." You can use SCCS to control versions of a document that you often revise. You can also use SCCS on drafts of a document. If you work with a publications group and your group doesn't have a good archiving and document control system, look into SCCS.

      Previous Page Main Page Next Page