CGI Perl Tutorial

Web based School

Chapter 13

Debugging CGI Programs


CONTENTS

There's nothing better than writing four or five pages of Perl code, putting it online, and watching it work flawlessly the first time. Unfortunately, you won't always be this lucky; debugging is an important part of any programming project.

In this chapter, you'll learn about the following:

  • Testing and debugging your CGI programs
  • Investigating some common errors and their causes
  • Using tools to make the debugging process less painful

Even if you've debugged programs before, you will find the process of debugging CGI programs a different kind of challenge. You should not be surprised if CGI programs are harder to debug than anything else you've encountered. Nevertheless, it can be done. CGI programs often are hard to debug because you don't have as many clues as you might expect. If you receive an error message when submitting a form, it might mean that your program has a syntax error, that it is not creating the output, or that it simply doesn't exist.

Several basic steps exist in the debugging process. The following list is a suggested method of finding the problem by process of elimination; as you develop and debug a few programs of your own, you'll grow to recognize certain kinds of problems and will be able to skip many of these steps:

  • Test the program and keep track of any problems you encounter.
  • If your project includes multiple Perl programs, determine which program is causing the error.
  • Determine whether the program is executing at all.
  • Check for syntax errors.
  • Determine whether the program is producing valid HTML output.
  • Check whether the correct data is being sent to the program from the form.
  • Pinpoint the location of the problem and fix it.

First, you'll look at the basic steps of this process in the following sections.

Determining Which Program Has a Problem

In a large CGI project, you may have several programs interacting with each other. It is important to determine which of them is executing when the problem occurs. This might be a simple process-for example, if your project uses only one program, or if the output stops halfway through a certain program's text. Some situations can be more difficult. Imagine an HTML document that includes several Server Side Include commands, for example, or a combination of programs that access a database. If I enter a record (using one program) and then cannot successfully recall it (using another program), it may be one of two things: The record isn't being written or it isn't being read.

In order to pinpoint the incriminating program, you might want to try these tips:

  • Try to isolate the program; test it without the use of any other programs. In the database example, you might test the "enter" program by entering a record and then viewing the file to see whether the data is there.
  • Add print statements to make it more clear which program is executing. (This helps only if you are able to view the program's output at all.)

Determining Whether the Program Is Being Executed

Here, you run into one of the idiosyncrasies of the CGI environment. In a typical programming language, it's usually obvious that the program is running. With a CGI program, however, you can't take this for granted. Many factors can cause your program to not run at all, and, unfortunately, the error message you get is usually the same one you'll get if your program runs into a problem.

The error message you'll usually see when your program is not executing follows: This server has encountered an internal error which prevents it from fulfilling your request. The most likely cause is a misconfiguration. Please ask the administrator to look for messages in the server's error log.

The most likely cause of this error, unfortunately, is not a misconfiguration; it's a CGI problem. The next step is to determine whether your program is executing at all. The following are some situations that could prevent your program from executing. It's best to quickly check each of these first when you encounter a problem.

Note
With some HTTP servers, a second error message is possible specifying that the file was not found. This is a sure indication of one of the first two conditions that follow.

  • The program file doesn't exist, or you've specified the wrong name in the Form tag, linked URI, or SSI declaration.
  • The permissions on the file are set incorrectly. It will need the Execute permission for all users; the easiest way to set this is with this UNIX command:

chmod a+x programname

  • The program is not located in a directory that allows CGI programs. For many servers, the /cgi-bin directory is the only directory that allows this. Some servers require a specific extension, such as cgi, for CGI programs.
  • The Perl interpreter isn't being found to run the program. Be sure that Perl is installed on the system and that the first line of your program contains the correct location for Perl. You might have to ask your Administrator for the correct location. Here's an example of a typical location:
    #!/usr/bin/perl
  • Your program contains a syntax error. Perl checks syntax before executing the program and quits if it finds any errors. Check the program's syntax, as described in the next section.

Note
You'll look at the server's error log later in this chapter, in "Reading the Server Error Log." It can be an invaluable resource if you happen to be the System Administrator or have the time to contact her.

Checking the Program's Syntax

The first step in debugging a Perl program is to check its syntax. Perl is very picky about syntax errors and is very sensitive to them. A simple misspelling or a misplaced punctuation character can cause you hours of frustration if you aren't careful. In this section, you'll learn how to check your program's syntax and how to spot (and avoid) some of the most common syntax errors.

Note
Technically, a syntax error is an error in the language, formatting, or punctuation used to write the program. These errors often are typographical errors.

Checking Syntax at the Command Line

With Perl, it's quite easy to check your program's syntax. You should do this as part of the editing process. Personally, I check the syntax each time I make a change. Type this command: perl -c programname

This checks the syntax of the program without executing any of the commands. Alternatively, you simply can execute the program by typing its name. Perl checks the syntax before executing it and displays any errors it finds.

You can use the -w switch to the Perl interpreter to give you additional information about debugging. This option watches for variables that are used only once and other common errors. This command attempts to execute a Perl script called register.cgi, for example, and displays warnings: perl -w register.cgi

Interpreting Perl Error Messages

A typical error message produced when you check syntax follows: syntax error at test.cgi line 29, near "while" syntax error at test.cgi line 129, near "}" test.cgi had compilation errors. Exit -1

As you can see, Perl doesn't exactly spell out the exact cause and location of the error. However, it does give you two important clues: The line number where the error occurred, and a bit of text near it. These are not exact; the line number often is incorrect, and the quoted code often is unrelated to (but next to) the code with the problem. It's best to consider this a starting point for your debugging process.

As this sample message illustrates, Perl often displays more than one error message. A good general rule is to ignore all but the first message in the list. Why? Often, an error at one point in the program causes a later section to appear wrong, creating a second error. Fixing the first error often eliminates the second, so it's best to fix one error at a time and then to check the syntax again to see whether you receive a different message.

Looking At the Causes of Common Syntax Errors

Some syntax errors are very easy to spot-for example, if you misspell the word print. Perl has some tricky syntax, however, and some errors are much harder to detect.

Now you'll look at some of the most common errors you can make when creating a Perl program and the error messages or other symptoms they are likely to produce.

Note
Not all syntax errors produce an error message. If a section of your program doesn't work or behaves in an unexpected manner, watch out for one of the errors described in this section.

Punctuation Problems

One of the most basic syntax errors is incorrect punctuation. Because these errors can be created by a simple missed key on the keyboard, they are quite common. Perl uses certain characters to indicate sections of the program or parts of a command. Table 13.1 lists some errors to watch out for.

Table 13.1. Common punctuation errors in Perl.

Symbol
Name Description
;
semicolonEach command in your Perl program must end with a semicolon. Unfortunately, the error message you get doesn't give you any hints. The error message in the "Interpreting Perl Error Messages" section was caused by this very error. The line listed in the error message is usually the line after the line missing the semicolon.
{ }
bracesUsed to delimit sections of the program. The most common problem is leaving off a closing brace to correspond with an opening brace. Fortunately, the error message is right on target: Missing right bracket. Remember that you need to use braces after each if, while, or sub statement.
( )
parenthesesMost of the commands in Perl do not require parentheses. However, an if statement must use parentheses around the condition.
" "
double quotation Perl allows quoted strings to include multiple lines. marks.This means that if you leave off a closing double quotation mark, the rest of your entire program might be considered part of the string.

Assignment and Equality Operators

Operators are used to form a relationship between two words in the program. The most common operator syntax error is also the hardest to notice. Remember that Perl uses two kinds of equal sign:

  • The assignment operator (=) is used to assign a value to a variable.
  • The equality operator (==) is used in an if statement's condition to test equality between two numbers.

If you're like me, you'll run into this error constantly-usually, a simple typing mistake. What makes it so complicated is that the incorrect operator often does not cause a syntax error; instead, it just works differently than you are expecting. Consider the following sample code: if ($result = 5) { print "The result is 5."; }

This looks like a correct section of code-in fact, it would be perfectly acceptable in some languages. However, note that the assignment operator (=) has been used in the if statement when the equality operator (==) should have been used.

What does this mean to the program? Well, instead of comparing the $result variable to the constant 5, it is being assigned the value 5. Worse, Perl allows the assignment to be used as a condition. The success of the assignment determines whether the condition is true; in other words, instead of saying if the result is 5, you're saying if you can successfully make the result 5.

Needless to say, this creates a problem. First of all, your condition always will be considered True, because the $result = 5 statement never fails. Second, and worse, your $result variable will be assigned the value 5, losing its previous value.

Based on this scenario, you should remember the following clues, which might let you know that you have mistakenly used the wrong type of equal sign:

  • An if statement is treated as if it is always true.
  • A variable changes value unexpectedly after a comparison.

String and Numeric Equality Operators

Before you consider that if statement to be good, there's one more thing to check. Perl, unlike some languages, uses separate operators to refer to strings and numbers. The equality operator, ==, is strictly for numbers.

The operators are easy to remember, because the string operators use strings -combinations of letters-instead of the normal punctuation. Table 13.2 gives a summary of the different operators for strings and numbers.

Table 13.2. String and numeric operators in Perl.

ConditionNumeric Operator String Operator
Is equal to
==
eq
Does not equal
!=
ne
Is greater than
>
gt
Greater than or equal
>=
ge
Is less than
<
lt
Less than or equal
<=
le

Tip
The assignment operator = is the same for both numbers and strings.

Variable Syntax Errors

Another common syntax problem is in variable names. All variables in Perl start with a character that indicates the type of variable. You often can refer to a variable in more than one way. Table 13.3 lists the characters used with the three types of variables.

Table 13.3. The syntax used for different Perl variable types.

Variable TypeCharacter Example
Scalar
$
$result
Array (entire array)
@
@data
Array (one element)
$
$data[4]
Associative array (entire array)
%
%value
Associative array (one element)
$
$value{"key"}

The simplest variable syntax error is to leave the character off the beginning of the variable, like this: result = 1

Again, if you're used to another language, you will run into this problem frequently. A more complicated issue involves using the correct character to refer to an entire array or a single element. A good rule of thumb is that the dollar sign ($) should be used any time you are referring to one element. You must include brackets [ ] for an array or curly braces { } for an associative array; this is how Perl can tell to which type of variable you are referring.

Viewing HTML Sources of Output

Many CGI problems can cause you to receive no output at all or simply an error message. The most common error message was shown at the beginning of this chapter. That message is repeated here: This server has encountered an internal error which prevents it from fulfilling your request. The most likely cause is a misconfiguration. Please ask the administrator to look for messages in the server's error log.

As mentioned earlier, this error message can be caused by your program failing to execute at all, and you should check for that first. Even if your program does execute, however, it can produce this error if it does not output correct HTML and headers.

Using MIME Headers

As you learned earlier in this guide, the first output your CGI program should produce is a MIME header to indicate the type of output. This usually is HTML, but your program can output anything-text, a downloadable file, or even a graphic. Most of your CGI scripts use a header like the following (the beginning of the actual HTML is included for clarity): Content-type: text/html <HTML>

Note the blank line after the Content-type header and before the HTML document begins. This is mandatory. If the blank line is not included, you receive the error message just discussed.

Alternatively, your program might return a reference to an existing URI. The output should look something like this: Content-type: text/html Location: URI of referenced document <HTML>

Note that you still include the beginning of an HTML document. It's best to include a small HTML document with the reference. The reason? First of all, if it is mistakenly interpreted as actual HTML, you'll have some hint as to what's going on. Second, some browsers won't accept the headers, including the all-important Location, unless they're followed by at least one line of text. The blank line after the headers still is required.

Examining Problems in the HTML Output

If your program is outputting the correct headers, you still might not receive any output. The most likely cause is incorrect HTML in the output after the header. Some browsers are forgiving and will display incorrect HTML; others will ignore it completely or display it incorrectly. If your browser allows you to view HTML source, you can quickly pinpoint the problem. Here are some common HTML mistakes you should check for:

  • Be sure that you include the HTML tag as the first element and end it properly with the </HTML> tag at the end of the output.
  • Although the Head and Body elements are not required, they can cause problems if they are included but not closed.
  • Watch for punctuation problems. These can be hard to spot when your program produces the HTML in print statements. Be sure that each < character is followed by a > character to end the tag. Also watch for quotation marks that are not closed.
  • Be sure that you aren't producing any non-ASCII characters as output.

If you still have problems or are using a browser that doesn't allow you to view the source, there are two tricks that might be helpful, as described in the next sections.

Displaying the Output as Text

As you learned in the previous section, the MIME header your program outputs tells the browser what sort of content to expect and how to display it. You can take advantage of this and force the browser to display the output as text. This makes it easy to determine whether an HTML element is causing the problem. Change your header to the following: Content-type: text/ascii <HTML>

Using the Direct Method: Testing with Telnet

Are you still stuck trying to view your program's output without interference from the browser? If you have access to the telnet command, you can view the output without using a browser at all. This makes it easy to narrow down the problem.

Tip
The telnet command described here works under UNIX systems. If you use a Macintosh or Windows system to connect to the Internet, you can use one of the publicly available Telnet utilities.

First, use this command to open a session with the HTTP server: telnet sitename.com 80

The 80 specifies the port under which the HTTP server is running. This is typically 80 but might be different on your server; the Administrator might have chosen a different port number for security or for a special purpose. After you establish a connection, type a GET request like this: GET /cgi-bin/directory/scriptname HTTP/1.0

This is not a complete URI; instead, it is the location in which to find the document. Use the exact directory that your script is in; this is equivalent to the URI you use to access your script from a browser but does not include the http: identifier or the site name.

After your GET request (note that the capital letters are required), your program executes and the output appears as HTML source. It should be easy to find the error. You should note two considerations:

  • Even this method produces an error if your program does not include the correct header.
  • If the telnet command fails to connect at all, it's a good indication that the HTTP server is down. This means the problem might not be in your program at all.

As a final example, here is the captured output of executing a CGI script from a successful Get request through the telnet command: Trying 198.60.22.4 ... Connected to www.xmission.com. Escape character is '^]'. GET /cgi-bin/users/mgm/randquote <HTML> This is a simple test document. </HTML> Connection closed by foreign host. Exit 1

Viewing the CGI Program's Environment

The next step in determining the cause of a problem with your CGI program is to view the input going into the program. This is usually the data entered in a form after a Get or Post query, or a QUERY_STRING that is appended directly to the URI.

Displaying the Raw Environment

The easiest way to determine the environment going into the program is to display it. This means using a different program temporarily-one that is intended simply to display the environment. Listing 13.1 shows a Perl program that simply displays environmental variables available to the program as an HTML file.


Listing 13.1. A CGI program to display environmental variables.
01: #!/usr/bin/perl 02: 03: MAIN: { 04: print "Content-type: text/html\n\n"; 05: print "<HTML><HEAD><TITLE>Environment Display</TITLE>"; 06: print "</HEAD><BODY>"; 07: while (($key,$value) = each %ENV) { 08: print "$key=$value<BR>\n"; 09: } 10: print "</BODY></HTML>"; 11: exit 0; 12: }


Listing 13.2 shows the typical output of this program. In this case, the CGI program was accessed directly; no form was used.


Listing 13.2. Output of the program in Listing 13.1.
SERVER_SOFTWARE=Apache/0.8.13 GATEWAY_INTERFACE=CGI/1.1 DOCUMENT_ROOT=/usr/local/lib/httpd/htdocs REMOTE_ADDR=204.228.136.119 SERVER_PROTOCOL=HTTP/1.0 REQUEST_METHOD=GET REMOTE_HOST=slc119.xmission.com QUERY_STRING= HTTP_USER_AGENT=Mozilla/1.22 (Windows; I; 16bit) PATH=/usr/local/bin:/usr/sbin:/usr/local/sbin/:s/. HTTP_AccEPT=*/*, image/gif, image/x-xbitmap, image/jpeg SCRIPT_FILENAME=/usr/local/lib/httpd/cgi-bin/users SCRIPT_NAME=/cgi-bin/users/mgm/test.cgi HTTP_PRAGMA=no-cache SERVER_NAME=www.xmission.com PATH_INFO= SERVER_PORT=8000 PATH_TRANSLATED=/usr/local/lib/httpd/htdocs/mgm/test.cgi SERVER_ADMIN=www@xmission.com


As you can see, this gives you quite a bit of information. Here are some of the problems this can help you detect:

  • The HTTP server software version. You might run into some servers that behave differently than others; it's good to know which server is running.
  • The request method. Get is the default; you should use Post for most forms.
  • The translated path, which tells you exactly where the CGI script is located so that you can be sure you're editing the right one.
  • The QUERY_STRING and CONTENT_LENGTH variables specify the content of the GET request. This is useful for debugging a form; simply make the script in Listing 13.1 the Action attribute of the form using the Get method.

Displaying Name/Value Pairs

A more useful debugging script displays the name and value pairs that were submitted. You easily can make such a script. Use the same code you usually do to split the name/value pairs, and use a section of code like this to display them: while (($key,$value) = each %entries) { print "$key=$value<BR>\n"; }

In this example, the name/value pairs are contained in the associative array %entries. The each keyword allows you to display each element in the array without knowing its key. To use this script to debug a form, simply point the Action field to this script instead of your normal script.

Here is an example of the output of this script, using a form with the Post method and several text fields: Name = John Smith Address = 221b Baker Street Phone = 801-555-1245 Interests = Computers, Hiking, Bad Poetry

Debugging at the Command Line

If you are allowed access to the UNIX command line or shell, you can access some additional debugging features. These include testing the program without involving the HTTP server and using Perl's powerful debug mode to find bugs in your program.

Testing without the HTTP Server

Although your CGI program is intended to work with an HTTP server across the Internet, there are some advantages to testing it without involving the HTTP server at all:

  • You can view exact error messages when they occur.
  • You can see the program's output, even if it is not correct HTML or does not contain the correct headers.
  • You can eliminate problems that might be caused by bugs in the HTTP server itself.

If your program is a simple SSI file, it's easy to test at the command line. Simply type the name of the program at the command line. If the current directory is not in your Path environment variable, you might need to include a directory name in your command. This command executes a program called test.cgi in the current directory: ./test.cgi

The period in this example is interpreted by UNIX to mean the current directory. You also could type the entire path to the program file.

This method also works if your program does not accept any parameters-in other words, if it is intended to give information that is not based on input from a form or from the URI. If your program does expect input, you'll need to do something a bit more tricky: simulate a Get request.

Simulating a Get Request

If you are using the Post method with your script, there is no easy way to test it at the command line. The Get method is easy to simulate, however. You can change the method to Get temporarily in order to use this technique.

In a Get request, these environment variables are set: REQUEST_METHOD = GET QUERY_STRING = data

You can set these manually to fool your program into working at the command line. For the variables in the QUERY_STRING, you need to use the & character between variable/value pairs and the = character between variables and their values. Suppose that you want to send this data to the script: Name: John Smith Address: 321 Elm Street City: Metropolis

You would use these variable settings: REQUEST_METHOD = GET QUERY_STRING = Name=John Smith&Address=321 Elm Street&City=Metropolis

In actuality, things are a bit more difficult, because the & characters are interpreted as special characters by the shell. Here are the actual commands to use to set these variables: setenv REQUEST_METHOD GET setenv QUERY_STRING "Name=John Smith\&Address=321 Elm Street\&City=Metropolis"

Note that you use a backslash (\) character before each & character. This is an escape code that indicates to the shell to use the character rather than its meaning. Also, the quotation marks in the string are required in order for the spaces to be treated as spaces. Otherwise, the command would end with the first space.

After typing the earlier commands, verify your settings by typing the setenv command by itself. This displays the entire environment; the last two entries should be the ones you added. Make sure that the data is listed correctly.

After the environment is set up correctly, you can invoke the Perl interpreter to execute the program. For example, this command tests the program test.cgi: perl test.cgi

If your program outputs a complex HTML document, it might not be easy to interpret its output. One solution to this is to redirect the program's output to an HTML file that you can view with the browser. This command executes test.cgi and stores the output in test.asp: perl test.cgi >test.asp

This method is particularly useful when it's necessary to debug the program without placing it online, such as in situations where the server's Administrator must place scripts online manually. It is also handy because, after you set the variables as listed earlier, you can test the program repeatedly without having to retype the data.

Using Perl's Debug Mode

Another advantage of debugging a CGI program at the command line is that you can use the debug mode available with Perl. This gives you much greater control over the execution of the program. You can step through each command individually, examine variable values along the way, and narrow down the source of an error or incorrect result.

Before you begin, set the environment variables to simulate a Get request if your program needs it, as described in the previous section. Then type this command to start the program in debug mode: perl -d programname

After you type this command, the first statement in your program is executed. Perl then stops and asks you for a command. You can enter Perl commands here and they are executed. More important, you can enter special debug commands. Table 13.4 lists the most useful commands.

Table 13.4. Useful Perl debug commands.

CommandMnemonic Explanation
/text SearchSearches for the text in the program
?text Search backSearches backward for the text
b breakSets a breakpoint; uses the current line or specifies a line
b sub break subSets a breakpoint at the start of a subroutine
c continueContinues to the next breakpoint
<CR> NextRepeats the last "next" or "step" command
d line delete breakDeletes a breakpoint at line or the current line
D Delete allDeletes all breakpoints
f finishExecutes statements until the end of the current routine
h helpDisplays a list of debug commands
l number listLists number lines of the program
l sub List subLists a named subroutine
n nextAdvances to the next statement, ignoring subroutines
p printDisplays a variable or an expression's value
q quitExits the debugger and quits the program
s stepExecutes a single statement (a single step)
S SubroutinesLists the names of all subroutines
t traceDisplays commands as they execute
V VariablesLists all variables

As a quick introduction to the debugger, here are the actions you will perform in a typical debugging session:

  • Use the -d option to start the program under the debugger.
  • Step through the program with the s command. This makes it easy to see when an error happens.
  • If you are testing a certain routine, use the b routine command to set a breakpoint at the start of the routine, and then use the c (continue) command to continue until the breakpoint is reached.
  • If you are testing a certain command, set a breakpoint at that command. This is particularly useful in loops. To do this, use the s command to move to the statement, and then use the b command to set a breakpoint.
  • While stepping through a program, use the p command to test the current values of variables. For example, p $result displays the value of the variable $result. You can use any expression-for example, p $correct / $possible.
  • The t (trace) command provides an easy way to know when the program is crashing. Simply type t to begin tracing, and then c to continue execution. The last trace message displayed lets you know which command was executing when the program stopped.

As a final bit of explanation, Listing 13.3 shows the output of the beginning of a typical debug session. The first statement in this program sets a variable called $sendmail. The prompt is the DB<1> at the end of the output. This is where you type debug commands.


Listing 13.3. Starting a Perl debug session.
perl -d jobqry.cgi Loading DB routines from $RCSfile: perl5db.pl,v $$Revision: 4.1 $$Date: 92/08/07 18:24:07 $ Emacs support available. Enter h for help. main::(jobqry.cgi:10): $sendmail = "/usr/lib/sendmail"; DB<1>


Reading the Server Error Log

One of the tools you might have available is the HTTP server's error log. This is a text file that lists all the errors that have occurred. Each time your CGI script produces an error, a message is added to this log.

Unfortunately, you often will not have access to the error log. You can ask your Administrator to view it or to give you access, though. Of course, if you have your own server, you will have no problem. Listing 13.4 shows a sample of part of an error log. This is from a particularly busy server; all these errors happened within about two hours.


Listing 13.4. A section of an HTTP server's error log.
[20/Apr/1995:17:50:17 +0500] [OK] [host: dsouza.interlog.com referer: http://  webcrawler.cs.washington.edu/cgi-bin/WebQuery] Connection interrupted   [SIGPIPE], req: GET /89-94.refs.asp HTTP/1.0 [20/Apr/1995:18:15:29 +0500] [OK] [host: cleta.chinalake.navy.mil referer:   http://webcrawler.cs.washington.edu/cgi-bin/WebQuery] Connection interrupted   [SIGPIPE], req: GET /89-94.refs.asp HTTP/1.0 [20/Apr/1995:20:55:17 +0500] [OK] [host: mac1223.botany.iastate.edu referer:   http://webcrawler.cs.washington.edu/cgi-bin/WebQuery] Connection interrupted   [SIGPIPE], req: GET /89-94.refs.asp HTTP/1.0 [20/Apr/1995:21:09:26 +0500] [OK] [host: slip16.docker.com referer: http://  webcrawler.cs.washington.edu/cgi-bin/WebQuery] Connection interrupted   [SIGPIPE], req: GET /89-94.refs.asp HTTP/1.0 [20/Apr/1995:21:14:46 +0500] [OK] [host: ip-pdx8-30.teleport.com referer: http://  webcrawler.cs.washington.edu/cgi-bin/WebQuery] Connection interrupted   [SIGPIPE], req: GET /89-94.refs.asp HTTP/1.0 [20/Apr/1995:22:45:38 +0500] [OK] [host: alpha10.scs.carleton.ca] Connection   interrupted [SIGPIPE], req: GET /89-94.refs.asp HTTP/1.0 [20/Apr/1995:23:04:53 +0500] [MULTI FAILED] [host: opentext.uunet.ca]   /robots.txt [20/Apr/1995:23:36:54 +0500] [OK] [host: macsf47.med.nyu.edu referer: http://  charlotte.med.nyu.edu/getstats] Connection interrupted [SIGPIPE], req: GET /  getstats/statform HTTP/1.0 [20/Apr/1995:23:42:15 +0500] [OK] [host: macsf47.med.nyu.edu referer: http://  charlotte.med.nyu.edu/getstats/statform.asp] Bad script request - none of   '/opt/cern_httpd_3.0/cgi-bin/getstats' and '/opt/cern_httpd_3.pp.pp' is   executable (500) "POST /cgi-bin/getstats HTTP/1.0" [20/Apr/1995:23:54:39 +0500] [OK] [host: macsf47.med.nyu.edu referer: http://  charlotte.med.nyu.edu/getstats/statform.asp] Bad script request - none of   '/opt/cern_httpd_3.0/cgi-bin/getstats' and '/opt/cern_httpd_3.pp.pp' is   executable (500) "POST /cgi-bin/getstats HTTP/1.0" [21/Apr/1995:00:28:39 +0500] [OK] [host: charlotte.med.nyu.edu] Invalid request   "" (unknown method)


Tip
If you are the Administrator, you should keep an eye on the size of the error log. You can quickly run out of disk space if you aren't careful.

The error log typically is found in a directory under the httpd directory. In a typical server setup, the directory is /usr/local/lib/httpd/logs

You need to ask your Administrator to tell you the exact location of the log file and to give you access to it. As you can see, several items are logged for each error message:

  • The date and time when the error occurred
  • The host that requested the data
  • The type of error that was encountered
  • The method (Get or Post)

The exact messages listed in the error log depend on the type of HTTP server you are running. The example in Listing 13.4 was produced by the CERN HTTP server. You should browse the log after experiencing various errors to get an idea of what events they cause. In Listing 13.4, the message Bad script request is a particularly useful message; it indicates that the script file was not found or is not executable.

Debugging with the Print Command

If you don't have access to the error log and don't find it convenient (or possible) to test your script at the command line, you might try debugging "the hard way" with simple print commands. In fact, this method is often the easiest to use and can quickly narrow down the source of a problem.

Note
Some Internet providers give you access to your own directory to run CGI scripts but don't allow access to the command line. This is a difficult situation; the print command method is one of the debugging methods that still is available to you in this circumstance.

As an example, Listing 13.5 shows a section of a script used to search for jobs matching certain criteria. To be completely realistic, I've even included a bug in the code. Can you find it?


Listing 13.5. A simple CGI program with a bug in it.
01: # State must match if entered 02: if ($rqpairs{"State"} gt " ") { 03: if ($rqpairs{"State"} ne $data{"ST"}) { 04: $match = 0; 05: } 06: } 07: # Zip code must match if entered 08: if ($rqpairs{"Zip_Code"} gt " ") { 09: if ($rqpairs{"Zip_Code"} ne $data{"Z"}) { 10: $match = 0; 11: } 12: } 13: # Country must match if entered 14: if ($rqpairs{"Country"} gt " ") { 15: if ($rqpairs{"Country"} != $data{"C"}) { 16: $match = 0; 17: } 18: }


As you can see, this code is comparing several values entered in a form, stored in the associative array %rqpairs, with values in a database, stored in the associative array %data. The $match variable is used to indicate whether the record matches the criteria. The $match variable defaults to 1 and is changed to 0 if any of the criteria do not match.

The symptoms: When the code in Listing 13.5 executes, $match always ends up being 0. The search is never successful, even if the exact values for State, Zip_Code, and Country are entered.

To fix this problem with the debugger, you simply can step through each if statement block and display the value of the $match variable after each one. You can do the same thing with print statements. Listing 13.6 shows the section of code in Listing 13.5 with print statements inserted. I left the print statements non-indented to make them easy to see.

Note
Be sure that your program outputs a correct MIME header before the output so that you will be able to view the results of the print statements on your browser. If your program already outputs HTML, you probably won't need to add anything.


Listing 13.6. Adding print statements to show data as the program executes.
01: # State must match if entered 02: if ($rqpairs{"State"} gt " ") { 03: if ($rqpairs{"State"} ne $data{"ST"}) { 04: $match = 0; 05: } 06: } 07: print "After State: match=$match"; 08: # Zip code must match if entered 09: if ($rqpairs{"Zip_Code"} gt " ") { 10: if ($rqpairs{"Zip_Code"} ne $data{"Z"}) { 11: $match = 0; 12: } 13: } 14: print "After Zip: match=$match"; 15: # Country must match if entered 16: if ($rqpairs{"Country"} gt " ") { 17: if ($rqpairs{"Country"} != $data{"C"}) { 18: $match = 0; 19: } 20: } 21: print "After Country: match=$match";


As you can see, I displayed the $match variable after each criterion is checked. The text in the print statement lets you know which of the print statements is being executed. Here is the output the print statements produce: After State: match=1 After Zip: match=1 After Country: match=0

Aha! It looks like the check for the Country field always results in a match value of 0. If you're very observant, you've probably found the error already. Look at this line again: if ($rqpairs{"Country"} != $data{"C"}) {

Here, I accidentally used the numeric inequality operator (!=) when I should have used the string inequality operator (ne). It's a common mistake.

You can follow this same method and use as many print statements as you need to diagnose the problem. After you finish debugging, you need to remove every one of the print statements. In the final section of this chapter, you'll learn about an alternative print routine called bugprint that you can use for this purpose and then easily turn off.

Note
Because the output of your CGI program is being interpreted as HTML, it helps to include HTML codes-such as <BR> for a line break-in the text of your print statements.

Looking At Useful Code for Debugging

In this section, you'll learn about some handy Perl programs you can use to assist in your debugging. They are short and easy to type in and use, and they can save you hours of time. Each program is explained and presented here.

Note
These programs have been tested under Perl 5.0 on a UNIX system. You need to specify the correct location for the Perl interpreter on the first line of the program, and you might need to modify it slightly for your system.

Show Environment

The program shown in Listing 13.7 displays the environment available when a CGI program executes. A shortened version of this was presented in the section "Viewing the CGI Program's Environment," earlier in this chapter. This version is a bit longer but displays more readable HTML.


Listing 13.7. A CGI program to display the environment.
01: #!/usr/bin/perl 02: 03: MAIN: { 04: print "Content-type: text/html\n\n"; 05: print "<HTML><HEAD><TITLE>Environment Display</TITLE>"; 06: print "</HEAD><BODY>"; 07: print "<H1>Environment Variables</H1>"; 08: print "The following variables are present in the current environment:"; 09: print "<UL>" 10: while (($key,$value) = each %ENV) { 11: print "<LI>$key = $value\n"; 12: } 13: print "</UL>"; 14: print "End of environment."; 15: print "</BODY></HTML>"; 16: exit 0; 17: }


Show Get Values

Listing 13.8 shows a simple script that displays all the variables from a form using the Get method. To use it, simply set the Action field of the form to this program instead of your normal program, as in this example: <FORM METHOD="GET" ACTION="/cgi-bin/show_get">


Listing 13.8. A program to display Get values.
01: #!/usr/bin/perl 02: 03: MAIN: { 04: print "Content-type: text/html\n\n"; 05: print "<HTML><HEAD><TITLE>GET Variables</TITLE>"; 06: print "</HEAD><BODY>"; 07: print "<H1>GET Method Variables</H1>"; 08: print "The following variables were sent:"; 09: print "<UL>" 10: $request = $ENV{'QUERY_STRING'}; 11: # Split request into name/value pairs 12: %rqpairs = split(/[&=]/, $request)); 13: # Convert URI syntax to ASCII 14: foreach (%rqpairs) { 15: tr/+/ /; 16: s/%(..)/pack("c",hex($1))/ge; 17: } 18: # Display each value 19: while (($key,$value) = each %rqpairs) { 20: print "<LI>$key = $value\n"; 21: } 22: print "</UL>"; 23: print "End of variables."; 24: print "</BODY></HTML>"; 25: exit 0; 26: }


Show Post Values

The program shown in Listing 13.9 is similar to Listing 13.8, but it displays values for a Post query. This is a bit more complicated. Again, simply point the Action field of your form to the location of this program, as in this example: <FORM METHOD="POST" ACTION="/cgi-bin/show_post">


Listing 13.9. A program to display Post values.
01: #!/usr/bin/perl 02: 03: MAIN: { 04: print "Content-type: text/html\n\n"; 05: print "<HTML><HEAD><TITLE>GET Variables</TITLE>"; 06: print "</HEAD><BODY>"; 07: print "<H1>GET Method Variables</H1>"; 08: print "The following variables were sent:"; 09: print "<UL>" 10: # Read POST data from standard input. 11: # The CONTENT_LENGTH variable tells us how 12: # many bytes to read. 13: read(STDIN, $request, $ENV{'CONTENT_LENGTH'}); 14: # Split request into name/value pairs 15: %rqpairs = split(/[&=]/, $request)); 16: # Convert URI syntax to ASCII 17: foreach (%rqpairs) { 18: tr/+/ /; 19: s/%(..)/pack("c",hex($1))/ge; 20: } 21: # Display each value 22: while (($key,$value) = each %rqpairs) { 23: print "<LI>$key = $value\n"; 24: } 25: print "</UL>"; 26: print "End of variables."; 27: print "</BODY></HTML>"; 28: exit 0; 29: }


Display Debugging Data

The Display Debugging Data program is the simplest program in this section, but you might find it-or your own modified version-very useful. In the "Debugging with the Print Command" section, you learned about using print statements to display variables during sections of the program. You can use the bugprint subroutine shown in Listing 13.10 instead. It offers a simple advantage: You can turn it off.

The bugprint routine prints, but only if the variable $debug is set to 1. This means that you can quickly remove all the debugging from your program simply by setting $debug to 0. In addition, because it uses a different keyword than print, you quickly can search through the program to remove the debug commands when you're finished.

This routine also displays the value of the Perl internal variable $!, which contains the most recent error message and may provide some insight into the error. Finally, it adds the <BR> HTML tag to separate lines of output.

Listing 13.10 shows the code for the bugprint routine. It could really fit on a single line, but I've stretched it out to make its meaning clear.


Listing 13.10. A program to display variables for debugging.
1: sub bugprint { 2: if ($debug ==1) { 3: print "Debug:" 4: eval "print @_"; 5: print "<BR>\n"; 6: print "Last error: $!<BR>\n"; 7: } 8: }


To use this subroutine, simply insert the code in Listing 13.10 at the end of your program. Then add the following command to the start of your program to turn on debugging: $debug = 1

After you're through debugging, you can change the $debug value to 0 to deactivate all the debugging output. This makes it easy to quickly switch between the debug output and the normal output.

Remember that, because bugprint is a subroutine, you must refer to it with the & character or the do keyword. You can use variables in the statement, just as you would with print. Here are two examples: do bugprint "The current value is:$result"; &bugprint "Key: $key Value: $value";

A Final Word about Debugging

And now, a final word about debugging. Three words, to be exact: Don't give up. Debugging can be a long, time-consuming process with little reward. You can spend hours staring at code and testing it over and over before finally noticing one tiny typing mistake. Here are a few tips for the human side of debugging:

  • Take a break. If you've got time, wait a day or two, get some sleep, and then start debugging with a fresh mind. You'll be amazed at how much easier it is.
  • Don't be afraid to ask for help. Your System Administrator might be able to answer questions; in addition, several useful newsgroups are available in which you can ask questions.
  • If you have a friend who knows Perl-even just a little-have him look at the program. A fresh set of eyes often spots mistakes very quickly.
  • As a last resort, rewrite. If a section is giving you nothing but trouble, delete it and rewrite it. You'll know better how to do it, and you might make fewer mistakes-or easier mistakes to find.
  • Remember that debugging is part of the programming process. Don't be upset if you spend time debugging; plan on it. If you are being paid for your work, include debugging time in your estimate. As you become more experienced, you'll be able to better estimate this time, but even the experts still have to spend time debugging.

If you don't give up, you'll get through it and the program will work beautifully. Good luck and happy debugging!

Summary

In this chapter, you were introduced to the not-so-glamorous world of debugging CGI programs in Perl. You learned about many of the common mistakes you can make in a Perl program and many methods you can use to pinpoint the part of your program that is causing an error.

You also looked at several techniques that can make it easier to narrow down an error. These include the HTTP error log, the source of the HTML output, the environment provided to the CGI program, and the good old-fashioned print statement.

Finally, you learned about several code segments and complete programs that can be helpful in debugging your own CGI programs or HTML forms.

Q&A

Q
My program worked when I tested it, but it doesn't work now that it's in use. What could be the problem?
A
This is common for two reasons:

You might have developed the program on one server and moved it to another; there may be a difference in compatibility between the servers. There is also the possibility that the permissions were set incorrectly when your program was moved to the new server.

When the program is used in the real world, it may encounter a wide variety of data that you didn't use in the testing process. Look for a statement that fails when the data reaches a certain value. Adding print statements at key points may help.

Q
Are any new syntax errors possible with Perl 5?
A
Yes, but not too many. Certain errors have been eliminated; for example, parentheses are usually not required with an if statement. The main cause for errors is the @ character. Perl 5 interprets @ as a variable reference, even in a quoted string. This means that if you include this character in a string (such as an e-mail address), you must be sure to escape it with a backslash: \@. Previous versions of Perl allowed this.
Q
Will future versions of HTML, such as HTML 3.2, affect my CGI scripts?
A
The only effect will be how the browser interprets the output of your program. The HTML 3.0 standard allows most valid HTML 2.0 tags, so there is little chance that your program will become completely unusable; however, you might want to modify it to take advantage of new HTML tags.
Q
The data from a Post form doesn't seem to reach my CGI program at all. What's wrong?
A
This may be a browser problem or a misconfigured HTTP server. In addition, if the URI you are using to access your program is forwarded to another URI, the Post data might not be forwarded properly. Try using the other URI in the Action field of your form.
Q
What are the most common HTTP servers?
A
You shouldn't have to worry, because the CGI standard is supported by most servers; however, some servers-particularly brand new versions-might have trouble with your CGI program. The most common UNIX-based server in use at this writing is Apache. Other common servers are the older ones from CERN and ncSA. Netscape Corporation's server, NetSite, also is becoming more popular.
Q
My script works at the command line but can't read from or write to a file when I run it online. What causes this?
A
Remember that most servers run CGI scripts as the user nobody. A file that you can access is not necessarily accessible to other users. Be sure to allow the Read and Write rights, if necessary, to all users; this is the only way to be sure that the file can be used from the CGI script.