Welcome to the last afternoon! In this chapter, you'll learn a few tips that can help you get the most out of CGI programming and find out where to go for more information. You also will examine some of the exciting developments that await in the future of CGI and the World Wide Web.
Many Web browsers include support for tags that aren't part of the HTML specification. This has caused quite a "Tower of Babel" in the WWW, because your Web browser isn't guaranteed to display every page it encounters.
One browser that supports many non-standard tags is Netscape Navigator. Although it's currently the most popular browser, many people don't realize that there are non-Netscape users out there. Many even go as far as to exclude non-Netscape users entirely by suggesting that they download Netscape to read the page. More recently, Microsoft's Internet Explorer (MSIE), included with Windows 95, also includes non-standard tags-and not the same ones as Netscape. MSIE is quickly gaining in popularity, and may overtake Netscape.
This issue is further complicated by the emerging client-side Web languages: Java (supported on Netscape and MSIE), JavaScript (supported in Netscape and partially supported by MSIE), and ActiveX (supported only by MSIE).
Why use non-standard tags and languages? Well, the answer is simple: You can do all sorts of things to make your pages look better and include additional features. It seems a shame not to take advantage of these features, but how do you support all the users?
One answer is to include browser-specific versions of each page. Although it's a lot of work, many people consider this a worthwhile task. Their Web page often includes links such as Click here for the non-Netscape version or even Select your browser from the list below
You can take this one step further in a CGI program. The HTTP_USER_AGENT environment variable can let your CGI program know which browser is being used and change the output accordingly. Listing 14.1 shows a simple Perl program that displays different versions of a page, depending on the user agent.
Listing 14.1. A simple program to display different pages,
depending on the browser.
01: #!/usr/bin/perl
02:
03: MAIN: {
04: print "Content-type: text/html\n\n";
05: if (index($ENV{"HTTP_USER_AGENT"}, "Mozilla")) {
06: # Netscape Specific page
07: print <netscape/thispage.asp>
08: }
09: elsif (index(ENV{"HTTP_USER_AGENT"}, "Microsoft")) {
10: # Internet Explorer Specific Page
11: print <explorer/thispage.asp>
12: }
13: else {
14: # non-specific page for other browsers
15: print <thispage.asp>
16: }
17: exit 0;
18: }
Note |
Notice that the string "Mozilla" is used here rather than "Netscape" to detect Netscape. Although "Netscape" also appears in the user agent value, it can't be used to differentiate between browsers, because some browsers (including MSIE) include the word "Netscape" to indicate that they are compatible. |
You could even take this one step further and print an error message if anyone tries to access your page with a certain browser. If you're a big fan of Netscape, you could disallow access by non-Netscape users; if you dislike Netscape, you could disallow it.
Neither of these approaches is recommended, however. The WWW is intended to be platform-independent. Although you might use this trick to take advantage of browser-specific features, why exclude anyone?
When you're writing a Perl program, one thing you might not think of is how the program looks. This makes sense, most of the time; if it works, why change it?
There is a benefit to readable code, however. It's easier to be sure that the program does what you intended it to. Debugging is easier, because you can isolate individual statements. Finally, if you ever have to debug someone else's Perl program or make a change to it, you'll wish it were written in a readable style.
With that in mind, take a look at a particularly bad example of Perl style and see what you can do to improve it. Listing 14.2 is probably the shortest complete program for parsing and displaying name/value pairs you've ever seen.
Listing 14.2. A short (and confusing) program to display name/value
pairs.
01: #!/usr/bin/perl
02: MAIN: { print <<EOF;
03: Content-type: text/html
04:
05: <HTML><HEAD><TITLE>GET Variables</TITLE></HEAD>
06: <BODY><H1>GET Method Variable Display</H1>
07: EOF
08: foreach (%rqpairs = split(/[&=]/, $ENV{"QUERY_STRING")) {
09: tr/+/ /;
10: s/%(..)/pack("c",hex($1))/ge; }
11: while (($key,$value) = each %rqpairs) {
12: print "<LI>$key = $value\n"; }
13: print "</UL>End of variables.</BODY></HTML>";
14: exit 0; }
As you can see, this isn't the easiest program to read. No wonder Perl is known in some circles as a difficult language. You can follow several tips to keep your Perl programs from looking like Listing 14.2:
Listing 14.3 shows the modified program. You should find it much easier to read-and much easier to modify for your needs.
Listing 14.3. The same program as Listing 14.2, modified for
clarity and readability.
01: #!/usr/bin/perl
02:
03: MAIN: {
04: print "Content-type: text/html\n\n";
05: print "<HTML><HEAD><TITLE>GET Variables</TITLE>";
06: print "</HEAD><BODY>";
07: print "<H1>GET Method Variables</H1>";
08: print "The following variables were sent:";
09: print "<UL>"
10: # GET data is in the environment variable
11: $request = $ENV{'QUERY_STRING'};
12: # Split request into name/value pairs
13: %rqpairs = split(/[&=]/, $request));
14: # Convert URI syntax to ASCII
15: foreach (%rqpairs) {
16: # plus signs become spaces
17: tr/+/ /;
18: # %nn (hex code) becomes ASCII character
19: s/%(..)/pack("c",hex($1))/ge;
20: }
21: # Display each value
22: while (($key,$value) = each %rqpairs) {
23: print "<LI>$key = $value\n";
24: }
25: print "</UL>";
26: print "End of variables.";
27: print "</BODY></HTML>";
28: exit 0;
29: }
You used some of the new features of Perl 5, the latest version, earlier in this guide. Here is a summary of some of the important new features available:
You've heard of Perl, but have you heard of Python? Python is a language that has some similarity to Perl. Like Perl, it's interpreted and has an easy syntax. Python does have some definite advantages and is designed for easy CGI programming. You might want to consider it as an alternative to Perl.
Python is a relatively new language. It was developed by Guido van Rossum in Amsterdam, the Netherlands, and is ed by a company called Stichting Mathematisch Centrum.
Although Python is being considered by more and more users as an alternative to Perl for CGI programs, it is still young; Perl is still the most popular language by far. However, Python may have advantages for your programs if your server supports it.
Like Perl, Python is an interpreted language. In order to use it, you must have installed a copy of the Python interpreter. If you want to use Python for CGI programs, you usually will need the help of the System Administrator to add Python support to the server.
Python includes a wide variety of features. Among the most important is that it is an object-oriented language. Although the latest version of Perl (Perl 5) includes object-oriented functions, Python was built from the ground up as an object-oriented language, making it more efficient and more extensible.
Another feature is the extensive library of functions available for Python. These include functions that enable you to communicate over networks and access system-specific functions. Most important for CGI programmers, a CGI library is available that makes everything easy.
The CGI library includes the following functions:
Additional libraries are available for working with URIs and for communicating with HTTP, FTP, and Gopher servers. This makes Python an ideal choice when building WWW search engines; in fact, one popular search engine, uses Python for all its programs.
Another feature of Python is that its language is much more readable than Perl in most cases. It doesn't include brackets, excessive parentheses, or punctuation-named variables (such as $_ , used in most Perl programs). Listing 14.4 shows an example of a Python function that inverts a dictionary (similar to an associative array). Keys are converted to values and vice versa.
Listing 14.4. A simple Python program.
1: def invert(table):
2: index = {}
3: for key in table.keys():
4: value = table[key]
5: if not index.has_key(value):
6: index[value] = []
7: index[value].append(key)
8: return index
You'll notice several things about the language. First of all, notice the lack of brackets, begin statements, and end statements. This is because Python uses indentation to define the start and end of functions.
Look at the following statements, for example: if value == 5 print value return value print "value is not 5"
The only thing telling Python which statements should be executed if the condition is true and which should be executed otherwise is the indentation. The print and return statements are considered part of a block after the if statement, because they are indented below it.
The advantage of this indentation-based syntax is that the code
is very clean and readable. The semicolon (;),
used to end each and every statement in C and Perl, is not necessary
(or allowed) in Python.
Warning |
There is a major disadvantage to this feature: You might end up spending hours testing a section of code, only to discover that the indentation is wrong on one of the lines. If you've ever programmed in COBOL, an ancient language used for business applications, you'll remember just how troublesome indentation problems can be. |
Python is available for several platforms, including UNIX, Windows NT, Macintosh, and DOS. In order to use Python, you'll need to install and compile the Python interpreter.
In order to use Python for CGI programs, you must install it at the system level on the UNIX (or other) machine that acts as an HTTP server. If you have your own server, you can set this up; otherwise, contact your System Administrator and ask him to install Python.
The normal HTML of the WWW is static. You access one page, then click on a link, and another page appears. With CGI programming, things get a bit more exciting-pages can be generated dynamically, include updated data, and interact with user-entered data. Nevertheless, it still appears as a page of text.
Java, a new language developed by Sun Microsystems, Mountain View, CA, takes the concept one step further. Imagine updated stock information appearing "live" on your browser window. Imagine accessing a page containing animated icons instead of static ones. All of this, and much more, is possible with Java.
If you're frustrated with the limitations of CGI programming,
Java might be the answer to your problems. Because data no longer
is restricted to a page-by-page display, you can do almost anything.
Java is explored further in the next sections. You'll also look
at JavaScript, a simple scripting language based on Java.
Note |
Although the Java language is simple, its features easily could fill a guide this size. This section is intended to give you a basic familiarity with the concepts behind Java and to give you an awareness of the impact it will have on the WWW and on you as a CGI programmer. To learn about the Java language, read Teach Yourself Java in 21 Days or Presenting Java, both published by Sams.net. |
Java isn't really a replacement for CGI programming; it's a completely different concept. Instead of executing on the HTTP server, a Java application actually is downloaded and executed by the Web browser.
Java can be used for two types of programs: applets, which are embedded in Web pages with the <APPLET> tag, and full-scale applications, which can be used with the Java interpreter.
When you access a Web page that includes a Java applet, the entire application is downloaded to your browser. The browser then executes the code. In order to do this, the browser must include a Java interpreter.
Because you download and execute an entire program, the stateless programming model that you've dealt with in CGI programming doesn't apply to Java. Your program can ask for input from the user, accept it, calculate other data, display it, and ask for more input-all without communicating with the HTTP server.
A simple Java application, for example, might enable you to fill out an order form. The browser would download the Java applet, and then you would fill in the fields to specify your order. You then could click a Total button and receive a total for the order; this would be done by the Java applet and would require no communication with the server. When you are finished, the final order would be transmitted to the server.
As you probably know, there are two types of computer languages:
Java actually fits into both categories. Before you can use a Java applet, you must compile it using the Java compiler. However, the applet isn't compiled into machine language-at least not for any particular machine. It's compiled into a virtual machine code; effectively, it's machine language for an imaginary, simple machine.
The Java interpreter and the interpreter built into a Web browser act as a virtual machine to run the Java code. This means that the language is fast, like a compiled language, but also is platform-independent.
In order for a Java applet to work on any particular machine,
the interpreter (or virtual machine) just has to be written for
that platform. Best of all, the same compiled applet can be run
on any type of system without recompiling it, which is essential
for the Internet.
Note |
The latest versions of Netscape and MSIE include just-in-time compilers for Java; these translate the Java bytecodes into native machine language for faster execution. |
The Java language includes many commands for a variety of purposes. I won't go into the details of the commands here, but I will explain briefly what a Java applet looks like. See "Finding Useful Internet Sites for CGI Programmers," later in this chapter, for sources of additional information about the language.
Java is an object-oriented language; it treats all elements of the program as objects. An object can be a variable, a subroutine, or your application itself. The idea behind object-oriented languages is that an object can include both data and code; a "number" object, for example, would include the value of the number and the code needed to display it.
Listing 14.5 shows an example of a short Java applet. This program simply displays the text Hello World in large text on the browser's screen.
Listing 14.5. A simple Java applet.
01: import browser.Applet;
02: import awt.Graphics;
03: class HelloWorld extends Applet {
04: public void init() {
05: resize(150, 25);
06: }
07: public void paint(Graphics g) {
08: g.drawString("Hello world!", 50, 25);
09: }
10: }
As you can see, the language isn't too hard to understand, and it uses a syntax similar to Perl (actually based on C++) to delimit subroutines and sections of code.
Although it sounds like Java is a ready-made language for the Internet, it wasn't designed for that purpose. Originally, it was intended for use in embedded systems-home appliances, stereos, toasters, traffic lights, and so on. Sun has modified Java to be easy to use with the WWW, however, and it turns out that it works very well.
In this section, you'll see what you need to get Java up and running on your system-whether you want to create your own custom Java applets or simply view applets created by others.
As mentioned in the previous section, you need a browser that supports Java in order to view and execute Java applets on the WWW. Because Java is widely regarded as the "next big thing" on the Internet, you'll no doubt see many browsers supporting it soon. Right now, three browsers support Java:
In order to develop a Java application, you'll need the Java compiler.
The compiler is available from Sun's Web site, listed earlier.
The compiler currently runs only on UNIX, Windows 95,
and Windows NT systems. The compiler is available as part of
the Java Development Kit (JDK).
After you create your source code, you use the compiler to generate the virtual machine code, called a class. You then can include the application on your Web page. To do this, you use the new <APPLET> tag. Listing 14.6 shows an example of a short WWW page with a Java applet.
Listing 14.6. Embedding a Java applet in a Web page.
01: <HTML>
02: <HEAD>
03: <TITLE> Java Applet Sample </TITLE>
04: </HEAD>
05: <BODY>
06: The program output will appear below.
07: <HR>
08: <APPLET CLASS="HelloWorld.class"> </APPLET>
09: </BODY>
10: </HTML>
JavaScript (originally called LiveScript) is a scripting language developed by Netscape Corporation and supported in the latest Netscape browsers. It now has been endorsed by Sun and many other companies as an ideal scripting language for the Internet. JavaScript also is supported in MSIE 3.0.
JavaScript is based on Java's syntax, but it is a different language with different uses. The basic differences follow:
The simplest use for JavaScript is to add validation to HTML forms. For example, a script could check the number you enter in a field, and if it is outside the valid range, warn you immediately via a dialog box.
To use JavaScript, you embed it in the HTML of a page using the <SCRIPT> tag. Listing 14.7 shows a simple page that includes a very short script.
Listing 14.7. HTML with an embedded JavaScript program.
01: <HTML>
02: <HEAD><TITLE>Simple JavaScript Output</TITLE>
03: </HEAD>
04: <BODY>
05: <SCRIPT LANGUAGE="JavaScript">
06: document.write("This is the output of the script.<BR>")
07: </SCRIPT>
08: Here's the body of the WWW page.
09: </BODY>
10: </HTML>
This page displays the script's output, a simple text string, before the actual body of the page.
Needless to say, many more complicated and exciting things can be done with JavaScript. It can be used to act on forms as a substitute for CGI in some cases. The latest version of JavaScript, implemented in Netscape 3.0, adds features for working with images, multi-media, and plug-ins; in addition, it can communicate with other languages, such as CGI and Java, for additional capabilities.
To learn more about JavaScript, see Netscape's JavaScript Authoring
Guide on the Web at
Tip |
You also can learn more about JavaScript by reading Teach Yourself JavaScript or Laura Lemay's Web Workshop: JavaScript, both published by Sams.net. |
Needless to say, this guide can't go into every detail about Perl,
CGI, and the other products, such as Java and JavaScript, mentioned
here. Thanks to the WWW, however, that information is easily available.
The following sections include URIs for some of the sites that
can be useful for CGI programmers. I'll also list some useful
Usenet newsgroups where you can ask questions about these subjects.
Tip |
Although these sites were accurate at the time of this writing, the WWW changes every day, and some sites may no longer be available or may have different addresses. If one of them is no longer accurate, try a Web search engine, such as AltaVista |
First take a look at a few sites that include helpful information about CGI programming. These range from tutorials to detailed technical specifications.
This is an excellent resource with information about all areas of CGI programming and links to many other useful sites. This is also the headquarters for the CGI FAQs (frequently asked questions), a useful compilation of questions and answers.
This is a huge collection of links covering all aspects of WWW development, HTML, and CGI programming. Additional issues, such as security, and new languages, such as Java, also are represented.
This is the most frequently cited reference for CGI programming. It is hosted by the National Center for Supercomputing Applications, developers of the original ncSA Mosaic and the ncSA HTTP server, which is used on a large number of Web servers. This is a tutorial explanation of the CGI standard, forms, and other features.
This includes a large collection of useful information about the WWW and links, as well as a complete section on CGI programming.
Here are a few sites you might find useful for information about the Perl language itself. Although they are not written specifically for CGI programming, they should help you understand the language and answer any questions you have about syntax.
This site contains a great deal of information about the Perl language, its uses, and resources for learning more about it. You can reach it at http://www.perl.com/
This is the official manual for Perl 4, converted to an online, searchable form. It isn't the ideal user interface, but it does include all the important information. It is the best resource for checking the syntax of commands. You can access the Perl Reference Manual at http://www-cgi.cs.cmu.edu/cgi-bin/perl-man
For an online version of the Perl 5 manual pages, see this site: http://www.perl.com/perl/manual/frames.asp
This is the place to go for the latest information about Perl 5. It includes links to the full documentation, along with an easy-to-read, hyperlinked list of new features in Perl 5. This page is updated as new features are added to the language. You can access the Perl 5 WWW Page at http://www.metronet.com/perlinfo/perl5.asp
This includes a list of references for Perl, with hyperlinks to many useful sites to emphasize learning Perl. You can reach Learning Perl at http://www.teleport.com/~rootbeer/perl.asp
The misc Perl newsgroup is the best place to ask questions about Perl. Many experts are willing to answer your questions, and you often can find someone else already asking the same question. You can reach the Perl newsgroups at comp.lang.perl.misc
and comp.lang.perl.announce
The following sites can help you learn about the various products and languages mentioned in this chapter.
The official site for Python is the Python Language home page at http://www.python.org/
You can find just about any type of information you need there. You also might want to try the following sites:
You learned about Java and JavaScript earlier in this chapter. Here are some sites you can access to find more information:
Although Netscape has no official relation to CGI programming, it is the most popular WWW browser, and the features that Netscape chooses to include are currently a driving force on the Internet. Netscape already has introduced support for Java and JavaScript. To keep track of developments in the WWW from Netscape's point of view, use these sites:
Microsoft Internet Explorer, particularly version 3.0, is serious competition for Netscape Navigator, with support for Java, JavaScript, and new technologies such as ActiveX. To learn more about MSIE or to download a copy, see Microsoft's Internet Explorer Web site: http://www.microsoft.com/ie/
This chapter examined some tips you can use to write good CGI programs. In addition, you learned about several developments that may affect the future of CGI:
You took a look at the new features of the Perl 5 language, which introduces new capabilities to simplify programming.
Finally, you learned about a number of Internet sites-WWW pages and Usenet newsgroups-that will help you keep track of current developments in the CGI field and learn more about the other topics introduced here.
The most important thing to remember with CGI programming and other Internet tasks is to keep learning. Things change often, and if you don't follow new developments, you'll be left behind. Good luck in your CGI programming!
With the popularity of Netscape Navigator, is there any reason to bother supporting other browsers? | |
Yes. At this writing, MSIE is gaining popularity quickly and eventually may overtake Netscape. Another issue is that many users are using older versions of Navigator, so it's a good idea to design for more than one browser. | |
I use different conventions for simplifying Perl code. Is there anything wrong with using other methods? | |
Not at all. Perl is an extremely flexible language, and you can write code in many different styles. Anything that works for you and your company is fine-but it's always a good idea to choose a style and use it consistently. | |
You mentioned Python as an alternative to Perl for CGI programs. Are there any other alternative languages? | |
Yes. Many CGI programmers prefer to use C or C++, which can generate faster code than Perl or Python. UNIX shell languages, such as sh and csh, also are common and easy to use for simple scripts. Any language can be used, as long as the server supports it. |