[Top] [Prev] [Next] [Bottom]

3

3 The Guestbook CGI Application

This chapter presents a simple Tcl program that computes a Web page. The chapter provides a brief background to HTML and the CGI interface to Web servers.
This chapter presents a complete, but simple guestbook program that computes an HTML document, or Web page, based on the contents of a simple database. The basic idea is that a user with a Web browser visits a page that is computed by the program. The details of how the page gets from your program to the user with the Web browser vary from system to system. You can use these scripts on your own Web server, but you will need help from your Webmaster to set things up. The goal of the examples in the chapter are to demonstrate a few non-trivial programming techniques with a real example.

The chapter provides a very brief introduction to HTML and CGI programming. HTML is a way to specify text formatting, including hypertext links to other pages on the World Wide Web. CGI is a standard for communication between a Web server that delivers documents and a program that computes documents for the server. There are many books on these subjects alone. CGI Developers Resource, Web Programming with Tcl and Perl (Prentice Hall, 1997) by John Ivler is a good reference for details that are left unexplained here.

A guestbook is a place for visitors to sign their name and perhaps provide other information. We will build a guestbook that takes advantage of the World Wide Web. Our guests can leave their address as a Universal Resource Location (URL). The guestbook will be presented as a page that has hypertext links to all these URLs so that other guests can visit them. The program works by keeping a simple database of the guests, and it generates the guestbook page from the database.

The Tcl scripts described in this chapter use commands and techniques that are described in more detail in later chapters. The goal of the examples is to demonstrate the power of Tcl without explaining every detail. If the examples in this chapter raise questions, you can follow the references to examples in other chapters that do go into more depth.

A Quick Introduction to HTML

Web pages are written in a text markup language called HTML (HyperText Markup Language). The idea of HTML is that you annotate, or mark up, regular text with special tags that indicate structure and formatting. For example, the title of a Web page is defined like this:

<TITLE>My Home Page</TITLE>
The tags provide general formatting guidelines, but the browsers that display HTML pages have freedom in how they display things. This keeps the markup simple. The general syntax for HTML tags is:

<tag parameters>normal text</tag>
As shown here, the tags usually come in pairs. The open tag may have some parameters, and the close tag name begins with a slash. The case of a tag is not considered, so <title>, <Title>, and <TITLE> are all valid and mean the same thing. The corresponding close tag could be </title>, </Title>, </TITLE>, or even </TiTlE>.

The <A> tag defines hypertext links that reference other pages on the Web. The hypertext links connect pages into a Web so you can move from page to page to page and find related information. It is the flexibility of the links that make the Web so interesting. The <A> tag takes an HREF parameter that defines the destination of the link. If you wanted to link to the Sun home page, you would put this in your page:

<A HREF="http://www.sun.com/">Sun Microsystems</A>
When this construct appears in a Web page, your browser typically displays "Sun Microsystems" in blue underlined text. When you click on that text, your browser switches to the page at the address "http://www.sun.com/". There is a lot more to HTML, of course, but this should give you a basic idea of what is going on in the examples. The following list summarizes the HTML tags that will be used in the examples:
HTML tags used in the examples.
HTML Main tag that surrounds the whole document.
HEAD Delimits head section of the HTML document.
TITLE Defines the title of the page.
BODY Delimits the body section. Lets you specify page colors.
H1 - H6 HTML defines 6 heading levels: H1, H2, H3, H4, H5, H6.
P Start a new paragraph.
B Bold text.
I Italic text.
A Used for hypertext links.
IMG Specify an image.
DL Definition list.
DT Term clause in a definition list.
DD Definition clause in a definition list.
FORM Defines a data entry form.
INPUT A one-line entry field, checkbox, radio button, or submit button.
TEXTAREA A multiline text field.

CGI for Dynamic Pages

There are two classes of pages on the Web, static and dynamic. A static page is written and stored on a Web server, and the same thing is returned each time a user views the page. This is the easy way to think about Web pages. You have some information to share, so you compose a page and tinker with the HTML tags to get the information to look good. If you have a home page, it is probably in this class.

In contrast, a dynamic page is computed each time it is viewed. This is how pages that give up-to-the-minute stock prices work, for example. A dynamic page does not mean it includes animations; it just means that a program computes the page contents when a user visits the page. The advantage of this approach is that a user might see something different each time they visit the page. As we shall see, it is also easier to maintain information in a database of some sort and generate the HTML formatting for the data with a program.

A CGI (Common Gateway Interface) program is used to compute Web pages. The CGI standard defines how inputs are passed to the program and a way to identify different types of results, such as images, plain text, or HTML markup. A CGI program simply writes the contents of the document to its standard output, and the Web server takes care of delivering the document to the user's Web browser. The following is a very simple CGI script:

A simple CGI script.
puts "Content-Type: text/html"
puts ""
puts "<TITLE>The Current Time</TITLE>"
puts "The time is <B>[clock format [clock seconds]]</B>"
The program computes a simple HTML page that has the current time. Each time a user visits the page they will see the current time on the server. The server that has the CGI program and the user viewing the page might be on different sides of the planet. The output of the program starts with a Content-Type line that tells your Web browser what kind of data comes next. This is followed by a blank line and then the contents of the page.

The clock command is used twice: once to get the current time in seconds, and a second time to format the time into a nice looking string. The clock command is described in detail on page 145. Fortunately there is no conflict between the markup syntax used by HTML and the Tcl syntax for embedded commands, so we can mix the two in the argument to the puts command. Double quotes are used to group the argument to puts so that the clock commands will be executed. When run, the output of the program will look like this:

Output of Example 3-1.
Content-Type: text/html

<TITLE>The Current Time</TITLE>
The time is <B>Wed Oct 16 11:23:43  1996</B>
This example is a bit sloppy in its use of HTML, but it should display properly in most Web browsers. The next example include all the required tags for a proper HTML document.

The guestbook.cgi Script

The guestbook.cgi script computes a page that lists all the registered guests. The example is shown first, and then each part of it is discussed in more detail later. One thing to note right away is that the HTML tags are generated by procedures that hide the details of the HTML syntax. The first lines of the script use the UNIX trick to have tclsh interpret the script. This trick is described on page 24:

The guestbook.cgi script.
#!/bin/sh
# guestbook.cgi
# \
exec tclsh "$0" ${1+"$@"}

# Implement a simple guestbook page.
# The set of visitors is kept in a simple database.
# The newguest.cgi script will update the database.
#
source /usr/local/lib/cgilib.tcl

Cgi_Header "Brent's Guestbook" {BGCOLOR=white TEXT=black}
P
set datafile [file join \
	[file dirname [info script]] guestbook.data]
if {![file exists $datafile]} {
	puts "No registered guests, yet."
	P 
	puts "Be the first [Link {registered guest!} newguest.html]"
} else {
	puts "The following folks have registered in my GuestBook."
	P 
	puts [Link Register newguest.html]
	H2 Guests
	catch {source $datafile}
	foreach name [lsort [array names Guestbook]] {
		set item $Guestbook($name)
		set homepage [lindex $item 0]
		set markup [lindex $item 1]
		H3 [Link $name $homepage]
		puts $markup
	}
}
Cgi_End

Beginning the HTML Page

The script uses a number of Tcl procedures that make working with HTML and the CGI interface easier. These procedures are kept in the cgilib.tcl file. You must update the complete pathname of this file to match your system. The script starts by sourcing the cgilib.tcl file and generating the standard information that comes at the beginning of an HTML page:

source /usr/local/lib/cgilib.tcl
Cgi_Header {Brent's GuestBook} {bgcolor=white text=black}
The Cgi_Header procedure takes as arguments the title for the page and some optional parameters for the HTML <Body> tag that set the page background and text color. Here we specify black text on a white background to avoid the standard grey background of most browsers. An empty default value is specified for the bodyparams so you do not have to pass those to Cgi_Header. Default values for procedure parameters are described on page 75.

The Cgi_Header procedure.
proc Cgi_Header {title {bodyparams {}}} {
    puts stdout \
"Content-Type: text/html

<HTML>
<HEAD>
<TITLE>$title</TITLE>
</HEAD>
<BODY $bodyparams>
<H1>$title</H1>"
}
The Cgi_Header procedure just contains a single puts command that generates the standard boilerplate that appears at the beginning of the output. Note that several lines are grouped together with double quotes. Double quotes are used so that the variable references mixed into the HTML are substituted properly.

The output begins with the CGI content-type information, a blank line, and then the HTML. The HTML is divided into a head and body part. The <TITLE> tag goes in the head section of an HTML document. Finally, browsers display the title in a different place than the rest of the page, so I always want to repeat the title as a level-one heading (i.e., H1) in the body of the page.

Simple Tags and Hypertext Links

The next thing the program does is see if there are any registered guests or not. The file command, which is described in detail on page 94, is used to see if there is any data:

if [file exists $datafile] {

If the database file does not exist, a different page is displayed to encourage a registration. The page includes a hypertext link to a registration page. The newguest.html page will be described in more detail later:

puts "No registered guests, yet."

P 
puts "Be the first [Link {registered guest!} newguest.html]"

The P command generates the HTML for a paragraph break. This trivial procedure saves us a few keystrokes:

proc P {} {

	puts <P>
}

The Link command formats and returns the HTML for a hypertext link. Instead of printing the HTML directly, it is returned so you can include it in-line with other text you are printing:

The Link command formats a hypertext link.
proc Link {text url} {
    return "<A HREF=\"$url\">$text</A>"
}
The output of the program would be this if there were no data:

Initial output of guestbook.cgi
Content-Type: text/html

<HTML>
<HEAD>
<TITLE>Brent's Guestbook</TITLE>
</HEAD>
<BODY BGCOLOR=white TEXT=black>
<H1>Brent's Guestbook</H1>
<P>
No registered guests.
<P>
Be the first <A HREF="newguest.cgi">registered guest!</A>
</BODY>
</HTML>
If the database file exists, then the real work begins. We first generate a link to the registration page, and a level-two header to separate that from the guest list:

puts [Link Register newguest.html]
H2 Guests
The H2 procedure handles the detail of including the matching close tag:

proc H2 {string} {
	puts "<H2>$string</H2>"
}

Using a Tcl Array for the Database

The datafile contains Tcl commands that define an array that holds the guestbook data. If this file is kept in the same directory as the guestbook.cgi script, then you can compute its name. The info script command returns the file name of the script. The file dirname and file join commands manipulate file names in a platform-independent way. They are described on page 94:

set datafile [file join \
	[file dirname [info script]] guestbook.data]
By using Tcl commands to represent the data, we can load the data with the source command. The catch command is used to protect the script from a bad data file, which will show up as an error from the source command. Catching errors is described in detail on page 73:

catch {source $datafile}
The Guestbook variable is the array defined in guestbook.data. Array variables are the topic of Chapter 8. Each element of the array is defined with a Tcl command that looks like this:

set {Guestbook(Brent Welch)} {
	http://www.beedub.com/
	{<img src=http://www.beedub.com/welch.gif>}
}
The person's name is the array index, or key. The value of the array element is a Tcl list with two elements: their URL and some additional HTML markup that they can include in the guestbook. Tcl lists are the topic of Chapter 5. The spaces in the name result in some awkward syntax that is explained on page 84. Do not worry about this now. We will see on page 40 that all the braces in the previous statement are generated automatically. The main point is that the person's name is the key, and the value is a list with two elements.

The array names command returns all the indices, or keys, in the array, and the lsort command sorts these alphabetically. The foreach command loops over the sorted list, setting the loop variable x to each key in turn:

foreach name [lsort [array names Guestbook]] {
Given the key, we get the value like this:

set item $Guestbook($name)
The two list elements are extracted with lindex, which is described on page 57.

set homepage [lindex $item 0]
set markup [lindex $item 1]
We generate the HTML for the guestbook entry as a level-three header that contains a hypertext link to the guest's home page. We follow the link with any HTML markup text that the guest has supplied to embellish their entry. The H3 procedure is similar to the H2 procedure already shown, except it generates <H3> tags;

H3 [Link $name $homepage]
puts $markup

Sample Output

The last thing the script does is call Cgi_End to output the proper closing tags. An example of the output of the guestbook.cgi script is shown in Example 3-7:

Output of guestbook.cgi.
Content-Type: text/html

<HTML>
<HEAD>
<TITLE>Brent's Guestbook</TITLE>
</HEAD>
<BODY BGCOLOR=white TEXT=black>
<H1>Brent's Guestbook</H1>
<P>
The following folks have registered in my guestbook.
<P>
<A HREF="newguest.cgi">Register</A>
<H2>Guests</H2>
<H3><A HREF="http://www.beedub.com/">Brent Welch</A></H3>
<IMG SRC="http://www.beedub.com/welch.gif">
</BODY>
</HTML>

Defining Forms and Processing Form Data

The guestbook.cgi script only generates output. The other half of CGI deals with input from the user. Input is more complex for two reasons. First, we have to define another HTML page that has a form for the user to fill out. Second, the data from the form is organized and encoded in a standard form that must be decoded by the script. Example 3-8 on page 38 defines a very simple form, and the procedure that decodes the form data is described in detail in Example 11-4 on page 129.

The guestbook page contains a link to newguest.html. This page contains a form that lets a user register their name, home page URL, and some additional HTML markup. The form has a submit button. When a user clicks that button in their browser, the information from the form is passed to the newguest.cgi script. This script updates the database and computes another page for the user that acknowledges their contribution.

The newguest.html Form

An HTML form is defined with tags that define data entry fields, buttons, checkboxes, and other elements that let the user specify values. For example, a one-line entry field that is used to enter the home page URL is defined like this:

<INPUT TYPE=text NAME=url>
The INPUT tag is used to define several kinds of input elements, and its type parameter indicates what kind. In this case, TYPE=text creates a one-line text entry field. The submit button is defined with a INPUT tag that has TYPE=submit, and the VALUE parameter becomes the text that appears on the button:

<INPUT TYPE=submit NAME=submit VALUE=Register>
A general type-in window is defined with the TEXTAREA tag. This creates a multiline, scrolling text field that is useful for specifying lots of information, such as a free-form comment. In our case we will let guests type in HTML that will appear with their guestbook entry. The text between the open and close TEXTAREA tags is inserted into the type-in window when the page is first displayed.

<TEXTAREA NAME=markup ROWS=10 COLS=50>Hello.</TEXTAREA>
A common parameter to the form tags is NAME=something. This name identifies the data that will come back from the form. The tags also have parameters that affect their display, such as the label on the submit button and the size of the text area. Those details are not important for our example. The complete form is shown in Example 3-8:

The newguest.html form.
<!Doctype HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML>
<HEAD>
<TITLE>Register in my Guestbook</TITLE>
<!-- Author: bwelch -->
<META HTTP-Equiv=Editor Content="SunLabs WebTk 1.0beta 10/
11/96">
</HEAD>
<BODY>

<FORM ACTION="newguest.cgi" METHOD="POST">

<H1>Register in my Guestbook</H1>
<UL>
<LI>Name <INPUT TYPE="text" NAME="name" SIZE="40">
<LI>URL  <INPUT TYPE="text" NAME="url" SIZE="40">
<P>
If you don't have a home page, you can use an email URL like 
"mailto:welch@acm.org"
<LI>Additional HTML to include after your link:
<BR>

<TEXTAREA NAME="html" COLS="60" ROWS="15">
</TEXTAREA>
<LI><INPUT TYPE="submit" NAME="new" VALUE="Add me to your 
guestbook">
<LI><INPUT TYPE="submit" NAME="update" VALUE="Update my 
guestbook entry">
</UL>
</FORM>

</BODY>
</HTML>

The newguest.cgi Script

When the user clicks the Submit button in their browser, the data from the form is passed to the program identified by the Action parameter of the form tag. That program takes the data, does something useful with it and then returns a new page for the browser to display. In our case the FORM tag names newguest.cgi as the program to handle the data:

<FORM ACTION=newguest.cgi METHOD=POST>
The CGI specification defines how the data from the form is passed to the program. The data is encoded and organized so that the program can figure out the values the user specified for each form element. The encoding is handled rather nicely with some regular expression tricks that are done in Cgi_Parse. Cgi_Parse saves the form data, and you use Cgi_Value to get a form value in your script. These procedures are described in Example 11-4 on page 129. Example 3-9 starts out by calling Cgi_Parse:

The newguest.cgi script.
#!/bin/sh
# \
exec tclsh "$0" ${1+"$@"}
# source cgilib.tcl from the same directory as newguest.cgi
source [file join \
	[file dirname [info script]] cgilib.tcl]

set datafile [file join \
	[file dirname [info script]] guestbook.data]

Cgi_Parse

# Open the datafile in append mode

if [catch {open $datafile a} out] {
	Cgi_Header "Guestbook Registration Error" \
		{BGCOLOR=black TEXT=red}
	P
	puts "Cannot open the data file"
	P
	puts $out	;# the error message
	exit 0
}

# Append a Tcl set command that defines the guest's entry

puts $out ""
puts $out [list set Guestbook([Cgi_Value name]) \
	[list [Cgi_Value url] [Cgi_Value html]]]
close $out

# Return a page to the browser

Cgi_Header "Guestbook Registration Confirmed" \
	{BGCOLOR=white TEXT=black}

puts "
<DL>
<DT>Name
<DD>[Cgi_Value name]
<DT>URL
<DD>[Link [Cgi_Value url] [Cgi_Value url]]
</DL>
[Cgi_Value html]
"

Cgi_End
The main idea of the newguest.cgi script is that it saves the data to a file as a Tcl command that defines an element of the Guestbook array. This lets the guestbook.cgi script simply load the data by using the Tcl source command. This trick of storing data as a Tcl script saves us from the chore of defining a new file format and writing code to parse it. Instead, we can rely on the well-tuned Tcl implementation to do the hard work for us efficiently.

The script opens the datafile in append mode so it can add a new record to the end. Opening files is described in detail on page 101. The script uses a catch command to guard against errors. If an error occurs, a page explaining the error is returned to the user. Working with files is one of the most common sources of errors (permission denied, disk full, file-not-found, and so on), so I always open the file inside a catch statement:

if [catch {open $datafile a} out] {
	# an error occurred
} else {
	# open was ok
}
In this command, the variable out gets the result of the open command, which is either a file descriptor or an error message. This style of using catch is described in detail in Example 6-14 on page 71.

The script writes the data as a Tcl set command. The list command is used to format the data properly:

puts $out [list set Guestbook([Cgi_Value name]) \
	[list [Cgi_Value url] [Cgi_Value html]]]
There are two lists. First the url and html are formatted into one list. This list will be the value of the array element. Then, the whole Tcl command is formed as a list. In simplified form, the command is generated from this:

list set variable value
Using the list command ensures that the result will always be a valid Tcl command that sets the variable to the given value. The list command is described in more detail on page 55.

Next Steps

There are a number of details that could be added to this example. A user may want to update their entry, for example. They could do that now, but they would have to retype everything. They might also like a chance to check the results of their registration and make changes before committing them. This requires another page that displays their guest entry as it would appear on a page, and also has the fields that let them update the data.

The details of how a CGI script is hooked up with a Web server vary from server to server. You should ask your local Webmaster for help if you want to try this out on your Web site.

Don Libes has create a nice package for CGI scripts, cgi.tcl, and you can find on the web at http://expect.nist.gov/cgi.tcl/.

The next few chapters describe basic Tcl commands and data structures. We return to this example in Chapter 11 on regular expressions.



[Top] [Prev] [Next] [Bottom]

welch@acm.org
Copyright © 1997, Brent Welch. All rights reserved.
This will be published by Prentice Hall as the 2nd Edition of
Practical Programming in Tcl and Tk