set d [exec date]The standard output of the program is returned as the value of the exec command. However, if the program writes to its standard error channel or exits with a non-zero status code, then exec raises an error. If you do not care about the exit status, or you use a program that insists on writing to standard error, then you can use catch to mask the errors:
catch {exec program arg arg} resultThe exec command supports a full set of I/O redirection and pipeline syntax. Each process normally has three I/O channels associated with it: standard input, standard output, and standard error. With I/O redirection you can divert these I/O channels to files or to I/O channels you have opened with the Tcl open command. A pipeline is a chain of processes that have the standard output of one command hooked up to the standard input of the next command in the pipeline. Any number of programs can be linked together into a pipeline.
set n [exec sort < /etc/passwd | uniq | wc -l 2> /dev/null]Example 9-1 uses exec to run three programs in a pipeline. The first program is sort, which takes its input from the file /etc/passwd. The output of sort is piped into uniq, which suppresses duplicate lines. The output of uniq is piped into wc, which counts the lines. The error output of the command is diverted to the null device to suppress any error messages.
Table 9-1 provides a summary of the syntax understood by the exec command. Note that a trailing & causes the program to run in the background. In this case the process identifier is returned by the exec command. Otherwise, the exec command blocks during execution of the program and the standard output of the program is the return value of exec. The trailing newline in the output is trimmed off, unless you specify -keepnewline as the first argument to exec.
If you look closely at the I/O redirection syntax, you'll see that it is built up from a few basic building blocks. The basic idea is that | stands for pipeline, > for output, and < for input. The standard error is joined to the standard output by &. Standard error is diverted separately by using 2>. You can use your own I/O channels by using @.
The auto_noexec Variable
The Tcl shell programs are set up by default to attempt to execute unknown Tcl commands as programs. For example, you can get a directory listing by typing:
lsinstead of:
exec lsThis is handy if you are using the Tcl interpreter as a general shell. It can also cause unexpected behavior when you are just playing around. To turn this off, define the auto_noexec variable:
set auto_noexec anything
The good news is that Windows NT cleans up most of these problems. Windows 95 also handles exec better, but still simulates pipes by using temporary files.
Cross-Platform File Naming
Files are named differently on UNIX, Windows, and Macintosh. UNIX separates file name components with forward slash (/), Macintosh separates components with colon (:), and Windows separates components with backslash (\). In addition, the way that absolute and relative names are distinquished is different. For example, these are absolute pathnames for the Tcl script library (i.e., $tcl_library) on Macintosh, Windows, and UNIX respectively:
Disk:System Folder:Extensions:Tool Command Language:tcl7.6
c:\Program Files\Tcl\lib\Tcl7.6
/usr/local/tcl/lib/tcl7.6The good news is that Tcl provides operations that let you deal with file pathnames in a platform independent manner. A few of the options operate on pathnames as opposed to returning information about the file itself. You can use these commands on any string; there is no requirement that the pathnames refer to an existing file.
The file operations described in this chapter allow either native format or the UNIX naming convention. I really appreciate the ability to type this on Windows:
c:/myprog/libHowever, there are some ambiguous cases that can only be specified with native pathnames. On my Macintosh, Tcl and Tk are installed in a directory that has a slash in it. You can only name it with the native Macintosh name:
Disk:Applications:Tcl/Tk 4.2
set file $tcl_library/init.tcl
The platform independent way to construct file names is with file join. The following command returns the name of the init..tcl file in native format:
set file [file join $tcl_library init.tcl]The file join operation can join any number of path name components. In addition, it has the feature that an absolute pathname overrides any previous components. For example (on UNIX), /b/c is an absolute pathname, so it overrides any paths that come before it in the arguments to file join:
file join a b/c d=> a/b/c/d
file join a /b/c d=> /b/c/d
On Macintosh, a relative pathname starts with a colon, and an absolute pathames does not. To specify an absolute path you put a trailing colon on the first component so it is interpreted as a volume specifier. These relative components are joined into a relative pathname:
file join a :b:c d=> :a:b:c:d
In the next case, b:c is an absolute pathname with b: as the volume specifier. The absolute name overrides the previous relative name:
file join a b:c d=> b:c:d
The file join operation converts UNIX-style pathnames to native format. For example, on Macintosh you get this:
file join /usr/local/lib=> usr:local:lib
file split "/Disk/System Folder/Extensions"=> Disk: {System Folder} Extensions
Common reasons to split up pathnames are to divide a pathname into the directory part and the file part. These special cases are handled directly by the dirname and tail operations. The dirname operation returns the parent directory of a pathname, while tail returns the trailing component of the pathname:
file dirname /a/b/c=> /a/b
file tail /a/b/c=> c
For a pathname with a single component, the dirname option returns ".", on UNIX and Windows, or ":" on Macintosh. This is the name of the current directory.
The extension and root options are also complementary. The extension option returns everything from the last period in the name to the end (i.e., the file suffix including the period.) The root option returns everything up to, but not including, the last period in the pathname:
file root /a/b.c=> /a/b
file extension /a/b.c=> .c
Copying Files
The file copy operation copies files and directories. The following example copies file1 to file2. If file2 already exists, the operation raises an error unless the -force option is specified:
file copy ?-force? file1 file2Several files can be copied into a destination directory. The names of the source files are preserved. The -force option indicates that files under directory can be replaced:
file copy ?-force? file1 file2 ... directoryDirectories can be recursively copied. The -force option indicates that files under dir2 can be replaced:
file copy ?-force? dir1 dir2
file mkdir dir dir ...It is not an error if the directory already exists. Furthermore, intermediate directories are created if needed. This means you can always make sure a directory exists with a single mkdir operation. Suppose /tmp has no subdirectories at all. The following command creates /tmp/sub1 and /tmp/sub1/sub2:
file mkdir /tmp/sub1/sub2The -force option is not understood by file mkdir, so the following command accidentally creates a folder named -force, as well as one named oops.
file mkdir -force oops
file delete ?-force? name name ...To delete a file or directory named -force, you must specify a non-existent file before the -force to prevent it from being interpreted as a flag (-force -force won't work):
file delete xyzzy -force
file rename ?-force? old newUsing file rename is the best way to update an existing file. First generate the new version of the file in a temporary file. Then use file rename to replace the old version with the new version. This ensures that any other programs that access the file will not see the new version until it is complete.
proc newer { file1 file2 } {
if ![file exists $file2] {
return 1
} else {
# Assume file1 exists
expr [file mtime $file1] > [file mtime $file2]
}
}The most general file command options are stat and lstat. They take a third argument that is the name of an array variable, and they initialize that array with elements and values corresponding to the results of the stat system call. The array elements defined are: atime, ctime, dev, gid, ino, mode, mtime, nlink, size, type, and uid. All the element values are decimal strings, except for type, which can have the values returned by the type option. See the UNIX manual page on the stat system call for a description of these attributes. Example 9-3 uses the device (dev) and inode (ino) attributes of a file to determine if two pathnames reference the same file.
proc fileeq { path1 path2 } {
file stat $path1 stat1
file stat $path2 stat2
expr [$stat1(ino) == $stat2(ino) && \
$stat1(dev) == $stat2(dev)]
}
Opening Files for I/O
The open command sets up an I/O channel to either a file or a pipeline of processes. The return value of open is an identifier for the I/O channel. Store the result of open in a variable, and use the variable like you used the stdout, stdin, and stderr identifiers in the examples so far. The basic syntax is:
open what ?access? ?permissions?The what argument is either a file name or a pipeline specification similar to that used by the exec command. The access argument can take two forms, either a short character sequence that is compatible with the fopen library routine, or a list of POSIX access flags. Table 9-4 summarizes the first form, while Table 9-5 summarizes the POSIX flags. If access is not specified, it defaults to read. The permissions argument is a value used for the permission bits on a newly created file. The default permission bits are 0666, which grant read/write access to everybody. Example 9-4 specifies 0600 so that the file is only readable and writeable by the owner. Remember to specify the leading zero to get an octal number as used in the chmod documentation. Consult the manual page on the UNIX chmod command for more details about permission bits.
set fileId [open /tmp/foo w 0600]
puts $fileId "Hello, foo!"
close $fileId
(You should consult your system's manual page for the open system call to determine the precise effects of the NOCTTY and NONBLOCK flags.)
set fileId [open /tmp/bar {RDWR CREAT}]
if [catch {open /tmp/data r} fileId] {
puts stderr "Cannot open /tmp/data: $fileId"
} else {
# Read and process the file, then...
close $fileId
}
set input [open "|sort /etc/passwd" r]
set contents [split [read $input] \n]
close $inputYou can open a pipeline for both read and write by specifying the r+ access mode. In this case you need to worry about buffering. After a puts the data may still be in a buffer in the Tcl library. Use the flush command to force the data out to the spawned processes before you try to read any output from the pipeline. Remember that this will not work at all with Windows 95 because pipes are simulated with files. On UNIX, the expect extension, which is described in the O'Reilly Exporing Expect book, provides a much more powerful way to do interact with other programs.
Event-driven I/O is also very useful with pipes. It means you can do other processing while the pipeline executes, and just respond when the pipe generates data. This is described in Chapter 14.
Reading and Writing
The standard I/O channels are already open for you. There is a standard input channel, a standard output channel, and a standard error output channel. These channels are identified by stdin, stdout, and stderr, respectively. Other I/O channels are returned by the open command, and by the socket command described on page 164. There are several commands used with channel identifiers.
The puts and gets Commands
The puts command writes a string and a newline to the output channel. There are a couple of details about the puts command that we have not yet used. It takes a -nonewline argument that prevents the newline character that is normally appended to the output channel. This will be used in the prompt example below. The second feature is that the channel identifier is optional, defaulting to stdout if not specified.
puts -nonewline "Enter value: "
flush stdout ;# Necessary in Tcl 7.5 and Tcl 7.6
set answer [gets stdin]The gets command reads a line of input, and it has two forms. In the previous example, with just a single argument, gets returns the line read from the specified I/O channel. It discards the trailing newline from the return value. If end-of-file is reached, an empty string is returned. You must use the eof command to tell the difference between a blank line and end-of-file. (eof returns 1 if there is end-of-file.) Given a second varName argument, gets stores the line into named variable and returns the number of bytes read. It discards the trailing newline, which is not counted. A -1 is returned if the channel has reached end of file.
while {[gets $channel line] >= 0} {
# Process line
}
close $channel
foreach line [split [read $channel] \n] {
# Process line
}
close $channelFor moderate-sized files it is about 10% faster to loop over the lines in a file using the read loop in the second example. In this case, read returns the whole file, and split chops the file into list elements, one for each line. For small files (less than 1K) it doesn't really matter. For large files (megabytes) you might induce paging with this approach.
During output, text lines are generated in the platform-native format. The automatic handling of line formats means that it is easy to convert a file to native format. You just need to read it in and write it out:
puts -nonewline $out [read $in]To suppress conversions, use the fconfigure command, which is described in more detail on page X. There are several properties of a channel you can set. Specify binary translations to do no conversions:
fconfigure channelID -translation binaryExample 9-10 demonstrates a File_Copy procedure that translates files to native format. It is complicated because it handles directories:
proc File_Copy {src dest} {
if [file isdirectory $src] {
file mkdir $dest
foreach f [glob -nocomplain [file join $src *]] {
File_Copy $f [file join $dest [file tail $f]]
}
return
}
if [file isdirectory $dest] {
set dest [file join $dest [file tail $src]]
}
set in [open $src]
set out [open $dest w]
puts -nonewline $out [read $in]
close $out
close $in
}
Matching File Names with glob
The glob command expands a pattern into the set of matching file names. The general form of the glob command is:
glob ?flags? pattern ?pattern? ...The pattern syntax is similar to the string match patterns:
The -- flag must be used if the pattern begins with a -.
Unlike the glob matching in csh, the Tcl glob command only matches the names of existing files. In csh, the {a,b} construct can match non-existent names. In addition, the results of glob are not sorted. Use the lsort command to sort its result if you find it important.
proc FindFile { startDir namePat } {
set pwd [pwd]
if [catch {cd $startDir} err] {
puts stderr $err
return
}
foreach match [glob -nocomplain -- $namePat]{
puts stdout [file join $startDir $match]
}
foreach file [glob -nocomplain *] {
if [file isdirectory $file] {
FindFile [file join $startDir $file] $namePat
}
}
cd $pwd
}The FindFile procedure traverses the file system hierarchy using recursion. At each iteration it saves its current directory and then attempts to change to the next subdirectory. A catch guards against bogus names. The glob command matches file names. FindFile is called recursively on each subdirectory.
The pid command returns the process ID of the current process. This can be useful as the seed for a random number generator because it changes each time you run your script. It is also common to embed the process ID in the name of temporary files.
proc printenv { args } {
global env
set maxl 0
if {[llength $args] == 0} {
set args [lsort [array names env]]
}
foreach x $args {
if {[string length $x] > $maxl} {
set maxl [string length $x]
}
}
incr maxl 2
foreach x $args {
puts stdout [format "%*s = %s" $maxl $x $env($x)]
}
}
printenv USER SHELL TERM
=>
USER = welch
SHELL = /bin/cshTERM = tx
welch@acm.org Copyright © 1996, Brent Welch. All rights reserved.