Notes

{Note Agenda}

In fact, I have an additional hidden agenda. I do believe that computational agents should be expressed as procedures or procedure libraries, not as programs. Scsh is intended to be an incremental step in this direction, one that is integrated with Unix. Writing a program as a Scheme 48 module should allow the user to make it available as a both a subroutine library callable from other Scheme 48 programs or the interactive read-eval-print loop, and, by adding a small top-level, as a standalone Unix program. So Unix programs written this way will also be useable as linkable subroutine libraries -- giving the programmer module interfaces superior to Unix's ``least common denominator'' of ASCII byte streams sent over pipes.

{Note No port sync}

In scsh, Unix' stdio file descriptors and Scheme's standard i/o ports (i.e., the values of (current-input-port), (current-output-port) and (error-output-port)) are not necessarily synchronised. This is impossible to do in general, since some Scheme ports are not representable as Unix file descriptors. For example, many Scheme implementations provide ``string ports,'' that is, ports that collect characters sent to them into memory buffers. The accumulated string can later be retrieved from the port as a string. If a user were to bind (current-output-port) to such a port, it would be impossible to associate file descriptor 1 with this port, as it cannot be represented in Unix. So, if the user subsequently forked off some other program as a subprocess, that program would of course not see the Scheme string port as its standard output.

To keep stdio synced with the values of Scheme's current i/o ports, use the special redirection stdports. This causes 0, 1, 2 to be redirected from the current Scheme standard ports. It is equivalent to the three redirections:


(= 0 ,(current-input-port))
(= 1 ,(current-output-port))
(= 2 ,(error-output-port))
The redirections are done in the indicated order. This will cause an error if the one of current i/o ports isn't a Unix port (e.g., if one is a string port). This Scheme/Unix i/o synchronisation can also be had in Scheme code (as opposed to a redirection spec) with the (stdports->stdio) procedure.

{Note Normal order}

Having to explicitly shift between processes and functions in scsh is in part due to the arbitrary-size nature of a Unix stream. A better, more integrated approach might be to use a lazy, normal-order language as the glue or shell language. Then files and process output streams could be regarded as first-class values, and treated like any other sequence in the language. However, I suspect that the realities of Unix, such as side-effects, will interfere with this simple model.

{Note On-line streams}

The (port->list reader port) procedure is a batch processor: it reads the port all the way to eof before returning a value. As an alternative, we might write a procedure to take a port and a reader, and return a lazily-evaluated list of values, so that I/O can be interleaved with element processing. A nice example of the power of Scheme's abstraction facilities is the ease with which we can write this procedure: it can be done with five lines of code.


;;; A <lazy-list> is either 
;;;     (delay '()) or
;;;     (delay (cons data <lazy-list>)).

(define (port->lazy-list reader port)
  (let collector ()
    (delay (let ((x (reader port)))
             (if (eof-object? x) '()
                 (cons x (collector)))))))

{Note Tempfile example}

For a more detailed example showing the advantages of higher-order procedures in Unix systems programming, consider the task of making random temporary objects (files, directories, fifos, etc.) in the file system. Most Unix's simply provide a function such as tmpnam() that creates a file with an unusual name, and hope for the best. Other Unix's provide functions that avoid the race condition between determining the temporary file's name and creating it, but they do not provide equivalent features for non-file objects, such as directories or symbolic links. This functionality is easily generalised with the procedure

(temp-file-iterate maker [template])
This procedure can be used to perform atomic transactions on the file system involving filenames, e.g.:

The string template is a format control string used to generate a series of trial filenames; it defaults to

"/usr/tmp/<pid>.~a"
where <pid> is the current process' process id. Filenames are generated by calling format to instantiate the template's ~a field with a varying string. (It is not necessary for the process' pid to be a part of the filename for the uniqueness guarantees to hold. The pid component of the default prefix simply serves to scatter the name searches into sparse regions, so that collisions are less likely to occur. This speeds things up, but does not affect correctness.)

The maker procedure is serially called on each filename generated. It must return at least one value; it may return multiple values. If the first return value is #f or if maker raises the ``file already exists'' syscall error exception, temp-file-iterate will loop, generating a new filename and calling maker again. If the first return value is true, the loop is terminated, returning whatever maker returned.

After a number of unsuccessful trials, temp-file-iterate may give up and signal an error.

To rename a file to a temporary name, we write:


(temp-file-iterate (lambda (backup-name)
                     (create-hard-link old-file
                                       backup-name)
                     backup-name)
                   ".#temp. a") ; Keep link in cwd.
(delete-file old-file)
Note the guarantee: if temp-file-iterate returns successfully, then the hard link was definitely created, so we can safely delete the old link with the following delete-file.

To create a unique temporary directory, we write:

(temp-file-iterate (lambda (dir) (create-directory dir) dir))
Similar operations can be used to generate unique symlinks and fifos, or to return values other than the new filename (e.g., an open file descriptor or port).