SCSH-Process

  1. SCSH-Process
    1. Description
    2. Requirements
    3. Documentation
      1. Caveats
        1. Forking and threads
        2. Signal handling
        3. Process reaping
      2. Macros
        1. Basic process macros
        2. Process macros for interfacing with Scheme
        3. Macros for conditional process control
        4. Collecting multiple outputs
      3. Procedural interface
        1. Basic forking and pipeline primitives
        2. Executing programs from the path
        3. Pipeline procedures
        4. Collecting multiple outputs
    4. Changelog
    5. Author
    6. Repository
    7. License

Description

A reimplementation for Chicken of SCSH's process notation.

Requirements

llrb-tree

Documentation

This egg implements all the special forms required to implement SCSH's convenient process notation. This notation is called "EPF", which stands for extended process form.

The procedural equivalents for these special forms are also available. Where it's required or makes sense, a handful of other procedures are also provided. This egg does not strive for full SCSH compatibility; it exists only to bring the convenience of EPF notation to Chicken.

Caveats

There are a few important caveats, especially related to threading. Please read this section thoroughly before using scsh-process! If none of this makes sense yet, read the rest of the manual first, then come back. The issues are listed in order of decreasing importance.

Forking and threads

If you are using threads, it is important to realize that thread killing is completely insensitive to your delicate synchronization primitives or resource management. Ensure that all locks are released and you don't have any non-GCable memory allocated in other threads. If you have finalizers registered, those will still run, though.

Signal handling

Beware that if you set a signal/chld handler, you will need to remember the original handler and call it from your handler. If you don't, you must manually wait for all children. If you've installed a handler before scsh-process is loaded, it will be automatically chained by the handler installed by scsh-process.

Process reaping

Loading scsh-process will transparently cause the posix process-wait procedure to be replaced so it can update the bookkeeping information about reaped child processes. See the description in the SCSH manual about how SCSH performs process reaping. Currently only the "early" reaping strategy is supported.

Macros

The special forms provided by this egg are the most convenient to use, but also slightly less powerful than the underlying procedures they use.

Basic process macros
[syntax] (run pf [redirection ...])
[syntax] (& pf [redirection ...])
[syntax] (exec-epf pf [redirection ...])

These forms run a new process pipeline, described by process form pf, with optional redirections of input and output descriptors indicated by any number of redirection patterns. These processes don't interact with the calling process; if you want to capture their output there are other forms, like run/string, run/port and friends. Please also beware that these forms don't do anything to indicate nonzero exit statuses in pipelines. The && and || macros might help if you need to do status checks.

The run form simply runs the process and waits for the final process in the pipeline to exit. It returns three values relating to this final process:

You'll note that these are just the same values returned by process-wait, but in a slightly different order (wait utilises this modified order, as well). This is for compatibility reasons: In SCSH this form returns only the exit status, but because Chicken accepts multiple values in single value contexts (discarding all but the first), we can provide the other ones as extra values, thereby avoiding gratuitous incompatibility.

The & form simply runs the process in the background and returns one value: a process object representing the last process in the pipeline.

The exec-epf form never returns; it replaces the current process by the last process in the pipeline. The others are implemented in terms of this form:

(& . epf) => (process-fork (lambda () (exec-epf . epf)) #t)
(run . epf) => (process-wait (& . epf))

A process form followed by a set of redirections is called "extended process form" in SCSH terminology. A process form can be one of the following:

(<program> <arg> ...)
(pipe <pf> ...)
(pipe+ <connect-list> <pf> ...)
(epf <pf> <redirection> ...)
(begin <s-expr> ...)

The basic building blocks are <program> and <begin>, which always correspond to one process in a pipeline. The other rules are ways to combine these two.

<program> gives the name of a program to run. If it starts with a slash, it's an absolute filename. If it contains a slash but doesn't start with one, it's a relative filename. If it doesn't contain a slash, it's a command name that is located in PATH. One or more <arg>s can be given as arguments to the program.

<program> and each <arg> are implicitly quasiquoted. <program> can be a string, symbol, number, or an unquote expression ,e that evaluates to a value of one of those types. Each <arg> follows the same rules as <program>, but can additionally be an unquote-splicing expression ,@e that evaluates to a list of multiple arguments where each element is a string, symbol, or number.

NOTE: An unquote-splicing expression is not recognized in <program> position, so it's not possible to obtain <program> and <arg>s from the same ,@ expression. If you have both the program name and the arguments in the same list, use car and cdr to destructure the list before passing it to run.

The pipe rule will hook up standard output and standard error of each pf to the next pf's standard input, just like in a regular Unix shell pipeline.

The pipe+ rule is like pipe, but it allows you to hook up arbitrary file descriptors between two neighbouring processes. This is done through the connect-list, which is a list of fd-mappings describing how ports are connected from one process to the next. It has the form ((from-fd1 from-fd2 ... to-fd) ...). The from-fds correspond to outbound file descriptors in one process, the to-fds correspond to inbound file descriptors in the other process.

The epf rule is to get extended process forms in contexts where only process forms are accepted, like the pipe and pipe+ subforms, and in the && and || macros (so you can do file redirects here).

The begin rule allows you to write scheme code which will be run in a forked process, having its current-input-port, current-output-port and current-error-port hooked up to its neighbouring processes in the pipeline.

A redirection can be one of the following:

(> [<fd>] <file-name>)      ; Write fd (default: 1) to the given filename
(>> [<fd>] <file-name>)     ; Like >, but append instead of overwriting
(< [<fd>] <file-name>)      ; Read fd (default: 0) from the filename
(<< [<fd>] <scheme-object>) ; Like <, but use object's printed representation
(= <fd> <fd-or-port>)       ; Redirect fd to fd-or-port
(- <fd-or-port>)            ; Close fd
stdports                    ; Duplicate fd 0, 1, 2 from standard Scheme ports

The arguments to redirection rules are also implicitly quasiquoted.

To tie it all together, here are a few examples:

(import scsh-process)

;; Writes "1235" to a file called "out" in the current directory.
;; Shell equivalent:  echo 1234 + 1 | bc > out
(run (pipe (echo "1234" + 1) ("bc")) (> out))

(define message "hello, world")

;; Writes 13 to stdout, with a forked Scheme process writing the data.
;; Shell equivalent (sort of): echo 'hello, world' | wc -c
(run (pipe (begin (display message) (newline)) (wc -c)))

;; A verbose way of doing the same, using pipe+.  It connects the {{begin}}
;; form's standard output and standard error to standard input of {{wc}}:
(run (pipe+ ((1 2 0)) (begin (display message) (newline)) (wc -c)))

;; Same as above, using redirection instead of port writing:
(run (wc -c) (<< ,(string-append message "\n")))

;; Writes nothing because stdout is closed:
(run (wc -c) (<< ,message) (- 1))

;; A complex toy example using nested pipes, with input/output redirection.
;; Closest shell equivalent:
;; ((sh -c "echo foo >&2") 2>&1 | cat) | cat
(run (pipe+ ((1 0))
            (pipe+ ((2 0)) (sh -c "echo foo >&2") (cat))
            (cat)))
Process macros for interfacing with Scheme
[syntax] (run/port pf [redirection ...])
[syntax] (run/file pf [redirection ...])
[syntax] (run/string pf [redirection ...])
[syntax] (run/strings pf [redirection ...])
[syntax] (run/sexp pf [redirection ...])
[syntax] (run/sexps pf [redirection ...])

These forms are equivalent to run, except they wire up the current process to the endpoint of the pipeline, allowing you to read the standard output from the pipeline as a whole. If you also need standard error or have even more specialized needs, take a look at the run/collecting form.

The difference between these forms is in how this output is returned, and when the call returns:

Macros for conditional process control
[syntax] (&& pf ...)
[syntax] (|| pf ...)

These macros act like their counterpart shell operators; they run the given process forms in sequence and stop either on the first "false" value (nonzero exit) or on the first "true" value (zero exit), respectively.

The result value of these is #f or #t, so they act a lot like regular Scheme and and or.

Note: The name of the || macro is really the empty symbol whereas in SCSH's reader, it reads as a symbol consisting of two pipe characters. The name of these macros may change in the future if it turns out to cause too much trouble.

Collecting multiple outputs
[syntax] (run/collecting fds pf ...)

This form runs the pf form, redirecting each the file descriptors in the fds list to a separate tempfile, and waits for the process to complete.

The result of this expression is (status file ...). status is the exit status of the process. Each file entry is an opened input port for the temporary file that belongs to the file descriptor at the same offset in the fds list. If you close the port, the tempfile is removed.

See the SCSH documentation for an extended rationale of why this works the way it does.

Procedural interface

These procedures form the basis for the special forms documented above, and can be used to implement your own, more specialized macros.

Basic forking and pipeline primitives
[procedure] (fork [thunk [continue-threads?]])
[procedure] (%fork [thunk [continue-threads?]])

If thunk is provided and not #f, the child process will invoke the thunk and exit when it returns. If continue-threads is provided and #t, all existing threads will be kept alive in the child process; by default only the current thread will be kept alive.

fork differs from the regular process-fork in its return value. Instead of a pid value, this returns a process object representing the child is returned in the parent process. When thunk is not provided, #f (not zero!) is returned in the child process.

Currently %fork is just an alias for fork.

[procedure] (wait [pid-or-process [nohang]])

Like process-wait, but nonblocking: Suspends the current process until the child process described by pid-or-process (either a numerical process ID or a scsh-process object) has terminated using the UNIX system call waitpid(). If pid-or-process is not given, then this procedure waits for any child process. If nohang is given and not #f then the current process is not suspended.

This procedure returns three values, in a different order from process-wait:

All values are #f, if nohang is true and the child process has not terminated yet.

It is allowed to wait multiple times for the same process after it has completed, if you pass a process object. Process IDs can only be waited for once after they have completed and will cause an error otherwise.

This procedure is nonblocking, which means you can wait for a child in a thread and have the other threads continue to run; the process isn't suspended, but a signal handler will take care of unblocking the waiting thread.

[procedure] (signal-process proc signal)

Like process-signal from the POSIX unit, but accepts a process object (proc) instead of a pid. Sends signal (an integer) to the given process.

[procedure] (process-sleep sec)

Put the entire process to sleep for sec seconds. Just an alias for sleep from the POSIX unit.

[procedure] (process? object)
[procedure] (proc? object)

Is object an object representing a process? The process? predicate is deprecated; proc? is API-compatible with SCSH.

[procedure] (proc:pid proc)

Retrieve the process id (an integer) from the process object proc.

[procedure] (fork/pipe [thunk [continue-threads?]])
[procedure] (%fork/pipe [thunk [continue-threads?]])

These fork the process as per fork or %fork, but additionally they set up a pipe between parent and child. The child's standard output is set up to write to the pipe, while the parent's standard input is set up to read to the pipe. Standard error is inherited from the parent.

The return value is a process object or #f.

Currently fork%/pipe is just an alias for fork/pipe.

Important: These procedures only set up the file descriptors, not the Scheme ports. current-input-port, current-output-port and current-error-port still refer to their old file descriptors after a fork. This means that you'll need to reopen the descriptors to get a Scheme port that reads from the child or writes to the parent:

(import scsh-process (chicken file posix))

(process-wait
  (fork/pipe (lambda ()
               (with-output-to-port (open-output-file* 1)
                 (lambda () (display "Hello, world.\n"))))))

(read-line (open-input-file* 0)) => "Hello, world"
[procedure] (fork/pipe+ conns [thunk [continue-threads?]])
[procedure] (fork%/pipe+ conns [thunk [continue-threads?]])

These are like fork/pipe and fork%/pipe, except they allow you to control how the file descriptors are wired. Conns is a list of lists, of the form ((from-fd1 from-fd2 ... to-fd) ...). See the description of pipe+ under the run special form for more information.

Currently fork%/pipe+ is just an alias for fork/pipe+.

Executing programs from the path
[procedure] (exec-path program [args ...])
[procedure] (exec-path* program arg-list)

This will simply execute program with the given arguments in args or args-list. The program replaces the currently running process. All arguments (including program) must be strings, symbols or numbers, and they will automatically be converted to strings before invoking the program.

The program is looked up in the user's path, which is simply the $PATH environment variable. There currently is no separately maintained path list like in SCSH. The difference between the two procedures is that exec-path accepts a variable number of arguments and exec-path* requires them pre-collected into a list.

Pipeline procedures
[procedure] (run/port* thunk)
[procedure] (run/file* thunk)
[procedure] (run/string* thunk)
[procedure] (run/strings* thunk)
[procedure] (run/sexp* thunk)
[procedure] (run/sexps* thunk)

These set up a pipe between the current process and a forked off child process which runs the thunk. See the "unstarred" versions run/port, run/file ... run/sexps for more information about the semantics of these procedures.

Collecting multiple outputs
[procedure] (run/collecting* fds thunk)

Like run/collecting, but use a thunk instead of a process form.

Changelog

Author

Peter Bex

Repository

http://code.more-magic.net/scsh-process

License

 Copyright (c) 2012-2020, Peter Bex
 All rights reserved.
 
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions
 are met:
 1. Redistributions of source code must retain the above copyright
    notice, this list of conditions and the following disclaimer.
 2. Redistributions in binary form must reproduce the above copyright
    notice, this list of conditions and the following disclaimer in the
    documentation and/or other materials provided with the distribution.
 
 THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND ANY EXPRESS OR
 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
 IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY DIRECT, INDIRECT,
 INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
 NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
 THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.