Spiffy

  1. Spiffy
    1. Description
    2. Author
    3. Requirements
    4. Documentation
    5. Starting the server
    6. Configuration parameters
    7. Handlers
    8. Runtime information
    9. Virtual hosts
    10. Access files
    11. Procedures and macros
    12. Modules
      1. ssp-handler
      2. web-scheme-handler
      3. cgi-handler
      4. simple-directory-handler
        1. Configuration
        2. Procedures
    13. Examples
      1. Quick config for serving up a docroot
      2. Network tweaks
      3. Redirecting to another domain
    14. Changelog
    15. License

Description

A small web-server written in Chicken.

Author

Felix Winkelmann. Currently maintained by Peter Bex.

Requirements

Requires the intarweb, uri-common, uri-generic and sendfile extensions.

Documentation

Spiffy is a web-server library for the Chicken Scheme system. It's quite easy to set up and use (whether as a library or a standalone server application) and it can be customized in numerous ways.

To test it out immediately, try the following command after installing:

 $ csi -e "(use spiffy) (start-server)"

This starts up a spiffy server which listens on port 8080, and serves documents from the default docroot, which is a directory called web underneath the current working directory.

A typical httpd would be a more complete program which first configures a few things like the port-number and root-path parameters and set up some logging through access-log and error-log. It could also configure some extensions (like simple-directory-handler) and register some user-defined path handlers by adding them to vhost-map. Finally it could daemonize by forking and exiting. The mentioned parameters are explained below.

Starting the server

[procedure] (start-server [port: port-number] [bind-address: address] [listen: listen-procedure] [accept: accept-procedure] [addresses: addresses-procedure])

This is the most convenient way to get a server up and running quickly. It starts the server to listen on the given port. Other configuration can be tweaked through SRFI-39 parameters. These are listed below. Once the server is started, server behaviour can be controlled through these parameters as well. After the listener is started, when spiffy-user and/or spiffy-group are provided, this procedure will drop privileges before starting the accept loop.

By default, Spiffy will only serve static files. On directories, it will give a "403 forbidden", unless there is an index-file. If there is, that file's contents will be shown.

All arguments directly supplied to start-server override the configuration parameter values and will be parameterized to reflect this new setting.

port-number defaults to the value of server-port (see below). bind-address defaults to the value of server-bind-address (see below). listen defaults to tcp-listen and should accept a port number, backlog and bind address. accept defaults to tcp-accept, and is passed on as-is to accept-loop. addresses-procedure defaults to a procedure which works like tcp-addresses but can also detect SSL ports and return the addresses of the underlying TCP connection.

[procedure] (accept-loop listener accept [addresses])

This procedure starts the loop which accepts incoming connections and fires off threads to handle requests on those connections. You can use it if you need more control over the startup process than start-server offers.

The listener object should be an object which is accepted by the accept procedure, which should return two values; an input and an output port which represent an incoming connection from a client. The optional addresses procedure should accept the input port returned by the accept procedure and return two strings; the local and remote addresses of the server and client, respectively.

For example, you can set up an SSL context and drop privileges, and possibly load extra code before starting the accept loop (Spiffy contains the required code to detect SSL ports, and will handle those more-or-less transparently):

(use chicken-syntax) ; This must done at the very toplevel so that it is available in the interaction-environment.
(use spiffy openssl)

(server-port 443)
(spiffy-user "www")
(spiffy-group "www")

;; Bind the port as root, before we drop privileges
(define listener (ssl-listen (server-port)))

;; Load the certificate files as root so we can secure their permissions
(ssl-load-certificate-chain! listener "server.pem")
(ssl-load-private-key! listener "server.key")

;; Drop root privileges
(switch-user/group (spiffy-user) (spiffy-group))
  
;; We don't want to load this extra code as root!
(load "extra-code.scm")

;; Done! Start listening for connections.
(accept-loop listener ssl-accept)
[procedure] (switch-user/group user group)

This is a helper procedure which allows you to easily drop privileges before running the accept loop. The user and group must be either strings or UID/GID numbers which indicate the username and groupname to which you want to switch. Either is also allowed to be #f, if you don't want to switch that aspect of the process.

Configuration parameters

The following parameters can be used to control spiffy's behaviour. Besides these parameters, you can also influence spiffy's behaviour by tweaking the intarweb parameters.

[parameter] (server-software [product])

The server software product description. This should be a valid product value as used in the server and user-agent headers by intarweb; this is a list of lists. The inner lists contain the product name, the product version and a comment, all either a string or #f. Default: (("Spiffy" "a.b" "Running on Chicken x.y")), with a.b being the Spiffy major/minor version and x.y being Chicken's.

[parameter] (root-path [path])

The path to the document root, for the current vhost. Defaults to "./web".

[parameter] (server-port [port-number])

The port number on which to listen. Defaults to 8080.

[parameter] (server-bind-address [address])

The IP address on which to listen, or all addresses if #f. Defaults to #f.

[parameter] (max-connections [number])

The maximum number of simultaneously active connections. Defaults to 1024.

Any new connection that comes in when this number is reached must wait until one of the active connections is closed.

[parameter] (spiffy-user [name-or-uid])

The name or UID of a user to switch to just after binding the port. This only works if you start Spiffy as root, so it can bind port 80 and then drop privileges. If #f, no switch will occur. Defaults to #f.

[parameter] (spiffy-group [name-or-gid])

The name or GID of a group to switch to just after binding the port. This only works if you start Spiffy as root, so it can bind port 80 and then drop privileges. If #f, it will be set to the primary group of spiffy-user if the user was selected. Otherwise, no change will occur. Defaults to #f.

[parameter] (index-files [file-list])

A list of filenames which are to be used as index files to serve when the requested URL identifies a directory. Defaults to '("index.html" "index.xhtml")

[parameter] (mime-type-map [extension->mimetype-list])

An alist of extensions (strings) to mime-types (symbols), to use for the content-type header when serving up a static file. Defaults to

 '(("html" . text/html)
   ("xhtml" . application/xhtml+xml)
   ("js"  . application/javascript)
   ("css" . text/css)
   ("png" . image/png)
   ("xml" . application/xml)
   ("pdf" . application/pdf)
   ("jpeg" . image/jpeg)
   ("jpg" . image/jpeg)
   ("gif" . image/gif)
   ("ico" . image/vnd.microsoft.icon)
   ("svg" . image/svg+xml)
   ("txt" . text/plain))

See also file-extension->mime-type for a procedure which can look up file extensions for you.

[parameter] (default-mime-type [mime-type])

The mime-type (a symbol) to use if none was found in the mime-type-map. Defaults to 'application/octet-stream

[parameter] (default-host [hostname])

The host name to use when no virtual host could be determined from the request. See the section on virtual hosts below.

[parameter] (vhost-map [host-regex->vhost-handler])

A mapping of virtual hosts (regex) to handlers (procedures of one argument; a continuation thunk). See the section on virtual hosts below. Defaults to `((".*" . ,(lambda (continue) (continue))))

[parameter] (file-extension-handlers [extension->handler-list])

An alist mapping file extensions (strings) to handler procedures (lambdas of one argument; the file name relative to the webroot). Defaults to '(). If no handler was found, defaults to just sending a static file.

[parameter] (access-log [log-file-or-port])

Filename (string) or port to append access log output to. Default: #f (disabled)

[parameter] (error-log [log-file-or-port])

Filename (string) or port to which error messages from evaluated code should be output. Default: (current-error-port)

[parameter] (debug-log [log-file-or-port])

Filename (string) or port to write debugging messages to. Default: #f (disabled)

[parameter] (access-file [string])

The name of an access file, or #f if not applicable. This file is read when the directory is entered by the directory traversal system, and allows you to write dynamic handlers that can assign new values for parameters only for resources below that directory, very much like adding parameters in code before calling a procedure. See the section "Access files" for more information.

[parameter] (trusted-proxies [list-of-strings])

When an incoming request is first accepted, the remote-address is initialized to the IP address of the remote peer. When this peer is a reverse proxy in an internal network, that value is not so useful because all requests would seem to come from there.

If you want to have a more meaningful value, you can add the IP addresses of proxies to this list, and X-Forwarded-For entries from these proxies will be stripped, and the first entry just before the most-distant trusted proxy will be used.

Be careful: all IP addresses in this list will be trusted on their word.

Default: () (trust no one)

Handlers

Besides "static" configuration, Spiffy also has several handlers for when something is to be served.

[parameter] (handle-directory [proc])

The handler for directory entries. If the requested URL points to a directory which has no index file, this handler is invoked. It is a procedure of one argument, the path (a string) relative to the webroot. Defaults to a procedure which returns a "403 forbidden".

[parameter] (handle-file [proc])

The handler for files. If the requested URL points to a file, this handler is invoked to serve the file. It is a procedure of one argument, the path (a string) relative to the webroot. Defaults to a procedure which sets the content-type and determines a handler based on the file-extension-handlers, or send-static-file if none was found and the method was HEAD or GET (otherwise it replies with 405 "Method Not Allowed").

[parameter] (handle-not-found [proc])

The handler for nonexistent files. If the requested URL does not point to an existing file or directory, this procedure is called. It is a procedure of one argument, the path (a string) to the first missing file in the request path. This path should be interpreted as being relative to the webroot (even though it points to no existing file). Defaults to a procedure which returns a "404 Not found".

[parameter] (handle-exception [proc])

The handler for when an exception occurs. This defaults to a procedure that logs the error to the error log. While debugging or developing, it may be more convenient to use a procedure that sends the error back to the client:

(handle-exception
  (lambda (exn chain)
    (send-status 'internal-server-error (build-error-message exn chain))))
[parameter] (handle-access-logging [proc])

The handler for access logging. This is a procedure of zero arguments which should write a line to the access log. Defaults to a procedure which writes a line to access-log which looks like this:

  127.0.0.1 [Sun Nov 16 15:16:01 2008] "GET http://localhost:8080/foo?bar HTTP/1.1" 200 "http://localhost:8080/referer" "Links (2.2; NetBSD 5.99.01 macppc; x)"

Runtime information

During the handling of a request, Spiffy adds more information to the environment by parameterizing the following parameters whenever the information becomes available:

[parameter] (current-request [request])

An intarweb request-object that defines the current request. Available from the moment the request comes in and is parsed. Contains, among other things, the query parameters and the request-headers, in fully parsed form (as intarweb returns them).

The URI is automatically augmented with the host, scheme and port if it is not an absolute URI.

[parameter] (current-response [response])

An intarweb response-object that defines the current response. Available from the same time current-request is available. This keeps getting updated along the way, while the response data is being refined (like when headers are being added).

[parameter] (current-file [path])

The path to the requested file (a string). Available from the moment Spiffy determined the requested URL points to a file (just before the handle-file procedure is called). This file is relative to the root-path.

[parameter] (current-pathinfo [path])

The trailing path fragments (a list of strings) that were passed in the URL after the requested filename. Available from the moment Spiffy determined the requested URL points to a file (just before the handle-file procedure is called).

[parameter] (remote-address [address])

The IP address (a string) of the user-agent performing the current request. See also trusted-proxies.

[parameter] (local-address [address])

The IP address (a string) on which the current request came in.

[parameter] (secure-connection? [boolean])

#t when the current connection is a secure one (SSL), #f if it isn't (regular HTTP). This pertains only to the direct connection itself, so if Spiffy is behind a proxy this will be #f even if the proxy itself is connected to the client over SSL.

One way to get around this is to always add a custom header to your reverse proxy's configuration file. Then you can read this out in Spiffy and set secure-connection? to #t or #f, as well as updating the request URI's scheme. There is no standardized header for this, so the default Spiffy won't do this.

An easier way around this is to set up two spiffies listening on different ports and configure one to have secure-connection? set to #t, which you redirect incoming HTTPS requests to.

This parameter may disappear or change in the future, when there are more smart people using Spiffy who know how to deal with this or the Spiffy maintainer has a moment of clarity and decides how to do this cleanly.

Virtual hosts

Spiffy has support for virtual hosting, using the HTTP/1.1 Host header. This allows you to use one Spiffy instance running on one IP address/port number to serve multiple webpages, as determined by the hostname that was requested.

The virtual host is defined by a procedure, which can set arbitrary parameters on-the-fly. It is passed a continuation thunk, which it should explicitly call if it wants the processing to continue. The most used parameter in virtual host setups is the root-path parameter, so that another docroot can be selected based on the requested hostname, showing different websites for different hosts:

(vhost-map `(("foo\\.bar\\.com" .
               ,(lambda (continue)
                  ;; Requires the spiffy-dynamic-handlers egg...
                  (parameterize ((file-extension-handlers
                                   `(("ssp" . ,ssp-handler) ("ws" . ,web-scheme-handler)))
                                 (root-path "/var/www/domains/foo.bar.com"))
                     (continue))))
             (,(glob->regexp "*.domain.com") .
                ,(lambda (continue)
                   ;; Requires the spiffy-cgi-handlers egg...
                   (parameterize ((file-extension-handlers
                                    `(("php" . ,(cgi-handler* "/usr/pkg/libexec/cgi-bin/php"))))
                                  ;; You can also change PHP's arg_separator.input
                                  ;; to be ";&" instead of this parameter
                                  (form-urlencoded-separator "&")
                                  (root-path "/var/www/domains/domain.com"))
                     (continue))))))

In this example, if a client accesses foo.bar.com/mumble/blah.html, the file /var/www/domains/foo.bar.com/mumble/blah.html will be served. Any files ending in .ssp or .ws will be served by the corresponding file type handler. If there's any PHP file, its source will simply be displayed. In case of my.domain.com/something/bar.html, the file /var/www/domains/domain.com/something/bar.html will be served. If there's a .ssp or .ws file there, it will not be interpreted. Its source will be displayed instead. A .php file, on the other hand, will be passed via CGI to the program /usr/pkg/libexec/cgi-bin/php.

Domain names are mapped to a lambda that can set up any parameters it wants to override from the defaults. The host names are matched using irregex-match. If the host name is not yet a regexp, it will be converted to a case-insensitive regexp.

Access files

Fine-grained access-control can be implemented by using so-called access files. When a request for a specific file is made and a file with the name given in the access-file parameter exists in any directory between the root-path of that vhost and the directory in which the file resides, then the access file is loaded as an s-expression containing a function and is evaluated with a single argument, the function that should be called to continue processing the request.

This works just like vhosting. The function that gets called can call parameterize to set additional constraints on the code that handles deeper directories.

For example, if we evaluate (access-file ".access") before starting the server, and we put the following code in a file named .access into the root-directory, then all accesses to any file in the root-directory or any subdirectory will be denied unless the request comes from localhost:

 (lambda (continue)
   (if (string=? (remote-address) "127.0.0.1")
       (continue)
       (send-status 'forbidden "Sorry, you're not allowed here")))

If we only want to deny access to files that start with an X, put this in the .access file:

 (lambda (continue)
   (let ((old-handler (handle-file)))
     (parameterize ((handle-file
                      (lambda (path)
	  	        (if (not (string-prefix? "X" (pathname-file path)))
			    (send-status 'forbidden "No X-files allowed!")
			    (old-handler path)))))
       (continue))))

Of course, access files can be used for much more than just access checks. One can put anything in them that could be put in vhost configuration or in top-level configuration.

They are very useful for making deployable web applications, so you can just drop a directory on your server which has its own configuration embedded in an access file in the root directory of the application, without having to edit the server's main configuration files.

Procedures and macros

The following procedures and macros can be used in dynamic web programs, or dynamic server configuration:

[procedure] (with-headers new-headers thunk)

Call thunk with the header list new-headers. This parameterizes the current response to contain the new headers. The existing headers are extended with new-headers through intarweb's headers procedure.

[procedure] (write-logged-response)

This procedure simply writes current-response after calling handle-access-logging. Responses should always go through this procedure instead of directly using write-response from intarweb.

If you have a response body to write, you still need to remember to call finish-response-body (from intarweb) after doing so.

[procedure] (log-to log format . rest)

Write a printf-style format string to the specified log (one of access-log, error-log or debug-log). format is a printf-style format string, and rest arguments should match the arguments one would pass to printf. A newline is appended to the end of the log message automatically.

[procedure] (send-response #!key status code reason body headers)

Easy way to send string data to the client, with additional headers. It will add appropriate headers and will automatically detect HEAD requests. If BODY is #f, no body is sent and the content-length header is set to zero.

The status is a symbol describing the response status, which is looked up in intarweb's http-status-code parameter. If code and/or reason are supplied, these take precedence. If status and code are missing, the default status is ok.

[procedure] (send-status code reason [message])
[procedure] (send-status status [message])

Easy way to send a page and a status code to the client. The optional message is a string containing HTML to add in the body of the response. Some structure will be added around the message, so message should only be the actual message you want to send.

This can be called either with a numeric code, string reason and optional message or with a symbolic status and optional message.

Example:

(send-status 404 "Not found"
 "Sorry, page not found! Please try <a href='/search.ws'>our search page</a>")

;; Alternative way of doing this:
(send-status 'not-found
 "Sorry, page not found! Please try <a href='/search.ws'>our search page</a>")
[procedure] (send-static-file filename)

Send a file to the client. This sets the content-length header and tries to send the file as quickly as possible to the client. The filename is interpreted relative to root-path.

[procedure] (file-extension->mime-type EXT)

Looks up the file extension EXT (without a leading dot) in mime-type-map, or uses default-mime-type when the extension can't be found.

If EXT is #f, it'll look up the extension that is the empty string.

This returns a symbol which indicates the mime-type which is matched to the extension (for example text/html).

[procedure] (restart-request request)

Restart the entire request-handling starting at the point where the request was just parsed. The argument is the new request to use. Be careful, this makes it very easy to introduce unwanted endless loops!

[procedure] (htmlize string) => string

Encode "special" html symbols like tag and attribute characters so they will not be interpreted by the browser.

[procedure] (build-error-message exn chain [raw-output])

Build an error message for the exception exn, with call chain chain. Defaults to HTML output, unless raw-output is given and nonfalse.

Modules

This section will describe what the various modules that come with Spiffy are and how they work.

ssp-handler

This was moved to the spiffy-dynamic-handlers egg.

web-scheme-handler

This was moved to the spiffy-dynamic-handlers egg.

cgi-handler

This was moved to the spiffy-cgi-handlers egg.

simple-directory-handler

In order to get directory listings, you can use simple-directory-handler. Just parameterize handle-directory's value with the simple-directory-handler procedure and you're set.

This directory handler truly is very simple and limited in what you can customize. For a more flexible directory handler, see the spiffy-directory-listing egg.

Configuration

The simple directory handler has a few configuration options:

[procedure] (simple-directory-dotfiles? [dotfiles?])

Determines if dotfiles should show up in the directory listings. Default: #f

[procedure] (simple-directory-display-file [displayer])

A lambda that accepts three arguments: the remote filename, the local filename and a boolean that says if the file is a directory. This lambda should output a table row with the desired information. Defaults to a lambda that prints the name, size and date when the file was last modified.

Procedures

The simple-directory handler adds only one procedure to the environment:

[procedure] (simple-directory-handler pathname)

The handler itself, which should be used in the handle-directory parameter.

Examples

Quick config for serving up a docroot

Spiffy is very easy to use for simple cases:

(use spiffy)

(server-port 80)
(root-path "/var/www")
;; When dropping privileges, switch to this user
(spiffy-user "httpd")
(spiffy-group "httpd")
(start-server)

One could also use parameterize (according to taste):

(use spiffy)

(parameterize ((server-port 80)
               (spiffy-user "httpd")
               (spiffy-group "httpd")
               (root-path "/var/www"))
  (start-server))

If you put this in /usr/local/libexec/spiffy.scm you can use this example init.d script or these example systemd scripts as-is, to launch Spiffy.

Network tweaks

Spiffy does not activate Chicken's TCP buffering, which results in extra traffic: one packet sent per header line. With a TCP buffer size greater than the total header length, all headers will be coalesced into a single write; generally the response body will be coalesced as well. For example:

(tcp-buffer-size 2048)   ; from unit tcp
(start-server)

Redirecting to another domain

When you have a domain you want to canonicalize so that it will always have www in front of it, you can set up a conditional redirect in your vhosts section:

(use spiffy intarweb uri-common)

(vhost-map `(("example.com"
               . ,(lambda (continue)
                    (let* ((old-u (request-uri (current-request)))
                           (new-u (update-uri old-u host: "www.example.com")))
                      (with-headers `((location ,new-u))
                        (lambda () (send-status 'moved-permanently))))))
             ("www.example.com" . ,(lambda (continue) (continue)))))

(start-server)

Alternatively the following version can be used to generate appropriate handlers for several domains:

(use spiffy intarweb uri-common)

; Generates a handler that can be used in vhost-map that will cause all requests to that URL to be rewritten to the domain specified in 'to'.
(define (canonicalise-domain to)
  (let ((to (uri-reference to)))
   (assert (equal? '(/ "") (uri-path to)))
   (assert (null? (uri-query to)))
   ; We don't see fragments on the server and choose not to care about usernames and password fields.
    (lambda (continue)
      (let* ((old-u (request-uri (current-request)))
	     (new-u (update-uri old-u
				scheme: (or (uri-scheme to) (uri-scheme old-u))
				port:   (or (uri-port   to) (uri-port   old-u))
				host:   (or (uri-host   to) (uri-host   old-u)))))
	(with-headers `((location ,new-u))
		      (lambda () (send-status 'moved-permanently)))))))

(vhost-map `(("example.com"
               . ,(canonicalise-domain "http://www.example.com"))
 ("www.example.com" . ,(lambda (continue) (continue)))))

The above code hooks a handler for the regular expression "example.com", checks the request's URI and then updates only the host component, keeping all other components (like port and path) intact. Then it sends a simple status response that indicates a 301 "Moved Permanently" response, using the new URI in the Location: header.

The second entry just tells Spiffy to continue serving the request when it's made to www.example.com. A request on any other host will receive a 404 not found response due to not having a vhost entry. More elaborate setups can parameterize some other aspects for each host before calling continue.

Changelog

License

 Copyright (c) 2005-2013, Felix L. Winkelmann and Peter Bex
 All rights reserved.
 
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are
 met:
 
 Redistributions of source code must retain the above copyright
 notice, this list of conditions and the following disclaimer.
 
 Redistributions in binary form must reproduce the above copyright
 notice, this list of conditions and the following disclaimer in the
 documentation and/or other materials provided with the distribution.
 
 Neither the name of the author nor the names of its contributors may
 be used to endorse or promote products derived from this software
 without specific prior written permission.
 
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
 FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
 COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
 INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
 OF THE POSSIBILITY OF SUCH DAMAGE.