You are looking at historical revision 15156 of this page. It may differ significantly from its current revision.
Intarweb is an advanced http library. It parses all headers into more useful Scheme values.
The intarweb egg is designed to be used from a variety of situations. For this reason, it does not try to be a full HTTP client or server. If you need that kind of functionality, see eggs like spiffy or http-client.
A request object (a defstruct-type record) can be created using the following procedure:[procedure] (make-request #!key uri port (method 'GET) (major 1) (minor 1) (headers (make-headers '())))
An existing request can be picked apart using the following procedures:
- <procedure>(request-uri REQUEST) => URI</procedure>
- <procedure>(request-port REQUEST) => PORT</procedure>
- <procedure>(request-method REQUEST) => SYMBOL</procedure>
- <procedure>(request-major REQUEST) => NUMBER</procedure>
- <procedure>(request-minor REQUEST) => NUMBER</procedure>
- <procedure>(request-headers REQUEST) => HEADERS</procedure>
The uri defines the entity to retrieve on the server, which should be a uri-common-type URI object. The port is the scheme I/O port where the request is written to or read from. The method is a symbol that defines the HTTP method to use (case sensitive). major and minor identify the major and minor version of HTTP to use. Currently, 0.9, 1.0 and 1.1 are supported (but be careful with 0.9, it has some weird consequences and is not widely supported). Headers must be a headers object, which is described below.
The client will generally write requests, while the server will read them. To write a request, use the following procedure:[procedure] (write-request REQUEST) => REQUEST
This will write a request line with headers to the server. In case it is a request type that has any body data, this should be written to the the request's port. Beware that this port can be modified by write-request, so be sure to write to the port as it is returned by the write-request procedure![procedure] (read-request PORT) => REQUEST
Reads a request object from the given input-port. An optional request body can be read from the request-port after calling this procedure.
Requests are parsed using parse procedures, which can be customized by overriding this parameter:[parameter] (request-parsers [LIST])
The list is one of procedures which accept a request line string, which produce a request object from that, or #f if the request is not of the type handled by that procedure.
The predefined request parsers are the following:
- <procedure>(http-0.9-request-parser STRING) => REQUEST</procedure>
- <procedure>(http-1.x-request-parser STRING) => REQUEST</procedure>
Requests are written using unparse procedures, which can be customized by overriding this parameter:[parameter] (request-unparsers [LIST])
The list is one of procedures which accept a request object and write to the request's output port and return the new, possibly updated request object. If the request object is not unparsed by this handler, it returns #f.
The predefined request unparsers are the following:
- <procedure>(http-0.9-request-unparser REQUEST) => REQUEST</procedure>
- <procedure>(http-1.x-request-unparser REQUEST) => REQUEST</procedure>
They return the request, and as a side effect they write the request to the request object's port.
A response is also a defstruct-type record, much like a request:[procedure] (make-response #!key port (code 200) (reason "OK") (major 1) (minor 1) (headers (make-headers '())))
An existing response can be picked apart using the following procedures:
- <procedure>(response-port RESPONSE) => PORT</procedure>
- <procedure>(response-code RESPONSE) => NUMBER</procedure>
- <procedure>(response-reason RESPONSE) => STRING</procedure>
- <procedure>(response-class RESPONSE-OR-CODE) => NUMBER</procedure>
- <procedure>(response-major RESPONSE) => NUMBER</procedure>
- <procedure>(response-minor RESPONSE) => NUMBER</procedure>
- <procedure>(response-headers RESPONSE) => HEADERS</procedure>
The port, major, minor and headers are the same as for requests. code and reason are an integer status code and the short message that belongs to it, as defined in the spec (examples include: 200 OK, 301 Moved Permanently, etc). class is the major class of the response code (100, 200, 300, 400 or 500). response-class can be called either on a response object or directly on a response code number.
A server will usually write a response, a client will read it. To write a response, use the following procedure:[procedure] (write-response RESPONSE) => RESPONSE
If there is a response body, this must be written to the response-port after sending the response headers.[procedure] (read-response PORT) => RESPONSE
Reads a response object from the port. An optional response body can be read from the response-port after calling this procedure.
Responses are parsed using parse procedures, which can be customized by overriding this parameter:[parameter] (response-parsers [LIST])
The list is one of procedures which accept a response line string, which produce a response object from that, or #f if the response is not of the type handled by that procedure.
The predefined response parsers are the following:
- <procedure>(http-0.9-response-unparser REQUEST) => REQUEST</procedure>
- <procedure>(http-1.x-response-unparser REQUEST) => REQUEST</procedure>
Responses are written using unparse procedures, which can be customized by overriding this parameter:[parameter] (response-unparsers [LIST])
The list is one of procedures which accept a response object and write to the response's output port and return the new, possibly updated response object. If the response object is not unparsed by this handler, it returns #f.
The predefined response unparsers are the following:
- <procedure>(http-0.9-response-unparser REQUEST) => REQUEST</procedure>
- <procedure>(http-1.x-response-unparser REQUEST) => REQUEST</procedure>
Requests and responses contain HTTP headers wrapped in a special header-object to ensure they are properly normalized.[procedure] (headers ALIST [HEADERS]) => HEADERS
This creates a header object based on an input list.[procedure] (headers->list HEADERS) => ALIST
This converts the header object back to a list.
The above mentioned lists have header names (symbols) as keys, and lists of values as values:
(headers `((host ("example.com" . 8080)) (accept #(text/html ((q . 0.5))) #(text/xml ((q . 0.1))))) old-headers)
This adds the named headers to the existing headers in old-headers. The host header is either a string with the hostname or a pair of hostname/port. The accept header is a list of allowed mime-type symbols. As can be seen here, optional parameters or "attributes" can be added to a header value by wrapping the value in a vector of length 2. The first entry in the vector is the header value, the second is an alist of attribute name/value pairs.
To obtain the value of any particular header, you can use[procedure] (header-values NAME HEADERS) => LIST
The name of the header is a symbol, and it will return all the values of the header (for example, the Accept header will have several values that indicate the set of acceptable mime-types).
If you know in advance that a header has only one value, you can use:[procedure] (header-value NAME HEADERS [DEFAULT]) => value
This will return the first value in the list, or the provided default if there is no value for that header.
- <procedure>(header-params NAME HEADERS) => ALIST</procedure>
This will return all the params for a given header, assuming there is only one header. An empty list is returned if the header does not exist.
- <procedure>(header-param NAME PARAM HEADERS [DEFAULT]) => value</procedure>
This will return a specific parameter for the header, or the DEFAULT is the parameter isn't present or the header does not exist. This also assumes there's only one header.
The procedures mentioned above are just shortcuts, the underlying procedures to query the raw contents of a header are these:
- <procedure>(header-contents NAME HEADERS) => VECTOR</procedure>
- <procedure>(get-value VECTOR) => value</procedure>
- <procedure>(get-params VECTOR) => ALIST</procedure>
- <procedure>(get-param PARAM VECTOR [DEFAULT]) => value</procedure>
Header contents are 2-element vectors; the first value containing the value for the header and the second value containing an alist with "parameters" for that header value. Parameters are attribute/value pairs that define further specialization of a header's value. For example, the accept header consists of a list of mime-types, which optionally can have a quality parameter that defines the preference for that mime-type. All parameter names are downcased symbols, just like header names.
The headers all have their own different types. Here follows a list of headers with their value types:
|Header name||Value type||Example value|
|accept||List of mime-types (symbols), with optional q attribute indicating "quality" (preference level)||(text/html #(text/xml ((q . 0.1))))|
|accept-charset||List of charset-names (symbols), with optional q attribute||(utf-8 #(iso-8859-5 ((q . 0.1))))|
|accept-encoding||List of encoding-names (symbols), with optional q attribute||(gzip #(identity ((q . 0))))|
|accept-language||List of language-names (symbols), with optional q attribute||(en-gb #(nl ((q . 0.5))))|
|accept-ranges||List of range types acceptable (symbols). The spec only defines bytes and none.||(bytes)|
|age||Age in seconds (number)||(3600)|
|allow||List of methods that are allowed (symbols).||(GET POST PUT DELETE)|
|authorization||Authorization information. This consists of a symbol identifying the authentication scheme, with scheme-specific attributes. basic is handled specially, as if it were a regular symbol with two attributes; username and password.||(basic #((username . "foo") (password . "bar")))|
|cache-control||An alist of key/value pairs. If no value is applicable, it is #t||((public . #t) (max-stale . 10) (no-cache . (age set-cookie)))|
|connection||A list of connection options (symbols)||(close)|
|content-encoding||A list of encodings (symbols) applied to the entity-body.||(deflate gzip)|
|content-language||The natural language(s) of the "intended audience" (symbols)||(de nl en-gb)|
|content-length||The number of bytes (an exact number) in the entity-body||(10)|
|content-location||A location that the content can be retrieved from (a uri-common object)||(<#uri-common# ...>)|
|content-md5||The MD5 checksum (a string) of the entity-body||("12345ABCDEF")|
|content-range||Content range (pair with start- and endpoint) of the entity-body, if partially sent||((25 . 120))|
|content-type||The mime type of the entity-body (a symbol)||(text/html)|
|date||A timestamp (10-element vector, see string->time) at which the message originated||(#(42 23 15 20 6 108 0 309 #f 0))|
|etag||An entity-tag (pair, car being either the symbol weak or strong, cdr being a symbol) that uniquely identifies the resource contents.||((strong . foo123))|
|expect||Expectations of the server's behaviour (alist of symbol-string pairs), possibly with parameters.||(#(((100-continue . #t)) ()))|
|expires||Expiry timestamp (10-element vector, see string->time) for the entity||(#(42 23 15 20 6 108 0 309 #f 0))|
|from||The e-mail address (a string) of the human user who controls the client||("firstname.lastname@example.org")|
|host||The host to use (for virtual hosting). This is a pair of hostname and port||(("example.com" . 80))|
|if-match||Either '* (a wildcard symbol) or a list of entity-tags (pair, weak/strong symbol and unique entity identifier symbol).||((strong . foo123) (strong . bar123))|
|if-modified-since||Timestamp (10-element vector, see string->time) which indicates since when the entity must have been modified.||(#(42 23 15 20 6 108 0 309 #f 0))|
|if-none-match||Either '* (a wildcard symbol) or a list of entity-tags (pair, weak/strong symbol and unique entity identifier symbol).||((strong . foo123) (strong . bar123))|
|if-range||The range to request, if the entity was unchanged||TODO|
|if-unmodified-since||A timestamp (10-element vector, see string->time) since which the entity must not have been modified||(#(42 23 15 20 6 108 0 309 #f 0))|
|last-modified||A timestamp (10-element vector, see string->time) when the entity was last modified||(#(42 23 15 20 6 108 0 309 #f 0))|
|location||A location (an URI object) to which to redirect||(<#uri-object ...>)|
|max-forwards||The maximum number of proxies that can forward a request||(2)|
|pragma||An alist of symbols containing implementation-specific directives.||((no-cache . #t) (my-extension . my-value))|
|proxy-authenticate||Proxy authentication request. Equivalent to www-authenticate, for proxies.||(digest #((username . "foo")))|
|proxy-authorization||The answer to a proxy-authentication request. Equivalent to authorization, for proxies.||(basic #((username . "foo") (password . "bar")))|
|range||The range of bytes (a pair of start and end) to request from the server.||((25 . 120))|
|referer||The referring URL (uri-common object) that linked to this one.||(<#uri-object ...>)|
|retry-after||Timestamp (10-element vector, see string->time) after which to retry the request if unavailable now.||(#(42 23 15 20 6 108 0 309 #f 0))|
|server||List of products the server uses (list of 3-tuple lists of strings; product name, product version, comment. Version and/or comment may be #f). Note that this is a single header, with a list inside it!||((("Apache" "2.2.9" "Unix") ("mod_ssl" "2.2.9" #f) ("OpenSSL" "0.9.8e" #f) ("DAV" "2" #f) ("mod_fastcgi" "2.4.2" #f) ("mod_apreq2-20051231" "2.6.0" #f)))|
|te||Allowed transfer-encodings (symbols, with optional q attribute) for the response||(deflate #(gzip ((q . 0.2))))|
|trailer||Names of header fields (symbols) available in the trailer/after body||(range etag)|
|transfer-encoding||The encodings (symbols) used in the body||(chunked)|
|upgrade||Product names to which must be upgraded (strings)||TODO|
|user-agent||List of products the user agent uses (list of 3-tuple lists of strings; product name, product version, comment. Version and/or comment may be #f). Note that this is a single header, with a list inside it!||((("Mozilla" "5.0" "X11; U; NetBSD amd64; en-US; rv:18.104.22.168") ("Gecko" "2008110501" #f) ("Minefield" "3.0.3" #f)))|
|vary||The names of headers that define variation in the resource body, to determine cachability (symbols)||(range etag)|
|via||The intermediate hops through which the message is forwarded (strings)||TODO|
|warning||Warning code for special status||TODO|
|www-authenticate||If unauthorized, a challenge to authenticate (symbol, with attributes)||(digest #((username . "foo")))|
|set-cookie||Cookies to set (name (symbol)/value (string) pair, with attributes)||(#((foo . "bar") ((max-age . 10) (port . '(80 8080))))|
|cookie||Cookies that were set (name/value string pair, with attributes)||(#((foo . "bar") ((version . 1) (path . "/") (domain . "foo.com"))))|
Any unrecognised headers are assumed to be multi-headers, and the entire header lines are put unparsed into a list, one entry per line.
Header parsers and unparsers
The parsers and unparsers used to read and write header values can be customized with the following parameters:
- <parameter>(header-parsers [ALIST])</parameter>
- <parameter>(header-unparsers [ALIST])</parameter>
These (un)parsers are indexed with as key the header name (a symbol) and the value being a procedure.
A header parser accepts the contents of the header (a string, without the leading header name and colon) and returns a list of vectors which represents the values of the header. For headers that are supposed to only have a single value, the last value in the list will be stored as the value (as determined by single-headers).
A header unparser accepts two arguments: the name of the header (a symbol) and the header's contents (a vector). It should return a string which represents the header contents (without the header name).
The parser driver will call update-header-contents! with the parser's result.
When there's an error parsing a given header, the following parameter's procedure will be invoked:[parameter] (header-parse-error-handler [HANDLER])
HANDLER is a procedure accepting four values: the header name, the header contents, the current headers and the exception object. The procedure must return the new headers. Defaults to a procedure that simply returns the current headers. When an error occurs while parsing the header line itself (for example when a colon is missing between the header name and contents), the error will not be caught.
In such a case, Servers should return a 400 Bad Request error and clients should error out. The reason that malformed error lines are ignored is that there are several servers and clients that send headers content values that are slightly off, even though the rest of the request is OK. In the interest of the "robustness principle", it's best to simply ignore these headers with "bad" content values.
- <procedure>(replace-header-contents NAME CONTENTS HEADERS) => HEADERS</procedure>
- <procedure>(replace-header-contents! NAME CONTENTS HEADERS) => HEADERS</procedure>
- <procedure>(update-header-contents NAME CONTENTS HEADERS) => HEADERS</procedure>
- <procedure>(update-header-contents! NAME CONTENTS HEADERS) => HEADERS</procedure>
The replace procedures replace any existing contents of the named header with new ones, the update procedures add these contents to the existing header. The procedures with a name ending in bang are linear update variants of the ones without the bang. The header contents have to be normalized to be a 2-element vector, with the first element being the actual value and the second element being an alist (possibly empty) of parameters/attributes for that value.
The update procedures append the value to the existing header if it is a multi-header, and act as a simple replace in the case of a single-header.
Whether a header is allowed once or multiple times in a request or response is determined by this parameter:[parameter] (single-headers [LIST])
The value is a list of symbols that define header-names which are allowed to occur only once in a request/response.
- <procedure>(http-name->symbol-name STRING) => SYMBOL</procedure>
- <procedure>(symbol->http-name SYMBOL) => STRING</procedure>
These procedures convert strings containing the name of a header or attribute (parameter name) to symbols representing the same. The symbols are completely downcased. When converting this symbol back to a string, the initial letters of all the words in the header name or attribute are capitalized.
- <procedure>(remove-header name headers) => headers</procedure>
- <procedure>(remove-header! name headers) => headers</procedure>
These two procedures remove all headers with the given name.
Other procedures[procedure] (keep-alive? request-or-response)
Returns #t when the given request or response object belongs to a connection that should be kept alive, #f if not. Remember that both parties must agree on whether the connection is to be kept alive or not; HTTP/1.1 defaults to keep alive unless a Connection: close header is sent, HTTP/1.0 defaults to closing the connection, unless a Connection: Keep-Alive header is sent.[procedure] (safe? request-or-method)
Returns #t when the given request object or symbol (method) is a safe method. A method is defined to be safe when a request of this method will have no side-effects on the server. In practice this means that you can send this request from anywhere at any time and cause no damage.
Important: Quite a lot of software does not abide by these rules! This is not necessarily a reason to treat all methods as unsafe, however. In the words of the standard "the user did not request the side-effects, so therefore cannot be held accountable for them". If a safe method produces side-effects, that's the server-side script developer's fault and he should fix his code.[parameter] (safe-methods [symbols])
A list of methods which are to be considered safe. Defaults to '(GET HEAD OPTIONS TRACE).[procedure] (idempotent? request-or-method)
Returns #t when the given request object or symbol (method) is a idempotent method. A method is defined to be idempotent when a series of identical requests of this method in succession causes the exact same side-effect as just one such request. In practice this means that you can safely retry such a request when an error occurs, for example.
Important: Just as with the safe methods, there is no guarantee that methods that should be idempotent really are idempotent in any given web application. Furthermore, a sequence of requests which each are individually idempotent is not necessarily idempotent as a whole. This means that you cannot replay requests starting anywhere in the chain. To be on the safe side, only retry the last request in the chain.[parameter] (idempotent-methods [symbols])
A list of methods which are to be considered idempotent. Defaults to '(GET HEAD PUT DELETE OPTIONS TRACE).
- 0.1 Initial version
Copyright (c) 2008-2009, Peter Bex All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. Neither the name of the author nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.