http-client

  1. http-client
    1. Description
    2. Author
    3. Requirements
    4. Documentation
      1. Main request procedures
        1. Examples
      2. Request handling parameters
      3. Connection management
      4. Cookie management
      5. Authentication support
      6. Proxy support
    5. Changelog
    6. License

Description

Http-client is a highlevel HTTP client library.

Author

Peter Bex

Requirements

Requires the intarweb, sendfile and md5 extensions.

The openssl extension is optional as of 0.7; if it's not installed you'll get an error when trying to access a HTTPS URI.

Documentation

Main request procedures

[procedure] (call-with-response request writer reader)

This is the core http-client procedure. It is only necessary to use this when you want the most control over the request/response cycle. request is the request object that contains information about the request to perform. reader is a procedure that receives the response object and should read the entire request body (any leftover data will cause errors on subsequent requests with keepalive connections), writer is a procedure that receives the request object and should write the request body.

The writer should be prepared to be called several times; if the response is a redirect or some other status that indicates the server wants the client to perform a new request, the writer should be ready to write a request body for this new request. In case digest authentication with message integrity checking is used, writer is always invoked at least twice, once to determine the message digest of the response and once to actually write the response.

Returns three values: The result of the call to reader (or #f if there is no message body in the response), the request-uri of the last request and the response object. The request-uri is useful because this is to be used as the base uri of the document. This can differ from the initial request in the presence of redirects.

If there is no response body to read (as determined by intarweb's response-has-message-body-for-request?), the reader procedure is not invoked at all.

If successive requests cause more than max-redirect-depth redirect responses to occur, a condition of type (exn http redirect-depth-exceeded) is raised.

If the request's URI or the URI of a used proxy is of an unsupported type, a condition of type (exn http unsupported-uri-scheme) is raised (this can of course also occur when the initial URI is correct, but the server redirects to an URI with an unsupported scheme).

When the request requires authentication of an unsupported type, a condition of type (exn http unknown-authtype) is raised.

[procedure] (call-with-input-request uri-or-request writer reader)

This procedure is a convenience wrapper around call-with-response.

It is much less strict - uri-or-request can be an intarweb request object, but also an uri-common object or even a string with the URI in it, in which case a request object will be automatically constructed around the URI, using the GET method when writer is #f or the POST method when writer is not #f.

writer can be either #f (in which case nothing is written and the GET method chosen), a string containing the raw data to send, an alist, or a procedure that accepts a port and writes the response data to it. If you supply a procedure, do not forget to set the content-length header! In the other cases, whenever possible, the length is calculated and the header automatically set for you.

If you supplied an alist, the content-type header is automatically set to application/x-www-form-urlencoded unless there's an alist entry whose value is a list starting with the keyword file:, in which case multipart/form-data is used. See the examples for with-input-from-request below. If the data cannot be form-encoded, a condition of type (exn http formdata-error) is raised.

reader is either #f or a procedure which accepts a port and reads out the data. If there is data left in the port when the reader returns (or #f was supplied), this will be automatically discarded to avoid problems.

Returns three values: The result of the call to reader (or #f if there is no message body in the response), the request-uri of the last request and the response object. If the response code is not in the 200 class, it will raise a condition of type (exn http client-error), (exn http server-error) or (exn http unexpected-server-response), depending on the response code. This includes 404 not found (which is a client-error).

If there is no response body to read (as determined by intarweb's response-has-message-body-for-request?), the reader procedure is not invoked at all.

When posting multipart form data, the value of a file entry is a list of keyword-value pairs. The following keywords are recognised:

file:
This indicates the file to read from. Can be either a string or a port. This must be specified, everything else is optional.
filename:
This indicates the filename to pass on to the server. If not specified or #f, the file:'s string (or port-name in case of a port) will be used.
headers:
Additional headers to send for this entry (an intarweb headers-object).
[procedure] (with-input-from-request uri-or-request writer-thunk reader-thunk)

Same as call-with-input-request, except when you pass a procedure as reader-thunk or writer-thunk it has to be a thunk (lambda of no arguments) instead of a procedure of one argument. These thunks will be executed with the current input (or output) port to the request or response port, respectively.

You can still pass #f for both or an alist or string for writer-thunk.

Examples
(use http-client)

;; Start with a simple GET request:
(with-input-from-request "http://wiki.call-cc.org/" #f read-string)
 => ;; [the chicken wiki page HTML contents]

;; Perform a POST of the key "test" with value "value" to an echo service:
(with-input-from-request "http://localhost/echo-service"
                         '((test . "value")) read-string)
 => "You posted: test=value"

;; Posting a file to the same echo-services:
(with-input-from-request "http://localhost/echo-service"
                         '((test . "value")
                           (test-file file: "/tmp/myfile" filename: "hello.txt"
                                      headers: ((content-type text/plain))))
                         read-string)
 => "You posted: test=value and a file named \"hello.txt\""


;; Performing a PUT request (a less commonly used method) requires
;; constructing your request object manually:

(use intarweb uri-common)  ; Required for "make-request" and "uri-reference"

(with-input-from-request
  (make-request method: 'PUT
                uri: (uri-reference "http://example.com/blabla"))
  (lambda () (print "Page contents"))
  read-string)

Request handling parameters

[parameter] (max-retry-attempts [number])

When a request fails because of an I/O or network problem (or simply because the remote end closed a persistent connection while we were doing something else), the library will try to establish a new connection and perform the request again. This parameter controls how many times this is allowed to be done. If #f, it will never give up.

Defaults to 1.

[parameter] (retry-request? [predicate])

This procedure is invoked when a retry should take place, to determine if it should take place at all. It should be a procedure accepting a request object and returning #f or a true value. If the value is true, the new request will be sent. Otherwise, the error that caused the retry attempt will be re-raised.

Defaults to idempotent?, from intarweb. This is because non-idempotent requests cannot be safely retried when it is unknown whether the previous request reached the server or not.

[parameter] (max-redirect-depth [number])

The maximum number of allowed redirects, or #f if there is no limit. Currently there's no automatic redirect loop detection algorithm implemented. If zero, no redirects will be followed at all.

Defaults to 5.

When the redirect limit is reached, call-with-response raises a condition of type (exn http redirect-depth-exceeded).

[parameter] (client-software [software-spec])

This is the names, versions and comments of the software packages that the client is using, for use in the user-agent header which is automatically added to each request.

Defaults to (("Chicken Scheme HTTP-client" VERSION #f)), where VERSION is the version of this egg.

Connection management

[procedure] (close-connection! uri)

Close the connection to the server associated with the URI.

[procedure] (close-all-connections!)

Close all connections to all servers.

http-client's cookie management is supposed to be as automatic and DWIMmy as possible. This means it will write any cookie as instructed by a server and all stored cookies are automatically sent back to the server upon a new request.

However, in some cases you may want to take control of how cookies are stored.

The API described here should be considered unstable and it may change dramatically when someone comes up with a better way to handle cookies.

[procedure] (get-cookies-for-uri uri)

Fetch a list of all cookies which ought to be sent to the given URI. Cookies are vectors of two elements: a name/value pair and an alist of attributes. In other words, these are the exact same values you can put in a cookie header.

[procedure] (store-cookie! cookie-info set-cookie)

Store a cookie in the cookiejar corresponding to the Set-Cookie header given by set-cookie. This overwrites any cookie that is equal to this cookie, as defined by RFC 2965, section 3.3.3. Practically, this means that when the cookie's name, domain and path are equal to an existant one, it will be overwritten by the new one. These attributes are taken from the cookie-info alist and expected to be there.

Generally, attributes should be taken from set-cookie, but if missing they ought to be taken from the request URI that responded with the set-cookie.

(store-cookie! `((path . ,(make-uri path: '(/ "")))
                 (domain . "some.host.com")
                 (secure . #t))
               `#(("COOKIE_NAME" . "cookie-value")
                  ((path . ,(make-uri path: '(/ ""))))))
[procedure] (delete-cookie! cookie-name cookie-info)

Removes any cookie from the cookiejar that is equal to the given cookie (again, in the sense of RFC 2965, section 3.3.3). The cookie-name must match and the path and domain values for the cookie-info alist must match.

Authentication support

When a 401 Unauthorized response is received, in most interactive clients, the user is normally asked to authenticate. To support this type of interaction, http-client offers the following parameter:

[parameter] (determine-username/password [HANDLER])

The procedure in this parameter is called whenever the remote host requests authentication via a 401 Unauthorized response.

The HANDLER is a procedure of two arguments; the URI for the resource currently being requested and the realm (a string) which wants credentials. The procedure should return two string values: the username and the password to use for authentication.

The default value is a procedure which extracts the username and password components from the URI.

For proxy authentication support, see determine-proxy-username/password in the next section.

[parameter] (http-authenticators [AUTHENTICATORS])

This parameter allows for pluggable authentication schemes. AUTHENTICATORS is an alist mapping authentication scheme name to a procedure of 7 arguments:

(lambda (response response-header new-request request-header uri realm writer) ...)

Here, response is the response object, response-header is the name of the response header which required authentication - a symbol which is either www-authenticate or proxy-authenticate.

new-request is the request that will be sent next, to be populated with additional headers by the authenticator procedure, and request-header is the name of the request header which is expected to be provided and supplied with extra details by the authenticator - also a symbol, which is either authorization or proxy-authorization.

uri is the URI which was requested when the authorization was demanded (in case of www-authenticate, the protected resource) and realm is the authentication realm (a string).

Finally writer is the writer procedure passed by the user or fabricated by call-with-input-request based on the user's form arguments. It's always a procedure accepting a request object. This is only needed when full-request authentication is desired, to obtain a request body.

Proxy support

http-client has support for sending requests through proxy servers.

[parameter] (determine-proxy [HANDLER])

Whenever a request is sent, the library invokes the procedure stored in this parameter to determine through what proxy to send the request, if any.

The HANDLER procedure receives one argument, the URI about to be requested, and returns either an URI-common absolute URI object representing the proxy or #f if no proxy should be used.

The URI's path and query, if present, are ignored; only the scheme and authority (host, port, username, password) are used.

The default value of this parameter is determine-proxy-from-environment.

(determine-proxy
 (lambda (url)
   (uri-reference "http://127.0.0.1:8888/")))

If you just want to disable proxy support, you can do:

(determine-proxy (constantly #f))   ; From unit data-structures
[procedure] (determine-proxy-from-environment URI)

This procedure implements the common behaviour of HTTP software under UNIX:

Some UNIX software expects plain hostnames or hostname port combinations separated by colons, but (currently) this library expects full URIs, like most modern UNIX programs.

[parameter] (determine-proxy-username/password [HANDLER])

The procedure in this parameter is called whenever the proxy requests authentication via a 407 Proxy Authentication Required response. This basically works the same as authentication against an origin server.

The HANDLER is a procedure of two arguments; the URI for the proxy currently being used and the realm (a string) which wants credentials. The procedure should return two string values: the username and the password to use for authentication.

The default value is a procedure which extracts the username and password components from the proxy's URI.

Changelog

License

 Copyright (c) 2008-2014, Peter Bex
 Parts copyright (c) 2000-2004, Felix L. Winkelmann
 All rights reserved.
 
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are
 met:
 
 Redistributions of source code must retain the above copyright
 notice, this list of conditions and the following disclaimer.
 
 Redistributions in binary form must reproduce the above copyright
 notice, this list of conditions and the following disclaimer in the
 documentation and/or other materials provided with the distribution.
 
 Neither the name of the author nor the names of its contributors may
 be used to endorse or promote products derived from this software
 without specific prior written permission.
 
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
 FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
 COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
 INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
 OF THE POSSIBILITY OF SUCH DAMAGE.