You are looking at historical revision 16466 of this page. It may differ significantly from its current revision.

abnf

Description

abnf is a collection of combinators to help constructing parsers for Augmented Backus-Naur form (ABNF) grammars (RFC 4234).

Library Procedures

The combinator procedures in this library are based on the interface provided by the lexgen library.

Terminal values

[procedure] (char CHAR) => MATCHER

Procedure char builds a pattern matcher function that matches a single character.

[procedure] (lit STRING) => MATCHER

lit matches a literal string (case-insensitive).

Operators

[procedure] (concatenation MATCHER-LIST) => MATCHER

concatenation matches an ordered list of rules. (RFC 4234, Section 3.1)

[procedure] (alternatives MATCHER-LIST) => MATCHER

alternatives matches any one of the given list of rules. (RFC 4234, Section 3.2)

[procedure] (range C1 C2) => MATCHER

range matches a range of characters. (RFC 4234, Section 3.4)

[procedure] (variable-repetition MIN MAX MATCHER) => MATCHER

variable-repetition matches between MIN and MAX or more consecutive elements that match the given rule. (RFC 4234, Section 3.6)

[procedure] (repetition MATCHER) => MATCHER

repetition matches zero or more consecutive elements that match the given rule.

[procedure] (repetition1 MATCHER) => MATCHER

repetition1 matches one or more consecutive elements that match the given rule.

[procedure] (repetition-n N MATCHER) => MATCHER

repetition-n matches exactly N consecutive occurences of the given rule. (RFC 4234, Section 3.7)

[procedure] (optional-sequence MATCHER) => MATCHER

optional-sequence matches the given optional rule. (RFC 4234, Section 3.8)

Core rules

The following primitive parsers match the rules described in RFC 4234, Section 6.1.

[procedure] (alpha STREAM-LIST) => STREAM-LIST

Matches any character of the alphabet.

[procedure] (binary STREAM-LIST) => STREAM-LIST

Matches [0..1].

[procedure] (decimal STREAM-LIST) => STREAM-LIST

Matches [0..9].

[procedure] (hexadecimal STREAM-LIST) => STREAM-LIST

Matches [0..9] and [A..F,a..f].

[procedure] (char STREAM-LIST) => STREAM-LIST

Matches any 7-bit US-ASCII character except for NUL (ASCII value 0).

[procedure] (cr STREAM-LIST) => STREAM-LIST

Matches the carriage return character.

[procedure] (lf STREAM-LIST) => STREAM-LIST

Matches the line feed character.

[procedure] (crlf STREAM-LIST) => STREAM-LIST

Matches the Internet newline.

[procedure] (ctl STREAM-LIST) => STREAM-LIST

Matches any US-ASCII control character. That is, any character with a decimal value in the range of [0..31,127].

[procedure] (dquote STREAM-LIST) => STREAM-LIST

Matches the double quote character.

[procedure] (htab STREAM-LIST) => STREAM-LIST

Matches the tab character.

[procedure] (lwsp STREAM-LIST) => STREAM-LIST

Matches linear white-space. That is, any number of consecutive wsp, optionally followed by a crlf and (at least) one more wsp.

[procedure] (sp STREAM-LIST) => STREAM-LIST

Matches the space character.

[procedure] (vspace STREAM-LIST) => STREAM-LIST

Matches any printable ASCII character. That is, any character in the decimal range of [33..126].

[procedure] (wsp STREAM-LIST) => STREAM-LIST

Matches space or tab.

[procedure] (quoted-pair STREAM-LIST) => STREAM-LIST

Matches a quoted pair. Any characters (excluding CR and LF) may be quoted.

[procedure] (quoted-string STREAM-LIST) => STREAM-LIST

Matches a quoted string. The slash and double quote characters must be escaped inside a quoted string; CR and LF are not allowed at all.

Additional convenience procedures and parser combinators

[procedure] (pass) => MATCHER

This matcher returns without consuming any input.

[procedure] (set CHAR-SET) => MATCHER

Matches any character from an SRFI-14 character set.

[procedure] (set-from-string STRING) => MATCHER

Matches any character from a set defined as a string.

[procedure] (bind F P) => MATCHER

Given a rule P and function F, returns a matcher that first applies P to the input stream, then applies F to the returned list of consumed tokens, and returns the result and the remainder of the input stream.

[procedure] (drop-consumed P) => MATCHER

Given a rule P, returns a matcher that always returns an empty list of consumed tokens when P succeeds.

Examples

The following parser libraries have been implemented with abnf, in order of complexity:

Parsing date and time


(use abnf)

(define (between-fws p)
  (abnf:concatenation
   (abnf:drop-consumed (abnf:optional-sequence fws)) p 
   (abnf:drop-consumed (abnf:optional-sequence fws))))

;; Date and Time Specification from RFC 5322 (Internet Message Format)

;; The following abnf parser combinators parse a date and time
;; specification of the form
;;
;;   Thu, 19 Dec 2002 20:35:46 +0200
;;
; where the weekday specification is optional. 
			     
;; Match the abbreviated weekday names

(define day-name 
  (abnf:alternatives
   (abnf:lit "Mon")
   (abnf:lit "Tue")
   (abnf:lit "Wed")
   (abnf:lit "Thu")
   (abnf:lit "Fri")
   (abnf:lit "Sat")
   (abnf:lit "Sun")))

;; Match a day-name, optionally wrapped in folding whitespace

(define day-of-week (between-fws day-name))


;; Match a four digit decimal number

(define year (between-fws (abnf:repetition-n 4 abnf:decimal)))

;; Match the abbreviated month names

(define month-name (abnf:alternatives
		    (abnf:lit "Jan")
		    (abnf:lit "Feb")
		    (abnf:lit "Mar")
		    (abnf:lit "Apr")
		    (abnf:lit "May")
		    (abnf:lit "Jun")
		    (abnf:lit "Jul")
		    (abnf:lit "Aug")
		    (abnf:lit "Sep")
		    (abnf:lit "Oct")
		    (abnf:lit "Nov")
		    (abnf:lit "Dec")))

;; Match a month-name, optionally wrapped in folding whitespace

(define month (between-fws month-name))


;; Match a one or two digit number

(define day (abnf:concatenation
	     (abnf:drop-consumed (abnf:optional-sequence fws))
	     (abnf:alternatives 
	      (abnf:variable-repetition 1 2 abnf:decimal)
	      (abnf:drop-consumed fws))))

;; Match a date of the form dd:mm:yyyy
(define date (abnf:concatenation day month year))

;; Match a two-digit number 

(define hour      (abnf:repetition-n 2 abnf:decimal))
(define minute    (abnf:repetition-n 2 abnf:decimal))
(define isecond   (abnf:repetition-n 2 abnf:decimal))

;; Match a time-of-day specification of hh:mm or hh:mm:ss.

(define time-of-day (abnf:concatenation
		     hour (abnf:drop-consumed (abnf:char #\:))
		     minute (abnf:optional-sequence 
			     (abnf:concatenation (abnf:drop-consumed (abnf:char #\:))
						 isecond))))

;; Match a timezone specification of the form
;; +hhmm or -hhmm 

(define zone (abnf:concatenation 
	      (abnf:drop-consumed fws)
	      (abnf:alternatives (abnf:char #\-) (abnf:char #\+))
	      hour minute))

;; Match a time-of-day specification followed by a zone.

(define itime (abnf:concatenation time-of-day zone))

(define date-time (abnf:concatenation
		   (abnf:optional-sequence
		    (abnf:concatenation
		     day-of-week
		     (abnf:drop-consumed (abnf:char #\,))))
		   date
		   itime
		   (abnf:drop-consumed (abnf:optional-sequence cfws))))

Requires

Version History

License

 Copyright 2009 Ivan Raikov and the Okinawa Institute of Science and Technology.
 This program is free software: you can redistribute it and/or
 modify it under the terms of the GNU General Public License as
 published by the Free Software Foundation, either version 3 of the
 License, or (at your option) any later version.
 This program is distributed in the hope that it will be useful, but
 WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 General Public License for more details.
 A full copy of the GPL license can be found at
 <http://www.gnu.org/licenses/>.