Editing page: Nitrate - The CHICKEN Scheme wiki

You can edit this page using wiki syntax for markup.

Article contents:

== Nitrate

Combinator based pattern matching

[[toc:]]

=== Description

Nitrate is a combinator (procedure) based backtracking pattern matcher. Being procedure based, as opposed to syntax based, the matcher is extensible with any predicate or record type and its basic components are composable and can represent recursive structure.

This could also be called monadic pattern matching or a parser-combinator library for Scheme values.

Nitrate has been tested to work on Chibi Scheme in addition to CHICKEN 5.

==== Why a Procedure-Based Pattern Matcher?

Scheme has no standard pattern matcher. A commonly used one is the “Wright” pattern matcher, with a notable syntax-rules implementation using CPS macros by Alex Shinn. Such patterns matchers are not extensible. Some proposals for high-end, extensible pattern matchers include SRFI 257 and SRFI 262.

These matchers are all at the syntax level and are very difficult to implement and extend as a result. A static-typed language like Standard ML or Haskell requires a matcher as a syntactic construct because it has no way to discriminate against sum types. But Scheme is a latently typed language, so it does not have this issue. Higher-order procedures should be able to do the same thing that pattern matching syntax can, with the added benefit of extensibility.

This procedure-based pattern matcher is very composable. For instance, the pattern {{or~}} could be defined as:

<enscript language="scheme">
(define or~
  (case-lambda
    (() _~)
    ((x~) x~)
    ((x~ . spats) (if~ x~ _~ (apply or~ spats)))))
</enscript>

This code almost looks like how one would write or in a call-by-name version of Scheme. Since procedures are composed before matching, the code that makes up the patterns passed to {{or~}} are not run until then.

The composition is done over a finite list (the argument list), hence the recursion is done at a meta-level to the matching. But with this system, one can also write recursive patterns at the matching level. Consider the following possible implementation of {{member~}}:

<enscript language="scheme">
(define-recpat (member~ pat~)
  (or~ (cons~ pat~ _~) (cons~ _~ (member~ pat~))))
</enscript>

The macro {{define-recpat}} uses an eta-expansion trick to allow for a composition of matching procedures to be recursive.

A procedure-based pattern matcher would have a uniform syntax for matching any structured data. This may make the basic case of lists and vectors less ergonomic than a specialized solution, but this could encourage people to use abstract patterns instead of concrete ones.

One thing that a procedure-based pattern matcher can't do that a syntactic pattern matcher can is to bind values to identifiers, because identifiers are a syntactic construct. This has its upsides, though: a binding can appear zero or more times, and the client code can extract values as desired.

==== Design of the Matcher

The matcher is a depth-first, left-to-right traversal. Many R7RS-small systems are not going to take much opportunity to optimize anyways by re-arranging the evaulation order, so by fixing an evaluation order the matcher can express things such as cuts.

The matcher threads an immutable map through the match. Its immutability makes it easy to backtrack. The map is returned at the end of the match. The matcher procedures either return the map, or #f.

The matcher backtracks by maintaining a stack of continuations captured by {{call/cc}}. This means that failures jump, they do not flow up. This matcher only uses escape continuations by default, so even for implementations with an inefficient {{call/cc}} this implementation can still work with something like {{call/ec}} or {{call/1cc}}.

The default matcher is {{call/cc}} safe. That means that a matcher using only the procedures here can be entered and exited multiple times. This could be used, for instance, to match one branch, do some processing, and cut back to try another match if something goes wrong in the client code.
Future Directions

One idea is to use identifiers (R6RS) instead of symbols. This would make the patterns easier to read and help with possible scope and hygiene issues. A more radical change would be to use SRFI 247 style syntactic monads.

Another possibility is to do a version of SRFI 262 lowered to the procedure level. SRFI 262 is designed to be compilable efficiently, which suggests a route for optimizing compilers: given a pattern (i.e. in a procedure), lift it such that it only needs to be evaluated once, evaluate derived matching procedures into matching primitives (maybe record types), optimize the primitives, and then output a matching procedure with compiler hints to aggresively inline.

=== Author

Peter McGoron

=== Repository

https://codeberg.org/phm/nitrate/

=== Requirements

* [[/egg/r7rs|r7rs]]
* [[/egg/srfi-225|SRFI 225]]

=== Examples

Match a list with specified atoms:

<enscript language="scheme">
(define (one-two-three-four lst)
  (match lst dict
    ((list~ 1 2 3 4) #t)
    (_~ #f)))
(one-two-three-four '(1 2 3 4)) ; => #t
(one-two-three-four '#(1 2 3 4)) ; => #f
</enscript>

Match a list, mapping values to symbols:

<enscript language="scheme">
(define (simple lst)
  (match lst dict
    ((list~ 1 (b~ 'second) 3) (car (dict-ref (dto) dict 'second)))
    (_~ #f)))
(simple '(1 100 3)) ; => 100
(simple '(1 100 5)) ; => #f
</enscript>

Match as many numbers at the head of a list as possible:

<enscript language="scheme">
(define (leading lst)
  (match lst dict
    ((not~ (?~ pair?)) (values '() '()))
    ((list*~ (and~ (?~ number?) (b~ 'n)) (b~ 'rest))
     (values (dict-ref/default (dto) dict 'n '())
             (car (dict-ref (dto) dict 'rest))))))

(leading '(1 2 3 4)) ; => (values '(1 2 3 4) '())
(leading '(1 2 3 x y 5)) ;=> (values '(1 2 3) '(x y 5))
(leading '(x 1 2)) ; => (values '() '(x 1 2))
(leading 'abcd) ; => (values '() '())
</enscript>

Partition a list using a predicate:

<enscript language="scheme">
(define (partition predicate? lst)
  (match lst dict
    ((list*~ (or~ (and~ (?~ predicate?) (b~ 'hit))
                  (b~ 'miss))
             '())
     (values (dict-ref/default (dto) dict 'hit '())
             (dict-ref/default (dto) dict 'miss '())))))
(partition positive? '(1 2 3 -5 -6 7 8 -9))
; => (values '(1 2 3 7 8) '(-5 -6 -9))
</enscript>

=== API

TODO: port the API documentation to the CHICKEN wiki format. The full API is documented on a [[https://florida.moe/nitrate/|separate website]].

== Version History

; 1.0.0 : Initial Release

=== License

Permission to use, copy, modify, and/or distribute this software for
any purpose with or without fee is hereby granted.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

Description of your changes:

I would like to authenticate

Authentication

Username:Password:

Spam control

What do you get when you subtract 16 from 0?