Outdated egg!
This is an egg for CHICKEN 4, the unsupported old release. You're almost certainly looking for the CHICKEN 5 version of this egg, if it exists.
If it does not exist, there may be equivalent functionality provided by another egg; have a look at the egg index. Otherwise, please consider porting this egg to the current version of CHICKEN.
Regex-literals
A reader extension providing precompiled regular expression literals of the form #/[a-z0-9]+/i and #r{^/path/(to)/file$}
Examples
Using regular expression literals in the interpreter
Loading regex-literals also loads the regex unit and allows convenient use of regular expression literals as follows:
#;1> (use regex-literals) #;2> #/[A-Za-z0-9]+/ #<regexp> #;3> ,x #/^[a-z0-9]+$/i (regexp "^[a-z0-9]+$" #t #f #f) #;4> (string-match #/^(\d{2}):(\d{2})(..)/ "11:59pm") ("11:59pm" "11" "59" "pm") #;5> (string-split-fields #/[^\s]+/ "the quick brown fox jumps over the lazy dog") ("the" "quick" "brown" "fox" "jumps" "over" "the" "lazy" "dog") #;6> (string-split-fields #r{[^/]+} "/path/to/file") ("path" "to" "file") #;7> (string-substitute #/(\w+)\s+(\w+)/u "\\2, \\1" "John Smith") "Smith, John"
Using regular expression literals with the compiler
Passing a -X regex-literals command-line option to csc allows you to conveniently make use of regular expression literals in your egg or compiled program without making the regex-literals egg a runtime dependency.
(See the php-s11n egg for an example of building with regex-literals.)
Authors
Requires
Reader extensions
This egg installs a reader extension for #\/ that reads a regular expression literal as described below in read-regex-literal, and another reader extension for #\r that works similarly but supports a generalized delimiter syntax as described in read-regex-literal/general.
Note that there are some caveats to using reader extensions when compiling; for more details, refer to the relevant FAQ entry.
Input and output
read-regex-literal
[procedure] (read-regex-literal [PORT])Reads a regular expression literal of the form #/.../ from PORT, which defaults to the value of (current-input-port). The literal is converted to a precompiled regular expression object using the (regexp) procedure provided by the regex unit.
Regular expression literals may include one or more options that modify the way the pattern matches strings. The options are one or more characters placed immediately after the terminator:
- #/.../i PCRE_CASELESS: case-insensitive mode; the pattern match will ignore the case of letters in the pattern.
- #/.../x PCRE_EXTENDED: extended mode; complex regular expressions can be difficult to read, so this option allows you to insert spaces, newlines, and comments in the pattern to make it more readable.
- #/.../u PCRE_UTF8: UTF-8 mode; sets the language encoding of the regular expression.
read-regex-literal/general
[procedure] (read-regex-literal/general [PORT])Reads a regular expression literal of the form #r(...) from PORT, which defaults to the value of (current-input-port). This works otherwise similarly to read-regex-literal but supports a generalized delimiter syntax as follows:
- Matching delimiter pairs: #r{...}, #r(...), #r[...] and #r<...>
- Any arbitrary character: #r!...!, #r|...|, #r@...@, and so forth.
License
Copyright (c) 2006-2007 Arto Bendiken. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Version history
- 1.0.3
- ported to CHICKEN 4 (Thanks to Christian Kellermann)
- 1.0.2
- Support for generalized #r(...) delimiters (by Zbigniew)
- 1.0.1
- Added support for the #/.../i, #/.../x and #/.../u options.
- 1.0.0
- Initial release of the regex-literals egg.