string-utils
Documentation
Memoized String
Usage
(import memoized-string)
make-string+
[procedure] (make-string+ COUNT [FILL]) -> stringAn interning make-string.
FILL is any valid char, including codepoints outside of the ASCII range, which produce UTF-8 strings.
string+
[procedure] (string+ [CHAR...]) -> stringAn interning string.
CHAR is any valid char, including codepoints outside of the ASCII range, which produce UTF-8 strings.
global-string
[procedure] (global-string STR) -> stringShare common string space.
String Hexadecimal
Usage
(import string-hexadecimal)
string->hex
[procedure] (string->hex STRING [START [END]]) -> stringReturns a hexadecimal represenation of STRING. START and END are substring limits.
STRING is treated as a string of bytes, a byte-vector.
hex->string
[procedure] (hex->string STRING [START [END]]) -> stringReturns the binary representation of a hexadecimalSTRING. START and END are substring limits.
Hexadecimal Procedures
Usage
(import to-hex)
str_to_hex
[procedure] (str_to_hex OUT IN OFF LEN)Writes the ASCII hexadecimal representation of IN to OUT.
IN is a nonnull-string.
OFF is the byte offset.
LEN is the length of the bytes at OFF.
OUT is a string of length >= (+ LEN 2).
blob_to_hex
[procedure] (blob_to_hex OUT IN OFF LEN)Like str_to_hex except IN is a nonnull-blob.
u8vec_to_hex
[procedure] (u8vec_to_hex OUT IN OFF LEN)Like str_to_hex except IN is a nonnull-u8vector.
s8vec_to_hex
[procedure] (s8vec_to_hex OUT IN OFF LEN)Like str_to_hex except IN is a nonnull-s8vector.
mem_to_hex
[procedure] (mem_to_hex OUT IN OFF LEN)Like str_to_hex except IN is a nonnull-c-pointer.
hex_to_str
[procedure] (hex_to_str OUT IN OFF LEN)Reads the ASCII hexadecimal representation of IN to OUT.
IN is a nonnull-string.
OFF is the byte offset.
LEN is the length of the bytes at OFF.
OUT is a string of length >= (/ LEN 2).
hex_to_blob
[procedure] (hex_to_blob OUT IN OFF LEN)Like hex_to_str except OUT is a blob of size >= (/ LEN 2).
Unicode Utilities
The name of this extension is misleading. Only UTF-8 is currently supported.
For a better treatment of UTF-8 see the utf-8 extension.
Usage
(import unicode-utils)
ascii-codepoint?
[procedure] (ascii-codepoint? CHAR) -> booleanchar->unicode-string
[procedure] (char->unicode-string CHAR) -> stringReturns a string formed from Unicode codepoint CHAR.
Note that the (string-length) (except under utf-8) may not be equal to 1.
Generates an error should the codepoint be out-of-range.
unicode-string
[procedure] (unicode-string [CHAR...]) -> stringReturns a string formed from Unicode codepoints CHAR...
Note that the (string-length) (except under utf-8) may not be equal to the length of CHAR....
Generates an error should the codepoint be out-of-range.
*unicode-string
[procedure] (*unicode-string CHARS) -> stringReturns a string formed from Unicode codepoints CHARS, a (list-of char).
unicode-make-string
[procedure] (unicode-make-string COUNT [FILL]) -> stringReturns a string formed from COUNT occurrences of the Unicode codepoint FILL. The FILL default is #\space.
Note that the (string-length) (except under utf-8) may not be equal to COUNT.
Generates an error should the codepoint be out-of-range.
unicode-surrogate?
[procedure] (unicode-surrogate? NUM) -> booleanunicode-surrogates->codepoint
[procedure] (unicode-surrogates->codepoint HIGH LOW) -> (or boolean fixnum)Returns the codepoint for the valid surrogate pair HIGH and LOW. Otherwise returns #f.
String Utilities
Usage
(import string-utils)
string-split-chars
[procedure] (string-split-chars STR [DELIMITERS]) -> (list-of string) (list-of char)Returns a list of substrings of STR & a list of the characters, from DELIMITERS, separating those substrings.
- STR
- string ; version string.
- DELIMITERS
- string ; string of version component delimiter characters, default ".,".
(string-split-chars "a.2,c" "$,.") ;=> ("a" "2" "c") (#\. #\,)
string-unzip
[procedure] (string-unzip STR [DELIMITERS]) -> (list-of string) (list-of string)Returns a list of substrings of STR & a list of the delimiters, from DELIMITERS, separating those substrings.
- STR
- string ; version string.
- DELIMITERS
- string ; string of version component delimiter characters, default ".,".
(string-unzip "a.2,c" "$,.") ;=> ("a" "2" "c") ("." ",")
string-zip
[procedure] (string-zip PARTS PUNCS) -> stringReturns a string formed from the concatenation of the PARTS and the interspersion of the PUNCS.
- PARTS
- (list-of string) ; version components.
- PUNCS
- (list-of string) ; version component separators.
(string-zip ("a" "2" "c") ("." ",")) ;=> "a.2,c"
string-trim-whitespace-both
[procedure] (string-trim-whitespace-both S) -> stringReturns the string S with whitespace trimmed.
list-as-string
[procedure] (list-as-string LS) -> stringReturns the list LS written to a string.
number->padded-string
[procedure] (number->padded-string N WIDTH [PADCHAR [BASE]]) -> string- N
- number ; source
- WIDTH
- fixnum ; field width
- PADCHAR
- char ; padding character
- BASE
- fixnum ; number conversion base
string-fixed-length
[procedure] (string-fixed-length S N [pad-char: #\space] [trailing: "..."]) -> stringReturns the string S with the string-length fixed to N.
A shorter string is padded. A longer string is truncated, & suffixed with the trailing.
string-longest-common-prefix
[procedure] (string-longest-common-prefix STRINGS) -> stringReturns the longest comment prefix of STRINGS.
- STRINGS
- (list-of string)
string-longest-common-suffix
[procedure] (string-longest-common-suffix STRINGS) -> stringReturns the longest comment suffix of STRINGS.
- STRINGS
- (list-of string)
string-longest-prefix
[procedure] (string-longest-prefix CANDIDATE OTHERS) -> (or boolean string)Returns the member with the longest comment prefix of CANDIDATE from OTHERS, or #f.
- CANDIDATE
- string
- OTHERS
- (list-of string)
string-longest-suffix
[procedure] (string-longest-suffix CANDIDATE OTHERS) -> (or boolean string)Returns the member with the longest comment suffix of CANDIDATE from OTHERS, or #f.
- CANDIDATE
- string
- OTHERS
- (list-of string)
String Interpolation
Extends the read-syntax with #"..." where tagged scheme expressions in the string are evaluated at runtime:
#"@ #(+ 1 2)## (#'and #1 #2) = #(and 1 2) trailing #" ;=> "@ 3# (and 1 2) = 2 trailing #"
Similar to the #<# multi-line string.
See Multiline String Constant with Embedded Expressions.
Note Support for the #{<sexpr>} subform is dropped. So SRFI 105 can work as expected:
(import (srfi-105 extra)) #"1 + 3 = #{1 + 3}" ;=> "1 + 3 = 4" #"An \"#{string-append(\"Hello, \" \"World\")}\" example" ;=> "An \"Hello, World\" example"
Usage
(import string-interpolation)
or using UTF8
(import utf8-string-interpolation)
Compiler Command-Line
csc -extend [utf8-]string-interpolation ...
Interpreter Command-Line
csi -require-extension [utf8-]string-interpolation ...
Activates string-interpolation #"..." syntax.
String Interpolation Syntax
Usage
(import string-interpolation-syntax)
set-sharp-string-interpolation-syntax
[procedure] (set-sharp-string-interpolation-syntax PROC)Extends the read-syntax with #"..." where the "..." is evaluated using (PROC "...").
- PROC
- #f ; read-syntax is cleared.
- PROC
- #t ; PROC is identity.
- PROC
- procedure ; interpolation function.
String Interpolator
Usage
(import string-interpolator)
or using UTF8
(import utf8-string-interpolator)
string-interpolate
[procedure] (string-interpolate STR [eval-tag: EVAL-TAG]) -> listPerforms substitution of embedded Scheme expressions, prefixed with EVAL-TAG. Two consecutive EVAL-TAGs are translated to a single EVAL-TAG. A trailing EVAL-TAG is taken literally.
- STR
- string.
- EVAL-TAG
- character, default #\#.
Rabin Karp String Search
Usage
(import rabin-karp)
make-string-search
[procedure] (make-string-search STRINGS [COMPARE [HASH]]) -> SEARCHER- STRINGS
- (list-of string) ;
- COMPARE
- (string string --> boolean) ;
- HASH
- (string [BOUNDS []]) ; SRFI-69 hash procedure.
- SEARCHER
- (string [START [END]]) --> RESULT
- RESULT
- (or #f (STRING . (START . END))) ; success or failure result
collect-string-search
[procedure] (collect-string-search SEARCHER TARGET) -> (list-of RESULT)Perform exhaustive search of the TARGET, returing a list of RESULT.
- SEARCHER
- from make-string-search
- TARGET
- string ; search within
- RESULT
- (or #f (STRING . (START . END))) ; success or failure result
Requirements
check-errors miscmacros srfi-1 srfi-13 srfi-69 utf8
Author
Version history
- 2.7.4
- More fixnum, add default delimiter for string-split-chars/string-unzip.
- 2.7.3
- Add tests, more fixnum, fix signatures.
- 2.7.2
- Fix signatures, new test-runner.
- 2.7.1
- Fix version.
- 2.7.0
- Add rabin-karp module.
- 2.6.0
- Remove #{...} support.
- 2.5.6
- Reflow.
- 2.5.5
- Update test-runner.
- 2.5.4
- UTF8.
- 2.5.3
- Add string-split-chars.
- 2.5.2
- Fix potential buffer overflow in to-hex.
- 2.5.0
- Add string-zip & string-unzip.
- 2.4.0
- Add string-longest-common-prefix/suffix, string-longest-prefix/suffix, number->padded-string, list-as-string, string-trim-whitespace-both.
- 2.3.2
- Deprecate unicode-char->string, fixes for memoized-string & string-utils modules, ascii-codepoint? & unicode-surrogate? are not predicates.
- 2.3.1
- Minor optimization.
- 2.3.0
- Deprecate #{...} support. Add string-interpolator modules.
- 2.2.0
- Fix string-interpolation.
- 2.1.0
- Add utf8-string-interpolation.
- 2.0.0
- C5 release.
License
Copyright (C) 2010-2024 Kon Lovett. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. Neither the name of the author nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICESLOSS OF USE, DATA, OR PROFITSOR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.