string-utils

(import memoized-string)

make-string+

[procedure] (make-string+ COUNT [FILL]) -> string

An interning make-string.

FILL is any valid char, including codepoints outside of the ASCII range, which produce UTF-8 strings.

string+

[procedure] (string+ [CHAR...]) -> string

An interning string.

CHAR is any valid char, including codepoints outside of the ASCII range, which produce UTF-8 strings.

global-string

[procedure] (global-string STR) -> string

Share common string space.

String Hexadecimal

Usage

(import string-hexadecimal)

string->hex

[procedure] (string->hex STRING [START [END]]) -> string

Returns a hexadecimal represenation of STRING. START and END are substring limits.

STRING is treated as a string of bytes, a byte-vector.

hex->string

[procedure] (hex->string STRING [START [END]]) -> string

Returns the binary representation of a hexadecimalSTRING. START and END are substring limits.

Hexadecimal Procedures

Usage

(import to-hex)

str_to_hex

[procedure] (str_to_hex OUT IN OFF LEN)

Writes the ASCII hexadecimal representation of IN to OUT.

IN is a nonnull-string.

OFF is the byte offset.

LEN is the length of the bytes at OFF.

OUT is a string of length >= (+ LEN 2).

blob_to_hex

[procedure] (blob_to_hex OUT IN OFF LEN)

Like str_to_hex except IN is a nonnull-blob.

u8vec_to_hex

[procedure] (u8vec_to_hex OUT IN OFF LEN)

Like str_to_hex except IN is a nonnull-u8vector.

s8vec_to_hex

[procedure] (s8vec_to_hex OUT IN OFF LEN)

Like str_to_hex except IN is a nonnull-s8vector.

mem_to_hex

[procedure] (mem_to_hex OUT IN OFF LEN)

Like str_to_hex except IN is a nonnull-c-pointer.

hex_to_str

[procedure] (hex_to_str OUT IN OFF LEN)

Reads the ASCII hexadecimal representation of IN to OUT.

IN is a nonnull-string.

OFF is the byte offset.

LEN is the length of the bytes at OFF.

OUT is a string of length >= (/ LEN 2).

hex_to_blob

[procedure] (hex_to_blob OUT IN OFF LEN)

Like hex_to_str except OUT is a blob of size >= (/ LEN 2).

Unicode Utilities

The name of this extension is misleading. Only UTF-8 is currently supported.

For a better treatment of UTF-8 see the utf-8 extension.

Usage

(import unicode-utils)

ascii-codepoint?

[procedure] (ascii-codepoint? CHAR) -> boolean

char->unicode-string

[procedure] (char->unicode-string CHAR) -> string

Returns a string formed from Unicode codepoint CHAR.

Note that the (string-length) (except under utf-8) may not be equal to 1.

Generates an error should the codepoint be out-of-range.

unicode-string

[procedure] (unicode-string [CHAR...]) -> string

Returns a string formed from Unicode codepoints CHAR...

Note that the (string-length) (except under utf-8) may not be equal to the length of CHAR....

Generates an error should the codepoint be out-of-range.

*unicode-string

[procedure] (*unicode-string CHARS) -> string

Returns a string formed from Unicode codepoints CHARS, a (list-of char).

unicode-make-string

[procedure] (unicode-make-string COUNT [FILL]) -> string

Returns a string formed from COUNT occurrences of the Unicode codepoint FILL. The FILL default is #\space.

Note that the (string-length) (except under utf-8) may not be equal to COUNT.

Generates an error should the codepoint be out-of-range.

unicode-surrogate?

[procedure] (unicode-surrogate? NUM) -> boolean

unicode-surrogates->codepoint

[procedure] (unicode-surrogates->codepoint HIGH LOW) -> (or boolean fixnum)

Returns the codepoint for the valid surrogate pair HIGH and LOW. Otherwise returns #f.

String Utilities

Usage

(import string-utils)

string-split-chars

[procedure] (string-split-chars STR [DELIMITERS]) -> (list-of string) (list-of char)

Returns a list of substrings of STR & a list of the characters, from DELIMITERS, separating those substrings.

STR: string ; version string.
DELIMITERS: string ; string of version component delimiter characters, default ".,".

(string-split-chars "a.2,c" "$,.")
;=> ("a" "2" "c") (#\. #\,)

string-unzip

[procedure] (string-unzip STR [DELIMITERS]) -> (list-of string) (list-of string)

Returns a list of substrings of STR & a list of the delimiters, from DELIMITERS, separating those substrings.

STR: string ; version string.
DELIMITERS: string ; string of version component delimiter characters, default ".,".

(string-unzip "a.2,c" "$,.")
;=> ("a" "2" "c") ("." ",")

string-zip

[procedure] (string-zip PARTS PUNCS) -> string

Returns a string formed from the concatenation of the PARTS and the interspersion of the PUNCS.

PARTS: (list-of string) ; version components.
PUNCS: (list-of string) ; version component separators.

(string-zip ("a" "2" "c") ("." ","))
;=> "a.2,c"

string-trim-whitespace-both

[procedure] (string-trim-whitespace-both S) -> string

Returns the string S with whitespace trimmed.

list-as-string

[procedure] (list-as-string LS) -> string

Returns the list LS written to a string.

number->padded-string

[procedure] (number->padded-string N WIDTH [PADCHAR [BASE]]) -> string

N: number ; source
WIDTH: fixnum ; field width
PADCHAR: char ; padding character
BASE: fixnum ; number conversion base

string-fixed-length

[procedure] (string-fixed-length S N [pad-char: #\space] [trailing: "..."]) -> string

Returns the string S with the string-length fixed to N.

A shorter string is padded. A longer string is truncated, & suffixed with the trailing.

string-subsequence?

[procedure] (string-subsequence? S T) -> boolean

Returns whether the characters of S occur in order thru T.

string-longest-common-prefix

[procedure] (string-longest-common-prefix STRINGS) -> string

Returns the longest comment prefix of STRINGS.

STRINGS: (list-of string)

string-longest-common-suffix

[procedure] (string-longest-common-suffix STRINGS) -> string

Returns the longest comment suffix of STRINGS.

STRINGS: (list-of string)

string-longest-prefix

[procedure] (string-longest-prefix CANDIDATE OTHERS) -> (or boolean string)

Returns the member with the longest comment prefix of CANDIDATE from OTHERS, or #f.

CANDIDATE: string
OTHERS: (list-of string)

string-longest-suffix

[procedure] (string-longest-suffix CANDIDATE OTHERS) -> (or boolean string)

Returns the member with the longest comment suffix of CANDIDATE from OTHERS, or #f.

CANDIDATE: string
OTHERS: (list-of string)

String Interpolation

Extends the read-syntax with #"..." where tagged scheme expressions in the string are evaluated at runtime:

#"@ #(+ 1 2)## (#'and #1 #2) = #(and 1 2) trailing #"
;=> "@ 3# (and 1 2) = 2 trailing #"

Similar to the #<# multi-line string.

See Multiline String Constant with Embedded Expressions.

Note Support for the #{<sexpr>} subform is dropped. So SRFI 105 can work as expected:

(import (srfi-105 extra))
#"1 + 3 = #{1 + 3}"
;=> "1 + 3 = 4"
#"An \"#{string-append(\"Hello, \" \"World\")}\" example"
;=> "An \"Hello, World\" example"

Usage

(import string-interpolation)

or using UTF8

(import utf8-string-interpolation)

Compiler Command-Line

csc -extend [utf8-]string-interpolation ...

Interpreter Command-Line

csi -require-extension [utf8-]string-interpolation ...

Activates string-interpolation #"..." syntax.

String Interpolation Syntax

Usage

(import string-interpolation-syntax)

set-sharp-string-interpolation-syntax

[procedure] (set-sharp-string-interpolation-syntax PROC)

Extends the read-syntax with #"..." where the "..." is evaluated using (PROC "...").

PROC: #f ; read-syntax is cleared.
PROC: #t ; PROC is identity.
PROC: procedure ; interpolation function.

String Interpolator

Usage

(import string-interpolator)

or using UTF8

(import utf8-string-interpolator)

string-interpolate

[procedure] (string-interpolate STR [eval-tag: EVAL-TAG]) -> list

Performs substitution of embedded Scheme expressions, prefixed with EVAL-TAG. Two consecutive EVAL-TAGs are translated to a single EVAL-TAG. A trailing EVAL-TAG is taken literally.

STR: string.
EVAL-TAG: character, default #\#.

Rabin Karp String Search

Usage

(import rabin-karp)

make-string-search

[procedure] (make-string-search STRINGS [COMPARE [HASH]]) -> SEARCHER

STRINGS: (list-of string) ;
COMPARE: (string string --> boolean) ;
HASH: (string [BOUNDS []]) ; SRFI-69 hash procedure.
SEARCHER: (string [START [END]]) --> RESULT
RESULT: (or #f (STRING . (START . END))) ; success or failure result

collect-string-search

[procedure] (collect-string-search SEARCHER TARGET) -> (list-of RESULT)

Perform exhaustive search of the TARGET, returing a list of RESULT.

SEARCHER: from make-string-search
TARGET: string ; search within
RESULT: (or #f (STRING . (START . END))) ; success or failure result

Requirements

check-errors miscmacros srfi-1 srfi-13 srfi-69 utf8

test test-utils

Version history

2.8.1: Feature string-interpolation.
2.8.0: Added .
2.7.5: Fixed .
2.7.4: More fixnum, add string-subsequence?, better string-longest-common-prefix/suffix.
2.7.4: More fixnum, add default delimiter for string-split-chars/string-unzip.
2.7.3: Add tests, more fixnum, fix signatures.
2.7.2: Fix signatures, new test-runner.
2.7.1: Fix version.
2.7.0: Add rabin-karp module.
2.6.0: Remove #{...} support.
2.5.6: Reflow.
2.5.5: Update test-runner.
2.5.4: UTF8.
2.5.3: Add string-split-chars.
2.5.2: Fix potential buffer overflow in to-hex.
2.5.0: Add string-zip & string-unzip.
2.4.0: Add string-longest-common-prefix/suffix, string-longest-prefix/suffix, number->padded-string, list-as-string, string-trim-whitespace-both.
2.3.2: Deprecate unicode-char->string, fixes for memoized-string & string-utils modules, ascii-codepoint? & unicode-surrogate? are not predicates.
2.3.1: Minor optimization.
2.3.0: Deprecate #{...} support. Add string-interpolator modules.
2.2.0: Fix string-interpolation.
2.1.0: Add utf8-string-interpolation.
2.0.0: C5 release.

License

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

 Redistributions of source code must retain the above copyright notice, this list of conditions and the following
   disclaimer.
 Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following
   disclaimer in the documentation and/or other materials provided with the distribution.
 Neither the name of the author nor the names of its contributors may be used to endorse or promote
   products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICESLOSS OF USE, DATA, OR PROFITSOR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Documentation

Memoized String

Usage

make-string+

string+

global-string

String Hexadecimal

Usage

string->hex

hex->string

Hexadecimal Procedures

Usage

str_to_hex

blob_to_hex

u8vec_to_hex

s8vec_to_hex

mem_to_hex

hex_to_str

hex_to_blob

Unicode Utilities

Usage

ascii-codepoint?

char->unicode-string

unicode-string

*unicode-string

unicode-make-string

unicode-surrogate?

unicode-surrogates->codepoint

String Utilities

Usage

string-split-chars

string-unzip

string-zip

string-trim-whitespace-both

list-as-string

number->padded-string

string-fixed-length

string-subsequence?

string-longest-common-prefix

string-longest-common-suffix

string-longest-prefix

string-longest-suffix

String Interpolation

Usage

Compiler Command-Line

Interpreter Command-Line

String Interpolation Syntax

Usage

set-sharp-string-interpolation-syntax

String Interpolator

Usage

string-interpolate

Rabin Karp String Search

Usage

make-string-search

collect-string-search

Requirements

Author

Version history

License