You are looking at historical revision 38248 of this page. It may differ significantly from its current revision.

string-utils

Documentation

Memoized String

Usage

(import memoized-string)

make-string+

[procedure] (make-string+ COUNT [FILL]) -> string

A tabling make-string.

FILL is any valid char, including codepoints outside of the ASCII range. As such UTF-8 strings can be memoized.

string+

[procedure] (string+ [CHAR...]) -> string

A tabling string.

CHAR is any valid char, including codepoints outside of the ASCII range. As such UTF-8 strings can be memoized.

global-string

[procedure] (global-string STR) -> string

Share common string space.

make-string* (DEPRECATED)

[procedure] (make-string* COUNT [FILL]) -> string

String Hexadecimal

Usage

(import string-hexadecimal)

string->hex

[procedure] (string->hex STRING [START [END]]) -> string

Returns a hexadecimal represenation of STRING. START and END are substring limits.

STRING is treated as a string of bytes, a byte-vector.

hex->string

[procedure] (hex->string STRING [START [END]]) -> string

Returns the binary representation of a hexadecimalSTRING. START and END are substring limits.

Hexadecimal Procedures

Usage

(import to-hex)

str_to_hex

[procedure] (str_to_hex OUT IN OFF LEN)

Writes the ASCII hexadecimal representation of IN to OUT.

IN is a nonnull-string.

OFF is the byte offset.

LEN is the length of the bytes at OFF.

OUT is a string of length >= (+ LEN 2).

blob_to_hex

[procedure] (blob_to_hex OUT IN OFF LEN)

Like str_to_hex except IN is a nonnull-blob.

u8vec_to_hex

[procedure] (u8vec_to_hex OUT IN OFF LEN)

Like str_to_hex except IN is a nonnull-u8vector.

s8vec_to_hex

[procedure] (s8vec_to_hex OUT IN OFF LEN)

Like str_to_hex except IN is a nonnull-s8vector.

mem_to_hex

[procedure] (mem_to_hex OUT IN OFF LEN)

Like str_to_hex except IN is a nonnull-c-pointer.

hex_to_str

[procedure] (hex_to_str OUT IN OFF LEN)

Reads the ASCII hexadecimal representation of IN to OUT.

IN is a nonnull-string.

OFF is the byte offset.

LEN is the length of the bytes at OFF.

OUT is a string of length >= (/ LEN 2).

hex_to_str

[procedure] (hex_to_blob OUT IN OFF LEN)

Like hex_to_str except OUT is a blob of size >= (/ LEN 2).

Unicode Utilities

The name of this extension is misleading. Only UTF-8 is currently supported.

For a better treatment of UTF-8 see the utf-8 extension.

Usage

(import unicode-utils)

ascii-codepoint?

[procedure] (ascii-codepoint? CHAR) -> boolean

unicode-char->string

[procedure] (unicode-char->string CHAR) -> string

Returns a string formed from Unicode codepoint CHAR.

Note that the (string-length) (except under utf-8) may not be equal to 1.

Generates an error should the codepoint be out-of-range.

unicode-string

[procedure] (unicode-string [CHAR...]) -> string

Returns a string formed from Unicode codepoints CHAR...

Note that the (string-length) (except under utf-8) may not be equal to the length of CHAR....

Generates an error should the codepoint be out-of-range.

*unicode-string

[procedure] (*unicode-string CHARS) -> string

Returns a string formed from Unicode codepoints CHARS, a (list-of char).

unicode-make-string

[procedure] (unicode-make-string COUNT [FILL]) -> string

Returns a string formed from COUNT occurrences of the Unicode codepoint FILL. The FILL default is #\space.

Note that the (string-length) (except under utf-8) may not be equal to COUNT.

Generates an error should the codepoint be out-of-range.

unicode-surrogate?

[procedure] (unicode-surrogate? NUM) -> boolean

unicode-surrogates->codepoint

[procedure] (unicode-surrogates->codepoint HIGH LOW) -> (or boolean fixnum)

Returns the codepoint for the valid surrogate pair HIGH and LOW. Otherwise returns #f.

String Utilities

Usage

(import string-utils)

string-fixed-length

[procedure] (string-fixed-length S N [pad-char: #\space] [trailing: "..."]) --> string

Returns the string S with the string-length fixed to N.

A shorter string is padded. A longer string is truncated, & suffixed with the trailing.

string-longest-common-prefix

[procedure] (string-longest-common-prefix CANDIDATE OTHERS) --> (or boolean string)

Returns the member with the longest comment prefix of CANDIDATE from OTHERS, or #f.

CANDIDATE
string
OTHERS
(list-of string)

String Interpolation

Extends the read-syntax with #"..." where tagged scheme expressions in the string are evaluated at runtime.

Similar to the #<# multi-line string.

See Multiline String Constant with Embedded Expressions.

(import utf8-string-interpolation)

#"@ #(+ 1 2)## (#'and #1 #2) = #(and 1 2) trailing #"
;=> "@ 3# (and 1 2) = 2 trailing #"

Note Support for the #{<sexpr>} subform is deprecated; use the #<sexpr> form.

Usage

(import string-interpolation) ;or (import utf8-string-interpolation)
csc -extend string-interpolation ...
csi -require-extension string-interpolation ...

Activates string-interpolation.

Automatically invokes (set-sharp-string-interpolation-syntax string-interpolate) on load.

Usage

(import string-interpolation-syntax)

set-sharp-string-interpolation-syntax

[procedure] (set-sharp-string-interpolation-syntax PROC)

Extends the read-syntax with #"..." where the "..." is evaluated using (PROC "...").

PROC
#f ; read-syntax is cleared.
PROC
#t ; PROC is identity.
PROC
procedure ; interpolation function.

Usage

(import string-interpolator) ;or (import utf8-string-interpolator)

string-interpolate

[procedure] (string-interpolate STR [eval-tag: EVAL-TAG]) -> list

Performs substitution of embedded Scheme expressions, prefixed with EVAL-TAG. Two consecutive EVAL-TAGs are translated to a single EVAL-TAG. A trailing EVAL-TAG is taken literally.

STR
string.
EVAL-TAG
character, default #\#.

Requirements

check-errors miscmacros srfi-1 srfi-13 srfi-69 utf8

test

Author

Kon Lovett

Version history

2.3.0
Deprecate #{...} support. Add string-interpolator modules.
2.2.0
Fix string-interpolation.
2.1.0
Add utf8-string-interpolation.
2.0.0
C5 release.
1.6.0
Add string-utils-extensions.
1.5.6
Add types.
1.5.5
1.5.4
1.5.3
memorize-string -> global-string.
1.5.2
Fix string+ & memorize-string.
1.5.1
Fix string+ unicode support.
1.5.0
Deprecate make-string* for make-string+, add memorize-string & string+.
1.4.0
Add string-interpolation modules.
1.3.1
Fix hex_to_str, hex_to_blob.
1.3.0
Add hex->string, hex_to_str, hex_to_blob.
1.2.5
Remove lookup-table.
1.2.2
Unicode string construction a little faster. Removed blob->hex.
1.2.1
Added blob->hex.
1.2.0
Added "generic" bytes to hexadecimal string.
1.1.0
Split into separate modules. Added some UTF-8 support.
1.0.0
Hello

License

Copyright (C) 2010-2020 Kon Lovett. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the Software), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ASIS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.