You are looking at historical revision 39459 of this page. It may differ significantly from its current revision.

icu

Chicken bindings to the ICU unicode library

Select bindings to the ICU unicode library.

Introduction

This library is partially inspired by Python's unicodedata library. As it deals with unicode, it also reexports the utf8 egg for ease of use.

Procedures

Names

[procedure] (char-from-name name)

Return char corresponding to string name name. name is passed through string-upcase.

(char-from-name "fire") ;; => #\x1f525
(char-from-name "FIRE") ;; => #\x1f525
[procedure] (char-string-name char)

Returns string name for char.

(char-string-name #\x1f525) ;; => "FIRE"

Decomposition and Normalization

[procedure] (char-decomposition char)

Returns the decomposition mapping of char.

For example, for ¼, VULGAR FRACTION ONE QUARTER:

(char-decomposition #\xBC) ;; => '(#\1 #\x2044 #\4)
[procedure] (string-normalize input #!optional (form "nfkc"))

Returns the normalized form of str to the destination string according to form

form
Any of "nfc", "nfkc", "nfd", or "nfkd"
(string-normalize "¼") ;; => "1/4"

Numbers

[procedure] (char-digit-value char)

Binding for u_charDigitValue. Returns the decimal digit value of a decimal digit character.

(char-digit-value #\4) ;; => 4
[procedure] (char-numeric-value char)

Binding for u_getNumericValue. Get the numeric value (as a double) for a Unicode code point as defined in the Unicode Character Database.

(char-numeric-value #\4) ;; => 4.0
(char-numeric-value #\xBC) ;; => .25
[procedure] (char-digit char radix)

Binding for u_digit. Returns the decimal digit value of the code point in the specified radix.

(char-digit #\f 16) ;; => 15
[procedure] (char-for-digit char radix)

Binding for u_isdigit. Determines whether the specified code point is a digit character according to Java.

(char-for-digit 15 16) ;; => #\f
[procedure] (char-digit? char)

Binding for u_isdigit. Determines whether the specified code point is a digit character according to Java.

[procedure] (char-xdigit? char)

Binding for u_isxdigit. Determines whether the specified code point is a hexadecimal digit.

Operators and transformers

[procedure] (char-mirror char)

Binding for u_charMirror. Maps the specified character to a "mirror-image" character.

[procedure] (char-bidi-paired-bracket char)

Binding for u_getBidiPairedBracket. Maps the specified character to its paired bracket character.

[procedure] (char->lower char)
[procedure] (char->upper char)
[procedure] (char->title char)

Bindings for u_tolower,u_toupper, and u_totitle

Properties

[procedure] (char-category char)

Binding for u_charType. Returns the general category value for the code point (an integer, see below).

You can convert this to a symbol with category->integer, and vice versa with integer->category

Categories:

[constant] category/unassigned
[constant] category/uppercase-letter
[constant] category/lowercase-letter
[constant] category/titlecase-letter
[constant] category/modifier-letter
[constant] category/other-letter
[constant] category/non-spacing-mark
[constant] category/enclosing-mark
[constant] category/combining-spacing-mark
[constant] category/decimal-digit-number
[constant] category/letter-number
[constant] category/other-number
[constant] category/space-separator
[constant] category/line-separator
[constant] category/paragraph-separator
[constant] category/control-char
[constant] category/format-char
[constant] category/private-use-char
[constant] category/surrogate
[constant] category/dash-punctuation
[constant] category/start-punctuation
[constant] category/end-punctuation
[constant] category/connector-punctuation
[constant] category/other-punctuation
[constant] category/math-symbol
[constant] category/currency-symbol
[constant] category/modifier-symbol
[constant] category/other-symbol
[constant] category/initial-punctuation
[constant] category/final-punctuation
[constant] category/char-category-count

[procedure] (char-direction char)

Binding for u_charDirection. Returns the bidirectional category value for the code point, which is used in the Unicode bidirectional algorithm (an integer, see below).

You can convert this to a symbol with direction->integer, and vice versa with integer->direction.

Directions:

[constant] direction/left-to-right
[constant] direction/right-to-left
[constant] direction/european-number
[constant] direction/european-number-separator
[constant] direction/european-number-terminator
[constant] direction/arabic-number
[constant] direction/common-number-separator
[constant] direction/block-separator
[constant] direction/segment-separator
[constant] direction/white-space-neutral
[constant] direction/other-neutral
[constant] direction/left-to-right-embedding
[constant] direction/left-to-right-override
[constant] direction/right-to-left-arabic
[constant] direction/right-to-left-embedding
[constant] direction/right-to-left-override
[constant] direction/pop-directional-format
[constant] direction/dir-non-spacing-mark
[constant] direction/boundary-neutral
[constant] direction/first-strong-isolate
[constant] direction/left-to-right-isolate
[constant] direction/right-to-left-isolate
[constant] direction/pop-directional-isolate
[constant] direction/char-direction-count

[procedure] (char-combining-class char)

Binding for u_getCombiningClass. Returns the combining class of the code point as specified in UnicodeData.txt.

Predicates

[procedure] (char-mirrored? char)
[procedure] (char-ualphabetic? char)
[procedure] (char-ulowercase? char)
[procedure] (char-uuppercase? char)
[procedure] (char-uwhitespace? char)
[procedure] (char-whitespace? char)
[procedure] (char-java-space? char)
[procedure] (char-space? char)
[procedure] (char-blank? char)
[procedure] (char-lower? char)
[procedure] (char-upper? char)
[procedure] (char-alpha? char)
[procedure] (char-alnum? char)
[procedure] (char-punct? char)
[procedure] (char-graph? char)
[procedure] (char-defined? char)
[procedure] (char-cntrl? char)
[procedure] (char-iso-control? char)
[procedure] (char-print? char)
[procedure] (char-base? char)

Author

Diego A. Mundo

License

ICU License

Version History

0.3.2
Document with chalk
0.3.1
Fix issue with utf8 reexports
0.3.0
Slight API change
0.2.0
Make string-normalize form parameter optional
0.1.0
Initial version