icu

Chicken bindings to the ICU unicode library

  1. icu
    1. Module: icu
      1. Names
        1. char-from-name
        2. char-string-name
      2. Decomposition and Normalization
        1. char-decomposition
        2. string-normalize
      3. Numbers
        1. char-digit-value
        2. char-numeric-value
        3. char-digit
        4. char-for-digit
        5. char-digit?
        6. char-xdigit?
      4. Operators and transformers
        1. char-mirror
        2. char-bidi-paired-bracket
        3. char->lower
        4. char->upper
        5. char->title
      5. Properties
        1. char-category
        2. char-direction
        3. char-combining-class
      6. Predicates
    2. Author
    3. Version History
    4. License

Module: icu

Select bindings to the ICU unicode library.

Names

char-from-name
[procedure] (char-from-name name)

Return char corresponding to string name name. name is passed through string-upcase.

(char-from-name "fire") ;; => #\x1f525
(char-from-name "FIRE") ;; => #\x1f525
char-string-name
[procedure] (char-string-name char)

Returns string name for char.

(char-string-name #\x1f525) ;; => "FIRE"

Decomposition and Normalization

char-decomposition
[procedure] (char-decomposition char)

Returns the decomposition mapping of char.

For example, for ¼, VULGAR FRACTION ONE QUARTER:

(char-decomposition #\xBC) ;; => '(#\1 #\x2044 #\4)
string-normalize
[procedure] (string-normalize input #!optional (form "nfkc"))

Returns the normalized form of str to the destination string according to form

form
Any of "nfc", "nfkc", "nfd", or "nfkd"
(string-normalize "¼") ;; => "1/4"

Numbers

char-digit-value
[procedure] (char-digit-value char)

Binding for u_charDigitValue. Returns the decimal digit value of a decimal digit character.

(char-digit-value #\4) ;; => 4
char-numeric-value
[procedure] (char-numeric-value char)

Binding for u_getNumericValue. Get the numeric value (as a double) for a Unicode code point as defined in the Unicode Character Database.

(char-numeric-value #\4) ;; => 4.0
(char-numeric-value #\xBC) ;; => .25
char-digit
[procedure] (char-digit char radix)

Binding for u_digit. Returns the decimal digit value of the code point in the specified radix.

(char-digit #\f 16) ;; => 15
char-for-digit
[procedure] (char-for-digit char radix)

Binding for u_isdigit. Determines whether the specified code point is a digit character according to Java.

(char-for-digit 15 16) ;; => #\f
char-digit?
[procedure] (char-digit? char)

Binding for u_isdigit. Determines whether the specified code point is a digit character according to Java.

char-xdigit?
[procedure] (char-xdigit? char)

Binding for u_isxdigit. Determines whether the specified code point is a hexadecimal digit.

Operators and transformers

char-mirror
[procedure] (char-mirror char)

Binding for u_charMirror. Maps the specified character to a "mirror-image" character.

char-bidi-paired-bracket
[procedure] (char-bidi-paired-bracket char)

Binding for u_getBidiPairedBracket. Maps the specified character to its paired bracket character.

char->lower
[procedure] (char->lower char)
char->upper
[procedure] (char->upper char)
char->title
[procedure] (char->title char)

Bindings for u_tolower,u_toupper, and u_totitle

Properties

char-category
[procedure] (char-category char)

Binding for u_charType. Returns the general category value for the code point (an integer, see below).

You can convert this to a symbol with category->integer, and vice versa with integer->category

Categories:

[constant] category/unassigned
[constant] category/uppercase-letter
[constant] category/lowercase-letter
[constant] category/titlecase-letter
[constant] category/modifier-letter
[constant] category/other-letter
[constant] category/non-spacing-mark
[constant] category/enclosing-mark
[constant] category/combining-spacing-mark
[constant] category/decimal-digit-number
[constant] category/letter-number
[constant] category/other-number
[constant] category/space-separator
[constant] category/line-separator
[constant] category/paragraph-separator
[constant] category/control-char
[constant] category/format-char
[constant] category/private-use-char
[constant] category/surrogate
[constant] category/dash-punctuation
[constant] category/start-punctuation
[constant] category/end-punctuation
[constant] category/connector-punctuation
[constant] category/other-punctuation
[constant] category/math-symbol
[constant] category/currency-symbol
[constant] category/modifier-symbol
[constant] category/other-symbol
[constant] category/initial-punctuation
[constant] category/final-punctuation
[constant] category/char-category-count
char-direction
[procedure] (char-direction char)

Binding for u_charDirection. Returns the bidirectional category value for the code point, which is used in the Unicode bidirectional algorithm (an integer, see below).

You can convert this to a symbol with direction->integer, and vice versa with integer->direction.

Directions:

[constant] direction/left-to-right
[constant] direction/right-to-left
[constant] direction/european-number
[constant] direction/european-number-separator
[constant] direction/european-number-terminator
[constant] direction/arabic-number
[constant] direction/common-number-separator
[constant] direction/block-separator
[constant] direction/segment-separator
[constant] direction/white-space-neutral
[constant] direction/other-neutral
[constant] direction/left-to-right-embedding
[constant] direction/left-to-right-override
[constant] direction/right-to-left-arabic
[constant] direction/right-to-left-embedding
[constant] direction/right-to-left-override
[constant] direction/pop-directional-format
[constant] direction/dir-non-spacing-mark
[constant] direction/boundary-neutral
[constant] direction/first-strong-isolate
[constant] direction/left-to-right-isolate
[constant] direction/right-to-left-isolate
[constant] direction/pop-directional-isolate
[constant] direction/char-direction-count
char-combining-class
[procedure] (char-combining-class char)

Binding for u_getCombiningClass. Returns the combining class of the code point as specified in UnicodeData.txt.

Predicates

[procedure] (char-mirrored? char)
[procedure] (char-ualphabetic? char)
[procedure] (char-ulowercase? char)
[procedure] (char-uuppercase? char)
[procedure] (char-uwhitespace? char)
[procedure] (char-whitespace? char)
[procedure] (char-java-space? char)
[procedure] (char-space? char)
[procedure] (char-blank? char)
[procedure] (char-lower? char)
[procedure] (char-upper? char)
[procedure] (char-alpha? char)
[procedure] (char-alnum? char)
[procedure] (char-punct? char)
[procedure] (char-graph? char)
[procedure] (char-defined? char)
[procedure] (char-cntrl? char)
[procedure] (char-iso-control? char)
[procedure] (char-print? char)
[procedure] (char-base? char)

Author

Diego A. Mundo

Version History

0.3.4
Port to CHICKEN 6
0.3.3
Use custom build script
0.3.2
Document with chalk
0.3.1
Fix issue with utf8 reexports
0.3.0
Slight API change
0.2.0
Make string-normalize form parameter optional
0.1.0
Initial version

License

Unicode

COPYRIGHT AND PERMISSION NOTICE (ICU 58 and later) 
 
Copyright © 1991-2020 Unicode, Inc. All rights reserved. 
Distributed under the Terms of Use in https://www.unicode.org/copyright.html. 
 
Permission is hereby granted, free of charge, to any person obtaining 
a copy of the Unicode data files and any associated documentation 
(the "Data Files") or Unicode software and any associated documentation 
(the "Software") to deal in the Data Files or Software 
without restriction, including without limitation the rights to use, 
copy, modify, merge, publish, distribute, and/or sell copies of 
the Data Files or Software, and to permit persons to whom the Data Files 
or Software are furnished to do so, provided that either 
(a) this copyright and permission notice appear with all copies 
of the Data Files or Software, or 
(b) this copyright and permission notice appear in associated 
Documentation. 
 
THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF 
ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE 
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 
NONINFRINGEMENT OF THIRD PARTY RIGHTS. 
IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS 
NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL 
DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, 
DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER 
TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR 
PERFORMANCE OF THE DATA FILES OR SOFTWARE. 
 
Except as contained in this notice, the name of a copyright holder 
shall not be used in advertising or otherwise to promote the sale, 
use or other dealings in these Data Files or Software without prior 
written authorization of the copyright holder.