You are looking at historical revision 39435 of this page. It may differ significantly from its current revision.
icu
Select bindings to the ICU unicode library.
Introduction
This library is partially inspired by [https://docs.python.org/3/library/unicodedata.html|Python's unicodedata library]. As it deals with unicode, it also reexports the utf8 egg for ease of use.
Procedures
Names
[procedure] (char-from-name name)Return char corresponding to string name name. name is passed through string-upcase.
(char-from-name "fire") ;; => #\x1f525 (char-from-name "FIRE") ;; => #\x1f525[procedure] (char-string-name char)
Returns string name for char.
(char-string-name #\x1f525) ;; => "FIRE"
Decomposition and Normalization
[procedure] (char-decomposition char)Returns the decomposition mapping of char.
For example, for ¼, VULGAR FRACTION ONE QUARTER:
(char-decomposition #\xBC) ;; => '(#\1 #\x2044 #\4)[procedure] (string-normalize str [form])
Returns the normalized form of str to the destination string according to form, which can be any of "nfc", "nfkc", "nfd", or "nfkd"
(string-normalize "¼") ;; => "1/4"
Numbers
[procedure] (char-digit-value char)Binding for u_charDigitValue. Returns the decimal digit value of a decimal digit character.
(char-digit-value #\4) ;; => 4[procedure] (char-numeric-value char)
Binding for u_getNumericValue. Get the numeric value (as a double) for a Unicode code point as defined in the Unicode Character Database.
(char-numeric-value #\4) ;; => 4.0 (char-numeric-value #\xBC) ;; => .25[procedure] (char-digit char radix)
Binding for u_digit. Returns the decimal digit value of the code point in the specified radix.
(char-digit #\f 16) ;; => 15[procedure] (char-for-digit char radix)
Binding for u_forDigit. Determines the character representation for a specific digit in the specified radix.
(char-for-digit 15 16) ;; => #\f[procedure] (char-digit? char)
Binding for u_isdigit. Determines whether the specified code point is a digit character according to Java.
[procedure] (char-xdigit? char)Binding for u_isxdigit. Determines whether the specified code point is a hexadecimal digit.
Operators and transformers
[procedure] (char-mirror char)Binding for u_charMirror. Maps the specified character to a "mirror-image" character.
[procedure] (char-bidi-paired-pracket)Binding for u_getBidiPairedBracket. Maps the specified character to its paired bracket character.
[procedure] (char->lower char)[procedure] (char->upper char)
[procedure] (char->title char)
Bindings for u_tolower,u_toupper, and u_totitle
Properties
[procedure] (char-category char)Binding for u_charType. Returns the general category value for the code point (an integer, see below).
You can convert this to a symbol with category->integer, and vice versa with integer->category
Categories:
category/unassigned category/uppercase-letter category/lowercase-letter category/titlecase-letter category/modifier-letter category/other-letter category/non-spacing-mark category/enclosing-mark category/combining-spacing-mark category/decimal-digit-number category/letter-number category/other-number category/space-separator category/line-separator category/paragraph-separator category/control-char category/format-char category/private-use-char category/surrogate category/dash-punctuation category/start-punctuation category/end-punctuation category/connector-punctuation category/other-punctuation category/math-symbol category/currency-symbol category/modifier-symbol category/other-symbol category/initial-punctuation category/final-punctuation category/char-category-count[procedure] (char-direction char)
Binding for u_charDirection. Returns the bidirectional category value for the code point, which is used in the Unicode bidirectional algorithm (an integer, see below).
You can convert this to a symbol with direction->integer, and vice versa with integer->direction
Directions:
direction/left-to-right direction/right-to-left direction/european-number direction/european-number-separator direction/european-number-terminator direction/arabic-number direction/common-number-separator direction/block-separator direction/segment-separator direction/white-space-neutral direction/other-neutral direction/left-to-right-embedding direction/left-to-right-override direction/right-to-left-arabic direction/right-to-left-embedding direction/right-to-left-override direction/pop-directional-format direction/dir-non-spacing-mark direction/boundary-neutral direction/first-strong-isolate direction/left-to-right-isolate direction/right-to-left-isolate direction/pop-directional-isolate direction/char-direction-count[procedure] (char-combining-class char)
Binding for u_getCombiningClass. Returns the combining class of the code point as specified in UnicodeData.txt.
Predicates
char-mirrored? char-ualphabetic? char-ulowercase? char-uuppercase? char-uwhitespace? char-whitespace? char-java-space? char-space? char-blank? char-lower? char-upper? char-digit? char-alpha? char-alnum? char-xdigit? char-punct? char-graph? char-defined? char-cntrl? char-iso-control? char-print? char-base?