nutils
Description
nutils is a collection of convenient, easy-to-use and versatile utilities. Unquestionably basic ideas of these utilities have been expressed in Scheme many times over the decades. The versions presented here offer a new variation or enhancements to existing ideas. It's hoped these utilites are effective and efficient tools that make Scheme programming just a bit more productive and enjoyable.
Author
Jules Altfas
Repository
https://codeberg.org/jrapdx/nutils
Dependencies
nutils api
let-kw
let-kw is somewhat like let-optionals and kin, but with features providing added utility.
<syntax>(let-kw REST ((KEYWORD/SYMBOL/STRING VARIABLE DEFAULT) ...) BODY ...)
- in - REST - a list, generally a dotted rest-list, containing arguments to parse
- in - KEYWORD/SYMBOL/STRING - the argument/value that identifies the intended input
- in - VARIABLE - an identifier naming the generated variable
- in - DEFAULT - value assigned to the variable when K/S/S not an input.
- in - BODY - like body of a let. Created variables are visible throughout body
- returns: value is result of last expression in body.
(import scheme.base nutils) (define (myproc . rest) (let-kw rest ((k0: k0 "cat")) (printnl "Animal:" k0))) ;; (myproc k0: "dog") => Animal: dog ;; (myproc) => Animal: cat
As noted, the "key" can be almost any scheme object:
(import nutils ...) (define (proc2 . rest) (let-kw rest (('-animal k0 "cat")) (printnl "Animal:" k0))) ;; (proc2 '-animal "dog") => Animal: dog
let-kw has another capability: it can be invoked with only a variable and default value. In this case it behaves like let-optional:
(import nutils ...) (define (proc-opt . rest) (let-kw rest ((k0 "cat")) (printnl "Animal:" k0))) ;; (proc-opt "dog") => Animal: dog ;; (proc-opt) => Animal: cat
Where it differs from let-optional is that both keyed and unkeyed inputs can be used together with interesting results:
(import nutils ...) (define (two-in-one . rest) (let-kw rest ((keyed: keyed 'big-value)) (let-kw rest ((xxx 0) (aaa 101)) (printnl keyed xxx aaa)))) (two-in-one keyed: 'big-money 2578) ;; => big-money 2578 101 (two-in-one 2578 keyed: 'big-money) ;; => big-money 2578 101 (two-in-one 2578) ;; => big-value 2578 101 (two-in-one) ;; big-value 0 101 (two-in-one tiger: 'home) => big-value tiger: home
Since keyed arguments are removed from rest list, when it's passed to the second let-kw it no longer contains keyed members and following values. So then let-kw matches the args in rest. If there are fewer args in rest than variables in let-kw, default values are assigned to the variables. This works because keyed args can't interfere with the "ordinary" optionals. There's no "rest confusion" due to DSSSL not removing keyword args from the rest list.
Note that let-kw can be used without keyed args at all. Pure optional let-kw works like optionals/let-optionals are expected to. When using keyed and unkeyed bindng lists, the keyed lists must come first as illustrated above. Also keyed and unkeyed should not be mixed in a bindings list.
foreach and foreach*
These are elaborated on for-each and map with similar basic semantics. However, foreach and foreach* invert, simplify and extend the syntax to allow accessing multiple elements of input lists at the same time.
foreach and foreach* have identical syntax except foreach* returns a list while foreach does not. In this document foreach/* refers to information that applies to both.
[syntax] (foreach/* ((VAR LIST) (VAR2 LIST2) ...) body ...)- in - LIST ... - basic syntax. VAR, VAR2, etc., are variables visible in BODY. Each VAR holds one element of the respective list.
- returns: foreach - no value, foreach* list of VAR, VAR2, etc., as determined by BODY expressions. If BODY result is the symbol _##_ an element is not added to output list.
- in - LIST ... - variables A B ..., C D ... hold one element of the respective list. IOW any number of elements can be extracted from lists at one time. The only restriction is that the same number of elements is taken from each list. Processing stops when shortest list runs out of elements. In foreach*, the result of BODY expressions is appended to output list, except if that value is the symbol _##_.
- returns: as above.
(import scheme.base nutils) (define op (open-output-file "demo1.txt")) (define row-vals '(123 282 366 471 543 612 798 882 936)) (foreach ((v row-vals)) (printnl v "divided by 3 is" (/ v 3) prn: op)) (close-output-port op) ;; demo1.txt contains rows: ;; 123 divided by 3 is 41 ;; 282 divided by 3 is 94 ;; ...
(import scheme.base nutils srfi-1) (define critrs '("lion" "tiger" "elephant" "gorilla" "dingo" "hippo")) (let ((res (foreach* (((a b c) (iota 6)) ((x y z) critrs)) (let ((f0 (cons (combine/sym 'S #\- a) x)) (f1 (cons (combine/sym 'T #\- b) y)) (f2 (cons (combine/sym 'U #\- c) z))) (list f0 f1 f2))))) res) ;; res => (((S-0 . "lion") (T-1 . "tiger") (U-2 . "elephant")) ;; ((S-3 . "gorilla") (T-4 . "dingo") (U-5 . "hippo")))
combine
combine is similar to conc in the chicken.string module. combine takes a number of objects, joining them into a string. What makes combine different is its 6 variations.
[procedure] combine OBJ ...- in - OBJ ... - any number of items: strings, symbols, characters, numbers, etc.,
- returns: all items as a single string without spaces between items.
(combine 'q 245 "cat") => "q245cat"[procedure] combine/w OBJ ...
- in - OBJ ... - items to combine
- returns: string with all items, preserves Scheme external representation.
(combine/w 'q 245 "cat") => "q245\"cat\""[procedure] combine/sym OBJ ...
- in - OBJ ... - items to combine
- returns: symbol with all items combined without spaces between.
(combine/sym 'q 245 "cat") => 'q245cat[procedure] combinesp OBJ ...
- in - OBJ ... - items to combine
- returns: string containing all items with a space between items.
(combinesp 'q 245 "cat") => "q 245 cat"[procedure] combinesp/w OBJ ...
- in - OBJ ... - items to combine
- returns: string with all items and space between. Preserves Scheme external representation.
(combinesp/w 'q 245 "cat") => "q 245 \"cat\""[procedure] combinesp/sym OBJ ...
- in - OBJ ... - items to combine
- returns: symbol with all items, space between.
(combinesp/sym 'q 245 "cat") => |q 245 cat|
printnl and variants
nutils provides procedures for terminal output with extended capabilites vs. the facilities of R7RS or chicken.io's print and print*.
[procedure] printnl ITEM ... prn: <output-port>- in - ITEM ... - virtually any scheme object and any number of ITEMs. These are printed to the terminal via display with a space between ITEMs and newline after the last ITEM is printed. (printnl = print + nl, or newline.)
- in - prn: <output-port> - by default, printnl (and all other print/write... procedures) print to current-output-port. However, the prn: keyword option directs printing to any open output port.
- returns: no value.
(import scheme.base nutils) (printnl 'test "my-output" 'isn\'t "bad at all!") => test my-output isn't bad at all! (define op (open-output-file "file1.txt")) ;; print to a file: (printnl "Here are test results:" prn: op) (printnl ... prn: op) ... (close-output-port op)[procedure] print0 ITEM ... prn: <output-port>
- in - ITEM ... - as above. print0 doesn't insert a space between ITEMs nor output newlinw after the last ITEM.
- in - prn: <output-port>
- returns: no value.
- in - ITEM ... - same as printnl. Except print0nl does NOT insert a space between items, but does print newline after the last ITEM.
- in - prn: <output-port>
- returns: no value.
- in - ITEM ... - as above. Inserts a space between items and after last item. newline is not printed after the last item. Useful for printing a list of items one or a few at a time.
- in - prn: <output-port>
- returns: no value.
- in - ITEM ... - as above. Like printsp except does NOT leave a space after the last ITEM is printed.
- in - prn: <output-port>
- returns: no value.
Each of the print... procedures has an analogous write... procedure. The difference between the series is that the latter uses write rather than display. Syntax of corresponding procedures is identical.
[procedure] writenl ITEM ...- in - ITEM ...
- returns: no value
- in - ITEM ...
- returns: no value
- in - ITEM ...
- returns: no value
- in - ITEM ...
- returns: no value
- in - ITEM ...
- returns: no value
The prn3port parameter
While the keyword argument prn: ... allows printing to a port, it can be tedious and error-prone when many invocations of print... are necessary, for example, writing a complex document to a file.
In such cases, using the prn3port parameter via parameterize makes a non-standard output port the default. Within the parameterize body, no need to use prn: <port>.
(import scheme.base nutils) (define op (open-output-file "complex-report-21.04.txt")) (parameterize ((prn3port op)) (printnl "Title:" article-title) (printnl "Author:" article-author) (printnl "Executive Summary:\n") (printnl summary-text) ... (printnl "References:" reference-list)) (close-output-port op)
string/symbol/number conversions
Conversion among strings, symbols, keywords and numbers is a common operation. These conversion procedures are simple to use and reasonably efficient:
[procedure] sym/str ITEM- in - ITEM may be a string, symbol, number, character, boolean, keyword or other scheme object.
- returns: a string representation of object. Note that converting "other scheme objects" has an adverse impact on performance. In any case, sym/str always returns a string (if possibly null, i.e., "").
(sym/str 'kettle) => "kettle" (sym/str skillet:) => "skillet:" (sym/str 1762) => "1762" (sym/str "String") => "String" (sym/str #f) => "#f" (sym/str #\$) => "$" (sym/str #(a) => "#(a)"
[procedure] str/sym ITEM
- in - ITEM - a string, symbol, number, or keyword
- returns: a symbol, or #f if ITEM is not one of the above.
(str/sym "kettle") => 'kettle (str/sym 1762) => |1762|[procedure] obj->num ITEM TRIM?
- in - ITEM is a numeric string, symbol or keyword, or a number
- in - TRIM: default is #f.
- If TRIM is #t, default trim is invoked (trims #\newline, #\tab, #\space).
- If other characters need to be trimmed, use strim or trim-string, e.g., (strim "string" "characters") .
- sym/str->num is an alias for obj->num.
- returns: a number or #f if ITEM does not convert to a number.
(obj->num "873") => 873 (obj->num '|873|) => 873 (obj->num #x873) => 2163 (obj->num "2163 ") => #f (obj->num " 2163 " #t) => 2163 (obj->num (strim "\t\t 2163--- \n" "-")) => 2163 (obj->num (trim-string '|\x9;\x9; 2163--- \xa;| "" "" "-")) => 2163 <enscript> <syntax>strls->symls LIST</syntax> * in - LIST of strings, symbols, keywords * returns: list of symbols or '() if LIST is #f. <syntax>symls->strls LIST</syntax> * in - LIST of symbols, numbers, keywords * returns: list of strings, or '() if LIST is #f. <syntax>strls->numls LIST</syntax> * in - LIST containing numeric symbols or strings, or numbers * returns: list of numbers, or '() if LIST is #f.. ** {{objs->numls}} is an alias for {{strls->numls}} Note: list conversions will insert #f when an element of the input list can't be converted to the output list type. ==== alist replace/append/delete The alist is commonly used as a database in Scheme programs for small data sets, such as configuration options or temporary data. Frequently modifying the data is called for, with 3 or 4 operations being used: * Adding a pair to the alist. * Deleting a pair from the alist. * Replacing the data (cdr) of a pair. * Appending data to the existing data of a pair/list. The semantics of these operations can be surprising. Using {{set-cdr!}} on a pair extracted from an alist by {{assoc ...}} is visible in the alist. Preserving the original alist requires making modifications to a "deep copy" of the original, leaving the original unaltered. {{alist/r_}}, {{alist/a_}} and {{alist/d_}} provide adding, replacing, appending and removing alist data. The "+" variants return a new alist, the "!" change the input alist. <procedure>alist-dup ALIST</procedure> * in - ALIST - the alist to duplicate * returns: new alist identical to ALIST. <syntax>alist/r+ K V ALIST</syntax> * in - K - the ''key'', car, of a pair in the alist. * in - V - the ''value'', cdr, of the pair. * in - ALIST ** {{alist/r+}} replaces the value (cdr) of a pair with a new value, V. If the key, K, doesn't exist in any pair in the ALIST, a new K-V pair is added to the alist. In either case, a fresh ALIST is returned. * returns: altered ALIST, original ALIST is unchanged. <syntax>alist/r! K V ALIST</syntax> * Destructive variant of {{alist/r+}}, sets ALIST to new ALIST. * returns: modified input alist. <syntax>alist/a+ K V ALIST</syntax> * in - K, V as above. * in - ALIST ** ALIST is modified, V is ''appended to'' the pair's current value. If the key is not found in the ALIST, a new K-V pair is added to the ALIST. * returns: the amended ALIST without changing the original. <syntax>alist/a! K V ALIST</syntax> * Destructive variant of {{alist/a+}}, setting ALIST to modified result. * returns: altered original alist. <procedure>alist/d+ K ALIST</procedure> * in - K - key of pair to remove from ALIST. ** if K isn't found in ALIST, unchanged duplicate ALIST is returned. * returns: altered alist, original is unmodified. <syntax>alist/d! K ALIST</syntax> * in - K - key of pair to remove from ALIST. ** if K isn't found in ALIST, the unchanged ALIST is returned. * returns: modified alist. ==== alist query <procedure>k->v KEY ALIST</procedure> * in - KEY - Performs case-sensitive search of ALIST for KEY, or car of pair. * in - ALIST - ALIST to search. * returns: value associated with KEY, or #f if key not found. <procedure>k->vci KEY ALIST</procedure> * in - KEY - Performs case-insensitive search of ALIST for KEY, or car of pair. * in - ALIST - ALIST to search. * returns: value associated with KEY, or #f if not found. ==== list-recv, pair-recv {{list-recv}} is analogous to {{receive}}, except {{list-recv}} destructures a list instead of values. Most useful with procedures returning lists with small number of elements. <syntax>list-recv (VAR1 ...) LIST BODY</syntax> * in - VAR1 ... - number of variables must match length of LIST * in - LIST - each element is assigned to variable in VARs. * in - BODY - VAR1 ... are variables visible throughout BODY. * returns: result of body expressions. <enscript highlight='scheme'> (import nutils) (define (myproc x y) (let ((a (* 101 x)) (b (* 202 y))) (list a b))) (list-recv (var-a var-b) (myproc 20 40) (printnl 'var-a 'is var-a) (printnl "var-b is" var-b)) ;; => var-a is 2020 ;; => var-b is 8080[syntax] pair-recv (K V) PAIR BODY
- in - K,V - K is assigned the key (car PAIR), V is value (cdr PAIR)
- in - PAIR - a key/value list, dotted pair, or procedure call returning a pair.
- in - BODY - variables K and V visible throughout BODY.
- returns: result of body expressions.
string indexing
[procedure] str-ndx-ls STR CHR STARTAT FROMRIGHT?- in - STR - search for indexes of this string
- in - CHR - character to search for
- in - STARTAT - (optional) index to start from, default is 0.
- Note that with startat > 0, output indexes will not be different than 0 start. Decrement startat from results gives offset from startat.
- in - FROMRIGHT? - (optional) if true, STARTAT and indexes are counted from end of string towards the beginning. The 0th character is at string-length - 1.
- returns: list of indexes per input criteria or null list if none found.
(import nutils) (define str "abc dws qthcd po9inmhy ecvhn") (str-ndx-ls str #\c) ;; => (2 11 24) (str-ndx-ls str #\c 0 #t) ;; (3 16 25) (str-ndx-ls str #\c 5 #t) ;; (16 25) (foreach* ((n '(16 25))) (- n 5)) ;; => (11 20)[procedure] find-string-ndx STR CHR RT?
- in - STR - search for indexes of this string
- in - CHR - character to search for
- in - RT? - search from right (#t) or left (#f)
- returns: first index that satisfies criteria, or '() if none.
- in - STR - search for indexes of this string
- in - CH - character to search for
- returns: first rightmost index, or '() if none found.
- in - STR - search for indexes of this string
- in - CH - character to search for
- returns: first leftmost index, or '() if none found.
- in - FILENAME - a string
- returns: FILENAME without extension, or if no extension, input FILENAME.
- in - FILENAME - a string
- returns: extension, or if no extension, the empty string ("").
string processing
trim string
[procedure] strim STR OPTIONS- in - STR - the string to trim
- in - OPTIONS:
- Chars to trim are contained in strings/objects. The default set is " \t\n".
- Keyword arguments include: L: <object>, R: <object> and D: <object>
- If L: <object> is given and not false, object's characters are added to default set and applied only to left side of STR.
- If R: <object> is given and not false, characters are added to default set and used to remove characters only from right side of STR.
- If D: <object> is supplied, its characters replace the default set affecting both ends of STR.
- If an optional argument is given, that is, a string/object without a keyword, its characters are added to the default set and used for character removal from both ends of STR.
- Non-string objects (symbols, numbers, lists, etc.) must be convertible to strings.
- returns: trimmed string
(import nutils) (define str "\t\t---... A string. ---+++\n\n") (strim str "-.+") ;; => "A string" (note chars removed from both ends) (strim str "-+" L: ".") ;; => "A string." (L: -> rm . only from left end) (strim str "-+" L: '(\.)) ;; => "A string." (strim str "-+" L: '|.|) ;; => "A string."[syntax] trim-string STR
[syntax] trim-string STR BOTH
[syntax] trim-string STR BOTH LEFT RIGHT
[syntax] trim-string STR BOTH LEFT RIGHT DEFAULT
- Similar to strim but more performant.
- BOTH, LEFT, RIGHT, DEFAULT are strings and may be "".
- Chars in BOTH are trimmed from left and right ends of STR.
- Chars in LEFT/RIGHT are trimmed from the respective end of STR.
- By default, chars in BOTH/LEFT/RIGHT are in addition to "\n", "\t" and " ".
- Chars in DEFAULT replace the default trim chars.
(import nutils) (define str "\t\t---... A string. ---+++\n\n") (trim-string str) ;; => "---... A string. ---+++" (trim-string str ".+-" ;; => "A string" (trim-string str "+-" "" ".") ;; => "... A string" (rm "." from right only) (trim-string str "" "-." "-+") ;; => "A string." (define str2 "---...Another string.--++") (trim-string str2 "" "" "" "-.+") ;; => "Another string"
split-string
[procedure] split-string/chr STRING SPLITAT optional: KEEPCHAR?- in - STRING - the string to split
- in - SPLITAT - the character where split occurs. Note that by default the character does not appear in the strings of the output list.
- in - KEEPCHAR? - default is #f. Use #t to have split character remain in output strings.
- returns: list of strings or null list if SPLITAT is not found in STRING.
- in - STRING - the string to split
- in - SPLITAT - substring where split will occur. Note that by default this string will not be included in output strings.
- in - KEEPSTR? - default is #f. Use #t to retain splitter string in output.
- returns: list of strings, or null list if SPLITAT isn't in STRING.
- in - STRING - the string to split
- in - NDX integer index where string will be split. If index is >= string length or less than 0, result is the empty list.
- returns: list of strings or null list.
- in - STRING - string to split
- in - NDXLIST - list of INTEGERS where STRING is split, in ascending order.
- Indexing stops when an index
- is followed by a smaller number
- is larger than the string length
- is less than zero
- is not a number.
- Indexing stops when an index
- in - SELECT - Optional. The select list specifies which substrings are copied to output. Selections are 0-based indexes into the list of output substrings. The 0th string extends from the fist character of STRING to one less than the first index in NDXLIST. Default for SELECT is '() which makes no modification to split output.
- Substrings can be selected in any order, and can be repeated (each repetition appears in output list).
- Indexes outside the range of 0 to (number of substrings)-1 are ignored. Non-numbers are also ignored.
- in - KEEPCHR@NDX - Optional. Ordinarily the character at an index in NDXLIST is NOT included in output substrings. When KEEPCHR@NDX is #t, the character IS retained.
- returns: list of split substrings according to above inputs, or '().
(import nutils) (define str1 "aa bbb cccc dd") (define ndxls '(2 6 11) (split-string/ndxls str1 ndxls) ;; ("aa" "bbb" "cccc" "dd") (split-string/ndxls str1 ndxls '(3 2 1 0)) ;; ("dd" "cccc" "bbb" "aa") (split-string/ndxls str1 ndxls '(1 3 0 2)) ;; ("bbb" "dd" "aa" "cccc") (split-string/ndxls str1 '()) ;; ("aa bbb cccc dd") (split-string/ndxls str1 '(abc)) ;; ("aa bbb cccc dd") (split-string/ndxls str1 ndxls '(abc)) ;; () (split-string/ndxls str1 ndxls '(0 0 3 3) #t) ;; ("aa" "aa" " dd" " dd") -- note space character in " dd" (split-string/ndxls str1 ndxls '(1 3 99 0 A 2 -5)) ;; ("bbb" "dd" "aa" "cccc")
- NOTE: the following older "split" procedures are much slower than those above but retained for backward compatibility. These will be removed in a future nutils version.
- in - SYM-STR - the string or symbol to split. A symbol is converted to string.
- in - SPLITTER: - keyword, optional. The substring where the string will be split. Default is " ".
- in - KEEP-NULLS: - keyword, optional. On splitting, null strings may be formed and are ignored by default. Using KEEP-NULLS: #t will retain the null strings in the result.
- in - KEEP-SPLITTER: - keyword, optional. Ordinarily the "splitter" string is not part of the result. Using KEEP-SPLITTER: #t will keep the splitter string in the result.
- returns: list of strings divided by the SPLITTER: string or #f if splitting is not possible.
- in - STR - string to split
- in - SCHARS - string containing characters where string string should be split.
- in - KEEP - boolean, default #f. When #t, SCHARS are retained in result.
- returns: list of strings
(import nutils) (define str "xyzAbcdeabcdefghijAbcde") (split-str-at-char str "Ad") ;; => ("xyz" "bc" "eabc" "efghij" "bc" "e") (split-str-at-char str "Ad" #t) ;; => ("xyz" "Abc" "deabc" "defghij" "Abc" "de")
str/join
[procedure] str/join STRINGLIST optional: JOINSTRING- in - STRINGLIST - non-empty list of strings to join
- in - JOINSTRING - defaults to " " (space).
- returns: the joined string.
;; Performance of (str/join ...) vs. srfi-13 (string-join ...) (import nutils srfi-13) (define strls '("The long" ", drawn out" ", string" ", wouldn't you know" ", or maybe" ", just maybe" ", you would not.")) (bench-test 100000 100 [(string-join strls)]) =================================================== Body: (string-join strls) Test: 100000 iterations/sample * 100 samples. Mean: 524.69 +/-13.23 (nsec/iter) Result lo/hi: 502.67//566.97 Median: 522.98 Faults: 0 Outliers: 0 (gen. 7.43 sec) (bench-test 100000 100 [(str/join strls)]) =================================================== Body: (str/join strls) Test: 100000 iterations/sample * 100 samples. Mean: 177.38 +/-8.28 (nsec/iter) Result lo/hi: 165.01//206.19 Median: 175.31 Faults: 0 Outliers: 0 (gen. 3.97 sec)
parse/evaluate
parse command line
[procedure] parse-cmdln CMDLINE PRMLS USAGE- in - CMDLINE - full command line as list of strings, such as returned by command-line (from scheme.process-context).
- in - PRMLS - list of command line arguments for application
- Argument list in this form: '(-arg0 [default val0] -arg1 [default val1] ...). Default values are optional, otherwise defaults to #f. An argument with default of #f acts as switch. (See below example.)
- in - USAGE - optional. If given, refers to variable that contains string with usage/help info for application.
- returns two values: result (an alist), and a cmdarg procedure
- the result alist has keys as defined in PRMLS (without leading hyphen), and values that were entered on the command line or the default. If a default wasn't given for an argument in PRMLS, the default is #f. PRMLS defaults may be any value but typically are string, number, boolean types.
- the "cmdarg" procedure (which doesn't have to be named "cmdarg") takes a key and returns it's assoc value from the 'result' alist. The type of value returned by cmdarg may be string, symbol, boolean, number, etc. Conversion procedures like sym/str can be useful. When the key doesn't exist in result, cmdarg returns #f.
A typical invocation might look like this:
;; myapp.scm (import nutils) ;; .... (define-values (res cmdarg) (parse-cmdln (command-line) '(-a #f -b 2000 ...) usage)) (and res (let ((a* (cmdarg 'a)) (b* (cmdarg 'b))) (proc-a (str2bool a*)) (proc-b (obj->num b*)) ...))
The application could be invoked as:
$ myapp -a -b 44
With this input a* would receive #t (toggling the default #f), and b* gets the string "44". Conversions are used to assure data is a type acceptable to recipient procedures.
See "example-cmdln.scm" in the egg's "doc" directory.
parse source objects (string/file)
These procedures read input from a file or string and eval the objects return by read.
[procedure] parse-file-src FILENAME IMPORT-STR[procedure] parse-string STR IMPORT-STR
- in - FILENAME - name of file with source content to read/eval.
- in - STR - source is contained in STR.
- in - IMPORT-STR - optional. If supplied, contains an import expression prepended to source content.
- returns: result of eval of forms in source file or string.
qsort
qsort is a simple, lightweight implementation of the qsort algorithm, useful for sorting short-moderate length lists encountered in many programs.
[procedure] qsort LIST #!optional !< !>=- in - LIST - the list to sort
- in - !< !>= - (optional) comparison procedures:
- Defaults: !< is <, !>= is >=. User-supplied compare procedures must accept data type of LIST.
- returns: sorted LIST.
- in - LIST - unsorted list of strings
- returns: sorted LIST.
sfmt
sfmt formats data like the unix command line utility "printf" according to a format string with "%..." specifiers. sfmt is geared toward simple use cases, for more complex formatting consider using extensions such as fmt, format, etc.
[procedure] sfmt FMT-STRING ITEMS- in - FMT-STRING - a string containing "%" specifiers as documented for the unix "printf" command.
- Specifiers recognized by sfmt include d, f, s, u, c, x, X, b, o and qualifiers l, ll. Precision, +/- components are recognized. A "%" character can be inserted with "%%".
- sfmt checks if ITEMS match the types of corresponding specifiers in FMT-STRING. If a type mismatch is evident, an error message is printed to the terminal and an empty string returned.
- in - ITEMS - data to be formatted. Must match the types of specifiers. Number of specifiers in the format string must equal the number of ITEMS.
- returns: formatted string or #f on error.
(import scheme.base nutils) (define op (open-output-file "Report-doc.txt")) ;; print headers (printnl (sfmt "%-18s %-14s %-15s %-13s %s" "Title" "Author" "Pub. Date" "Dept." "Cost") prn: op ) (printnl (sfmt "%-18s %-14s %-15s %-13s %s" "------------" "-------" "----------" "------" "-------") prn: op) (printnl (sfmt "%-18s %-14s %-15s %-8d %12.2f" "Report No. 122" "Sam Jones" "2024-08-15" 1405 1223.57) prn: op) (close-output-port op) ;; file has this content: ;; Title Author Pub. Date Dept. Cost ;; ------------ ------- ---------- ------ ------- ;; Report No. 122 Sam Jones 2024-08-15 1405 1223.57
measuring performance
Testing performance of procedures, etc., can be informative, not only about execution time, but can reveal errors that remain hidden when running code once or a few times. bench-test and bench-test-graph have widest applicability.
bench-test
[syntax] bench-test [BODY ...][syntax] bench-test [BODY ...] OUTLIER
[syntax] bench-test ITERS [BODY ...]
[syntax] bench-test ITERS [BODY ...] OUTLIER
[syntax] bench-test ITERS NSAMP [BODY ...]
[syntax] bench-test ITERS NSAMP [BODY ...] OUTLIER
[syntax] bench-test ITERS NSAMP SHOW [BODY ...]
[syntax] bench-test ITERS NSAMP SHOW [BODY ...] OUTLIER
- in - ITERS - number of iterations/sample. Default is 10000.
- in - NSAMP - number of samples that will be run. Default is 10.
- in - SHOW - a symbol, if supplied, specifies output options:
- returnls/retls/ls -- prints summary and returns data list: sample mean, stddev, stderr, variance, number of iters/sample, faults, lowest-data-value, highest-data-value, list of sample values (sample total iteration time/num iters)
- lsonly -- returns data list without other output
- reps -- prints sample values and summary of results
- repsls -- same as reps except also returns data list
- (#f, empty) -- prints summary only. This is the default.
- in - OUTLIER - if supplied, is a ratio of the given OUTLIER to the running average of measured values. A typical OUTLIER spec is >= 6.0. The purpose is reducing the number of unusably elevated datapoints. However the greater the number of datapoints screened out, the longer it takes to finish all sample runs. An OUTLIER value is not often useful or needed.
- Summary output has a number of fields, most are self-explanatory.
- Body: the expression being measured
- Test: iterations/sample and number of samples
- Mean: arithmetic mean, +/- stddev (nanoseconds/iteration)
- Result: lowest-value//highest-value and median
- Faults: number of "inverted" results, that is, where iteration start-time > ending-time. These values are excluded. Faults occur with very short interval times and reflect randomness of processing. Greater number of iterations/sample reduce random effects. Logically, as result nanosecs/iter increase fewer faults occur. When many faults are reported, the time to complete all samples lengthens disproportionately.
- Outliers: described above.
- (gen. *** sec): report generation time.
(import nutils) ;; "empty body test" ;; Shows minimum interval, in this case running in csi (bench-test 100000 100 []) =================================================== Body: Test: 100000 iterations/sample * 100 samples. Mean: 19.01 +/-2.09 (nsec/iter) Result lo/hi: 15.13//27.52 Median: 18.57 Faults: 0 Outliers: 0 (gen. 2.27 sec) (bench-test 100000 100 [(+ 1 2)]) =================================================== Body: (+ 1 2) Test: 100000 iterations/sample * 100 samples. Mean: 50.47 +/-4.59 (nsec/iter) Result lo/hi: 43.68//66.91 Median: 49.41 Faults: 0 Outliers: 0 (gen. 2.60 sec) (bench-test 100000 100 [(* 3 (sqrt (+ 233 101)))]) =================================================== Body: (* 3 (sqrt (+ 233 101))) Test: 100000 iterations/sample * 100 samples. Mean: 331.49 +/-14.78 (nsec/iter) Result lo/hi: 312.78//425.71 Median: 328.01 Faults: 0 Outliers: 0 (gen. 6.14 sec) (bench-test 100000 100 [(combine #\. "abc")]) =================================================== Body: (combine #\. "abc") Test: 100000 iterations/sample * 100 samples. Mean: 120.28 +/-7.46 (nsec/iter) Result lo/hi: 112.47//154.95 Median: 119.47 Faults: 0 Outliers: 0 (gen. 3.38 sec)
bench-test-graph
Extends bench-test with visual representation of results.
[syntax] bench-test-graph [BODY ...][syntax] bench-test-graph [BODY ...] OUTLIER
[syntax] bench-test-graph ITERS [BODY ...]
[syntax] bench-test-graph ITERS [BODY ...] OUTLIER
[syntax] bench-test-graph ITERS NSAMP [BODY ...]
[syntax] bench-test-graph ITERS NSAMP [BODY ...] OUTLIER
[syntax] bench-test-graph ITERS NSAMP ROWS [BODY ...]
[syntax] bench-test-graph ITERS NSAMP ROWS [BODY ...] OUTLIER
[syntax] bench-test-graph ITERS NSAMP ROWS GRPHCHR [BODY ...]
[syntax] bench-test-graph ITERS NSAMP ROWS GRPHCHR [BODY ...] OUTLIER
- in - BODY ... - expression(s) to measure
- in - ITERS - same as bench-test
- in - NSAMP - same as bench-test
- in - ROWS - number of rows in graph, defaults to 16.
- in - GRPHCHR - character used to draw graph, defaults to #.
- in - OUTLIER - same as bench-test prints the following info:
- bench-test-graph prints summary identical to bench-test.
- In addition bench-test-graph
- Range-size: time interval for each row of graph
- Graph symbols:
- stddevs from mean indicated by numbers -5 to +6, with 0 adjacent to row containing the mean.
- mode is marked with <
- median is marked with =
- >=low <high range limits (for each row)
(bench-test-graph 100000 100 [(+ 1 2)]) =================================================== Body: (+ 1 2) Test: 100000 iterations/sample * 100 samples. Mean: 49.86 +/-4.42 (nsec/iter) Result lo/hi: 43.65//63.89 Median: 49.33 Faults: 0 Outliers: 0 (gen. 2.56 sec) --------------------------------------------------- Range-size: 1.3 Graph symbols: sd -5..0..+6 >=low <high mode '<' median '=' --------------------------------------------------- 43.6 44.9 |################# 44.9 46.1 |######## -1 46.1 47.4 |#### 47.4 48.7 |######### 48.7 50.0 |##################### 0 < = 50.0 51.2 |######### 51.2 52.5 |############ 52.5 53.8 |# 53.8 55.0 |##### +1 55.0 56.3 |####### 56.3 57.6 |# 57.6 58.9 |# +2 58.9 60.1 |# 60.1 61.4 |## 61.4 62.7 |# 62.7 63.9 |# +3
Note: shape of the graph reflects iterations/sample and number of samples. The "height" of the graph is increases with number of samples, total size of the columns is same as number of samples. Number of iterations has major effect on graph shape. Generally a graph more closely resembles a normal distribution as the number of iterations increases. Depending on system capabilities, >=10e6 iterations/sample can take a long time to complete. However, info such as average time is mostly independent of graph shape.
(bench-test-graph 1000 100 8 [(split-string/chr "A big, long, string without obvious, visible ending" #\,)]) =================================================== Body: (split-string/chr "A big, long, string without obvious, visible ending" #\,) Test: 1000 iterations/sample * 100 samples. Mean: 589.56 +/-25.07 (nsec/iter) Result lo/hi: 558.42//734.32 Median: 582.77 Faults: 0 Outliers: 0 (gen. 0.08 sec) --------------------------------------------------- Range-size: 22.0 Graph symbols: sd -5..0..+6 >=low <high mode '<' median '=' --------------------------------------------------- 558.4 580.4 |######################################## -1 580.4 602.4 |########################################### 0 < = 602.4 624.4 |###### +1 624.4 646.4 |######## +2 646.4 668.4 |# +3 668.4 690.4 |# +4 690.4 712.4 | 712.4 734.4 |# +5 (bench-test-graph 100000 100 [(split-string/chr "A big, long, string without obvious, visible ending" #\,)]) =================================================== Body: (split-string/chr "A big, long, string without obvious, visible ending" #\,) Test: 100000 iterations/sample * 100 samples. Mean: 597.43 +/-15.73 (nsec/iter) Result lo/hi: 569.59//648.58 Median: 595.37 Faults: 0 Outliers: 0 (gen. 8.28 sec) --------------------------------------------------- Range-size: 4.9 Graph symbols: sd -5..0..+6 >=low <high mode '<' median '=' --------------------------------------------------- 569.5 574.5 |#### 574.5 579.4 |###### 579.4 584.4 |########## -1 584.4 589.3 |############### 589.3 594.3 |############## 594.3 599.2 |######## 0 = 599.2 604.1 |################ < 604.1 609.1 |####### 609.1 614.0 |##### +1 614.0 619.0 |###### 619.0 623.9 |## 623.9 628.9 |#### 628.9 633.8 | +2 633.8 638.7 |# 638.7 643.7 |# 643.7 648.6 |# +3
low-level procedures/syntax
[procedure] curr-nanosec- returns current nanosecond per OS. Resolution varies among systems, but generally is 1-100 nanoseconds.
- in - CALIB - optional. If not #f (the default) should be a number to set as calibration interval, generally 20-30 (nanoseconds). Setting this number too high causes increased faults in bench-test.
- returns: current calibration setting, or default setting if CALIB is #f or not given.
(import nutils) (calibrate) ;; => calibration: base=25.43, curr-total=25.43, bias-setting: 0.00 (calibrate 30) ;; => Settings: base=25.38, bias=4.62, **calibration**=30.00[syntax] verbosity ON?
- in - ON? - (boolean) turns on/off verbose output. Default is off.
- in - BODY ... - target expressions
- returns: run interval (nanoseconds)
[syntax] benchmark N-ITERATIONS (BODY ...)
- in - N-ITERATIONS - number of times execution interval is measured. Cumulative time is average of N-ITERS trials. Defaults to 1000.
- Benchmark is primarily an internal routine.
- in - BODY ... - expressions to measure.
- returns: average execution time of BODY ....
- in - DATALIST - list of numeric data from which statistics are computed.
- Integer and float data are accepted.
- returns: list of the sample mean, standard deviation, standard error and variance and median.
misc
[syntax] inc! NUMBER AMT[syntax] dec! NUMBER AMT
- in - NUMBER - object with numeric value to increment/decrement
- in - AMT - optional. If given, the amount to increment/decrement number. Default is 1.
- returns: the resulting numeric value to which the input NUMBER object is also set.
(import nutils) (define n 24) (define p 43.5) (inc! n) ;; => 25 n ;; => 25 (dec! p 22.8) ;; => 20.7 p ;; => 20.7[syntax] mkexact NUMBER
- in - NUMBER - exact or inexact number
- returns: if NUMBER is exact, returns NUMBER. Otherwise, round, then exact is applied to an inexact NUMBER.
- in - TEST - a boolean expression
- in - BODY - BODY is evaluated until TEST is #f. The procedure break ... can be used to terminate the loop at any point.
- returns: value of BODY's last expression, or value input to break if it's used.
- in - ITEM - a scheme object
- returns: a symbol describing the type of the item, or 'unknown if the item's type can't be determined. type-of can identify most scheme objects. Number-vectors above u8 are identified as "number-vector".
(import nutils chicken.number-vector) (type-of 'cat) ; => 'symbol (type-of 22.22) ; => 'float (type-of 22) ; => 'exact-int (type-of #u8(0 0 0 0) ;; => 'bytevector (type-of #u32(0 0 0 0)) ;; => 'number-vector[procedure] deg2rad DEG
- in - DEG - angle (degrees)
- returns: angle as radians
- in - RAD - angle (radians)
- returns: angle as degrees
- in - VAR - the "variable", name of generated procedure
- in - VAL - constant value returned from generated procedure
- returns toplevel procedural constant (procedure of no arguments, returning VAL)
- generated by mkconst
- returns: value of PI, 3.14159265359
(import nutils) (print0nl "pi is approx " (PI)) ;; => pi is approx 3.14159265359
LICENSE
BSD-2
Versions
- 0.4.0 - Revised/enhanced string and conversion routines
- 0.3.2 - Added parse-cmdln, parse-str, revised bench-test
- 0.3.1 - Bugfix, enhancements to string-processing/bench-test
- 0.3.0 New: bench-test/bench-test-graph, revised str-ndx-ls, sfmt, and others.
- 0.2.8.1 - Bugfix release
- 0.2.8 - Error fixes, added alist/d*
- 0.2.6 - Fixed tests
- 0.2.5 - initial public release