csv-abnf
Description
The csv-abnf library contains procedures for parsing and formatting of comma-separated values (CSV) as described in RFC 4180. There are several differences with the RFC:
- The RFC prescribes CRLF standard network line breaks, but many CSV files have platform-dependent line endings, so this library accepts any sequence of CRs and LFs as a line break.
- The format of header lines is exactly like a regular record and the presence of a header can only be determined from the mime type. This library treats all lines as regular records.
- The formal grammar specifies that fields can contain only certain US ASCII characters, but the specification of the MIME type allows for other character sets. This library allow all characters in fields, except for the field delimiter character, CRs and LFs in unquoted fields.
- According to the RFC, the records all have to have the same length. This library allows variable length records.
- The delimiter character is specified by the user and can be a character other than comma, or an SRFI-14 character set.
See also csv-xml.
Library Procedures
[procedure] (csv-record? X) => BOOLReturns #t if the given object is a csv-record, #f otherwise.
[procedure] (list->csv-record LIST) => CSV-RECORDTakes in a list of values and creates a csv-record object.
[procedure] (csv-record->list CSV-RECORD) => LISTReturns the list of values contained in the given csv-record object.
Parsing Procedures: Preliminaries
The parsing procedures in csv-abnf are based on abnf, which provides the core parsing primitives used to build the CSV grammar parser (see the abnf library for more information).
Parsing Procedures: csv-abnf
[procedure] (make-parser [DELIMITER]) => PARSERmake-parser returns a constructor for the CSV parsing procedure. Optional argument DELIMITER specifies the field delimiter (comma by default). DELIMITER can be a character, or an SRFI-14 character set. The returned procedure takes in an input stream and returns a list of the form:
((<#csv-record (FIELD1 FIELD2 ...)>) (<#csv-record ... >))
where FIELD represents the field values in a record.
The following example illustrates the creation of an instance of <CSV> specialized for character lists.
(import abnf csv-abnf) (define parse-csv (make-parser #\|)) (parse-csv (string->list "a|b|c")) (map csv-record->list (parse-csv (string->list "a|b|c"))) ==> (("a" "b" "c"))
Formatting procedures
[procedure] (make-format [DELIMITER]) => FORMAT-CELL * FORMAT-RECORD * FORMAT-CSVReturns procedures for outputting individual field values, CSV records, and lists of CSV records, where each list is printed on a separate line.
Procedure FORMAT-CELL takes in a value, obtains its string representation via format, and surrounds the string with quotes, if it contains characters that need to be escaped (such as quote characters, the delimiter character, or newlines).
Procedure FORMAT-RECORD takes in a record of type csv-record and returns its string representation, based on the strings produced by FORMAT-CELL and the delimiter character.
Procedure FORMAT-CSV takes in a list of csv-record objects and produces a string representation using FORMAT-RECORD.
Example:
(use csv-abnf) (define-values (fmt-cell fmt-record fmt-csv) (make-format ";")) (fmt-cell "hello") => "hello" ;; This is quoted because it contains delimiter-characters (fmt-cell "one;two;three") => "\"one;two;three\"" ;; This is quoted because it contains quotes, which are then doubled for escaping (fmt-cell "say \"hi\"") => "\"say \"\"hi\"\"\"" ;; Converts one line at a time (useful when converting data in a streaming manner) (fmt-record (list->csv-record '("hi there" "let's say \"hello world\" again" "until we are bored"))) => "hi there;\"let's say \"\"hello world\"\" again\";until we are bored" ;; And an example of how to quickly convert a list of lists ;; to a CSV string containing the entire CSV file (fmt-csv (map list->csv-record '(("one" "two") ("and another \"line\"" "of csv stuff")))) => "one;two\r\n\"and another \"\"line\"\"\";of csv stuff\r\n"
Repository
https://github.com/iraikov/chicken-csv-abnf
Version History
- 5.3 Added module csv-char-list
- 5.1 utf8-related bug fixes
- 5.0 Compatibility with abnf 6.0 / using utf8 for char operations
- 4.7 Created module csv-string
- 4.6 Ensure unit test script returns proper exit code
- 4.5 Compatibility with CharLex->CoreABNF constructor in abnf
- 4.4 Added cr and lf to set of characters allows in escaped strings
- 4.3 Added csv-record? to list of exported identifiers
- 4.1 Fixes in the handling of escaped strings
- 4.0 Compatibility with abnf 5
- 3.2 Fixes to reflect changes in the regex API in Chicken 4.6.0
- 3.1 Added regex as an explicit dependency
- 3.0 Implemented typeclass interface
- 2.0 Added formatting routines
- 1.0 Initial Release
License
Copyright 2009-2018 Ivan Raikov
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
A full copy of the GPL license can be found at <http://www.gnu.org/licenses/>.