Outdated egg!
This is an egg for CHICKEN 4, the unsupported old release. You're almost certainly looking for the CHICKEN 5 version of this egg, if it exists.
If it does not exist, there may be equivalent functionality provided by another egg; have a look at the egg index. Otherwise, please consider porting this egg to the current version of CHICKEN.
libxml2
Libxml2 is a XML C parser and toolkit with DOM, SAX and text-reader APIs.
- Outdated egg!
- libxml2
- LibXML2
- Author
- Upstream
- Egg Source Code
- Miscellaneous
- DOM Parser
- Example
- Node Types
- dom:element-node
- dom:attribute-node
- dom:text-node
- dom:cdata_section_node
- dom:entity-ref-node
- dom:entity-node
- dom:pi-node
- dom:comment-node
- dom:document-node
- dom:document-type-node
- dom:document-frag-node
- dom:notation-node
- dom:html-document-node
- dom:dtd-node
- dom:element-decl
- dom:attribute-decl
- dom:entity-decl
- dom:namespace-decl
- dom:xinclude-start
- dom:xinclude-end
- API
- dom:is-element-node?
- dom:is-text-node?
- dom:is-attribute-node?
- dom:parse-string
- dom:parse-string-default
- dom:cleanup-parser
- dom:parse-file
- dom:free-doc
- dom:make-parser-context
- dom:read-file-with-context
- dom:is-valid?
- dom:free-parser-context
- dom:to-string
- dom:next-node
- dom:node-content
- dom:node-children
- dom:node-name
- dom:is-element-name?
- dom:get-attribute
- dom:attributes
- SAX Parser
- Text Reader Parser
- Example
- Node Types
- text-reader:none
- text-reader:element
- text-reader:attribute
- text-reader:text
- text-reader:cdata
- text-reader:entity-reference
- text-reader:entity
- text-reader:processing-instruction
- text-reader:comment
- text-reader:document
- text-reader:document-type
- text-reader:document-fragmenta
- text-reader:notation
- text-reader:whitespace
- text-reader:significant-whitespace
- text-reader:end-element
- text-reader:end-entity
- text-reader:xml-declaration
- API
- text-reader:element-to-string
- text-reader:end-element-is?
- text-reader:start-element-is?
- text-reader:end-element-node?
- text-reader:text-node?
- text-reader:element-node?
- text-reader:make
- text-reader:read-more
- text-reader:free
- text-reader:node-type
- text-reader:empty-element?
- text-reader:move-to-attribute
- text-reader:all-attributes
- text-reader:move-to-next-attribute
- text-reader:move-to-first-attribute
- text-reader:move-to-element
- text-reader:next
- text-reader:next-sibling
- text-reader:name
- text-reader:value
- About this egg
LibXML2
Libxml2 is the XML C parser and toolkit developed for the Gnome project but usable outside of the Gnome platform), it is free software available under the MIT License. XML itself is a metalanguage to design markup languages, i.e. text language where semantic and structure are added to the content using extra 'markup' information enclosed between angle brackets. HTML is the most well-known markup language. Though the library is written in C a variety of language bindings make it available in other environments.
Author
David Ireland (djireland79 at gmail dot com)
Upstream
Egg Source Code
https://gitlab.com/maxwell79/chicken-libxml2
libxml
[module] libxml
- attributes->string
- text-reader:element-to-string
- text-reader:end-element-is?
- text-reader:start-element-is?
- text-reader:end-element-node?
- text-reader:text-node?
- text-reader:element-node?
- text-reader:make
- text-reader:read-more
- text-reader:free
- text-reader:depth
- text-reader:node-type
- text-reader:empty-element?
- text-reader:move-to-attribute
- text-reader:all-attributes
- text-reader:move-to-next-attribute
- text-reader:move-to-first-attribute
- text-reader:move-to-element
- text-reader:next
- text-reader:next-sibling
- text-reader:name
- text-reader:value
- sax:attributes->list
- sax:parse-file
- sax:parse-string
- sax:make-handler
- sax:free-handler
- dom:is-element-node?
- dom:is-text-node?
- dom:is-attribute-node?
- dom:parse-string
- dom:parse-string-default
- dom:cleanup-parser
- dom:memory-dump
- dom:parse-file
- dom:free-doc
- dom:make-parser-context
- dom:read-file-with-context
- dom:is-valid?
- dom:free-parser-context
- dom:to-string
- dom:copy-doc
- dom:root-element
- dom:copy-node
- dom:copy-node-list
- dom:next-node
- dom:node-content
- dom:node-children
- dom:node-type
- dom:node-name
- dom:is-element-name?
- dom:get-attribute
- dom:attributes
Miscellaneous
attributes->string
[procedure] (attributes->string attributes) → stringConverts an attribute list to string
- attributes
- List of attributes
Examples
Example:
(attributes->string `(("id1" . "value1") ("id2" . "value2"))) => " id2=\"value2\" id1=\"value1\""
DOM Parser
DOM stands for the Document Object Model; this is an API for accessing XML or HTML structured documents.
Example
(define (dom-demo)
(define (print-element-names node)
(let loop ((n node))
(when n
(when (dom:is-element-node? n)
(print "element <" (dom:node-name n) ">")
(print "@ => " (dom:attributes n)))
(when (dom:is-text-node? n)
(print "content => " (dom:node-content n)))
(print-element-names (dom:node-children n))
(loop (dom:next-node n)))))
(define ctx (dom:make-parser-context))
(define doc (dom:read-file-with-context ctx "foo.xml" #f 0))
(define root (dom:root-element doc))
(define valid? (dom:is-valid? ctx))
(print "XML is valid?: " valid?)
(print "root: " root)
(print-element-names root)
(dom:free-doc doc)
(dom:cleanup-parser))
Node Types
dom:element-node
[constant] dom:element-node → 1DOM element node
dom:attribute-node
[constant] dom:attribute-node → 2DOM attribute node
dom:text-node
[constant] dom:text-node → 3DOM text node
dom:cdata_section_node
[constant] dom:cdata_section_node → 4DOM CData node
dom:entity-ref-node
[constant] dom:entity-ref-node → 5DOM Entity reference node
dom:entity-node
[constant] dom:entity-node → 6DOM entity node
dom:pi-node
[constant] dom:pi-node → 7DOM pi-node
dom:comment-node
[constant] dom:comment-node → 8DOM comment node
dom:document-node
[constant] dom:document-node → 9DOM document node
dom:document-type-node
[constant] dom:document-type-node → 10DOM document type node
dom:document-frag-node
[constant] dom:document-frag-node → 11DOM document frag node
dom:notation-node
[constant] dom:notation-node → 12DOM notation node
dom:html-document-node
[constant] dom:html-document-node → 13DOM HTML document node
dom:dtd-node
[constant] dom:dtd-node → 14DOM DTD node
dom:element-decl
[constant] dom:element-decl → 15DOM element declaration
dom:attribute-decl
[constant] dom:attribute-decl → 16DOM attributte declaration
dom:entity-decl
[constant] dom:entity-decl → 17DOM entity declaration
dom:namespace-decl
[constant] dom:namespace-decl → 18DOM namespace declaration
dom:xinclude-start
[constant] dom:xinclude-start → 19DOM xinclude start declaration
dom:xinclude-end
[constant] dom:xinclude-end → 20DOM xinlude end declaration
API
dom:is-element-node?
[procedure] (dom:is-element-node? node) → booleanChecks if specified dom:node is a element node
- node
- A dom:xml-node
dom:is-text-node?
[procedure] (dom:is-text-node? node) → booleanChecks if specified dom:node is a text node
- node
- A dom:xml-node
dom:is-attribute-node?
[procedure] (dom:is-attribute-node? node) → booleanChecks if specified dom:node is an attribute node
- node
- A dom:xml-node
dom:parse-string
[procedure] (dom:parse-string xml-string xml-size URL encoding options) → dom:docParse string using the DOM parser API
- xml-string
- XML string
- xml-size
- Size of the XML string
- URL
- XML URL
- encoding
- Encoding
- options
- Options
dom:parse-string-default
[procedure] (dom:parse-string-default str) → dom:docParse string using the DOM parser API with default options and encoding
- xml-string
- XML string
dom:cleanup-parser
[constant] dom:cleanup-parser → (foreign-lambda void xmlCleanupParser)Free the dom:doc
dom:parse-file
[procedure] (dom:parse-file filename) → dom:docParse a file using the DOM parser API
- filename
- XML file
dom:free-doc
[procedure] (dom:free-doc) → unspecifiedFree the dom:doc
dom:make-parser-context
[procedure] (dom:make-parser-context) → dom:parser-contextCreate a DOM parser context
dom:read-file-with-context
[procedure] (dom:read-file-with-context context filename encoding options) → dom:docParse a XML file using the given DOM parser context
- context
- DOM parser context
- filename
- encoding
- options
dom:is-valid?
[procedure] (dom:is-valid? context) → booleanChecks if the parser context is valid after parsing a file
- context
- DOM parser context
dom:free-parser-context
[procedure] (dom:free-parser-context) → unspecifiedFree the dom:parser-context
dom:to-string
[procedure] (dom:to-string) → stringConvert a dom:node to string including the children nodes
dom:next-node
[procedure] (dom:next-node) → dom:nodeMove to the next dom:node
dom:node-content
[procedure] (dom:node-content) → stringReturns the contents (text) of the dom:node
dom:node-children
[procedure] (dom:node-children) → dom:nodeReturns the first child node
dom:node-name
[procedure] (dom:node-name) → dom:nodeReturns the name of the dom:node
dom:is-element-name?
[procedure] (dom:is-element-name? name dom:node) → booleanChecks if the current name of the dom:node matches the specified string
- name
- Name (string) to match
- dom:node
dom:get-attribute
[procedure] (dom:get-attribute key dom:node) → stringReturns the attribute from the specified key
- key
- string
- dom:node
dom:attributes
[procedure] (dom:attributes n) → Association listReturns the complete set of XML attributes for the given node
- dom:node
SAX Parser
Sometimes the DOM tree output is just too large to fit reasonably into memory. In that case (and if you don't expect to save back the XML document loaded using libxml), it's better to use the SAX interface of libxml. SAX is a callback-based interface to the parser. Before parsing, the application layer registers a customized set of callbacks which are called by the library as it progresses through the XML input.
Example
(define (sax-demo)
(define sax
(sax:make-handler
(lambda (localname attribute-list)
(print "<" localname ">")
(print "@ => " attribute-list))
(lambda (localname) (print "<" localname "/>"))
(lambda (characters) (print "[on-chars]: characters: " characters))))
(sax:parse-file sax #f "foo.xml")
(sax:free-handler sax))
sax:parse-file
[procedure] (sax:parse-file handler user-data) → numberParse a XML file using the SAX handler
- handler
- SAX handler
- user-data
- SAX parser context
sax:parse-string
[procedure] (sax:parse-string sax-handler user-data xml-string size) → numberParse a XML string using the SAX handler
- sax-handler
- user-data
- SAX parser context
- xml-string
- size
- The size of the XML string
sax:make-handler
[procedure] (sax:make-handler on-start on-end on-characters) → sax-handlerMakes a SAX handler
- on-start
- λ called on start of element
- on-end
- λ called on end of element
- on-characters
- λ called on start of reading characters
sax:free-handler
[procedure] (sax:free-handler sax-handler) → unspecifiedFrees the SAX handler
- sax-handler
Text Reader Parser
Libxml2 main API is tree based, where the parsing operation results in a document loaded completely in memory, and expose it as a tree of nodes all availble at the same time. This is very simple and quite powerful, but has the major limitation that the size of the document that can be handled is limited by the size of the memory available. Libxml2 also provide a SAX based API, but that version was designed upon one of the early expat version of SAX, SAX is also not formally defined for C. SAX basically work by registering callbacks which are called directly by the parser as it progresses through the document streams. The problem is that this programming model is relatively complex, not well standardized, cannot provide validation directly, makes entity, namespace and base processing relatively hard.
The text-reader API provides a far simpler programming model. The API acts as a cursor going forward on the document stream and stopping at each node in the way. The user's code keeps control of the progress and simply calls a read-next procedure repeatedly to progress to each node in sequence in document order. There is direct support for namespaces, xml:base, entity handling and adding DTD validation on top of it was relatively simple. This API is really close to the DOM Core specification This provides a far more standard, easy to use and powerful API than the existing SAX. Moreover integrating extension features based on the tree seems relatively easy.
In a nutshell the text-reader API provides a simpler, more standard and more extensible interface to handle large documents than the existing SAX version.
Example
(define (text-reader-demo)
(define tr (text-reader:make "foo.xml"))
(define (helper tr)
(when (text-reader:element-node? tr)
(print "<" (text-reader:name tr) ">")
(print "@ => " (text-reader:all-attributes tr)))
(when (text-reader:text-node? tr)
(print "value =>" (text-reader:value tr)))
(if (> (text-reader:read-more tr) 0) (helper tr)))
(helper tr)
(text-reader:free tr))
Node Types
text-reader:none
[constant] text-reader:none → 0Text-Reader none
text-reader:element
[constant] text-reader:element → 1Text-Reader element
text-reader:attribute
[constant] text-reader:attribute → 2Text-Reader attribute
text-reader:text
[constant] text-reader:text → 3Text-Reader text
text-reader:cdata
[constant] text-reader:cdata → 4Text-Reader cdata
text-reader:entity-reference
[constant] text-reader:entity-reference → 5Text-Reader entity reference
text-reader:entity
[constant] text-reader:entity → 6Text-Reader entity
text-reader:processing-instruction
[constant] text-reader:processing-instruction → 7Text-Reader processing instruction
text-reader:comment
[constant] text-reader:comment → 8Text-Reader comment
text-reader:document
[constant] text-reader:document → 9Text-Reader document
text-reader:document-type
[constant] text-reader:document-type → 10Text-Reader document type
text-reader:document-fragmenta
[constant] text-reader:document-fragmenta → 11Text-Reader document fragments
text-reader:notation
[constant] text-reader:notation → 12Text-Reader notation
text-reader:whitespace
[constant] text-reader:whitespace → 13Text-Reader whitespace
text-reader:significant-whitespace
[constant] text-reader:significant-whitespace → 14Text-Reader signficiant whitespace
text-reader:end-element
[constant] text-reader:end-element → 15Text-Reader element end
text-reader:end-entity
[constant] text-reader:end-entity → 16Text-Reader entity end
text-reader:xml-declaration
[constant] text-reader:xml-declaration → 17Text-Reader XML declaration
API
text-reader:element-to-string
[procedure] (text-reader:element-to-string r) → stringConverts a text reader to string including child nodes
- text-reader
text-reader:end-element-is?
[procedure] (text-reader:end-element-is? name reader) → booleanChecks if end element is specified name
- name
- Element name (string)
- text-reader
text-reader:start-element-is?
[procedure] (text-reader:start-element-is? name reader) → booleanChecks if start element is specified name
- name
- Element name (string)
- text-reader
text-reader:end-element-node?
[procedure] (text-reader:end-element-node? reader) → booleanChecks if node is an end element
- reader
text-reader:text-node?
[procedure] (text-reader:text-node? reader) → booleanChecks for text node
- reader
text-reader:element-node?
[procedure] (text-reader:element-node? reader) → booleanChecks if node is an element
- reader
text-reader:make
[procedure] (text-reader:make filename) → text-readerMakes a new text-reader
- filename
text-reader:read-more
[procedure] (text-reader:read-more text-reader) → unspecifiedReads the next node in the text-reader
- text-reader
text-reader:free
[procedure] (text-reader:free text-reader) → unspecifiedFree the specfied text-reader
- text-reader
text-reader:node-type
[procedure] (text-reader:node-type text-reader) → Node type (number)Returns the node type
- text-reader
text-reader:empty-element?
[procedure] (text-reader:empty-element? text-reader) → booleanChecks if text-reader is empty
- text-reader
text-reader:move-to-attribute
[procedure] (text-reader:move-to-attribute text-reader attribute-name) → numberMoves text-reader to the specified attribute
- text-reader
- attribute-name
- (string)
text-reader:all-attributes
[procedure] (text-reader:all-attributes r) → listExtracts all the attributes from the element. Attributes are placed into an association list
- text-reader
text-reader:move-to-next-attribute
[procedure] (text-reader:move-to-next-attribute text-reader) → numberMoves text-reader to the next attribute
- text-reader
text-reader:move-to-first-attribute
[procedure] (text-reader:move-to-first-attribute text-reader) → numberMoves text-reader to the first attribute
- text-reader
text-reader:move-to-element
[procedure] (text-reader:move-to-element text-reader) → numberMoves text-reader to first element
- text-reader
text-reader:next
[procedure] (text-reader:next text-reader) → numberMoves text-reader to next node
- text-reader
text-reader:next-sibling
[procedure] (text-reader:next-sibling text-reader) → numberMoves text-reader to next sibling node
- text-reader
text-reader:name
[procedure] (text-reader:name text-reader) → stringReturns the name of the node
- text-reader
text-reader:value
[procedure] (text-reader:value text-reader) → stringReturns the value of the node
- text-reader
About this egg
Author
Colophon
Documented by hahn.