Editing page: Outdated egg! - The CHICKEN Scheme wiki

You can edit this page using wiki syntax for markup.

Article contents:

== Outdated egg!

This is an egg for CHICKEN 4, the unsupported old release.  You're almost certainly looking for [[/eggref/5/libxml2|the CHICKEN 5 version of this egg]], if it exists.

If it does not exist, there may be equivalent functionality provided by another egg; have a look at the [[https://wiki.call-cc.org/chicken-projects/egg-index-5.html|egg index]]. Otherwise, please consider porting this egg to the current version of CHICKEN.

== libxml2

Libxml2 is a XML C parser and toolkit with DOM, SAX and text-reader APIs.
[[toc:]]
=== LibXML2
Libxml2 is the XML C parser and toolkit developed for the Gnome project
but usable outside of the Gnome platform), it is free software available under
the MIT License. XML itself is a metalanguage to design markup languages, i.e. 
text language where semantic and structure are added to the content using extra
'markup' information enclosed between angle brackets. HTML is the most 
well-known markup language. Though the library is written in C a variety of
language bindings make it available in other environments.

=== Author
David Ireland (djireland79 at gmail dot com)

=== Upstream
[[http://xmlsoft.org/]]

=== Egg Source Code
[[https://gitlab.com/maxwell79/chicken-libxml2]]

==== {{libxml}}
'''[module]''' {{libxml}}

* [[#attributes->string]]
* [[#text-reader:element-to-string]]
* [[#text-reader:end-element-is?]]
* [[#text-reader:start-element-is?]]
* [[#text-reader:end-element-node?]]
* [[#text-reader:text-node?]]
* [[#text-reader:element-node?]]
* [[#text-reader:make]]
* [[#text-reader:read-more]]
* [[#text-reader:free]]
* [[#text-reader:depth]]
* [[#text-reader:node-type]]
* [[#text-reader:empty-element?]]
* [[#text-reader:move-to-attribute]]
* [[#text-reader:all-attributes]]
* [[#text-reader:move-to-next-attribute]]
* [[#text-reader:move-to-first-attribute]]
* [[#text-reader:move-to-element]]
* [[#text-reader:next]]
* [[#text-reader:next-sibling]]
* [[#text-reader:name]]
* [[#text-reader:value]]
* [[#sax:attributes->list]]
* [[#sax:parse-file]]
* [[#sax:parse-string]]
* [[#sax:make-handler]]
* [[#sax:free-handler]]
* [[#dom:is-element-node?]]
* [[#dom:is-text-node?]]
* [[#dom:is-attribute-node?]]
* [[#dom:parse-string]]
* [[#dom:parse-string-default]]
* [[#dom:cleanup-parser]]
* [[#dom:memory-dump]]
* [[#dom:parse-file]]
* [[#dom:free-doc]]
* [[#dom:make-parser-context]]
* [[#dom:read-file-with-context]]
* [[#dom:is-valid?]]
* [[#dom:free-parser-context]]
* [[#dom:to-string]]
* [[#dom:copy-doc]]
* [[#dom:root-element]]
* [[#dom:copy-node]]
* [[#dom:copy-node-list]]
* [[#dom:next-node]]
* [[#dom:node-content]]
* [[#dom:node-children]]
* [[#dom:node-type]]
* [[#dom:node-name]]
* [[#dom:is-element-name?]]
* [[#dom:get-attribute]]
* [[#dom:attributes]]
=== Miscellaneous
==== {{attributes->string}}
<procedure>(attributes->string attributes) → string</procedure>
Converts an attribute list to string
; {{attributes}} : List of attributes
===== Examples
Example: 
 (attributes->string `(("id1" . "value1") ("id2" . "value2")))
  => " id2=\"value2\" id1=\"value1\""
 
=== DOM Parser
DOM stands for the Document Object Model; this is an API for accessing
XML or HTML structured documents.

==== Example
<enscript highlight="scheme">(define (dom-demo)
  (define (print-element-names node)
    (let loop ((n node))
      (when n
            (when (dom:is-element-node? n)
                  (print "element <" (dom:node-name n) ">")
                  (print "@ => " (dom:attributes n)))
            (when (dom:is-text-node? n)
                  (print "content => " (dom:node-content n)))
            (print-element-names (dom:node-children n))
            (loop (dom:next-node n)))))
  (define ctx (dom:make-parser-context))
  (define doc (dom:read-file-with-context ctx "foo.xml" #f 0))
  (define root (dom:root-element doc))
  (define valid? (dom:is-valid? ctx))
  (print "XML is valid?: " valid?)
  (print "root: " root)
  (print-element-names root)
  (dom:free-doc doc)
  (dom:cleanup-parser))
</enscript>
==== Node Types
===== {{dom:element-node}}
<constant>dom:element-node → 1</constant>
DOM element node
===== {{dom:attribute-node}}
<constant>dom:attribute-node → 2</constant>
DOM attribute node
===== {{dom:text-node}}
<constant>dom:text-node → 3</constant>
DOM text node
===== {{dom:cdata_section_node}}
<constant>dom:cdata_section_node → 4</constant>
DOM CData node
===== {{dom:entity-ref-node}}
<constant>dom:entity-ref-node → 5</constant>
DOM Entity reference node
===== {{dom:entity-node}}
<constant>dom:entity-node → 6</constant>
DOM entity node
===== {{dom:pi-node}}
<constant>dom:pi-node → 7</constant>
DOM pi-node 
===== {{dom:comment-node}}
<constant>dom:comment-node → 8</constant>
DOM comment node
===== {{dom:document-node}}
<constant>dom:document-node → 9</constant>
DOM document node
===== {{dom:document-type-node}}
<constant>dom:document-type-node → 10</constant>
DOM document type node
===== {{dom:document-frag-node}}
<constant>dom:document-frag-node → 11</constant>
DOM document frag node
===== {{dom:notation-node}}
<constant>dom:notation-node → 12</constant>
DOM notation node
===== {{dom:html-document-node}}
<constant>dom:html-document-node → 13</constant>
DOM HTML document node
===== {{dom:dtd-node}}
<constant>dom:dtd-node → 14</constant>
DOM DTD node
===== {{dom:element-decl}}
<constant>dom:element-decl → 15</constant>
DOM element declaration
===== {{dom:attribute-decl}}
<constant>dom:attribute-decl → 16</constant>
DOM attributte declaration
===== {{dom:entity-decl}}
<constant>dom:entity-decl → 17</constant>
DOM entity declaration
===== {{dom:namespace-decl}}
<constant>dom:namespace-decl → 18</constant>
DOM namespace declaration
===== {{dom:xinclude-start}}
<constant>dom:xinclude-start → 19</constant>
DOM xinclude start declaration
===== {{dom:xinclude-end}}
<constant>dom:xinclude-end → 20</constant>
DOM xinlude end declaration
==== API
===== {{dom:is-element-node?}}
<procedure>(dom:is-element-node? node) → boolean</procedure>
Checks if specified dom:node is a element node
; {{node}} : A dom:xml-node
===== {{dom:is-text-node?}}
<procedure>(dom:is-text-node? node) → boolean</procedure>
Checks if specified dom:node is a text node
; {{node}} : A dom:xml-node
===== {{dom:is-attribute-node?}}
<procedure>(dom:is-attribute-node? node) → boolean</procedure>
Checks if specified dom:node is an attribute node
; {{node}} : A dom:xml-node
===== {{dom:parse-string}}
<procedure>(dom:parse-string xml-string xml-size URL encoding options) → dom:doc</procedure>
Parse string using the DOM parser API
; {{xml-string}} : XML string
; {{xml-size}} : Size of the XML string
; {{URL}} : XML URL
; {{encoding}} : Encoding
; {{options}} : Options
===== {{dom:parse-string-default}}
<procedure>(dom:parse-string-default str) → dom:doc</procedure>
Parse string using the DOM parser API with default options and encoding
; {{xml-string}} : XML string
===== {{dom:cleanup-parser}}
<constant>dom:cleanup-parser → (foreign-lambda void xmlCleanupParser)</constant>
Free the dom:doc
===== {{dom:parse-file}}
<procedure>(dom:parse-file filename) → dom:doc</procedure>
Parse a file using the DOM parser API
; {{filename}} : XML file
===== {{dom:free-doc}}
<procedure>(dom:free-doc) → unspecified</procedure>
Free the dom:doc

===== {{dom:make-parser-context}}
<procedure>(dom:make-parser-context) → dom:parser-context</procedure>
Create a DOM parser context

===== {{dom:read-file-with-context}}
<procedure>(dom:read-file-with-context context filename encoding options) → dom:doc</procedure>
Parse a XML file using the given DOM parser context
; {{context}} : DOM parser context
; {{filename}} : 
; {{encoding}} : 
; {{options}} : 
===== {{dom:is-valid?}}
<procedure>(dom:is-valid? context) → boolean</procedure>
Checks if the parser context is valid after parsing a file
; {{context}} : DOM parser context
===== {{dom:free-parser-context}}
<procedure>(dom:free-parser-context) → unspecified</procedure>
Free the dom:parser-context

===== {{dom:to-string}}
<procedure>(dom:to-string) → string</procedure>
Convert a dom:node to string including the children nodes

===== {{dom:next-node}}
<procedure>(dom:next-node) → dom:node</procedure>
Move to the next dom:node

===== {{dom:node-content}}
<procedure>(dom:node-content) → string</procedure>
Returns the contents (text) of the dom:node

===== {{dom:node-children}}
<procedure>(dom:node-children) → dom:node</procedure>
Returns the first child node

===== {{dom:node-name}}
<procedure>(dom:node-name) → dom:node</procedure>
Returns the name of the dom:node

===== {{dom:is-element-name?}}
<procedure>(dom:is-element-name? name dom:node) → boolean</procedure>
Checks if the current name of the  dom:node matches the specified string
; {{name}} : Name (string) to match
; {{dom:node}} : 
===== {{dom:get-attribute}}
<procedure>(dom:get-attribute key dom:node) → string</procedure>
Returns the attribute from the specified key
; {{key}} : string
; {{dom:node}} : 
===== {{dom:attributes}}
<procedure>(dom:attributes n) → Association list</procedure>
Returns the complete set of XML attributes for the given node
; {{dom:node}} : 
=== SAX Parser
Sometimes the DOM tree output is just too large to fit reasonably into
memory. In that case (and if you don't expect to save back the XML document
loaded using libxml), it's better to use the SAX interface of libxml. SAX is a
callback-based interface to the parser. Before parsing, the application layer
registers a customized set of callbacks which are called by the library as it
progresses through the XML input.

==== Example
<enscript highlight="scheme">(define (sax-demo)
  (define sax
    (sax:make-handler
      (lambda (localname attribute-list)
        (print "<" localname ">")
        (print "@ => " attribute-list))
      (lambda (localname) (print "<" localname "/>"))
      (lambda (characters) (print "[on-chars]: characters: " characters))))
  (sax:parse-file sax #f "foo.xml")
  (sax:free-handler sax))
</enscript>
===== {{sax:parse-file}}
<procedure>(sax:parse-file handler user-data) → number</procedure>
Parse a XML file using the SAX handler
; {{handler}} : SAX handler
; {{user-data}} : SAX parser context
===== {{sax:parse-string}}
<procedure>(sax:parse-string sax-handler user-data xml-string size) → number</procedure>
Parse a XML string using the SAX handler
; {{sax-handler}} : 
; {{user-data}} : SAX parser context
; {{xml-string}} : 
; {{size}} : The size of the XML string
===== {{sax:make-handler}}
<procedure>(sax:make-handler on-start on-end on-characters) → sax-handler</procedure>
Makes a SAX handler
; {{on-start}} : λ called on start of element
; {{on-end}} : λ called on end of element
; {{on-characters}} : λ called on start of reading characters
===== {{sax:free-handler}}
<procedure>(sax:free-handler sax-handler) → unspecified</procedure>
Frees the SAX handler
; {{sax-handler}} : 
=== Text Reader Parser
Libxml2 main API is tree based, where the parsing operation results in
a document loaded completely in memory, and expose it as a tree of nodes all
availble at the same time. This is very simple and quite powerful, but has the
major limitation that the size of the document that can be handled is limited
by the size of the memory available. Libxml2 also provide a SAX based API, but
that version was designed upon one of the early expat version of SAX, SAX is
also not formally defined for C. SAX basically work by registering callbacks 
which are called directly by the parser as it progresses through the document 
streams. The problem is that this programming model is relatively complex, not
well standardized, cannot provide validation directly, makes entity, namespace
and base processing relatively hard.

The text-reader API provides a far simpler programming model. The API acts as 
a cursor going forward on the document stream and stopping at each node in the
way. The user's code keeps control of the progress and simply calls a read-next
procedure repeatedly to progress to each node in sequence in document order. 
There is direct support for namespaces, xml:base, entity handling and adding
DTD validation on top of it was relatively simple. This API is really close to
the DOM Core specification This provides a far more standard, easy to use and
powerful API than the existing SAX. Moreover integrating extension features
based on the tree seems relatively easy.

In a nutshell the text-reader API provides a simpler, more standard and more
extensible interface to handle large documents than the existing SAX version.

==== Example
<enscript highlight="scheme">(define (text-reader-demo)
  (define tr (text-reader:make "foo.xml"))
  (define (helper tr)
    (when (text-reader:element-node? tr)
          (print "<" (text-reader:name tr) ">")
          (print "@ => " (text-reader:all-attributes tr)))
    (when (text-reader:text-node? tr)
          (print "value =>" (text-reader:value tr)))
    (if (> (text-reader:read-more tr) 0) (helper tr)))
  (helper tr)
  (text-reader:free tr))
</enscript>
==== Node Types
===== {{text-reader:none}}
<constant>text-reader:none → 0</constant>
Text-Reader none
===== {{text-reader:element}}
<constant>text-reader:element → 1</constant>
Text-Reader element
===== {{text-reader:attribute}}
<constant>text-reader:attribute → 2</constant>
Text-Reader attribute
===== {{text-reader:text}}
<constant>text-reader:text → 3</constant>
Text-Reader text
===== {{text-reader:cdata}}
<constant>text-reader:cdata → 4</constant>
Text-Reader cdata
===== {{text-reader:entity-reference}}
<constant>text-reader:entity-reference → 5</constant>
Text-Reader entity reference
===== {{text-reader:entity}}
<constant>text-reader:entity → 6</constant>
Text-Reader entity
===== {{text-reader:processing-instruction}}
<constant>text-reader:processing-instruction → 7</constant>
Text-Reader processing instruction
===== {{text-reader:comment}}
<constant>text-reader:comment → 8</constant>
Text-Reader comment
===== {{text-reader:document}}
<constant>text-reader:document → 9</constant>
Text-Reader document
===== {{text-reader:document-type}}
<constant>text-reader:document-type → 10</constant>
Text-Reader document type
===== {{text-reader:document-fragmenta}}
<constant>text-reader:document-fragmenta → 11</constant>
Text-Reader document fragments
===== {{text-reader:notation}}
<constant>text-reader:notation → 12</constant>
Text-Reader notation
===== {{text-reader:whitespace}}
<constant>text-reader:whitespace → 13</constant>
Text-Reader whitespace
===== {{text-reader:significant-whitespace}}
<constant>text-reader:significant-whitespace → 14</constant>
Text-Reader signficiant whitespace
===== {{text-reader:end-element}}
<constant>text-reader:end-element → 15</constant>
Text-Reader element end
===== {{text-reader:end-entity}}
<constant>text-reader:end-entity → 16</constant>
Text-Reader entity end
===== {{text-reader:xml-declaration}}
<constant>text-reader:xml-declaration → 17</constant>
Text-Reader XML declaration
==== API
===== {{text-reader:element-to-string}}
<procedure>(text-reader:element-to-string r) → string</procedure>
Converts a text reader to string including child nodes
; {{text-reader}} : 
===== {{text-reader:end-element-is?}}
<procedure>(text-reader:end-element-is? name reader) → boolean</procedure>
Checks if end element is specified name
; {{name}} : Element name (string)
; {{text-reader}} : 
===== {{text-reader:start-element-is?}}
<procedure>(text-reader:start-element-is? name reader) → boolean</procedure>
Checks if start element is specified name
; {{name}} : Element name (string)
; {{text-reader}} : 
===== {{text-reader:end-element-node?}}
<procedure>(text-reader:end-element-node? reader) → boolean</procedure>
Checks if node is an end element
; {{reader}} : 
===== {{text-reader:text-node?}}
<procedure>(text-reader:text-node? reader) → boolean</procedure>
Checks for text node
; {{reader}} : 
===== {{text-reader:element-node?}}
<procedure>(text-reader:element-node? reader) → boolean</procedure>
Checks if node is an element
; {{reader}} : 
===== {{text-reader:make}}
<procedure>(text-reader:make filename) → text-reader</procedure>
Makes a new text-reader
; {{filename}} : 
===== {{text-reader:read-more}}
<procedure>(text-reader:read-more text-reader) → unspecified</procedure>
Reads the next node in the text-reader
; {{text-reader}} : 
===== {{text-reader:free}}
<procedure>(text-reader:free text-reader) → unspecified</procedure>
Free the specfied text-reader
; {{text-reader}} : 
===== {{text-reader:node-type}}
<procedure>(text-reader:node-type text-reader) → Node type (number)</procedure>
Returns the node type
; {{text-reader}} : 
===== {{text-reader:empty-element?}}
<procedure>(text-reader:empty-element? text-reader) → boolean</procedure>
Checks if text-reader is empty
; {{text-reader}} : 
===== {{text-reader:move-to-attribute}}
<procedure>(text-reader:move-to-attribute text-reader attribute-name) → number</procedure>
Moves text-reader to the specified attribute
; {{text-reader}} : 
; {{attribute-name}} : (string)
===== {{text-reader:all-attributes}}
<procedure>(text-reader:all-attributes r) → list</procedure>
Extracts all the attributes from the element. Attributes are placed
into an association list
; {{text-reader}} : 
===== {{text-reader:move-to-next-attribute}}
<procedure>(text-reader:move-to-next-attribute text-reader) → number</procedure>
Moves text-reader to the next attribute
; {{text-reader}} : 
===== {{text-reader:move-to-first-attribute}}
<procedure>(text-reader:move-to-first-attribute text-reader) → number</procedure>
Moves text-reader to the first attribute
; {{text-reader}} : 
===== {{text-reader:move-to-element}}
<procedure>(text-reader:move-to-element text-reader) → number</procedure>
Moves text-reader to first element
; {{text-reader}} : 
===== {{text-reader:next}}
<procedure>(text-reader:next text-reader) → number</procedure>
Moves text-reader to next node
; {{text-reader}} : 
===== {{text-reader:next-sibling}}
<procedure>(text-reader:next-sibling text-reader) → number</procedure>
Moves text-reader to next sibling node
; {{text-reader}} : 
===== {{text-reader:name}}
<procedure>(text-reader:name text-reader) → string</procedure>
Returns the name of the node
; {{text-reader}} : 
===== {{text-reader:value}}
<procedure>(text-reader:value text-reader) → string</procedure>
Returns the value of the node
; {{text-reader}} : 
=== About this egg

==== Author

[[/users/djireland|David Ireland]]

==== Colophon

Documented by [[/egg/hahn|hahn]].

Description of your changes:

I would like to authenticate

Authentication

Username:Password:

Spam control

What do you get when you multiply 5 by 6?