Outdated egg!
This is an egg for CHICKEN 4, the unsupported old release. You're almost certainly looking for the CHICKEN 5 version of this egg, if it exists.
If it does not exist, there may be equivalent functionality provided by another egg; have a look at the egg index. Otherwise, please consider porting this egg to the current version of CHICKEN.
link-grammar
Bindings for the CMU link-grammar parser system.
- Outdated egg!
- link-grammar
- Link Grammar
- Author
- Upstream
- Egg Source Code
- Example Usage
- Simple Use
- Sentences
- Dictionary
- Linkages
- create-linkage
- delete-linkage!
- num-words
- num-links
- link-length
- get-lword
- get-rword
- link-label
- link-llabel
- link-rlabel
- num-domains
- link-domain-names
- get-words
- get-word
- disjunct-str
- disjunct-cost
- disjunct-corpus-score
- get-constituents
- get-diagram
- get-postscript
- get-disjuncts
- get-links-domains
- unused-word-cost
- disjunct-cost
- link-cost
- corpus-cost
- linkage->eps-file
- get-version
- get-dictionary-version
- get-dictionary-locale
- display-off
- display-multi-line
- display-bracket-tree
- display-single-line
- display-max-styles
- set-display-morphology!
- get-display-morphology
- Parse Options
- License
- About this egg
Link Grammar
The link grammar parser is a syntactic parser of English, based on link grammar, an original theory of English syntax. Given a sentence the system assigns to it a syntactic structure, which consists of a set of labeled links connecting pairs of words. The parser also produces a 'constituent' representation of a sentence (showing noun phrases, verb phrases, etc.).
Author
David Ireland (djireland79 at gmail dot com)
Upstream
https://www.abisource.com/projects/link-grammar/
Egg Source Code
https://gitlab.com/maxwell79/chicken-link-grammar
link-grammar
[module] link-grammar
Documentation
- parse-with-default
- parse-sentence
- display-off
- display-multi-line
- display-bracket-tree
- display-single-line
- display-max-styles
- create-default-dictionary
- create-dictionary-with-language
- get-verbosity
- get-version
- get-dictionary-version
- get-dictionary-locale
- get-dictionary-language
- get-dictionary-data-dir
- set-dictionary-data-dir!
- delete-dictionary!
- create-sentence
- split-sentence
- sentence-length
- sentence-null-count
- sentence-disjunct-cost
- sentence-link-cost
- linkages-found
- linkages-post-processed
- linkages-violated
- valid-linkages
- delete-sentence!
- create-linkage
- corpus-cost
- get-lword
- get-rword
- get-words
- get-word
- get-constituents
- get-diagram
- get-postscript
- get-disjuncts
- get-links-domains
- get-violation-name
- link-length
- link-label
- link-llabel
- link-rlabel
- link-cost
- link-domain-names
- num-words
- num-links
- num-domains
- unused-word-cost
- delete-linkage!
- init-opts
- set-max-parse-time!
- set-linkage-limit!
- set-short-length!
- set-disjunct-cost!
- set-min-null-count!
- set-max-null-count!
- set-max-parse-time!
- set-islands-ok!
- set-verbosity!
- resources-exhausted?
- memory-exhausted?
- timer-expired?
- reset-resources!
- delete-parse-options!
Example Usage
(import scheme) (cond-expand (chicken-4 (use (prefix link-grammar lg:))) (chicken-5 (import (prefix link-grammar lg:)))) (define (display-linkage sentence opts index) (let* ((links-found (lg:linkages-found sentence)) (linkage (lg:create-linkage index sentence opts))) (when linkage (let ((constituents (lg:get-constituents linkage lg:display-multi-line)) (diagram (lg:get-diagram linkage #t 80))) (print constituents) (print diagram) (lg:delete-linkage! linkage))) (when (<= index links-found) (display-linkage sentence opts (+ index 1))))) (define (parse text dictionary opts) (let* ((sentence (lg:create-sentence text dictionary)) (num-linkages (lg:parse-sentence sentence opts))) (when (= num-linkages 0) (lg:set-min-null-count! opts 1) (lg:set-max-null-count! opts (lg:sentence-length sentence)) (set! num-linkages (lg:parse-sentence sentence opts))) (display-linkage sentence opts 0) (lg:delete-sentence! sentence))) (define dictionary (lg:create-default-dictionary)) (define opts (lg:init-opts)) (lg:set-linkage-limit! opts 1000) (lg:set-short-length! opts 10) (lg:set-verbosity! opts 1) (lg:set-max-parse-time! opts 30) (lg:set-linkage-limit! opts 1000) (lg:set-min-null-count! opts 0) (lg:set-max-null-count! opts 0) (lg:set-short-length! opts 16) (lg:set-islands-ok! opts #f) (parse "The black fox ran from the hunters" dictionary opts) (lg:delete-parse-options! opts) (lg:delete-dictionary! dictionary)
(S (NP the black.a fox.n) (VP ran.v-d (PP from (NP the hunters.n))))
+------------------------Xp------------------------+ +----------->WV----------->+ | +---------Wd--------+ | | | +----Ds**x---+ | +----Jp----+ | | | +---A--+--Ss--+--MVp-+ +--Dmc-+ +--RW--+ | | | | | | | | | | LEFT-WALL the black.a fox.n ran.v-d from the hunters.n . RIGHT-WALL
(S (NP the black.a fox.n) (VP ran.v-d (PP from (NP the hunters.n))))
+------------------------Xp------------------------+ +---------Wd--------+ | | +----Ds**x---+ +----Jp----+ | | | +---A--+--Ss--+--MVp-+ +--Dmc-+ +--RW--+ | | | | | | | | | | LEFT-WALL the black.a fox.n ran.v-d from the hunters.n . RIGHT-WALL
Simple Use
Parse a text using default values for the dictionary and parser
parse-with-default
[procedure] (parse-with-default text) → (values words links diagrams postscript)Parse text using default values
- text
- string to parse
Sentences
A sentence is the API's representation of an input string, tokenized and interpreted according to a specific Dictionary. After a Sentence is created and parsed, various attributes of the resulting set of linkages can be obtained.
create-sentence
[procedure] (create-sentence input dictionary) → sentencecreates a sentence object from the input string, using the Dictionary that was created earlier to tokenize and define words
- input
- Input string (string)
- dictionary
- dictionary to use
delete-sentence!
[procedure] (delete-sentence! sentence) → unspecifiedDeletes the specificed sentence
- sentence
- Sentence to be deleted (sentence)
split-sentence
[procedure] (split-sentence sentence parse-options) → numberSplits (tokenizes) the sentence up into its component words and punctuation. This includes splitting up certain run-on expressions, such as '12ft.' which is split into '12' and 'ft.'. If spell- guessing is enabled in the opts, the tokenizer will also separate most run-on words, i.e. pairs of words without an intervening space. This routine returns zero if successful; else a non-zero value if an error occurred.
- sentence
- Sentence to split (sentence)
- parse-options
parse-sentence
[procedure] (parse-sentence sentence parse-options) → numberThis routine represents the heart of the program. There are several things that are done when a sentence is parsed: 1. Word expressions are extracted from the dictionary and pruned. 2. Disjuncts are built. 3. A series of pruning operations is carried out. 4. The linkages having the minimal number of null links are counted. 5. A 'parse set' of linkages is built. 6. The linkages are post-processed.
The 'parse set' is attached to the sentence, and this is one of the key reasons that the API is flexible and modular. All of the necessary information for building linkages is stored in the parse set. This means that other sentences can be parsed, possibly using different dictionaries and other parameters, without disturbing the information obtained from a call to sentence_parse. If another call to parse-sentence is made on the same sentence, the parsing information for the previous call is deleted. Like almost all of the other routines, this call is thread-safe: that is, sentences can be parsed concurrently in multiple threads.
- sentence
- parse-options
sentence-length
[procedure] (sentence-length sentence) → numberReturns the length of the sentence
- sentence
sentence-null-count
[procedure] (sentence-null-count) → numberReturns the number of words that failed to be linked into the rest of the sentence during parsing. This number is greater then zero whenever a word doesn't seem to fit anywhere in the parse, either due to poor grammar, or due to a shortcoming of the dictionary.
linkages-found
[procedure] (linkages-found) → numberReturns the number of linkages that the search found
valid-linkages
[procedure] (valid-linkages) → numberReturns the number of linkages that had no post-processing violations
linkages-post-processed
[procedure] (linkages-post-processed) → numberReturns the number of linkages that were actually post-processed
linkages-violated
[procedure] (linkages-violated) → numberReturns the number of post-processing violations that the i-th linkage had
during the last call to sentence_parse.
sentence-disjunct-cost
[procedure] (sentence-disjunct-cost sentence index) → numberReturns the sum total of all of the costs of all of the disjuncts used in the i-th linkage of the sentence. The higher the cost, the less likely that the parse is correct. Very roughly, this can be interpreted as if it was (minus) the log-liklihood of a parse being correct.
- sentence
- index
sentence-link-cost
[procedure] (sentence-link-cost sentence index) → numberReturns the sum of the length of the links in the i-th parse. The ratio of this length, to the total length of the sentence, gives a rough measure of the complexity of the sentence. That is, long-range links between distant words indicates that the sentence may be hard to understand; alternately, it may indicate that the parse is not very accurate.
- sentence
- index
Dictionary
A Dictionary is the programmer's handle on the set of word definitions that defines the grammar. A user creates a Dictionary from a grammar file and post-process knowledge file, and then passes it to the various parsing routines.
create-dictionary-with-language
[procedure] (create-dictionary-with-language language) → dictionaryCreates a dictionary with the specified language
- language
- Language to use (string)
create-default-dictionary
[procedure] (create-default-dictionary) → dictionaryLooks for a dictionary in the same language as the current environment, and if one is found, creates a dictionary object.
get-dictionary-language
[procedure] (get-dictionary-language dictionary) → stringReturns the language of the specified dictionary
- dictionary
- specified dictionary (dictionary)
delete-dictionary!
[procedure] (delete-dictionary! dictionary) → unspecifiedDeletes the specified dictionary
- dictionary
- specified dictionary (dictionary)
set-dictionary-data-dir!
[procedure] (set-dictionary-data-dir! path) → unspecifiedSpecify the file path to the dictionaries to use; to be effective, this routine must be called before the dictionaries are opened.
- path
- Filename with path
get-dictionary-data-dir
[procedure] (get-dictionary-data-dir) → stringReturns the file path to the dictionaries
Linkages
create-linkage
[procedure] (create-linkage) → linkageThis function creates the index-th linkage from the (parsed) sentence sent. Several operations can be carried out on the resulting linkage; for example it can be printed, post-processed with a different post- processor, or information on individual links can be extracted. If the parse has a conjunction, then the linkage will be made up of two or more sublinkages.
delete-linkage!
[procedure] (delete-linkage! linakge) → unspecifiedDelete the given linkage
- linakge
num-words
[procedure] (num-words linkage) → numberThe number of words in the sentence for which this is a linkage.
- linkage
num-links
[procedure] (num-links linkage) → numberThe number of links used in the linkage.
- linkage
link-length
[procedure] (link-length linkage index) → numberThe value returned by num-links procedure is the number of words spanned by the index-th link of the linkage.
- linkage
- index
- (number)
get-lword
[procedure] (get-lword) → numberThe value returned is the number of the word on the left end of the index-th link of the current sublinkage.
get-rword
[procedure] (get-rword) → numberThe value returned is the number of the word on the right end of the index-th link of the current sublinkage.
link-label
[procedure] (link-label linkage index) → stringThe label on a link in a diagram is constructed by taking the 'intersection' of the left and right connectors that comprise the link. For example, 'I.p eat, therefore I.p think.v' has a Sp*i label on the link between the words I.p and eat is constructed from the Sp*i connector on the its left word, and the Sp connector on its right word. So, for this example, both link-label and link-llabel return 'Sp*i' while link-rlabel returns 'Sp' for this link.
- linkage
- index
link-llabel
[procedure] (link-llabel linkage index) → stringSee link-label
- linkage
- index
link-rlabel
[procedure] (link-rlabel linkage index) → stringSee link-label
- linkage
- index
num-domains
[procedure] (num-domains linkage index) → numbernum-domains, link-domain-names allow access to most of the domain structure extracted during post-processing. The index parameter in the first two calls specify which link in the linkage to extract the information for. In the 'I eat therefore I think' example above, the link between the words therefore and I.p belongs to two 'm' domains. If the linkage violated any post-processing rules, the name of the violated rule in the post-process knowledge file can be determined by a call to get-violation-name.
- linkage
- index
link-domain-names
[procedure] (link-domain-names linkage word-index) → listGets domain structure extracted during the post-processing
- linkage
- word-index
- Specifies which link in the linkage to extract the information for.
get-words
[procedure] (get-words linkage) → listReturns the array of word spellings or individual word spelling for the linkage. These are the subscripted spellings, such as 'dog.n'. The original spellings can be obtained by calls to sentence-get-word.
- linkage
get-word
[procedure] (get-word linkage word-number) → stringReturns the word spelling of an individual word
- linkage
- word-number
- The specific word
disjunct-str
[procedure] (disjunct-str linkage linkage word-number) → stringReturn a string showing the disjuncts that were actually used in association with the specified word in the current linkage. The string shows the disjuncts in proper order; that is, left-to-right, in the order in which they link to other words. The returned string can be thought of as a very precise part-of-speech-like label for the word, indicating how it was used in the given sentence; this can be useful for corpus statistics.
- linkage
- The specific linkage
- linkage
- word-number
- The specific word
disjunct-cost
[procedure] (disjunct-cost) → numberReturn the cost of a word as used in a particular linkage, based
on the dictionary.
disjunct-corpus-score
[procedure] (disjunct-corpus-score) → numberReturns the cost based on the corpus-statistics database.
get-constituents
[procedure] (get-constituents linkage display-style) → stringReturns the constituents for a particular linkage
- linkage
- display-style
- (number
get-diagram
[procedure] (get-diagram linkage display-walls? screen-width) → stringReturns the linkage diagram
- linkage
- display-walls?
- A boolean that indicates whether or not the wall-words, and the connectors to them, should be printed
- screen-width
- The screen-width is an integer, indicating the number of columns that should be used during printing; long sentences that are wider than the number of columns will be automatically wrapped so that they always fit.
get-postscript
[procedure] (get-postscript linkage display-walls? print-ps-header?) → stringReturns the macros needed to print out the linkage in a postscript file.
- linkage
- display-walls?
- A boolean that indicates whether or not the wall-words, and the connectors to them, should be printed
- print-ps-header?
- A boolean that indicates whether or not postscript header boilerplate should be included.
get-disjuncts
[procedure] (get-disjuncts linkage) → stringReturns the returns a string that shows all of the disjuncts, and their costs, that were used to create the linkage.
- linkage
get-links-domains
[procedure] (get-links-domains linkage) → stringReturns a string that lists all of the links and domain names for the linkage.
- linkage
unused-word-cost
[procedure] (unused-word-cost linkage) → numberShould return the same value as sentence-null-count.
- linkage
disjunct-cost
[procedure] (disjunct-cost linkage) → numberShould return the same value as sentence-disjunct-cost.
- linkage
link-cost
[procedure] (link-cost linkage) → numberShould return the same value as sentence-link-cost.
- linkage
corpus-cost
[procedure] (corpus-cost linkage) → numberReturns the total cost of this particular linkage, based on the cost of disjuncts stored in the corpus-statistics database.
- linkage
linkage->eps-file
[procedure] (linkage->eps-file filename postscript) → unspecifiedSaves a linkage to a postscript file
- path
- filename
- postscript
- Postscript string
get-version
[procedure] (get-version) → stringGets link-grammar version
get-dictionary-version
[procedure] (get-dictionary-version dictionary) → stringGets dictionary version
- dictionary
- Dictionary
get-dictionary-locale
[procedure] (get-dictionary-locale) → stringGets dictionary locale
display-off
[constant] display-off → 0Turn off display
display-multi-line
[constant] display-multi-line → 1Print diagram across multiple lines
display-bracket-tree
[constant] display-bracket-tree → 2Use brackets when printing diagram
display-single-line
[constant] display-single-line → 3Print diagram on single line
display-max-styles
[constant] display-max-styles → 3Print diagram on single line
set-display-morphology!
[procedure] (set-display-morphology! parse-options value) → unspecifiedSets display morphology in parse-options
- parse-options
- value
- (number)
get-display-morphology
[procedure] (get-display-morphology parse-options) → numberGets display morphology value
- parse-options
Parse Options
Parse-options specify the different parameters that are used to parse sentences. Examples of the kinds of things that are controlled by parse-options include maximum parsing time and memory, whether to use null-links, and whether or not to use 'panic' mode. This data structure is passed in to the various parsing and printing routines along with the sentence.
Default value for parse-option members are:
verbosity → 0
linkage-limit → 10000
min-null-count → 0
max-null-count → 0
null-block → 1
islands-ok → #f
short-length → 6
all-short → #f
display-short → #t
display-word-subscripts → #t
display-link-subscripts → #t
display-walls → #f
allow-null → #t
echo-on → #f
batch-mode → #f
panic-mode → #f
screen-width → 79
display-on → #t
display-postscript → #f
display-bad → #f
display-links → #f
init-opts
[procedure] (init-opts) → parse-optionsInitilise parse-options to default values
set-max-parse-time!
[procedure] (set-max-parse-time! parse-options value) → unspecifiedSet maximum parse time
- parse-options
- value
- (number)
set-linkage-limit!
[procedure] (set-linkage-limit! parse-options linkage-limit) → unspecifiedSet linkage limit
- parse-options
- linkage-limit
- (number)
set-short-length!
[procedure] (set-short-length! parse-options short-length) → unspecifiedThe short_length parameter determines how long the links are allowed to be. The intended use of this is to speed up parsing by not considering very long links for most connectors, since they are very rarely used in a correct parse. An entry for UNLIMITED-CONNECTORS in the dictionary will specify which connectors are exempt from the length limit.
- parse-options
- short-length
- (number)
set-disjunct-cost!
[procedure] (set-disjunct-cost! parse-options disjunt-cost) → unspecifiedDetermines the maximum disjunct cost used during parsing, where the cost of a disjunct is equal to the maximum cost of all of its connectors. The default is that only disjuncts up to a cost of 2.9 are considered.
- parse-options
- disjunt-cost
set-min-null-count!
[procedure] (set-min-null-count! parse-options null-count) → unspecifiedWhen parsing a sentence, the parser will find all solutions having the minimum number of null links. It carries out its search in the range of null link counts between min_null_count and max_null_count. By default, the minimum and maximum number of null links is 0, so null links are not used.
- parse-options
- null-count
set-max-null-count!
[procedure] (set-max-null-count! parse-options null-count) → unspecifiedWhen parsing a sentence, the parser will find all solutions having the minimum number of null links. It carries out its search in the range of null link counts between min-null-count and max-null-count. By default, the minimum and maximum number of null links is 0, so null links are not used.
- parse-options
- null-count
reset-resources!
[procedure] (reset-resources! parse-options) → unspecifiedReset acquired resources
- parse-options
resources-exhausted?
[procedure] (resources-exhausted? parse-options) → booleanResources_exhausted means memory-exhausted? OR timer-expired?
- parse-options
memory-exhausted?
[procedure] (memory-exhausted? parse-options) → numberChecks whether the memory was exhausted during parsing
- parse-options
timer-expired?
[procedure] (timer-expired? parse-options) → numberChecks whether the timer was exceeded during parsing.
- parse-options
set-islands-ok!
[procedure] (set-islands-ok! parse-options islands-ok?) → unspecifiedThis option determines whether or not 'islands' of links are allowed.
- parse-options
- islands-ok?
- A boolean to indicate whether islands are allowed
set-verbosity!
[procedure] (set-verbosity! parse-options verbosity-level) → unspecifiedSets/gets the level of description printed to stderr/stdout about the parsing process.
- parse-options
- verbosity-level
get-verbosity
[procedure] (get-verbosity parse-options) → numberGet the verbosity level
- parse-options
delete-parse-options!
[procedure] (delete-parse-options! parse-options) → numberDelete a parse-option object
- parse-options
License
This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
About this egg
Author
Repository
https://gitlab.com/maxwell79/chicken-link-grammar
License
LGPL-2.1
Dependencies
Versions
Colophon
Documented by hahn.