Bindings for the CMU link-grammar parser system.

  1. link-grammar
    1. Link Grammar
    2. Author
    3. Upstream
    4. Egg Source Code
      1. link-grammar
    5. Example Usage
    6. Simple Use
      1. parse-with-default
    7. Sentences
      1. create-sentence
      2. delete-sentence!
      3. split-sentence
      4. parse-sentence
      5. sentence-length
      6. sentence-null-count
      7. linkages-found
      8. valid-linkages
      9. linkages-post-processed
      10. linkages-violated
      11. sentence-disjunct-cost
      12. sentence-link-cost
    8. Dictionary
      1. create-dictionary-with-language
      2. create-default-dictionary
      3. get-dictionary-language
      4. delete-dictionary!
      5. set-dictionary-data-dir!
      6. get-dictionary-data-dir
    9. Linkages
      1. create-linkage
      2. delete-linkage!
      3. num-words
      4. num-links
      5. link-length
      6. get-lword
      7. get-rword
      8. link-label
      9. link-llabel
      10. link-rlabel
      11. num-domains
      12. link-domain-names
      13. get-words
      14. get-word
      15. disjunct-str
      16. disjunct-cost
      17. disjunct-corpus-score
      18. get-constituents
      19. get-diagram
      20. get-postscript
      21. get-disjuncts
      22. get-links-domains
      23. unused-word-cost
      24. disjunct-cost
      25. link-cost
      26. corpus-cost
      27. linkage->eps-file
      28. get-version
      29. get-dictionary-version
      30. get-dictionary-locale
      31. display-off
      32. display-multi-line
      33. display-bracket-tree
      34. display-single-line
      35. display-max-styles
      36. set-display-morphology!
      37. get-display-morphology
    10. Parse Options
      1. init-opts
      2. set-max-parse-time!
      3. set-linkage-limit!
      4. set-short-length!
      5. set-disjunct-cost!
      6. set-min-null-count!
      7. set-max-null-count!
      8. reset-resources!
      9. resources-exhausted?
      10. memory-exhausted?
      11. timer-expired?
      12. set-islands-ok!
      13. set-verbosity!
      14. get-verbosity
      15. delete-parse-options!
    11. License
    12. About this egg
      1. Author
      2. Repository
      3. License
      4. Dependencies
      5. Versions
      6. Colophon

The link grammar parser is a syntactic parser of English, based on link grammar, an original theory of English syntax. Given a sentence the system assigns to it a syntactic structure, which consists of a set of labeled links connecting pairs of words. The parser also produces a 'constituent' representation of a sentence (showing noun phrases, verb phrases, etc.).

Author

David Ireland (djireland79 at gmail dot com)

Upstream

https://www.abisource.com/projects/link-grammar/

Egg Source Code

https://gitlab.com/maxwell79/chicken-link-grammar

[module] link-grammar

Documentation

Example Usage

(import scheme)
(cond-expand
 (chicken-4
   (use (prefix link-grammar lg:)))
 (chicken-5
   (import (prefix link-grammar lg:))))

(define (display-linkage sentence opts index)
  (let* ((links-found (lg:linkages-found sentence))
         (linkage (lg:create-linkage index sentence opts)))
    (when linkage
          (let ((constituents
                  (lg:get-constituents linkage lg:display-multi-line))
                (diagram (lg:get-diagram linkage #t 80)))
            (print constituents)
            (print diagram)
            (lg:delete-linkage! linkage)))
    (when (<= index links-found) (display-linkage sentence opts (+ index 1)))))
(define (parse text dictionary opts)
  (let* ((sentence (lg:create-sentence text dictionary))
         (num-linkages (lg:parse-sentence sentence opts)))
    (when (= num-linkages 0)
          (lg:set-min-null-count! opts 1)
          (lg:set-max-null-count! opts (lg:sentence-length sentence))
          (set! num-linkages (lg:parse-sentence sentence opts)))
    (display-linkage sentence opts 0)
    (lg:delete-sentence! sentence)))
(define dictionary (lg:create-default-dictionary))
(define opts (lg:init-opts))
(lg:set-linkage-limit! opts 1000)
(lg:set-short-length! opts 10)
(lg:set-verbosity! opts 1)
(lg:set-max-parse-time! opts 30)
(lg:set-linkage-limit! opts 1000)
(lg:set-min-null-count! opts 0)
(lg:set-max-null-count! opts 0)
(lg:set-short-length! opts 16)
(lg:set-islands-ok! opts #f)
(parse "The black fox ran from the hunters" dictionary opts)
(lg:delete-parse-options! opts)
(lg:delete-dictionary! dictionary)
 (S (NP the black.a fox.n)
            (VP ran.v-d
                (PP from
                    (NP the hunters.n))))
 +------------------------Xp------------------------+       
 +----------->WV----------->+                       |       
 +---------Wd--------+      |                       |       
 |      +----Ds**x---+      |      +----Jp----+     |       
 |      |     +---A--+--Ss--+--MVp-+   +--Dmc-+     +--RW--+
 |      |     |      |      |      |   |      |     |      |
 LEFT-WALL the black.a fox.n ran.v-d from the hunters.n . RIGHT-WALL
 (S (NP the black.a fox.n)
    (VP ran.v-d
        (PP from
            (NP the hunters.n))))
 +------------------------Xp------------------------+       
 +---------Wd--------+                              |       
 |      +----Ds**x---+             +----Jp----+     |       
 |      |     +---A--+--Ss--+--MVp-+   +--Dmc-+     +--RW--+
 |      |     |      |      |      |   |      |     |      |
 LEFT-WALL the black.a fox.n ran.v-d from the hunters.n . RIGHT-WALL

Simple Use

Parse a text using default values for the dictionary and parser

parse-with-default

[procedure] (parse-with-default text) → (values words links diagrams postscript)

Parse text using default values

text
string to parse

Sentences

A sentence is the API's representation of an input string, tokenized and interpreted according to a specific Dictionary. After a Sentence is created and parsed, various attributes of the resulting set of linkages can be obtained.

create-sentence

[procedure] (create-sentence input dictionary) → sentence

creates a sentence object from the input string, using the Dictionary that was created earlier to tokenize and define words

input
Input string (string)
dictionary
dictionary to use

delete-sentence!

[procedure] (delete-sentence! sentence) → unspecified

Deletes the specificed sentence

sentence
Sentence to be deleted (sentence)

split-sentence

[procedure] (split-sentence sentence parse-options) → number

Splits (tokenizes) the sentence up into its component words and punctuation. This includes splitting up certain run-on expressions, such as '12ft.' which is split into '12' and 'ft.'. If spell- guessing is enabled in the opts, the tokenizer will also separate most run-on words, i.e. pairs of words without an intervening space. This routine returns zero if successful; else a non-zero value if an error occurred.

sentence
Sentence to split (sentence)
parse-options

parse-sentence

[procedure] (parse-sentence sentence parse-options) → number

This routine represents the heart of the program. There are several things that are done when a sentence is parsed: 1. Word expressions are extracted from the dictionary and pruned. 2. Disjuncts are built. 3. A series of pruning operations is carried out. 4. The linkages having the minimal number of null links are counted. 5. A 'parse set' of linkages is built. 6. The linkages are post-processed.

The 'parse set' is attached to the sentence, and this is one of the key reasons that the API is flexible and modular. All of the necessary information for building linkages is stored in the parse set. This means that other sentences can be parsed, possibly using different dictionaries and other parameters, without disturbing the information obtained from a call to sentence_parse. If another call to parse-sentence is made on the same sentence, the parsing information for the previous call is deleted. Like almost all of the other routines, this call is thread-safe: that is, sentences can be parsed concurrently in multiple threads.

sentence
parse-options

sentence-length

[procedure] (sentence-length sentence) → number

Returns the length of the sentence

sentence

sentence-null-count

[procedure] (sentence-null-count) → number

Returns the number of words that failed to be linked into the rest of the sentence during parsing. This number is greater then zero whenever a word doesn't seem to fit anywhere in the parse, either due to poor grammar, or due to a shortcoming of the dictionary.

linkages-found

[procedure] (linkages-found) → number

Returns the number of linkages that the search found

valid-linkages

[procedure] (valid-linkages) → number

Returns the number of linkages that had no post-processing violations

linkages-post-processed

[procedure] (linkages-post-processed) → number

Returns the number of linkages that were actually post-processed

linkages-violated

[procedure] (linkages-violated) → number

Returns the number of post-processing violations that the i-th linkage had

during the last call to sentence_parse.

sentence-disjunct-cost

[procedure] (sentence-disjunct-cost sentence index) → number

Returns the sum total of all of the costs of all of the disjuncts used in the i-th linkage of the sentence. The higher the cost, the less likely that the parse is correct. Very roughly, this can be interpreted as if it was (minus) the log-liklihood of a parse being correct.

sentence
index
[procedure] (sentence-link-cost sentence index) → number

Returns the sum of the length of the links in the i-th parse. The ratio of this length, to the total length of the sentence, gives a rough measure of the complexity of the sentence. That is, long-range links between distant words indicates that the sentence may be hard to understand; alternately, it may indicate that the parse is not very accurate.

sentence
index

Dictionary

A Dictionary is the programmer's handle on the set of word definitions that defines the grammar. A user creates a Dictionary from a grammar file and post-process knowledge file, and then passes it to the various parsing routines.

create-dictionary-with-language

[procedure] (create-dictionary-with-language language) → dictionary

Creates a dictionary with the specified language

language
Language to use (string)

create-default-dictionary

[procedure] (create-default-dictionary) → dictionary

Looks for a dictionary in the same language as the current environment, and if one is found, creates a dictionary object.

get-dictionary-language

[procedure] (get-dictionary-language dictionary) → string

Returns the language of the specified dictionary

dictionary
specified dictionary (dictionary)

delete-dictionary!

[procedure] (delete-dictionary! dictionary) → unspecified

Deletes the specified dictionary

dictionary
specified dictionary (dictionary)

set-dictionary-data-dir!

[procedure] (set-dictionary-data-dir! path) → unspecified

Specify the file path to the dictionaries to use; to be effective, this routine must be called before the dictionaries are opened.

path
Filename with path

get-dictionary-data-dir

[procedure] (get-dictionary-data-dir) → string

Returns the file path to the dictionaries

Linkages

create-linkage

[procedure] (create-linkage) → linkage

This function creates the index-th linkage from the (parsed) sentence sent. Several operations can be carried out on the resulting linkage; for example it can be printed, post-processed with a different post- processor, or information on individual links can be extracted. If the parse has a conjunction, then the linkage will be made up of two or more sublinkages.

delete-linkage!

[procedure] (delete-linkage! linakge) → unspecified

Delete the given linkage

linakge

num-words

[procedure] (num-words linkage) → number

The number of words in the sentence for which this is a linkage.

linkage
[procedure] (num-links linkage) → number

The number of links used in the linkage.

linkage
[procedure] (link-length linkage index) → number

The value returned by num-links procedure is the number of words spanned by the index-th link of the linkage.

linkage
index
(number)

get-lword

[procedure] (get-lword) → number

The value returned is the number of the word on the left end of the index-th link of the current sublinkage.

get-rword

[procedure] (get-rword) → number

The value returned is the number of the word on the right end of the index-th link of the current sublinkage.

[procedure] (link-label linkage index) → string

The label on a link in a diagram is constructed by taking the 'intersection' of the left and right connectors that comprise the link. For example, 'I.p eat, therefore I.p think.v' has a Sp*i label on the link between the words I.p and eat is constructed from the Sp*i connector on the its left word, and the Sp connector on its right word. So, for this example, both link-label and link-llabel return 'Sp*i' while link-rlabel returns 'Sp' for this link.

linkage
index
[procedure] (link-llabel linkage index) → string

See link-label

linkage
index
[procedure] (link-rlabel linkage index) → string

See link-label

linkage
index

num-domains

[procedure] (num-domains linkage index) → number

num-domains, link-domain-names allow access to most of the domain structure extracted during post-processing. The index parameter in the first two calls specify which link in the linkage to extract the information for. In the 'I eat therefore I think' example above, the link between the words therefore and I.p belongs to two 'm' domains. If the linkage violated any post-processing rules, the name of the violated rule in the post-process knowledge file can be determined by a call to get-violation-name.

linkage
index
[procedure] (link-domain-names linkage word-index) → list

Gets domain structure extracted during the post-processing

linkage
word-index
Specifies which link in the linkage to extract the information for.

get-words

[procedure] (get-words linkage) → list

Returns the array of word spellings or individual word spelling for the linkage. These are the subscripted spellings, such as 'dog.n'. The original spellings can be obtained by calls to sentence-get-word.

linkage

get-word

[procedure] (get-word linkage word-number) → string

Returns the word spelling of an individual word

linkage
word-number
The specific word

disjunct-str

[procedure] (disjunct-str linkage linkage word-number) → string

Return a string showing the disjuncts that were actually used in association with the specified word in the current linkage. The string shows the disjuncts in proper order; that is, left-to-right, in the order in which they link to other words. The returned string can be thought of as a very precise part-of-speech-like label for the word, indicating how it was used in the given sentence; this can be useful for corpus statistics.

linkage
The specific linkage
linkage
word-number
The specific word

disjunct-cost

[procedure] (disjunct-cost) → number

Return the cost of a word as used in a particular linkage, based

    on the dictionary.

disjunct-corpus-score

[procedure] (disjunct-corpus-score) → number

Returns the cost based on the corpus-statistics database.

get-constituents

[procedure] (get-constituents linkage display-style) → string

Returns the constituents for a particular linkage

linkage
display-style
(number

get-diagram

[procedure] (get-diagram linkage display-walls? screen-width) → string

Returns the linkage diagram

linkage
display-walls?
A boolean that indicates whether or not the wall-words, and the connectors to them, should be printed
screen-width
The screen-width is an integer, indicating the number of columns that should be used during printing; long sentences that are wider than the number of columns will be automatically wrapped so that they always fit.

get-postscript

[procedure] (get-postscript linkage display-walls? print-ps-header?) → string

Returns the macros needed to print out the linkage in a postscript file.

linkage
display-walls?
A boolean that indicates whether or not the wall-words, and the connectors to them, should be printed
print-ps-header?
A boolean that indicates whether or not postscript header boilerplate should be included.

get-disjuncts

[procedure] (get-disjuncts linkage) → string

Returns the returns a string that shows all of the disjuncts, and their costs, that were used to create the linkage.

linkage
[procedure] (get-links-domains linkage) → string

Returns a string that lists all of the links and domain names for the linkage.

linkage

unused-word-cost

[procedure] (unused-word-cost linkage) → number

Should return the same value as sentence-null-count.

linkage

disjunct-cost

[procedure] (disjunct-cost linkage) → number

Should return the same value as sentence-disjunct-cost.

linkage
[procedure] (link-cost linkage) → number

Should return the same value as sentence-link-cost.

linkage

corpus-cost

[procedure] (corpus-cost linkage) → number

Returns the total cost of this particular linkage, based on the cost of disjuncts stored in the corpus-statistics database.

linkage

linkage->eps-file

[procedure] (linkage->eps-file filename postscript) → unspecified

Saves a linkage to a postscript file

path
filename
postscript
Postscript string

get-version

[procedure] (get-version) → string

Gets link-grammar version

get-dictionary-version

[procedure] (get-dictionary-version dictionary) → string

Gets dictionary version

dictionary
Dictionary

get-dictionary-locale

[procedure] (get-dictionary-locale) → string

Gets dictionary locale

display-off

[constant] display-off → 0

Turn off display

display-multi-line

[constant] display-multi-line → 1

Print diagram across multiple lines

display-bracket-tree

[constant] display-bracket-tree → 2

Use brackets when printing diagram

display-single-line

[constant] display-single-line → 3

Print diagram on single line

display-max-styles

[constant] display-max-styles → 3

Print diagram on single line

set-display-morphology!

[procedure] (set-display-morphology! parse-options value) → unspecified

Sets display morphology in parse-options

parse-options
value
(number)

get-display-morphology

[procedure] (get-display-morphology parse-options) → number

Gets display morphology value

parse-options

Parse Options

Parse-options specify the different parameters that are used to parse sentences. Examples of the kinds of things that are controlled by parse-options include maximum parsing time and memory, whether to use null-links, and whether or not to use 'panic' mode. This data structure is passed in to the various parsing and printing routines along with the sentence.

Default value for parse-option members are:

verbosity → 0

linkage-limit → 10000

min-null-count → 0

max-null-count → 0

null-block → 1

islands-ok → #f

short-length → 6

all-short → #f

display-short → #t

display-word-subscripts → #t

display-link-subscripts → #t

display-walls → #f

allow-null → #t

echo-on → #f

batch-mode → #f

panic-mode → #f

screen-width → 79

display-on → #t

display-postscript → #f

display-bad → #f

display-links → #f

init-opts

[procedure] (init-opts) → parse-options

Initilise parse-options to default values

set-max-parse-time!

[procedure] (set-max-parse-time! parse-options value) → unspecified

Set maximum parse time

parse-options
value
(number)

set-linkage-limit!

[procedure] (set-linkage-limit! parse-options linkage-limit) → unspecified

Set linkage limit

parse-options
linkage-limit
(number)

set-short-length!

[procedure] (set-short-length! parse-options short-length) → unspecified

The short_length parameter determines how long the links are allowed to be. The intended use of this is to speed up parsing by not considering very long links for most connectors, since they are very rarely used in a correct parse. An entry for UNLIMITED-CONNECTORS in the dictionary will specify which connectors are exempt from the length limit.

parse-options
short-length
(number)

set-disjunct-cost!

[procedure] (set-disjunct-cost! parse-options disjunt-cost) → unspecified

Determines the maximum disjunct cost used during parsing, where the cost of a disjunct is equal to the maximum cost of all of its connectors. The default is that only disjuncts up to a cost of 2.9 are considered.

parse-options
disjunt-cost

set-min-null-count!

[procedure] (set-min-null-count! parse-options null-count) → unspecified

When parsing a sentence, the parser will find all solutions having the minimum number of null links. It carries out its search in the range of null link counts between min_null_count and max_null_count. By default, the minimum and maximum number of null links is 0, so null links are not used.

parse-options
null-count

set-max-null-count!

[procedure] (set-max-null-count! parse-options null-count) → unspecified

When parsing a sentence, the parser will find all solutions having the minimum number of null links. It carries out its search in the range of null link counts between min-null-count and max-null-count. By default, the minimum and maximum number of null links is 0, so null links are not used.

parse-options
null-count

reset-resources!

[procedure] (reset-resources! parse-options) → unspecified

Reset acquired resources

parse-options

resources-exhausted?

[procedure] (resources-exhausted? parse-options) → boolean

Resources_exhausted means memory-exhausted? OR timer-expired?

parse-options

memory-exhausted?

[procedure] (memory-exhausted? parse-options) → number

Checks whether the memory was exhausted during parsing

parse-options

timer-expired?

[procedure] (timer-expired? parse-options) → number

Checks whether the timer was exceeded during parsing.

parse-options

set-islands-ok!

[procedure] (set-islands-ok! parse-options islands-ok?) → unspecified

This option determines whether or not 'islands' of links are allowed.

parse-options
islands-ok?
A boolean to indicate whether islands are allowed

set-verbosity!

[procedure] (set-verbosity! parse-options verbosity-level) → unspecified

Sets/gets the level of description printed to stderr/stdout about the parsing process.

parse-options
verbosity-level

get-verbosity

[procedure] (get-verbosity parse-options) → number

Get the verbosity level

parse-options

delete-parse-options!

[procedure] (delete-parse-options! parse-options) → number

Delete a parse-option object

parse-options

License

This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

About this egg

Author

David Ireland

Repository

https://gitlab.com/maxwell79/chicken-link-grammar

License

LGPL-2.1

Dependencies

Versions

1.6

Colophon

Documented by hahn.