gen_fst v0.4.1 GenFST View Source
GenFST implements a generic finite state transducer with customizable rules expressed in a DSL.
A finite-state transducer (FST) is a finite-state machine with two memory tapes, following the terminology for Turing machines: an input tape and an output tape.
A FST will read a set of strings on the input tape and generates a set of relations on the output tape. An FST can be thought of as a translator or relater between strings in a set.
In morphological parsing, an example would be inputting a string of letters into the FST, the FST would then output a string of morphemes.
Example
Here we implement a simple morphological parser for English language. This morphological parser recognize different inflectional morphology of the verbs.
fst = GenFST.new
|> GenFST.rule(["play", {"s", "^s"}])
|> GenFST.rule(["act", {"s", "^s"}])
|> GenFST.rule(["act", {"ed", "^ed"}])
|> GenFST.rule(["act", {"ing", ""}])
assert "play^s" == fst |> GenFST.parse("plays")
For example if we pass the third-person singluar tense of the verb act,
GenFST.parse(fst, "acts")
, the morphological parser will output
"act^s"
. The semantic of rule definition is given at rule/2
.
Link to this section Summary
Functions
Create a new finite state transducer
Parse the input by transducing it with the given fst
Define a transducing rule, adding it to the fst
Link to this section Types
Link to this section Functions
Create a new finite state transducer.
See example usage in Module Example
Parse the input by transducing it with the given fst.
See example usage in Module Example
Define a transducing rule, adding it to the fst
A transducing rule is a List
of String.t | {String.t, String.t}
.
For example: rule fst, ["play", {"s", "^s"}]
means outputing "play"
verbatimly,
and transform "s"
into "^s"
. If a finite state transducer built with
this rule is fed with string "plays"
, then the output will be "play^s"
See example usage in Module Example