View Source Lemma (Lemma v0.1.2)

A morphological parser (analyzer) / lemmatizer implemented with textbook standard method, using an abstraction called Finite State Transducer (FST).

FST is implemented in gen_fst package

A parser can be initilized with desired language using Lemma.new/1. This initialized parser can be used to parse words with Lemma.parse/2

examples

Examples

en_parser = Lemma.new :en
#=> nil
en_parser |> Lemma.parse("plays")
#=> "play"

about-morphological-parsing-lemmatization

About morphological parsing / lemmatization

For grammatical reasons, documents are going to use different forms of a word, such as organize, organizes, and organizing. Additionally, there are families of derivationally related words with similar meanings, such as democracy, democratic, and democratization. In many situations, it seems as if it would be useful for a search for one of these words to return documents that contain another word in the set. <br/> The goal of both stemming and lemmatization is to reduce inflectional forms and sometimes derivationally related forms of a word to a common base form. For instance: <br/><br/>am, are, is ⇒ be <br/>car, cars, car's, cars' ⇒ car <br/><br/>The result of this mapping of text will be something like: <br/>the boy's cars are different colors ⇒ the boy car be differ color. <br/> -- Stanford NLP Group

Link to this section Summary

Functions

Initialize a morphological parser for the given language.

Use the given parser to parse a word or a list of words.

Link to this section Functions

@spec new(atom()) :: GenFST.fst()

Initialize a morphological parser for the given language.

Only English (:en) is supported currently.

@spec parse(GenFST.fst(), String.t() | [String.t()]) :: String.t() | [String.t()]

Use the given parser to parse a word or a list of words.