Text.Corpus behaviour (Text v0.2.0) View Source

Defines the behaviour for a language corpus with convenience functions to simplifying the creation of corpus vocabularies.

Link to this section Summary

Callbacks

Classifies the natural language of a given text into an ordered list.

Detects the most likely natural language of a given text.

Returns a ist of vocabularies for a corpus.

Returns a list of vocabularies for a corpus.

Returns the natural langauge training text for a given language in the corpus.

Normalizes the text used for training and for classification.

Functions

Builds the vocabulary for all known vocabulary modules

Builds a vocabulary for a given vocanulary module.

Link to this section Callbacks

Specs

classify(String.t(), Keyword.t()) ::
  [Text.frequency_tuple(), ...] | {:error, {module(), String.t()}}

Classifies the natural language of a given text into an ordered list.

See Text.Language.classify/2 for the options that may be passed.

Specs

detect(String.t(), Keyword.t()) ::
  {:ok, Text.language()} | {:error, {module(), String.t()}}

Detects the most likely natural language of a given text.

See Text.Language.detect/2 for the options that may be passed.

Specs

known_languages() :: [Text.language(), ...]

Returns a ist of vocabularies for a corpus.

Specs

known_vocabularies() :: [Text.vocabulary(), ...]

Returns a list of vocabularies for a corpus.

Link to this callback

language_content(language)

View Source

Specs

language_content(Text.language()) :: String.t()

Returns the natural langauge training text for a given language in the corpus.

Specs

normalize_text(String.t()) :: String.t()

Normalizes the text used for training and for classification.

Link to this section Functions

Link to this function

build_vocabularies(corpus, options \\ [])

View Source

Builds the vocabulary for all known vocabulary modules

Link to this function

build_vocabulary(corpus, vocabulary, options \\ [])

View Source

Builds a vocabulary for a given vocanulary module.