View Source Paasaa (Paasaa v0.6.0)

Provides language detection functions

examples

Examples

iex> Paasaa.detect "Detect this!"
"eng"

Link to this section Summary

Functions

Detects a language. Returns a list of languages scored by probability.

Detects a language. Returns a string with ISO6393 language code (e.g. "eng").

Detects a script.

Link to this section Types

@type options() :: [
  min_length: integer(),
  max_length: integer(),
  whitelist: [String.t()],
  blacklist: [String.t()]
]
@type result() :: [{language :: String.t(), score :: number()}]

Link to this section Functions

Link to this function

all(str, options \\ [min_length: 10, max_length: 2048, whitelist: [], blacklist: []])

View Source
@spec all(str :: String.t(), options()) :: result()

Detects a language. Returns a list of languages scored by probability.

parameters

Parameters

  • str - a text string
  • options - a keyword list with options, see detect/2 for details.

examples

Examples

Detect language and limit results to 5:

iex> Paasaa.all("Detect this!") |> Enum.take(5)
[
  {"eng", 1.0},
  {"sco", 0.8230731943771207},
  {"nob", 0.6030053320407174},
  {"nno", 0.5525933107125545},
  {"swe", 0.508482792050412}
]
Link to this function

detect(str, options \\ [min_length: 10, max_length: 2048, whitelist: [], blacklist: []])

View Source
@spec detect(str :: String.t(), options()) :: language :: String.t()

Detects a language. Returns a string with ISO6393 language code (e.g. "eng").

parameters

Parameters

  • str - a text string
  • options - a keyword list with options:
    • :min_length - If the text is shorter than :min_length it will return und. Default: 10.
    • :max_length - Maximum length to analyze. Default: 2048.
    • :whitelist - Allow languages. Default: [].
    • :blacklist - Disallow languages. Default: [].

examples

Examples

Detect a string:

iex> Paasaa.detect "Detect this!"
"eng"

With the :blacklist option:

iex> Paasaa.detect "Detect this!", blacklist: ["eng"]
"sco"

With the :min_length option:

iex> Paasaa.detect "Привет", min_length: 6
"rus"

It returns und for undetermined language:

iex> Paasaa.detect "1234567890"
"und"
@spec detect_script(str :: String.t()) :: {String.t(), number()}

Detects a script.

parameters

Parameters

  • str - a text string

examples

Examples

iex> Paasaa.detect_script("Detect this!")
{"Latin", 0.8333333333333334}