Text.Sentiment (Text v0.5.0)

Copy Markdown View Source

Sentiment analysis with multilingual support.

Two backends are shipped with text:

  • Text.Sentiment.Backends.Lexicon (the default) — fast, deterministic, multilingual lexicon-based scoring backed by the bundled AFINN lexicons (English, Danish, Finnish, French, Polish, Swedish, Turkish, plus a language-agnostic emoticon lexicon). Apache 2.0. Sub-millisecond per call. No model download, no optional dependencies.

  • Text.Sentiment.Backends.Bumblebee (optional) — neural sentiment via Bumblebee and a pre-trained multilingual transformer (XLM-RoBERTa). Higher quality, slower (single-digit ms per call after a 10-30 s cold start), and requires adding :bumblebee and :exla to your deps.

The default backend is the lexicon. To switch globally:

# config/config.exs
config :text, :sentiment_backend, Text.Sentiment.Backends.Bumblebee

To switch per call:

Text.Sentiment.analyze(text, backend: Text.Sentiment.Backends.Bumblebee)

Custom backends can be supplied by implementing the Text.Sentiment.Backend behaviour.

Three-line summary

iex> {:ok, _} = {:ok, :ignored}                   # one-liner
iex> result = Text.Sentiment.analyze("This is a great day!")
iex> result.label
:positive

Multilingual usage

Pass language: <tag> to pick a bundled AFINN lexicon. The tag matches the ISO 639-1 codes AFINN itself uses:

iex> Text.Sentiment.analyze("Ce produit est excellent et magnifique!", language: :fr).label
:positive

iex> Text.Sentiment.analyze("Detta är en mycket dålig idé.", language: :sv).label
:negative

For languages outside the bundled set, supply your own lexicon — any %{token => number} map works:

custom = %{"awesome" => 4, "garbage" => -4}
Text.Sentiment.analyze("That feature is awesome", lexicon: custom)

Combining lexicons (e.g. emoticons)

The bundled :emoticon lexicon is language-agnostic. Merge it with any per-language lexicon when scoring informal text:

lexicon = Text.Sentiment.lexicon_for(:en, with_emoticons: true)
Text.Sentiment.analyze("That movie was awful :-(", lexicon: lexicon)

End-to-end with Text.Language.Classifier.Fasttext

When the language is unknown, detect it first and route to the matching lexicon:

{:ok, model} = Text.Language.Classifier.Fasttext.ModelLoader.load(path)
{:ok, detection} = Text.Language.Classifier.Fasttext.detect(text, model)
lang = String.to_atom(detection.language)
Text.Sentiment.analyze(text, language: lang)

Falls back gracefully when the detected language is not bundled — pass :fallback_language to control what happens (default :en).

Summary

Types

Convenience type for a positive/negative/neutral label.

Result struct returned by analyze/2.

Functions

Analyzes the sentiment of text.

Returns just the sentiment label for text.

Builds a composite lexicon for the given language.

Types

label()

@type label() :: :positive | :negative | :neutral

Convenience type for a positive/negative/neutral label.

result()

@type result() :: %{
  sum: float(),
  compound: float(),
  label: label(),
  language: atom(),
  tokens: non_neg_integer(),
  matched: non_neg_integer()
}

Result struct returned by analyze/2.

Functions

analyze(text, options \\ [])

@spec analyze(
  String.t(),
  keyword()
) :: result()

Analyzes the sentiment of text.

Arguments

  • text is a UTF-8 string.

Options

  • :language — accepts an atom (:fr), a string ("fr", "fr-CA"), or a Localize.LanguageTag struct when the optional :localize dependency is loaded. The tag is normalised to its language subtag before lookup. Bundled AFINN tags are [:af, :am, :ar, :az, :be, :bg, :bn, :bs, :ca, :ceb, :co, :cs, :cy, :da, :de, :el, :emoji, :emoticon, :en, :eo, :es, :et, :eu, :fa, :fi, :fr, :fy, :ga, :gd, :gl, :gu, :ha, :haw, :hi, :hmn, :hr, :ht, :hu, :hy, :id, :ig, :it, :iw, :ja, :jw, :ka, :kk, :km, :kn, :ko, :ku, :ky, :la, :lb, :lo, :lt, :lv, :mg, :mi, :mk, :ml, :mn, :mr, :ms, :mt, :my, :ne, :nl, :no, :ny, :pa, :pl, :ps, :pt, :ro, :ru, :sd, :si, :sk, :sl, :sm, :sn, :so, :sq, :sr, :st, :su, :sv, :sw, :ta, :te, :tg, :th, :tl, :tr, :uk, :ur, :uz, :vi, :xh, ...]. Defaults to :en. Ignored when :lexicon is given.

  • :lexicon — a %{token => number} map. Overrides :language if given. Use lexicon_for/2 to compose a bundled language with the emoticon lexicon, or supply your own.

  • :fallback_language — the bundled tag to fall back to if :language is given but not bundled. Same shapes as :language. Defaults to :en.

  • Any of the keyword options accepted by Text.Sentiment.Lexicon.score/3 (:tokenizer, :fold_case, :negators, :intensifiers, :diminishers, etc.) are forwarded through.

Returns

A map with:

  • :sum, :compound, :label, :tokens, :matched — same fields as Text.Sentiment.Lexicon.score/3.

  • :language — the language tag actually used (after fallback).

Examples

iex> Text.Sentiment.analyze("I really love this product!").label
:positive

iex> Text.Sentiment.analyze("This was a bad experience.").label
:negative

iex> Text.Sentiment.analyze("The package arrived today.").label
:neutral

label(text, options \\ [])

@spec label(
  String.t(),
  keyword()
) :: label()

Returns just the sentiment label for text.

Arguments

  • text is a UTF-8 string.

Options

Same as analyze/2.

Returns

  • :positive, :negative, or :neutral.

Examples

iex> Text.Sentiment.label("This is amazing!")
:positive

iex> Text.Sentiment.label("This is awful.")
:negative

lexicon_for(language, options \\ [])

@spec lexicon_for(
  Text.Language.input(),
  keyword()
) :: %{required(String.t()) => number()}

Builds a composite lexicon for the given language.

Useful for combining a language-specific lexicon with the language-agnostic emoticon lexicon (or any other extension lexicon).

Arguments

  • language is a bundled tag.

Options

  • :with_emoticons — when true (default false), merges the emoticon lexicon into the result. Emoticon entries override language entries on conflict, but in practice there are none.

  • :overrides — an extra %{token => number} map merged on top. Useful for domain-specific terms (industry jargon, product names, brand-charged language).

Returns

A %{token => number} map suitable for analyze/2's :lexicon option.

Examples

iex> lexicon = Text.Sentiment.lexicon_for(:en, with_emoticons: true)
iex> Map.get(lexicon, "good")
3
iex> Map.get(lexicon, ":-)")
2

iex> lexicon = Text.Sentiment.lexicon_for(:en, overrides: %{"foo" => 5})
iex> Map.get(lexicon, "foo")
5