Text.Emoji (Text v0.5.0)

Copy Markdown View Source

Emoji detection and short-name conversion.

Detection uses the Unicode Extended_Pictographic property, so it picks up the full emoji repertoire including newer additions without needing a per-release data update.

Short-name conversion (demojize/2, emojize/2) uses a small bundled lookup of the most common emoji. It is not a complete CLDR annotation set; rare emoji round-trip as themselves. Users can extend the lookup at runtime via add_emoji/1.

Sentiment scoring is provided by sentiment/1 and text_sentiment/1, backed by the bundled Emoji Sentiment Ranking v1.0 (Kralj Novak et al., 2015 — see priv/emoji_sentiment/) covering ~750 emoji with per-emoji negative/neutral/positive proportions and an aggregate score in [-1.0, 1.0].

Summary

Functions

Adds project-specific emoji to the runtime short-name lookup.

Returns true when the text contains at least one emoji.

Returns the number of emoji in the text.

Replaces emoji in the text with :short_name: placeholders.

Replaces :short_name: placeholders with their emoji.

Returns a list of every emoji found in the text, in order of appearance.

Returns the bundled sentiment record for a single emoji.

Removes every emoji from the text.

Returns the aggregate sentiment of every known emoji in a text.

Functions

add_emoji(entries)

@spec add_emoji(map() | keyword()) :: :ok

Adds project-specific emoji to the runtime short-name lookup.

Useful for custom platform emoji or for filling in gaps in the bundled set.

Arguments

  • entries is a map or keyword list of emoji => short_name pairs (where short_name is the bare name without colons).

Returns

  • :ok on success.

contains?(text)

@spec contains?(String.t()) :: boolean()

Returns true when the text contains at least one emoji.

Examples

iex> Text.Emoji.contains?("hello 😀")
true

iex> Text.Emoji.contains?("hello")
false

count(text)

@spec count(String.t()) :: non_neg_integer()

Returns the number of emoji in the text.

Examples

iex> Text.Emoji.count("Hello 😀 world 🎉")
2

demojize(text, options \\ [])

@spec demojize(
  String.t(),
  keyword()
) :: String.t()

Replaces emoji in the text with :short_name: placeholders.

Emoji not in the bundled lookup are left as-is.

Arguments

  • text is the input string.

Options

  • :delimiter is the delimiter character used around the short name. Default ":" produces :smile:.

Returns

  • The text with known emoji replaced by their short names.

Examples

iex> Text.Emoji.demojize("Hello 😀 world")
"Hello :grinning_face: world"

iex> Text.Emoji.demojize("rare emoji 🪿")
"rare emoji 🪿"

emojize(text, options \\ [])

@spec emojize(
  String.t(),
  keyword()
) :: String.t()

Replaces :short_name: placeholders with their emoji.

Unknown short names are left as-is.

Arguments

  • text is the input string.

Options

  • :delimiter is the delimiter character around the short name. Default ":".

Returns

  • The text with short names replaced by emoji.

Examples

iex> Text.Emoji.emojize("Hello :grinning_face: world")
"Hello 😀 world"

extract(text)

@spec extract(String.t()) :: [String.t()]

Returns a list of every emoji found in the text, in order of appearance.

Arguments

  • text is the input string.

Returns

  • A list of single-emoji strings. Emoji ZWJ sequences (e.g. family emoji) currently appear as their constituent pictographs rather than as a single grouped sequence.

Examples

iex> Text.Emoji.extract("Hello 😀 world 🎉")
["😀", "🎉"]

iex> Text.Emoji.extract("no emoji here")
[]

sentiment(emoji)

@spec sentiment(String.t()) ::
  %{
    emoji: String.t(),
    occurrences: non_neg_integer(),
    negative: non_neg_integer(),
    neutral: non_neg_integer(),
    positive: non_neg_integer(),
    score: float(),
    name: String.t()
  }
  | nil

Returns the bundled sentiment record for a single emoji.

The data is the Emoji Sentiment Ranking v1.0 (Kralj Novak et al., 2015), licensed CC-BY-SA 3.0. Coverage is ~750 of the most-used emoji in tweets at the time of the study; rare emoji return nil.

Arguments

  • emoji is a single-emoji string. Surface form must match the upstream entry exactly (no skin-tone or ZWJ-sequence variants).

Returns

  • A map with keys :emoji, :occurrences, :negative, :neutral, :positive, :score (range [-1.0, 1.0]), and :name (Unicode name). Returns nil if the emoji is not in the ranking.

Examples

iex> %{score: score} = Text.Emoji.sentiment("😂")
iex> score > 0.0
true

iex> Text.Emoji.sentiment("not_an_emoji")
nil

strip(text)

@spec strip(String.t()) :: String.t()

Removes every emoji from the text.

Examples

iex> Text.Emoji.strip("Hello 😀 world 🎉!")
"Hello  world !"

text_sentiment(text)

@spec text_sentiment(String.t()) :: {float(), pos_integer()} | nil

Returns the aggregate sentiment of every known emoji in a text.

Each emoji's score is weighted by its corpus :occurrences so that high-confidence emoji dominate noisy ones — matching the weighted-average approach used by the original paper. Emoji not in the ranking are skipped.

Arguments

  • text is the input string.

Returns

  • {score, count} where score is a float in [-1.0, 1.0] and count is the number of emoji that contributed. Returns nil when the text contains no scoreable emoji.

Examples

iex> {score, 2} = Text.Emoji.text_sentiment("Great day 😂❤")
iex> score > 0.0
true

iex> Text.Emoji.text_sentiment("no emoji here")
nil