essence v0.2.0 Essence.Vocabulary

This module exports helpful methods around Vocabularies.

Summary

Functions

The freq_dist method calculates the frequency distribution of tokens in the given text

The lexical_richness method computes the lexical richness of a given text

Return a list of {int, token} pairs, ordered by their token frequency in the given Essence.Document. Optionally supply a filter function such as Essence.Token.is_word?/1 to exclude unwanted tokens from the calculation

The vocabulary method computes the vocabulary of a given Essence.Document. The vocabulary is the unique set of dictionary words in that text

Functions

freq_dist(tokens)

The freq_dist method calculates the frequency distribution of tokens in the given text.

lexical_richness(text)

The lexical_richness method computes the lexical richness of a given text.

top_tokens(document, filter_fun \\ &always_true/1)

Return a list of {int, token} pairs, ordered by their token frequency in the given Essence.Document. Optionally supply a filter function such as Essence.Token.is_word?/1 to exclude unwanted tokens from the calculation.

vocabulary(frequency_distribution)

Specs

vocabulary(Essence.Document.t) :: List.t
vocabulary(Map.t) :: List.t

The vocabulary method computes the vocabulary of a given Essence.Document. The vocabulary is the unique set of dictionary words in that text.