Readability metrics for English text.
Implements the classic readability indices used to estimate the reading difficulty of a passage:
flesch/2— Flesch Reading Ease (higher = easier).flesch_kincaid/2— Flesch-Kincaid Grade Level (US school grade).gunning_fog/2— Gunning-Fog Index (years of education).smog/2— SMOG Index (years of education; needs ~30 sentences).ari/2— Automated Readability Index (US school grade).coleman_liau/2— Coleman-Liau Index (US school grade).lix/2— LIX (Läsbarhetsindex; language-agnostic-ish).dale_chall/2— Dale-Chall (US school grade); uses the bundled ~3,000-word easy-words list.spache/2— Spache (US school grade, K-3 readers); uses the bundled ~1,000-word easy-words list.
Use metrics/2 to compute every index in one pass over the text,
and statistics/2 to inspect the raw counts (words, sentences,
syllables, characters, polysyllables, long words) the metrics are
built from.
Sentence and word segmentation use Text.Segment (UAX #29 with
CLDR abbreviation suppressions). Syllable counting uses
Text.Syllable, which currently supports English only — so all
metrics that depend on syllable counts (Flesch, Flesch-Kincaid,
Gunning-Fog, SMOG) are English-only. ARI, Coleman-Liau, and LIX
are character/length-based and work for any whitespace-segmented
language. Dale-Chall and Spache are English-only and use the
bundled easy-word lists in priv/readability/.
Summary
Types
Map of every readability metric computed for a text.
Raw text statistics used to compute the readability metrics.
Functions
Returns the Automated Readability Index.
Returns the Coleman-Liau Index.
Returns the Dale-Chall readability score.
Returns the Flesch Reading Ease score.
Returns the Flesch-Kincaid Grade Level.
Returns the Gunning-Fog Index.
Returns the LIX (Läsbarhetsindex) readability score.
Computes every readability metric in one pass.
Returns the SMOG Index (Simple Measure Of Gobbledygook).
Returns the Spache readability score.
Returns the raw text statistics used by the readability metrics.
Types
@type metrics() :: %{ flesch: float(), flesch_kincaid: float(), gunning_fog: float(), smog: float(), ari: float(), coleman_liau: float(), lix: float(), dale_chall: float(), spache: float() }
Map of every readability metric computed for a text.
@type statistics() :: %{ characters: non_neg_integer(), letters: non_neg_integer(), words: non_neg_integer(), sentences: non_neg_integer(), syllables: non_neg_integer(), polysyllables: non_neg_integer(), long_words: non_neg_integer(), difficult_words: non_neg_integer(), unfamiliar_words: non_neg_integer(), average_sentence_length: float(), average_syllables_per_word: float() }
Raw text statistics used to compute the readability metrics.
Functions
@spec ari( String.t() | statistics(), keyword() ) :: float()
Returns the Automated Readability Index.
Output is a US school grade level. Character-based rather than syllable-based, so it works for any whitespace-segmented language.
Formula: 4.71 × (letters/words) + 0.5 × (words/sentences) − 21.43.
Arguments
textis a string of one or more sentences.
Options
:languageis the segmentation locale, default:en.
Returns
- A float grade level. Returns
0.0for empty input.
Examples
iex> Text.Readability.ari("The cat sat on the mat.") |> Float.round(1)
-5.1
@spec coleman_liau( String.t() | statistics(), keyword() ) :: float()
Returns the Coleman-Liau Index.
Output is a US school grade level. Character-based; no syllable counting required.
Formula: 0.0588 × L − 0.296 × S − 15.8, where L is letters
per 100 words and S is sentences per 100 words.
Arguments
textis a string of one or more sentences.
Options
:languageis the segmentation locale, default:en.
Returns
- A float grade level. Returns
0.0for empty input.
Examples
iex> Text.Readability.coleman_liau("The cat sat on the mat.") |> Float.round(1)
-4.1
@spec dale_chall( String.t() | statistics(), keyword() ) :: float()
Returns the Dale-Chall readability score.
Output is a US school grade level. Computed from the percentage of
difficult words — words not in the bundled ~3,000-word
easy-words list — and the average sentence length. Uses the
Chall-Dale 1995 adjustment: the raw score is shifted by +3.6365
when more than 5% of words are difficult.
Formula: 0.1579 × (PDW × 100) + 0.0496 × ASL [+ 3.6365 if PDW > 0.05],
where PDW is the proportion of difficult words and ASL is
average sentence length.
Arguments
textis a string of one or more sentences.
Options
:languageis the segmentation locale, default:en. The bundled easy-words list is English-only.
Returns
- A float grade level. Returns
0.0for empty input.
Examples
iex> Text.Readability.dale_chall("The cat sat on the mat.") |> is_float()
true
@spec flesch( String.t() | statistics(), keyword() ) :: float()
Returns the Flesch Reading Ease score.
Higher scores mean easier text. 60-70 is "plain English"; 30-50 is "difficult"; 0-30 is "very confusing".
Formula: 206.835 − 1.015 × (words/sentences) − 84.6 × (syllables/words).
Arguments
textis a string of one or more sentences.
Options
:languageis the syllable-counting language, default:en.
Returns
- A float. Returns
0.0if the text has no sentences or words.
Examples
iex> Text.Readability.flesch("The cat sat on the mat.") |> Float.round(1)
116.1
@spec flesch_kincaid( String.t() | statistics(), keyword() ) :: float()
Returns the Flesch-Kincaid Grade Level.
Output is a US school grade level (e.g. 8.5 ≈ mid eighth grade).
Formula: 0.39 × (words/sentences) + 11.8 × (syllables/words) − 15.59.
Arguments
textis a string of one or more sentences.
Options
:languageis the syllable-counting language, default:en.
Returns
- A float grade level. Returns
0.0for empty input.
Examples
iex> Text.Readability.flesch_kincaid("The cat sat on the mat.") |> Float.round(1)
-1.4
@spec gunning_fog( String.t() | statistics(), keyword() ) :: float()
Returns the Gunning-Fog Index.
Output is roughly the years of formal education a reader needs. 12 is a US high-school senior; 17+ is graduate-level.
Formula: 0.4 × ((words/sentences) + 100 × (complex_words/words)),
where a complex word is one with 3 or more syllables.
Arguments
textis a string of one or more sentences.
Options
:languageis the syllable-counting language, default:en.
Returns
- A float. Returns
0.0for empty input.
Examples
iex> Text.Readability.gunning_fog("The complicated explanation confused everyone immediately.") |> Float.round(1)
35.7
@spec lix( String.t() | statistics(), keyword() ) :: float()
Returns the LIX (Läsbarhetsindex) readability score.
Designed for Swedish but used as a language-agnostic indicator. A long word is one with 7 or more characters.
Formula: (words/sentences) + 100 × (long_words/words).
Interpretation: <30 very easy, 30-40 easy, 40-50 medium, 50-60 difficult, >60 very difficult.
Arguments
textis a string of one or more sentences.
Options
:languageis the segmentation locale, default:en.
Returns
- A float. Returns
0.0for empty input.
Examples
iex> Text.Readability.lix("The cat sat on the mat.") |> Float.round(1)
6.0
Computes every readability metric in one pass.
Arguments
textis a string of one or more sentences.
Options
:languageis the syllable-counting and segmentation language, default:en.
Returns
- A map with the keys
:flesch,:flesch_kincaid,:gunning_fog,:smog,:ari,:coleman_liau,:lix,:dale_chall, and:spache. Empty input yields0.0for every metric.
Examples
iex> Text.Readability.metrics("The cat sat on the mat.") |> Map.keys() |> Enum.sort()
[:ari, :coleman_liau, :dale_chall, :flesch, :flesch_kincaid, :gunning_fog, :lix, :smog, :spache]
@spec smog( String.t() | statistics(), keyword() ) :: float()
Returns the SMOG Index (Simple Measure Of Gobbledygook).
Output is roughly the years of formal education a reader needs. Designed to be applied to passages of about 30 sentences; results on shorter passages are less reliable.
Formula: 1.0430 × √(polysyllables × 30 / sentences) + 3.1291.
Arguments
textis a string of one or more sentences.
Options
:languageis the syllable-counting language, default:en.
Returns
- A float. Returns
0.0for empty input.
Examples
iex> Text.Readability.smog("The cat sat on the mat. The dog ran fast.") |> is_float()
true
@spec spache( String.t() | statistics(), keyword() ) :: float()
Returns the Spache readability score.
Output is a US school grade level, intended for K-3 reading material. Computed from the percentage of unfamiliar words — words not in the bundled ~1,000-word Spache easy-words list — and average sentence length.
Formula: 0.121 × ASL + 0.082 × (UPW × 100) + 0.659, where ASL
is average sentence length and UPW is the proportion of
unfamiliar words.
Arguments
textis a string of one or more sentences.
Options
:languageis the segmentation locale, default:en. The bundled easy-words list is English-only.
Returns
- A float grade level. Returns
0.0for empty input.
Examples
iex> Text.Readability.spache("The cat sat on the mat.") |> is_float()
true
@spec statistics( String.t(), keyword() ) :: statistics()
Returns the raw text statistics used by the readability metrics.
Arguments
textis a string of one or more sentences.
Options
:languageis the language used for syllable counting and sentence segmentation. The default is:en. Only:enis supported by the syllable counter today.
Returns
- A map with the keys
:characters,:letters,:words,:sentences,:syllables,:polysyllables,:long_words,:average_sentence_length, and:average_syllables_per_word. Empty input returns zeroed counts and0.0averages.
Examples
iex> stats = Text.Readability.statistics("The cat sat on the mat. The dog ran.")
iex> stats.words
9
iex> stats.sentences
2