# `Text.Readability`
[🔗](https://github.com/kipcole9/text/blob/v0.5.0/lib/readability.ex#L1)

Readability metrics for English text.

Implements the classic readability indices used to estimate the
reading difficulty of a passage:

* `flesch/2` — Flesch Reading Ease (higher = easier).
* `flesch_kincaid/2` — Flesch-Kincaid Grade Level (US school grade).
* `gunning_fog/2` — Gunning-Fog Index (years of education).
* `smog/2` — SMOG Index (years of education; needs ~30 sentences).
* `ari/2` — Automated Readability Index (US school grade).
* `coleman_liau/2` — Coleman-Liau Index (US school grade).
* `lix/2` — LIX (Läsbarhetsindex; language-agnostic-ish).
* `dale_chall/2` — Dale-Chall (US school grade); uses the bundled
  ~3,000-word easy-words list.
* `spache/2` — Spache (US school grade, K-3 readers); uses the
  bundled ~1,000-word easy-words list.

Use `metrics/2` to compute every index in one pass over the text,
and `statistics/2` to inspect the raw counts (words, sentences,
syllables, characters, polysyllables, long words) the metrics are
built from.

Sentence and word segmentation use `Text.Segment` (UAX #29 with
CLDR abbreviation suppressions). Syllable counting uses
`Text.Syllable`, which currently supports English only — so all
metrics that depend on syllable counts (Flesch, Flesch-Kincaid,
Gunning-Fog, SMOG) are English-only. ARI, Coleman-Liau, and LIX
are character/length-based and work for any whitespace-segmented
language. Dale-Chall and Spache are English-only and use the
bundled easy-word lists in `priv/readability/`.

# `metrics`

```elixir
@type metrics() :: %{
  flesch: float(),
  flesch_kincaid: float(),
  gunning_fog: float(),
  smog: float(),
  ari: float(),
  coleman_liau: float(),
  lix: float(),
  dale_chall: float(),
  spache: float()
}
```

Map of every readability metric computed for a text.

# `statistics`

```elixir
@type statistics() :: %{
  characters: non_neg_integer(),
  letters: non_neg_integer(),
  words: non_neg_integer(),
  sentences: non_neg_integer(),
  syllables: non_neg_integer(),
  polysyllables: non_neg_integer(),
  long_words: non_neg_integer(),
  difficult_words: non_neg_integer(),
  unfamiliar_words: non_neg_integer(),
  average_sentence_length: float(),
  average_syllables_per_word: float()
}
```

Raw text statistics used to compute the readability metrics.

# `ari`

```elixir
@spec ari(
  String.t() | statistics(),
  keyword()
) :: float()
```

Returns the Automated Readability Index.

Output is a US school grade level. Character-based rather than
syllable-based, so it works for any whitespace-segmented language.

Formula: `4.71 × (letters/words) + 0.5 × (words/sentences) − 21.43`.

### Arguments

* `text` is a string of one or more sentences.

### Options

* `:language` is the segmentation locale, default `:en`.

### Returns

* A float grade level. Returns `0.0` for empty input.

### Examples

    iex> Text.Readability.ari("The cat sat on the mat.") |> Float.round(1)
    -5.1

# `coleman_liau`

```elixir
@spec coleman_liau(
  String.t() | statistics(),
  keyword()
) :: float()
```

Returns the Coleman-Liau Index.

Output is a US school grade level. Character-based; no syllable
counting required.

Formula: `0.0588 × L − 0.296 × S − 15.8`, where `L` is letters
per 100 words and `S` is sentences per 100 words.

### Arguments

* `text` is a string of one or more sentences.

### Options

* `:language` is the segmentation locale, default `:en`.

### Returns

* A float grade level. Returns `0.0` for empty input.

### Examples

    iex> Text.Readability.coleman_liau("The cat sat on the mat.") |> Float.round(1)
    -4.1

# `dale_chall`

```elixir
@spec dale_chall(
  String.t() | statistics(),
  keyword()
) :: float()
```

Returns the Dale-Chall readability score.

Output is a US school grade level. Computed from the percentage of
*difficult* words — words **not** in the bundled ~3,000-word
easy-words list — and the average sentence length. Uses the
Chall-Dale 1995 adjustment: the raw score is shifted by `+3.6365`
when more than 5% of words are difficult.

Formula: `0.1579 × (PDW × 100) + 0.0496 × ASL [+ 3.6365 if PDW > 0.05]`,
where `PDW` is the proportion of difficult words and `ASL` is
average sentence length.

### Arguments

* `text` is a string of one or more sentences.

### Options

* `:language` is the segmentation locale, default `:en`. The
  bundled easy-words list is English-only.

### Returns

* A float grade level. Returns `0.0` for empty input.

### Examples

    iex> Text.Readability.dale_chall("The cat sat on the mat.") |> is_float()
    true

# `flesch`

```elixir
@spec flesch(
  String.t() | statistics(),
  keyword()
) :: float()
```

Returns the Flesch Reading Ease score.

Higher scores mean easier text. 60-70 is "plain English"; 30-50
is "difficult"; 0-30 is "very confusing".

Formula: `206.835 − 1.015 × (words/sentences) − 84.6 × (syllables/words)`.

### Arguments

* `text` is a string of one or more sentences.

### Options

* `:language` is the syllable-counting language, default `:en`.

### Returns

* A float. Returns `0.0` if the text has no sentences or words.

### Examples

    iex> Text.Readability.flesch("The cat sat on the mat.") |> Float.round(1)
    116.1

# `flesch_kincaid`

```elixir
@spec flesch_kincaid(
  String.t() | statistics(),
  keyword()
) :: float()
```

Returns the Flesch-Kincaid Grade Level.

Output is a US school grade level (e.g. `8.5` ≈ mid eighth grade).

Formula: `0.39 × (words/sentences) + 11.8 × (syllables/words) − 15.59`.

### Arguments

* `text` is a string of one or more sentences.

### Options

* `:language` is the syllable-counting language, default `:en`.

### Returns

* A float grade level. Returns `0.0` for empty input.

### Examples

    iex> Text.Readability.flesch_kincaid("The cat sat on the mat.") |> Float.round(1)
    -1.4

# `gunning_fog`

```elixir
@spec gunning_fog(
  String.t() | statistics(),
  keyword()
) :: float()
```

Returns the Gunning-Fog Index.

Output is roughly the years of formal education a reader needs.
12 is a US high-school senior; 17+ is graduate-level.

Formula: `0.4 × ((words/sentences) + 100 × (complex_words/words))`,
where a complex word is one with 3 or more syllables.

### Arguments

* `text` is a string of one or more sentences.

### Options

* `:language` is the syllable-counting language, default `:en`.

### Returns

* A float. Returns `0.0` for empty input.

### Examples

    iex> Text.Readability.gunning_fog("The complicated explanation confused everyone immediately.") |> Float.round(1)
    35.7

# `lix`

```elixir
@spec lix(
  String.t() | statistics(),
  keyword()
) :: float()
```

Returns the LIX (Läsbarhetsindex) readability score.

Designed for Swedish but used as a language-agnostic indicator.
A long word is one with 7 or more characters.

Formula: `(words/sentences) + 100 × (long_words/words)`.

Interpretation: <30 very easy, 30-40 easy, 40-50 medium,
50-60 difficult, >60 very difficult.

### Arguments

* `text` is a string of one or more sentences.

### Options

* `:language` is the segmentation locale, default `:en`.

### Returns

* A float. Returns `0.0` for empty input.

### Examples

    iex> Text.Readability.lix("The cat sat on the mat.") |> Float.round(1)
    6.0

# `metrics`

```elixir
@spec metrics(
  String.t(),
  keyword()
) :: metrics()
```

Computes every readability metric in one pass.

### Arguments

* `text` is a string of one or more sentences.

### Options

* `:language` is the syllable-counting and segmentation language,
  default `:en`.

### Returns

* A map with the keys `:flesch`, `:flesch_kincaid`, `:gunning_fog`,
  `:smog`, `:ari`, `:coleman_liau`, `:lix`, `:dale_chall`, and
  `:spache`. Empty input yields `0.0` for every metric.

### Examples

    iex> Text.Readability.metrics("The cat sat on the mat.") |> Map.keys() |> Enum.sort()
    [:ari, :coleman_liau, :dale_chall, :flesch, :flesch_kincaid, :gunning_fog, :lix, :smog, :spache]

# `smog`

```elixir
@spec smog(
  String.t() | statistics(),
  keyword()
) :: float()
```

Returns the SMOG Index (Simple Measure Of Gobbledygook).

Output is roughly the years of formal education a reader needs.
Designed to be applied to passages of about 30 sentences; results
on shorter passages are less reliable.

Formula: `1.0430 × √(polysyllables × 30 / sentences) + 3.1291`.

### Arguments

* `text` is a string of one or more sentences.

### Options

* `:language` is the syllable-counting language, default `:en`.

### Returns

* A float. Returns `0.0` for empty input.

### Examples

    iex> Text.Readability.smog("The cat sat on the mat. The dog ran fast.") |> is_float()
    true

# `spache`

```elixir
@spec spache(
  String.t() | statistics(),
  keyword()
) :: float()
```

Returns the Spache readability score.

Output is a US school grade level, intended for K-3 reading
material. Computed from the percentage of *unfamiliar* words —
words **not** in the bundled ~1,000-word Spache easy-words list —
and average sentence length.

Formula: `0.121 × ASL + 0.082 × (UPW × 100) + 0.659`, where `ASL`
is average sentence length and `UPW` is the proportion of
unfamiliar words.

### Arguments

* `text` is a string of one or more sentences.

### Options

* `:language` is the segmentation locale, default `:en`. The
  bundled easy-words list is English-only.

### Returns

* A float grade level. Returns `0.0` for empty input.

### Examples

    iex> Text.Readability.spache("The cat sat on the mat.") |> is_float()
    true

# `statistics`

```elixir
@spec statistics(
  String.t(),
  keyword()
) :: statistics()
```

Returns the raw text statistics used by the readability metrics.

### Arguments

* `text` is a string of one or more sentences.

### Options

* `:language` is the language used for syllable counting and
  sentence segmentation. The default is `:en`. Only `:en` is
  supported by the syllable counter today.

### Returns

* A map with the keys `:characters`, `:letters`, `:words`,
  `:sentences`, `:syllables`, `:polysyllables`, `:long_words`,
  `:average_sentence_length`, and `:average_syllables_per_word`.
  Empty input returns zeroed counts and `0.0` averages.

### Examples

    iex> stats = Text.Readability.statistics("The cat sat on the mat. The dog ran.")
    iex> stats.words
    9
    iex> stats.sentences
    2

---

*Consult [api-reference.md](api-reference.md) for complete listing*
