Essence.Readability (essence v0.3.0)

The Readbility module contains several methods for calculating the readability scores of a text.

Link to this section Summary

Functions

The ari_score method calculates the Automated Readability Index (ARI) of a given Essence.Document.

The Coleman-Liau readability test. Like the ARI but unlike most of the other indices, Coleman–Liau relies on characters instead of syllables per word. Although opinion varies on its accuracy as compared to the syllable/word and complex word indices, characters are more readily and accurately counted by computer programs than are syllables. The Coleman–Liau index was designed to be easily calculated mechanically from samples of hard-copy text. Unlike syllable-based readability indices, it does not require that the character content of words be analyzed, only their length in characters. Therefore, it could be used in conjunction with theoretically simple mechanical scanners that would only need to recognize character, word, and sentence boundaries, removing the need for full optical character recognition or manual keypunching.

Calculates the Dale-Chall readability score. that provides a numeric gauge of the comprehension difficulty that readers come upon when reading a text. It uses a list of 3000 words that groups of fourth-grade American students could reliably understand, considering any word not on that list to be difficult.

Gunning fog index measures the readability of English writing. The index estimates the years of formal education needed to understand the text on a first reading. A fog index of 12 requires the reading level of a U.S. high school senior (around 18 years old). The test was developed by Robert Gunning, an American businessman, in 1952.[1]

Calculates an estimate of the time it would take an average reader to read the given Essence.Document, assuming a reading speed of 200 words per minute.

The smog_grade method calculates the SMOG grade measure of readability that estimates the years of education needed to understand a piece of writing. The SMOG grade is commonly used in rating health messages. Please note that results for documents with less than 30 sentences are statistically invalid[1].

Calculates the speaking speed in words per minute, given a speech described by the given Essence.Document and the recorded speaking_time in minutes.

Calculates an estimate of the time it would take to read the given Essence.Document as a speech, with a speaking speed of 120 words per minute.

Link to this section Functions

The ari_score method calculates the Automated Readability Index (ARI) of a given Essence.Document.

Details

The ARI uses two quantities, mu(w) and mu(s), where mu(w) is average number of letters per word in the given text and mu(s) is the average number of words per sentence in the given text. The ARI is then defined by the following formula: ari = 4.71 * mu(w) + 0.5 * mu(s) - 21.43

Commonly, the ARI score is rounded up and translated by the following table:

ARI scoreReadability LevelReader Age
1Kindergarten5-6
2First Grade6-7
3Second Grade7-8
4Third Grade8-9
5Fourth Grade9-10
6Fifth Grade10-11
7Sixth Grade11-12
8Seventh Grade12-13
9Eighth Grade13-14
10Ninth Grade14-15
11Tenth Grade15-16
12Eleventh Grade16-17
13Twelth Grade17-18
14+College18-22
Link to this function

coleman_liau(doc)

Specs

coleman_liau(%Essence.Document{
  meta: term(),
  nested_tokens: term(),
  text: term(),
  type: term(),
  uri: term()
}) :: float()

The Coleman-Liau readability test. Like the ARI but unlike most of the other indices, Coleman–Liau relies on characters instead of syllables per word. Although opinion varies on its accuracy as compared to the syllable/word and complex word indices, characters are more readily and accurately counted by computer programs than are syllables. The Coleman–Liau index was designed to be easily calculated mechanically from samples of hard-copy text. Unlike syllable-based readability indices, it does not require that the character content of words be analyzed, only their length in characters. Therefore, it could be used in conjunction with theoretically simple mechanical scanners that would only need to recognize character, word, and sentence boundaries, removing the need for full optical character recognition or manual keypunching.

The score output approximates the U.S. grade level thought necessary to comprehend the text.

Link to this function

dale_chall(doc)

Calculates the Dale-Chall readability score. that provides a numeric gauge of the comprehension difficulty that readers come upon when reading a text. It uses a list of 3000 words that groups of fourth-grade American students could reliably understand, considering any word not on that list to be difficult.

ScoreNotes
4.9 or lowereasily understood by an average 4th-grade student or lower
5.0–5.9easily understood by an average 5th or 6th-grade student
6.0–6.9easily understood by an average 7th or 8th-grade student
7.0–7.9easily understood by an average 9th or 10th-grade student
8.0–8.9easily understood by an average 11th or 12th-grade student
9.0–9.9easily understood by an average 13th to 15th-grade (college) student
Link to this function

gunning_fog(doc)

Gunning fog index measures the readability of English writing. The index estimates the years of formal education needed to understand the text on a first reading. A fog index of 12 requires the reading level of a U.S. high school senior (around 18 years old). The test was developed by Robert Gunning, an American businessman, in 1952.[1]

The fog index is commonly used to confirm that text can be read easily by the intended audience. Texts for a wide audience generally need a fog index less than 12. Texts requiring near-universal understanding generally need an index less than 8.

[1] DuBay, William H. (23 March 2004). "Judges Scold Lawyers for Bad Writing". Plain Language At Work Newsletter. Impact Information (8).

Fog IndexReading level by grade
17College graduate
16College senior
15College junior
14College sophomore
13College freshman
12High school senior
11High school junior
10High school sophomore
9High school freshman
8Eighth grade
7Seventh grade
6Sixth grade
Link to this function

reading_time(doc, speed \\ 200)

Calculates an estimate of the time it would take an average reader to read the given Essence.Document, assuming a reading speed of 200 words per minute.

Link to this function

smog_grade(doc)

The smog_grade method calculates the SMOG grade measure of readability that estimates the years of education needed to understand a piece of writing. The SMOG grade is commonly used in rating health messages. Please note that results for documents with less than 30 sentences are statistically invalid[1].

[1] https://en.wikipedia.org/wiki/SMOG

Link to this function

speaking_speed(doc, speaking_time)

Calculates the speaking speed in words per minute, given a speech described by the given Essence.Document and the recorded speaking_time in minutes.

Link to this function

speaking_time(doc, speed \\ 120)

Calculates an estimate of the time it would take to read the given Essence.Document as a speech, with a speaking speed of 120 words per minute.