readability v0.3.1 Readability

Readability library for extracting & curating articles.

Example

@type html :: binary

# Extract title
Readability.title(html)

# Extract only text from article
article = html
          |> Readability.article
          |> Readability.readable_text

# Extract article with transformed html
article = html
          |> Readability.article
          |> Readability.raw_html

Summary

Functions

Using a variety of metrics (content score, classname, element types), find the content that is most likely to be the stuff a user wants to read

Normalize and Parse to html tree(tuple or list)) from binary html

return raw html binary from html_tree

return only text binary from html_tree

Extract title

Types

html_tree :: tuple | list
options :: list

Functions

article(raw_html, opts \\ [])

Specs

article(binary, options) :: html_tree

Using a variety of metrics (content score, classname, element types), find the content that is most likely to be the stuff a user wants to read

Example

iex> article_tree = Redability(html_str)
# returns article that is tuple
default_options()
parse(raw_html)

Specs

parse(binary) :: html_tree

Normalize and Parse to html tree(tuple or list)) from binary html

raw_html(html_tree)

Specs

raw_html(html_tree) :: binary
raw_html(html_tree) :: binary

return raw html binary from html_tree

readable_text(html_tree)

return only text binary from html_tree

regexes()
title(html)

Specs

title(binary) :: binary

Extract title

Example

iex> title = Readability.title(html_str)
"Some title in html"