View Source HtmlQuery (HtmlQuery v0.2.1)

Some simple HTML query functions. Delegates the hard work to Floki.

data-types

Data types

All functions accept HTML in the form of a string, a Floki HTML tree, or a Floki HTML node. Others expect only a Floki HTML node or a Floki HTML tree. See HtmlQuery.html/0.

Some functions take a CSS selector, which can be a string, a keyword list, or a list. See HtmlQuery.Css.selector/0.

main-query-functions

Main query functions

The main query functions take an HTML string or some parsed HTML, and a selector.

all/2return all elements matching the selector
find/2return the first element that matches the selector
find!/2return the only element that matches the selector

parsing-functions

Parsing functions

parse/1parses an HTML fragment into a [Floki HTML tree]
parse_doc/1parses an HTML doc into a [Floki HTML tree]

extraction-functions

Extraction functions

attr/2returns the attribute value as a string
form_fields/1returns the names and values of form fields as a map
meta_tags/1returns the names and values of metadata fields
text/1returns the text contents as a single string

utility-functions

Utility functions

inspect_html/2prints prettified HTML with a label
normalize/1parses and re-stringifies HTML
pretty/1prettifies HTML

alias

Alias

If you use HtmlQuery a lot, you may want to alias it to the recommended shortcut "Hq":

alias HtmlQuery, as: Hq

examples

Examples

Get the value of a selected option:

iex> html = ~s|<select> <option value="a" selected>apples</option> <option value="b">bananas</option> </select>|
iex> HtmlQuery.find(html, "select option[selected]") |> HtmlQuery.attr("value")
"a"

Get the text of a selected option, raising if there are more than one:

iex> html = ~s|<select> <option value="a" selected>apples</option> <option value="b">bananas</option> </select>|
iex> HtmlQuery.find!(html, "select option[selected]") |> HtmlQuery.text()
"apples"

Get the text of all the options:

iex> html = ~s|<select> <option value="a" selected>apples</option> <option value="b">bananas</option> </select>|
iex> HtmlQuery.all(html, "select option") |> Enum.map(&HtmlQuery.text/1)
["apples", "bananas"]

Use a keyword list as the selector (see HtmlQuery.CSS for details on selectors):

iex> html = ~s|<div> <a href="/logout" test-role="logout-link">logout</a> </div>|
iex> HtmlQuery.find!(html, test_role: "logout-link") |> HtmlQuery.attr("href")
"/logout"

Link to this section Summary

Types

A string or atom representing an attribute name. If an atom, underscores are converted to dashes.

A string, a struct that implements the String.Chars protocol, a Floki HTML tree, or a Floki HTML node.

Functions

Finds all elements in html that match selector, returning a Floki HTML tree.

Returns the value of attr from the outermost element of html. If attr is an atom, any underscores are converted to dashes.

Like find/2 but raises unless exactly one element is found.

Finds the first element in html that matches selector, returning a Floki HTML node.

Returns a map containing the form fields of form selector in html.

Prints prettified html with a label, and then returns the original html.

Extracts all the meta tags from html, returning a list of maps.

Parses and then re-stringifies html, increasing the liklihood that two equivalent HTML strings can be considered equal.

Parses an HTML fragment using Floki.parse_fragment!/1, returning a Floki HTML tree.

Parses an HTML document using Floki.parse_document!/1, returning a Floki HTML tree.

Returns html as a prettified string, using Floki.raw_html/2 and its pretty: true option.

Returns the text value of html.

Link to this section Types

@type attr() :: binary() | atom()

A string or atom representing an attribute name. If an atom, underscores are converted to dashes.

@type html() :: binary() | String.Chars.t() | Floki.html_tree() | Floki.html_node()

A string, a struct that implements the String.Chars protocol, a Floki HTML tree, or a Floki HTML node.

Link to this section Functions

Finds all elements in html that match selector, returning a Floki HTML tree.

iex> html = ~s|<select> <option value="a" selected>apples</option> <option value="b">bananas</option> </select>|
iex> HtmlQuery.all(html, "option")
[
  {"option", [{"value", "a"}, {"selected", "selected"}], ["apples"]},
  {"option", [{"value", "b"}], ["bananas"]}
]
@spec attr(html(), attr()) :: binary()

Returns the value of attr from the outermost element of html. If attr is an atom, any underscores are converted to dashes.

iex> html = ~s|<div> <a href="/logout" test-role="logout-link">logout</a> </div>|
iex> HtmlQuery.find!(html, test_role: "logout-link") |> HtmlQuery.attr("href")
"/logout"

Like find/2 but raises unless exactly one element is found.

@spec find(html(), HtmlQuery.Css.selector()) :: Floki.html_node() | nil

Finds the first element in html that matches selector, returning a Floki HTML node.

iex> html = ~s|<select> <option value="a" selected>apples</option> <option value="b">bananas</option> </select>|
iex> HtmlQuery.find(html, "select option[selected]")
{"option", [{"value", "a"}, {"selected", "selected"}], ["apples"]}
@spec form_fields(html()) :: %{required(atom()) => binary() | map()}

Returns a map containing the form fields of form selector in html.

iex> html = ~s|<form> <input type="text" name="color" value="green"> <textarea name="desc">A tree</textarea> </form>|
iex> html |> HtmlQuery.find("form") |> HtmlQuery.form_fields()
%{color: "green", desc: "A tree"}

If form field names are in foo[bar] format, then foo becomes a key to a nested map containing bar:

iex> html = ~s|<form> <input type="text" name="profile[name]" value="fido"> <input type="text" name="profile[age]" value="10"> </form>|
iex> html |> HtmlQuery.find("form") |> HtmlQuery.form_fields()
%{profile: %{name: "fido", age: "10"}}
Link to this function

inspect_html(html, label \\ "INSPECTED HTML")

View Source
@spec inspect_html(html(), binary()) :: html()

Prints prettified html with a label, and then returns the original html.

@spec meta_tags(html()) :: [%{required(binary()) => binary()}]

Extracts all the meta tags from html, returning a list of maps.

iex> html = ~s|<head> <meta charset="utf-8"/> <meta http-equiv="X-UA-Compatible" content="IE=edge"/> </head>|
iex> HtmlQuery.meta_tags(html)
[%{"charset" => "utf-8"}, %{"content" => "IE=edge", "http-equiv" => "X-UA-Compatible"}]
@spec normalize(html()) :: binary()

Parses and then re-stringifies html, increasing the liklihood that two equivalent HTML strings can be considered equal.

iex> a = ~s|<p id="color">green</p>|
iex> b = ~s|<p  id = "color" >green</p>|
iex> a == b
false
iex> HtmlQuery.normalize(a) == HtmlQuery.normalize(b)
true
@spec parse(html()) :: Floki.html_tree()

Parses an HTML fragment using Floki.parse_fragment!/1, returning a Floki HTML tree.

@spec parse_doc(html()) :: Floki.html_tree()

Parses an HTML document using Floki.parse_document!/1, returning a Floki HTML tree.

@spec pretty(html()) :: binary()

Returns html as a prettified string, using Floki.raw_html/2 and its pretty: true option.

@spec text(html()) :: binary()

Returns the text value of html.

iex> html = ~s|<select> <option value="a" selected>apples</option> <option value="b">bananas</option> </select>|
iex> HtmlQuery.find!(html, "select option[selected]") |> HtmlQuery.text()
"apples"