scrape v3.1.0 Scrape.Tools.DOM

Utility module for selecting/extracting data from a "DOM" (HTML/XML tree-like structure). Can find text values and attribute values, inspired by jQuery and implemented with Floki.

Link to this section Summary

Types

dom()

DOM tree representation, same as Floki's html_tree.

Functions

attr(dom, selector, name)

Similar to text/2 but but returns a chosen attribute value instead of the node's text value (or nil).

attrs(dom, selector, name)

Similar to attr/3 but returns a list of all matching results.

first(dom, list)

Cascading query helper, applies either text/2 or attr/3 until something returns a non-nil result or all queries are tried.

from_string(string)

Create a DOM from a given (HTML/XML) string.

text(dom, selector)

Get the text value of a DOM node (including nested nodes).

texts(dom, selector)

Similar to text/2 but iterates over all matching nodes.

to_string(dom)

Builds a (HTML/XML) string from a DOM structure.

Link to this section Types

dom()

dom() :: String.t() | tuple() | [any()]

DOM tree representation, same as Floki's html_tree.

Can be created via from_string/1.

Link to this section Functions

attr(dom, selector, name)

attr(dom(), String.t(), String.t()) :: nil | String.t()

Similar to text/2 but but returns a chosen attribute value instead of the node's text value (or nil).

Examples

iex> "<meta name='a' content='b' />" |> DOM.from_string |> DOM.attr("meta", "unknown")
nil

iex> "<meta name='a' content='b' />" |> DOM.from_string |> DOM.attr("meta", "content")
"b"

iex> "<meta name='a' content='b' />" |> DOM.from_string |> DOM.attr("meta[name=a]", "content")
"b"

attrs(dom, selector, name)

attrs(dom(), String.t(), String.t()) :: [String.t()]

Similar to attr/3 but returns a list of all matching results.

Examples

iex> "<p class='a'>b</p><p class='c' />" |> DOM.from_string() |> DOM.attrs("div", "class")
[]

iex> "<p class='a'>b</p><p class='c' />" |> DOM.from_string() |> DOM.attrs("p", "id")
[]

iex> "<p class='a'>b</p><p class='c' />" |> DOM.from_string() |> DOM.attrs("p", "class")
["a", "c"]

first(dom, list)

first(dom(), [{String.t()} | {String.t(), String.t()}]) :: nil | String.t()

Cascading query helper, applies either text/2 or attr/3 until something returns a non-nil result or all queries are tried.

Examples

iex> DOM.first([], [])
nil

iex> DOM.first([], [{"b"}, {"i"}, {"div", "class"}])
nil

iex> "<div id='1'>abc</div>" |> DOM.from_string() |> DOM.first([{"i"}, {"div", "id"}])
"1"

iex> "<b>abc</b>" |> DOM.from_string() |> DOM.first([{"i"}, {"b"}])
"abc"

from_string(string)

from_string(String.t()) :: dom()

Create a DOM from a given (HTML/XML) string.

Examples

iex> DOM.from_string("")
[]

iex> DOM.from_string("<html></html>")
{"html", [], []}

text(dom, selector)

text(dom(), String.t()) :: nil | String.t()

Get the text value of a DOM node (including nested nodes).

If many nodes match the selector, the first one is used.

Examples

iex> "<div>abc</div>" |> DOM.from_string() |> DOM.text("p")
nil

iex> "<div>abc</div>" |> DOM.from_string() |> DOM.text("div")
"abc"

texts(dom, selector)

texts(dom(), String.t()) :: [String.t()]

Similar to text/2 but iterates over all matching nodes.

Returns always a list result, but with nil values filtered.

Examples

iex> "<div>abc</div>" |> DOM.from_string() |> DOM.texts("p")
[]

iex> "<div>abc</div>" |> DOM.from_string() |> DOM.texts("div")
["abc"]

iex> "<p>a</p><p>b</p>" |> DOM.from_string() |> DOM.texts("p")
["a", "b"]

to_string(dom)

to_string(dom()) :: String.t()

Builds a (HTML/XML) string from a DOM structure.

Examples

iex> DOM.to_string([])
""

iex> DOM.to_string({"html", [], []})
"<html></html>"

scrape v3.1.0 Scrape.Tools.DOM

Link to this section Summary

Types

Functions

Link to this section Types

dom()

dom() :: String.t() | tuple() | [any()]

Link to this section Functions

attr(dom, selector, name)

attr(dom(), String.t(), String.t()) :: nil | String.t()

Examples

attrs(dom, selector, name)

attrs(dom(), String.t(), String.t()) :: [String.t()]

Examples

first(dom, list)

first(dom(), [{String.t()} | {String.t(), String.t()}]) :: nil | String.t()

Examples

from_string(string)

from_string(String.t()) :: dom()

Examples

text(dom, selector)

text(dom(), String.t()) :: nil | String.t()

Examples

texts(dom, selector)

texts(dom(), String.t()) :: [String.t()]

Examples

to_string(dom)

to_string(dom()) :: String.t()

Examples

v3.1.0 v3.0.3 v3.0.2 v3.0.1 v3.0.0 v2.0.0

scrape v3.1.0 Scrape.Tools.DOM

Link to this section Summary

Types

Functions

Link to this section Types

dom() dom() :: String.t() | tuple() | [any()]

Link to this section Functions

attr(dom, selector, name) attr(dom(), String.t(), String.t()) :: nil | String.t()

Examples

attrs(dom, selector, name) attrs(dom(), String.t(), String.t()) :: [String.t()]

Examples

first(dom, list) first(dom(), [{String.t()} | {String.t(), String.t()}]) :: nil | String.t()

Examples

from_string(string) from_string(String.t()) :: dom()

Examples

text(dom, selector) text(dom(), String.t()) :: nil | String.t()

Examples

texts(dom, selector) texts(dom(), String.t()) :: [String.t()]

Examples

to_string(dom) to_string(dom()) :: String.t()

Examples

dom()

dom() :: String.t() | tuple() | [any()]

attr(dom, selector, name)

attr(dom(), String.t(), String.t()) :: nil | String.t()

attrs(dom, selector, name)

attrs(dom(), String.t(), String.t()) :: [String.t()]

first(dom, list)

first(dom(), [{String.t()} | {String.t(), String.t()}]) :: nil | String.t()

from_string(string)

from_string(String.t()) :: dom()

text(dom, selector)

text(dom(), String.t()) :: nil | String.t()

texts(dom, selector)

texts(dom(), String.t()) :: [String.t()]

to_string(dom)

to_string(dom()) :: String.t()