Meeseeks v0.13.1 Meeseeks.Document View Source

A Meeseeks.Document represents a flattened, queryable view of an HTML document in which:

  • The nodes (element, comment, or text) have been provided an id
  • Parent-child relationships have been made explicit

Examples

The actual contents of a document become quickly unwieldly in iex, so the inspect value of a document is always #Meeseeks.Document<{...}> regardless of the content. The example below ignores this fact for educational purposes.

tuple_tree = {"html", [],
               [{"head", [], []},
                {"body", [],
                 [{"h1", [{"id", "greeting"}], ["Hello, World!"]},
                  {"div", [], [
                      {"p", [], ["1"]},
                      {"p", [], ["2"]},
                      {"p", [], ["3"]}]}]}]}

document = Meeseeks.Parser.parse(tuple_tree)
#=> %Meeseeks.Document{
#      id_counter: 12,
#      roots: [1],
#      nodes: %{
#        1 => %Meeseeks.Document.Element{attributes: [], children: [3, 2],
#         id: 1, namespace: nil, parent: nil, tag: "html"},
#        2 => %Meeseeks.Document.Element{attributes: [], children: [], id: 2,
#         namespace: nil, parent: 1, tag: "head"},
#        3 => %Meeseeks.Document.Element{attributes: [], children: [6, 4], id: 3,
#         namespace: nil, parent: 1, tag: "body"},
#        4 => %Meeseeks.Document.Element{attributes: [{"id", "greeting"}],
#         children: [5], id: 4, namespace: nil, parent: 3, tag: "h1"},
#        5 => %Meeseeks.Document.Text{content: "Hello, World!", id: 5, parent: 4},
#        6 => %Meeseeks.Document.Element{attributes: [], children: [7, 9, 11],
#         id: 6, namespace: nil, parent: 3, tag: "div"},
#        7 => %Meeseeks.Document.Element{attributes: [], children: [8], id: 7,
#         namespace: nil, parent: 6, tag: "p"},
#        8 => %Meeseeks.Document.Text{content: "1", id: 8, parent: 7},
#        9 => %Meeseeks.Document.Element{attributes: [], children: [10], id: 9,
#         namespace: nil, parent: 6, tag: "p"},
#        10 => %Meeseeks.Document.Text{content: "2", id: 10, parent: 9},
#        11 => %Meeseeks.Document.Element{attributes: [], children: [12], id: 11,
#         namespace: nil, parent: 6, tag: "p"},
#        12 => %Meeseeks.Document.Text{content: "3", id: 12, parent: 11}}}

Meeseeks.Document.children(document, 6)
#=> [7, 9, 11]

Meeseeks.Document.descendants(document, 6)
#=> [7, 8, 9, 10, 11, 12]

Link to this section Summary

Functions

Returns the node ids of node_id's ancestors in the context of the document

Returns the node ids of node_id's children in the context of the document

Deletes the node referenced by node_id and all its descendants from the document

Returns the node ids of node_id's descendants in the context of the document

Checks if a node_id refers to a Meeseeks.Document.Element in the context of the document

Returns a tuple of {:ok, node}, where node is the node referred to by node_id in the context of the document, or :error

Returns the node referred to by node_id in the context of the document, or nil

Returns all of the document's node ids

Returns all of the document's nodes

Returns a list of nodes referred to by node_ids in the context of the document

Returns all of the document's root ids

Returns all of the document's root nodes

Returns the HTML of the document

Returns the node ids of the siblings that come after node_id in the context of the document

Returns the node id of node_id's parent in the context of the document, or nil if node_id does not have a parent

Returns the node ids of the siblings that come before node_id in the context of the document

Returns the node ids of node_id's siblings in the context of the document

Returns the Meeseeks.TupleTree of the document

Link to this section Types

Link to this type

node_t() View Source
node_t() :: Meeseeks.Document.Node.t()

Link to this type

t() View Source
t() :: %Meeseeks.Document{
  id_counter: node_id() | nil,
  nodes: %{optional(node_id()) => node_t()},
  roots: [node_id()]
}

Link to this section Functions

Link to this function

ancestors(document, node_id) View Source
ancestors(Meeseeks.Document.t(), node_id()) :: [node_id()] | no_return()

Returns the node ids of node_id's ancestors in the context of the document.

Returns the ancestors in reverse order: [parent, grandparent, ...]

Raises if node_id does not exist in the document.

Link to this function

children(document, node_id) View Source
children(Meeseeks.Document.t(), node_id()) :: [node_id()] | no_return()

Returns the node ids of node_id's children in the context of the document.

Returns all children, not just those that are Meeseeks.Document.Elements.

Returns children in depth-first order.

Raises if node_id does not exist in the document.

Link to this function

delete_node(document, node_id) View Source

Deletes the node referenced by node_id and all its descendants from the document.

Raises if node_id does not exist in the document.

Link to this function

descendants(document, node_id) View Source
descendants(Meeseeks.Document.t(), node_id()) :: [node_id()] | no_return()

Returns the node ids of node_id's descendants in the context of the document.

Returns all descendants, not just those that are Meeseeks.Document.Elements.

Returns descendants in depth-first order.

Raises if node_id does not exist in the document.

Link to this function

element?(document, node_id) View Source
element?(Meeseeks.Document.t(), node_id()) :: boolean() | no_return()

Checks if a node_id refers to a Meeseeks.Document.Element in the context of the document.

Raises if node_id does not exist in the document.

Link to this function

fetch_node(document, node_id) View Source
fetch_node(Meeseeks.Document.t(), node_id()) ::
  {:ok, node_t()} | {:error, Meeseeks.Error.t()}

Returns a tuple of {:ok, node}, where node is the node referred to by node_id in the context of the document, or :error.

Link to this function

get_node(document, node_id) View Source
get_node(Meeseeks.Document.t(), node_id()) :: node_t() | nil

Returns the node referred to by node_id in the context of the document, or nil.

Link to this function

get_node_ids(document) View Source
get_node_ids(Meeseeks.Document.t()) :: [node_id()]

Returns all of the document's node ids.

Returns node ids in depth-first order.

Returns all of the document's nodes.

Returns nodes in depth-first order.

Link to this function

get_nodes(document, node_ids) View Source
get_nodes(Meeseeks.Document.t(), [node_id()]) :: [node_t()] | no_return()

Returns a list of nodes referred to by node_ids in the context of the document.

Returns nodes in the same order as node_ids.

Raises if any id in node_ids does not exist in the document.

Link to this function

get_root_ids(document) View Source
get_root_ids(Meeseeks.Document.t()) :: [node_id()]

Returns all of the document's root ids.

Returns root ids in depth-first order.

Link to this function

get_root_nodes(document) View Source
get_root_nodes(Meeseeks.Document.t()) :: [node_t()]

Returns all of the document's root nodes.

Returns nodes in depth-first order.

Returns the HTML of the document.

Link to this function

next_siblings(document, node_id) View Source
next_siblings(Meeseeks.Document.t(), node_id()) :: [node_id()] | no_return()

Returns the node ids of the siblings that come after node_id in the context of the document.

Returns all of these siblings, not just those that are Meeseeks.Document.Elements.

Returns siblings in depth-first order.

Raises if node_id does not exist in the document.

Link to this function

parent(document, node_id) View Source
parent(Meeseeks.Document.t(), node_id()) :: node_id() | nil | no_return()

Returns the node id of node_id's parent in the context of the document, or nil if node_id does not have a parent.

Raises if node_id does not exist in the document.

Link to this function

previous_siblings(document, node_id) View Source
previous_siblings(Meeseeks.Document.t(), node_id()) :: [node_id()] | no_return()

Returns the node ids of the siblings that come before node_id in the context of the document.

Returns all of these siblings, not just those that are Meeseeks.Document.Elements.

Returns siblings in depth-first order.

Raises if node_id does not exist in the document.

Link to this function

siblings(document, node_id) View Source
siblings(Meeseeks.Document.t(), node_id()) :: [node_id()] | no_return()

Returns the node ids of node_id's siblings in the context of the document.

Returns all siblings, including node_id itself, and not just those that are Meeseeks.Document.Elements.

Returns siblings in depth-first order.

Raises if node_id does not exist in the document.

Returns the Meeseeks.TupleTree of the document.