SimpleXml.XmlNode (simple_xml v1.3.1)

A simplistic XML node representation that uses the saxy lib, in order to avoid xmerl based libraries, which have the vulnerability that they create new atoms for each tag within the XML document.

For simplicity, this module ignores namespaces within the document.

Summary

Functions

Obtains value for the given attribute.

This function attempts to mimic the :xmerl_c14n.c14n() functionality by

Returns the children of the given node. To get a filtered list of children, see children/2.

Returns all children that match the given child_name filter. Filtering matches that of first_child/2. A string child tag name or a regex can be supplied for filtering.

Removes all children that match the given name. Semantics of the child_name parameter follow those of the first_child/1 function.

Obtains the first direct child of the given node with the given string tag name via case-insensitive match.

Obtains any namespace declaration attribute associated with the given node.

Returns true if the given attribute name is one that's reserved for namespaces; false otherwise.

Obtains text within the body of a tag.

Exports the given node, its attributes, and its decendents into an XML string.

Types

attributes_reduction()

@type attributes_reduction() :: {[SimpleXml.xml_attribute()], map(), map()}

xml_attribute()

@type xml_attribute() :: SimpleXml.xml_attribute()

xml_node()

@type xml_node() :: SimpleXml.xml_node()

Functions

attribute(arg, attr_name)

@spec attribute(xml_node(), String.t()) :: {:ok, String.t()} | {:error, any()}

Obtains value for the given attribute.

Examples

Obtains the value for an attribute

iex> {:ok, node} = SimpleXml.parse(~S'<foo a="1" b="2"></foo>')
iex> SimpleXml.XmlNode.attribute(node, "a")
{:ok, "1"}

Returns the first matching attribute it finds

iex> {:ok, node} = SimpleXml.parse(~S'<foo a="1" a="2"></foo>')
iex> SimpleXml.XmlNode.attribute(node, "a")
{:ok, "1"}

Generates an error when the attribute is missing

iex> {:ok, node} = SimpleXml.parse(~S'<foo a="1" b="2"></foo>')
iex> SimpleXml.XmlNode.attribute(node, "c")
{:error, {:attribute_not_found, "c"}}

canonicalize(node, opts \\ [])

@spec canonicalize(
  xml_node() | [xml_node()] | String.t(),
  keyword()
) :: xml_node() | [xml_node()]

This function attempts to mimic the :xmerl_c14n.c14n() functionality by:

  • placing namespace attributes first
  • sorting attributes alphabetically
  • only applying namespaces attributes to nodes where they're first used
  • unused named namespace attributes are applied to descendants where they're first used

Ultimately, :xmerl_c14n.c14n() is an implementation of this W3C spec.

Examples

Nodes with no attributes are unaffected

iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo><bar>1</bar><bar>2</bar></foo>'
iex> expected_output = ~S'<foo><bar>1</bar><bar>2</bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Attributes are sorted alphabetically, with namespaces appearing first

iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns="a" a="1" B="2"><bar xmlns="b" b="1" a="2">1</bar><bar xmlns="c">2</bar></foo>'
iex> expected_output = ~S'<foo xmlns="a" B="2" a="1"><bar xmlns="b" a="2" b="1">1</bar><bar xmlns="c">2</bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Unused named namespaces are dropped

iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a"></foo>'
iex> expected_output = ~S'<foo></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Default namespaces are kept

iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns="a"></foo>'
iex> expected_output = ~S'<foo xmlns="a"></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Parent named namespace is preserved when it's used by the parent

iex> alias SimpleXml.XmlNode
iex> input = ~S'<a:foo xmlns:a="a"><a:bar>1</a:bar></a:foo>'
iex> expected_output = ~S'<a:foo xmlns:a="a"><a:bar>1</a:bar></a:foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Child named namespace is dropped when the parent declares and uses the same namespace

iex> alias SimpleXml.XmlNode
iex> input = ~S'<a:foo xmlns:a="a"><a:bar xmlns:a="B">1</a:bar></a:foo>'
iex> expected_output = ~S'<a:foo xmlns:a="a"><a:bar>1</a:bar></a:foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Parent unused named namespaces are applied to namespaced children, but not namespaced grandchildren

iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a"><a:bar>1</a:bar><a:bar><a:baz>2</a:baz></a:bar></foo>'
iex> expected_output = ~S'<foo><a:bar xmlns:a="a">1</a:bar><a:bar xmlns:a="a"><a:baz>2</a:baz></a:bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Parent unused named namespace is given precedence to child's named namespace declaration

iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a"><a:bar xmlns:a="b">1</a:bar><a:bar>2</a:bar></foo>'
iex> expected_output = ~S'<foo><a:bar xmlns:a="a">1</a:bar><a:bar xmlns:a="a">2</a:bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Attribute namespaces are preserved

iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a" a:type="apple"><bar>1</bar></foo>'
iex> expected_output = ~S'<foo xmlns:a="a" a:type="apple"><bar>1</bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Attribute namespaces within namespaced nodes are preserved

iex> alias SimpleXml.XmlNode
iex> input = ~S'<a:foo xmlns:a="a" xmlns:b="b" b:type="apple"><bar>1</bar></a:foo>'
iex> expected_output = ~S'<a:foo xmlns:a="a" xmlns:b="b" b:type="apple"><bar>1</bar></a:foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Unused attribute namespaces in the parent are applied to child nodes

iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a"><bar a:type="apple">1</bar></foo>'
iex> expected_output = ~S'<foo><bar xmlns:a="a" a:type="apple">1</bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Unused attribute namespaces in the namespaced parent are applied to child nodes

iex> alias SimpleXml.XmlNode
iex> input = ~S'<a:foo xmlns:a="a" xmlns:b="b"><bar b:type="apple">1</bar></a:foo>'
iex> expected_output = ~S'<a:foo xmlns:a="a"><bar xmlns:b="b" b:type="apple">1</bar></a:foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Unused attribute namespaces in the namespaced parent are applied to child nodes, but not grandchildren

iex> alias SimpleXml.XmlNode
iex> input = ~S'<a:foo xmlns:a="a" xmlns:b="b"><bar b:type="apple"><baz b:type="food">2</baz></bar></a:foo>'
iex> expected_output = ~S'<a:foo xmlns:a="a"><bar xmlns:b="b" b:type="apple"><baz b:type="food">2</baz></bar></a:foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output

Inclusive namespaces are preserved

iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a" xmlns:b="b"><bar b:type="apple"><baz b:type="food">2</baz></bar></foo>'
iex> expected_output = ~S'<foo xmlns:a="a"><bar xmlns:b="b" b:type="apple"><baz b:type="food">2</baz></bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root, inclusive_namespaces: ["a"]) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n(false, [~c"a"]) |> to_string()
iex> output
xmerl_output

children(arg)

@spec children(xml_node()) ::
  {:ok, [String.t() | xml_node()]}
  | {:error, {:no_children_found, [String.t() | xml_node()]}}

Returns the children of the given node. To get a filtered list of children, see children/2.

Examples

Returns all children

iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar><baz>2</baz></foo>')
iex> SimpleXml.XmlNode.children(node)
{:ok, [{"bar", [], ["1"]}, {"baz", [], ["2"]}]}

Returns an error if the node doesn't contain a child

iex> {:ok, node} = SimpleXml.parse(~S'<foo>bar</foo>')
iex> SimpleXml.XmlNode.children(node)
{:error, {:no_children_found, ["bar"]}}

children(xml_node, child_name)

@spec children(xml_node(), String.t() | Regex.t()) :: [xml_node()]

Returns all children that match the given child_name filter. Filtering matches that of first_child/2. A string child tag name or a regex can be supplied for filtering.

Examples

Returns all children by a given string name

iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar><baz>2</baz></foo>')
iex> SimpleXml.XmlNode.children(node, "bar")
[{"bar", [], ["1"]}]

Returns all children by a given Regex

iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar><baz>2</baz></foo>')
iex> SimpleXml.XmlNode.children(node, ~r/BA/i)
[{"bar", [], ["1"]}, {"baz", [], ["2"]}]

Returns an empty list, if there are no children

iex> {:ok, node} = SimpleXml.parse(~S'<foo>bar</foo>')
iex> SimpleXml.XmlNode.children(node, "bar")
[]

drop_children(arg, child_name)

@spec drop_children(xml_node(), String.t() | Regex.t()) :: xml_node()

Removes all children that match the given name. Semantics of the child_name parameter follow those of the first_child/1 function.

Exmaples

All matching children are removed based on a string child name

iex> {:ok, node} = SimpleXml.parse(~S'<ns:foo><xs:bar>1</xs:bar><xs:bar>2</xs:bar></ns:foo>')
iex> SimpleXml.XmlNode.drop_children(node, "*:Bar")
{"ns:foo", [], []}

All matching children are removed based on a Regex child name

iex> {:ok, node} = SimpleXml.parse(~S'<ns:foo><xs:bar>1</xs:bar><xs:BAR>2</xs:BAR></ns:foo>')
iex> SimpleXml.XmlNode.drop_children(node, ~r/bar/)
{"ns:foo", [], [{"xs:BAR", [], ["2"]}]}

first_child(xml_node, child_name)

@spec first_child(xml_node(), String.t() | Regex.t()) ::
  {:ok, xml_node()} | {:error, any()}

Obtains the first direct child of the given node with the given string tag name via case-insensitive match.

Use a *: prefix for the tag name to ignore namespace associated with the tag name.

Alternatively, you can supply a regex to pattern match the child name. When Regex is supplied the Regex's case sensitivity is respected.

Examples

Obtains the first child by the given name

iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar><baz>2</baz></foo>')
iex> SimpleXml.XmlNode.first_child(node, "bar")
{:ok, {"bar", [], ["1"]}}

Returns the first matching node it finds

iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar><bar>2</bar></foo>')
iex> SimpleXml.XmlNode.first_child(node, "bar")
{:ok, {"bar", [], ["1"]}}

Ignores case when matching tag name

iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar><bar>2</bar></foo>')
iex> SimpleXml.XmlNode.first_child(node, "BAR")
{:ok, {"bar", [], ["1"]}}

Wildcard ignores tag namespace

iex> {:ok, node} = SimpleXml.parse(~S'<ns:foo><xs:bar>1</xs:bar><xs:bar>2</xs:bar></ns:foo>')
iex> SimpleXml.XmlNode.first_child(node, "*:Bar")
{:ok, {"xs:bar", [], ["1"]}}

Use Regex to find a child

iex> {:ok, node} = SimpleXml.parse(~S'<ns:foo><xs:bar>1</xs:bar><xs:bar>2</xs:bar></ns:foo>')
iex> SimpleXml.XmlNode.first_child(node, ~r/.*:BAR/i)
{:ok, {"xs:bar", [], ["1"]}}

Generates an error when there's no child with the given name

iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar></foo>')
iex> SimpleXml.XmlNode.first_child(node, "baz")
{:error, {:child_not_found, [child_name: "baz", actual_children: [{"bar", [], ["1"]}]]}}

Generates an error when there no children

iex> {:ok, node} = SimpleXml.parse(~S'<foo></foo>')
iex> SimpleXml.XmlNode.first_child(node, "baz")
{:error, {:child_not_found, [child_name: "baz", actual_children: []]}}

namespace_attribute(arg)

@spec namespace_attribute(xml_node()) :: {:ok, xml_attribute()} | {:error, any()}

Obtains any namespace declaration attribute associated with the given node.

Examples

Returns the default namespace, if it's applicable

iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns="a"></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> XmlNode.namespace_attribute(root)
{:ok, {"xmlns", "a"}}

Returns the named namespace, if it's applicable

iex> alias SimpleXml.XmlNode
iex> input = ~S'<a:foo xmlns:a="b"></a:foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> XmlNode.namespace_attribute(root)
{:ok, {"xmlns:a", "b"}}

Returns an error if there's no namespace

iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> XmlNode.namespace_attribute(root)
{:error, :namesapce_attribute_not_found}

Returns an error if there's no applicable namespace

iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a"></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> XmlNode.namespace_attribute(root)
{:error, :namesapce_attribute_not_found}

namespace_attribute?(arg1)

@spec namespace_attribute?(String.t()) :: boolean()

Returns true if the given attribute name is one that's reserved for namespaces; false otherwise.

Examples

Returns true for xmlns

iex> SimpleXml.XmlNode.namespace_attribute?("xmlns")
true

Returns true for xmlns:*

iex> SimpleXml.XmlNode.namespace_attribute?("xmlns:a")
true

Returns false for non-namespace attributes

iex> SimpleXml.XmlNode.namespace_attribute?("foo")
false

text(xml_node)

@spec text(xml_node()) :: {:ok, String.t()} | {:error, any()}

Obtains text within the body of a tag.

Examples

Obtains the text contents of a tag

iex> {:ok, node} = SimpleXml.parse(~S'<foo>bar</foo>')
iex> SimpleXml.XmlNode.text(node)
{:ok, "bar"}

Generates an error when the tag contains no text

iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar></foo>')
iex> SimpleXml.XmlNode.text(node)
{:error, {:text_not_found, [{"bar", [], ["1"]}]}}

to_string(text)

@spec to_string(xml_node() | [xml_node()] | [xml_attribute()] | String.t()) ::
  String.t()

Exports the given node, its attributes, and its decendents into an XML string.

Examples

XML can be exported to string

iex> input = ~S'<foo>bar</foo>'
iex> {:ok, node} = SimpleXml.parse(input)
iex> SimpleXml.XmlNode.to_string(node) == input
true

Case is preserved for tag names and attributes

iex> input = ~S'<Foo A="1"><BAR>b</BAR></Foo>'
iex> {:ok, node} = SimpleXml.parse(input)
iex> SimpleXml.XmlNode.to_string(node) == input
true

Attribute order is preserved

iex> input = ~S'<foo b="b" a="a"><bar d="d" c="c">1</bar></foo>'
iex> {:ok, node} = SimpleXml.parse(input)
iex> SimpleXml.XmlNode.to_string(node) == input
true