SimpleXml.XmlNode (simple_xml v1.3.1)
A simplistic XML node representation that uses the saxy lib, in order to avoid xmerl based libraries, which have the vulnerability that they create new atoms for each tag within the XML document.
For simplicity, this module ignores namespaces within the document.
Summary
Functions
Obtains value for the given attribute.
This function attempts to mimic the :xmerl_c14n.c14n()
functionality by
Returns the children of the given node. To get a filtered list of children, see children/2
.
Returns all children that match the given child_name filter. Filtering matches that of
first_child/2
. A string child tag name or a regex can be supplied for filtering.
Removes all children that match the given name. Semantics of the child_name parameter follow
those of the first_child/1
function.
Obtains the first direct child of the given node with the given string tag name via case-insensitive match.
Obtains any namespace declaration attribute associated with the given node.
Returns true if the given attribute name is one that's reserved for namespaces; false otherwise.
Obtains text within the body of a tag.
Exports the given node, its attributes, and its decendents into an XML string.
Types
@type attributes_reduction() :: {[SimpleXml.xml_attribute()], map(), map()}
@type xml_attribute() :: SimpleXml.xml_attribute()
@type xml_node() :: SimpleXml.xml_node()
Functions
Obtains value for the given attribute.
Examples
Obtains the value for an attribute
iex> {:ok, node} = SimpleXml.parse(~S'<foo a="1" b="2"></foo>')
iex> SimpleXml.XmlNode.attribute(node, "a")
{:ok, "1"}
Returns the first matching attribute it finds
iex> {:ok, node} = SimpleXml.parse(~S'<foo a="1" a="2"></foo>')
iex> SimpleXml.XmlNode.attribute(node, "a")
{:ok, "1"}
Generates an error when the attribute is missing
iex> {:ok, node} = SimpleXml.parse(~S'<foo a="1" b="2"></foo>')
iex> SimpleXml.XmlNode.attribute(node, "c")
{:error, {:attribute_not_found, "c"}}
@spec canonicalize( xml_node() | [xml_node()] | String.t(), keyword() ) :: xml_node() | [xml_node()]
This function attempts to mimic the :xmerl_c14n.c14n()
functionality by:
- placing namespace attributes first
- sorting attributes alphabetically
- only applying namespaces attributes to nodes where they're first used
- unused named namespace attributes are applied to descendants where they're first used
Ultimately, :xmerl_c14n.c14n()
is an implementation of this W3C spec.
Examples
Nodes with no attributes are unaffected
iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo><bar>1</bar><bar>2</bar></foo>'
iex> expected_output = ~S'<foo><bar>1</bar><bar>2</bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Attributes are sorted alphabetically, with namespaces appearing first
iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns="a" a="1" B="2"><bar xmlns="b" b="1" a="2">1</bar><bar xmlns="c">2</bar></foo>'
iex> expected_output = ~S'<foo xmlns="a" B="2" a="1"><bar xmlns="b" a="2" b="1">1</bar><bar xmlns="c">2</bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Unused named namespaces are dropped
iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a"></foo>'
iex> expected_output = ~S'<foo></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Default namespaces are kept
iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns="a"></foo>'
iex> expected_output = ~S'<foo xmlns="a"></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Parent named namespace is preserved when it's used by the parent
iex> alias SimpleXml.XmlNode
iex> input = ~S'<a:foo xmlns:a="a"><a:bar>1</a:bar></a:foo>'
iex> expected_output = ~S'<a:foo xmlns:a="a"><a:bar>1</a:bar></a:foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Child named namespace is dropped when the parent declares and uses the same namespace
iex> alias SimpleXml.XmlNode
iex> input = ~S'<a:foo xmlns:a="a"><a:bar xmlns:a="B">1</a:bar></a:foo>'
iex> expected_output = ~S'<a:foo xmlns:a="a"><a:bar>1</a:bar></a:foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Parent unused named namespaces are applied to namespaced children, but not namespaced grandchildren
iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a"><a:bar>1</a:bar><a:bar><a:baz>2</a:baz></a:bar></foo>'
iex> expected_output = ~S'<foo><a:bar xmlns:a="a">1</a:bar><a:bar xmlns:a="a"><a:baz>2</a:baz></a:bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Parent unused named namespace is given precedence to child's named namespace declaration
iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a"><a:bar xmlns:a="b">1</a:bar><a:bar>2</a:bar></foo>'
iex> expected_output = ~S'<foo><a:bar xmlns:a="a">1</a:bar><a:bar xmlns:a="a">2</a:bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Attribute namespaces are preserved
iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a" a:type="apple"><bar>1</bar></foo>'
iex> expected_output = ~S'<foo xmlns:a="a" a:type="apple"><bar>1</bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Attribute namespaces within namespaced nodes are preserved
iex> alias SimpleXml.XmlNode
iex> input = ~S'<a:foo xmlns:a="a" xmlns:b="b" b:type="apple"><bar>1</bar></a:foo>'
iex> expected_output = ~S'<a:foo xmlns:a="a" xmlns:b="b" b:type="apple"><bar>1</bar></a:foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Unused attribute namespaces in the parent are applied to child nodes
iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a"><bar a:type="apple">1</bar></foo>'
iex> expected_output = ~S'<foo><bar xmlns:a="a" a:type="apple">1</bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Unused attribute namespaces in the namespaced parent are applied to child nodes
iex> alias SimpleXml.XmlNode
iex> input = ~S'<a:foo xmlns:a="a" xmlns:b="b"><bar b:type="apple">1</bar></a:foo>'
iex> expected_output = ~S'<a:foo xmlns:a="a"><bar xmlns:b="b" b:type="apple">1</bar></a:foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Unused attribute namespaces in the namespaced parent are applied to child nodes, but not grandchildren
iex> alias SimpleXml.XmlNode
iex> input = ~S'<a:foo xmlns:a="a" xmlns:b="b"><bar b:type="apple"><baz b:type="food">2</baz></bar></a:foo>'
iex> expected_output = ~S'<a:foo xmlns:a="a"><bar xmlns:b="b" b:type="apple"><baz b:type="food">2</baz></bar></a:foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n() |> to_string()
iex> output
xmerl_output
Inclusive namespaces are preserved
iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a" xmlns:b="b"><bar b:type="apple"><baz b:type="food">2</baz></bar></foo>'
iex> expected_output = ~S'<foo xmlns:a="a"><bar xmlns:b="b" b:type="apple"><baz b:type="food">2</baz></bar></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> output = XmlNode.canonicalize(root, inclusive_namespaces: ["a"]) |> XmlNode.to_string()
iex> output
expected_output
iex> {doc, _} = input |> :binary.bin_to_list() |> :xmerl_scan.string()
iex> xmerl_output = doc |> :xmerl_c14n.c14n(false, [~c"a"]) |> to_string()
iex> output
xmerl_output
@spec children(xml_node()) :: {:ok, [String.t() | xml_node()]} | {:error, {:no_children_found, [String.t() | xml_node()]}}
Returns the children of the given node. To get a filtered list of children, see children/2
.
Examples
Returns all children
iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar><baz>2</baz></foo>')
iex> SimpleXml.XmlNode.children(node)
{:ok, [{"bar", [], ["1"]}, {"baz", [], ["2"]}]}
Returns an error if the node doesn't contain a child
iex> {:ok, node} = SimpleXml.parse(~S'<foo>bar</foo>')
iex> SimpleXml.XmlNode.children(node)
{:error, {:no_children_found, ["bar"]}}
Returns all children that match the given child_name filter. Filtering matches that of
first_child/2
. A string child tag name or a regex can be supplied for filtering.
Examples
Returns all children by a given string name
iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar><baz>2</baz></foo>')
iex> SimpleXml.XmlNode.children(node, "bar")
[{"bar", [], ["1"]}]
Returns all children by a given Regex
iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar><baz>2</baz></foo>')
iex> SimpleXml.XmlNode.children(node, ~r/BA/i)
[{"bar", [], ["1"]}, {"baz", [], ["2"]}]
Returns an empty list, if there are no children
iex> {:ok, node} = SimpleXml.parse(~S'<foo>bar</foo>')
iex> SimpleXml.XmlNode.children(node, "bar")
[]
Removes all children that match the given name. Semantics of the child_name parameter follow
those of the first_child/1
function.
Exmaples
All matching children are removed based on a string child name
iex> {:ok, node} = SimpleXml.parse(~S'<ns:foo><xs:bar>1</xs:bar><xs:bar>2</xs:bar></ns:foo>')
iex> SimpleXml.XmlNode.drop_children(node, "*:Bar")
{"ns:foo", [], []}
All matching children are removed based on a Regex child name
iex> {:ok, node} = SimpleXml.parse(~S'<ns:foo><xs:bar>1</xs:bar><xs:BAR>2</xs:BAR></ns:foo>')
iex> SimpleXml.XmlNode.drop_children(node, ~r/bar/)
{"ns:foo", [], [{"xs:BAR", [], ["2"]}]}
Obtains the first direct child of the given node with the given string tag name via case-insensitive match.
Use a *:
prefix for the tag name to ignore namespace associated with the tag name.
Alternatively, you can supply a regex to pattern match the child name. When Regex is supplied the Regex's case sensitivity is respected.
Examples
Obtains the first child by the given name
iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar><baz>2</baz></foo>')
iex> SimpleXml.XmlNode.first_child(node, "bar")
{:ok, {"bar", [], ["1"]}}
Returns the first matching node it finds
iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar><bar>2</bar></foo>')
iex> SimpleXml.XmlNode.first_child(node, "bar")
{:ok, {"bar", [], ["1"]}}
Ignores case when matching tag name
iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar><bar>2</bar></foo>')
iex> SimpleXml.XmlNode.first_child(node, "BAR")
{:ok, {"bar", [], ["1"]}}
Wildcard ignores tag namespace
iex> {:ok, node} = SimpleXml.parse(~S'<ns:foo><xs:bar>1</xs:bar><xs:bar>2</xs:bar></ns:foo>')
iex> SimpleXml.XmlNode.first_child(node, "*:Bar")
{:ok, {"xs:bar", [], ["1"]}}
Use Regex to find a child
iex> {:ok, node} = SimpleXml.parse(~S'<ns:foo><xs:bar>1</xs:bar><xs:bar>2</xs:bar></ns:foo>')
iex> SimpleXml.XmlNode.first_child(node, ~r/.*:BAR/i)
{:ok, {"xs:bar", [], ["1"]}}
Generates an error when there's no child with the given name
iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar></foo>')
iex> SimpleXml.XmlNode.first_child(node, "baz")
{:error, {:child_not_found, [child_name: "baz", actual_children: [{"bar", [], ["1"]}]]}}
Generates an error when there no children
iex> {:ok, node} = SimpleXml.parse(~S'<foo></foo>')
iex> SimpleXml.XmlNode.first_child(node, "baz")
{:error, {:child_not_found, [child_name: "baz", actual_children: []]}}
@spec namespace_attribute(xml_node()) :: {:ok, xml_attribute()} | {:error, any()}
Obtains any namespace declaration attribute associated with the given node.
Examples
Returns the default namespace, if it's applicable
iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns="a"></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> XmlNode.namespace_attribute(root)
{:ok, {"xmlns", "a"}}
Returns the named namespace, if it's applicable
iex> alias SimpleXml.XmlNode
iex> input = ~S'<a:foo xmlns:a="b"></a:foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> XmlNode.namespace_attribute(root)
{:ok, {"xmlns:a", "b"}}
Returns an error if there's no namespace
iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> XmlNode.namespace_attribute(root)
{:error, :namesapce_attribute_not_found}
Returns an error if there's no applicable namespace
iex> alias SimpleXml.XmlNode
iex> input = ~S'<foo xmlns:a="a"></foo>'
iex> {:ok, root} = SimpleXml.parse(input)
iex> XmlNode.namespace_attribute(root)
{:error, :namesapce_attribute_not_found}
Returns true if the given attribute name is one that's reserved for namespaces; false otherwise.
Examples
Returns true for xmlns
iex> SimpleXml.XmlNode.namespace_attribute?("xmlns")
true
Returns true for xmlns:*
iex> SimpleXml.XmlNode.namespace_attribute?("xmlns:a")
true
Returns false for non-namespace attributes
iex> SimpleXml.XmlNode.namespace_attribute?("foo")
false
Obtains text within the body of a tag.
Examples
Obtains the text contents of a tag
iex> {:ok, node} = SimpleXml.parse(~S'<foo>bar</foo>')
iex> SimpleXml.XmlNode.text(node)
{:ok, "bar"}
Generates an error when the tag contains no text
iex> {:ok, node} = SimpleXml.parse(~S'<foo><bar>1</bar></foo>')
iex> SimpleXml.XmlNode.text(node)
{:error, {:text_not_found, [{"bar", [], ["1"]}]}}
@spec to_string(xml_node() | [xml_node()] | [xml_attribute()] | String.t()) :: String.t()
Exports the given node, its attributes, and its decendents into an XML string.
Examples
XML can be exported to string
iex> input = ~S'<foo>bar</foo>'
iex> {:ok, node} = SimpleXml.parse(input)
iex> SimpleXml.XmlNode.to_string(node) == input
true
Case is preserved for tag names and attributes
iex> input = ~S'<Foo A="1"><BAR>b</BAR></Foo>'
iex> {:ok, node} = SimpleXml.parse(input)
iex> SimpleXml.XmlNode.to_string(node) == input
true
Attribute order is preserved
iex> input = ~S'<foo b="b" a="a"><bar d="d" c="c">1</bar></foo>'
iex> {:ok, node} = SimpleXml.parse(input)
iex> SimpleXml.XmlNode.to_string(node) == input
true