Saxy v1.3.0 Saxy.SimpleForm View Source

Provides functions to parse a XML document to simple-form data structure.

Data structure

Simple form is a basic representation of the parsed XML document. It contains a root element, and all elements are in the following format:

element = {tag_name, attributes, content}
content = (element | binary | cdata)*

See “Types” section for more information.

Link to this section Summary

Functions

Parse given string into simple form

Link to this section Types

Link to this type attributes() View Source
attributes() :: [{name :: String.t(), value :: String.t()}]
Link to this type content() View Source
content() :: [String.t() | {:cdata, String.t()} | t()]
Link to this type tag_name() View Source
tag_name() :: String.t()

Link to this section Functions

Link to this function parse_string(data, options \\ []) View Source
parse_string(data :: binary(), options :: Keyword.t()) ::
  {:ok, t()} | {:error, exception :: Saxy.ParseError.t()}

Parse given string into simple form.

Options

  • :expand_entity - specifies how external entity references should be handled. Three supported strategies respectively are:

    • :keep - keep the original binary, for example Orange ® will be expanded to "Orange ®", this is the default strategy.
    • :skip - skip the original binary, for example Orange ® will be expanded to "Orange ".
    • {mod, fun, args} - take the applied result of the specified MFA.
  • :cdata_as_characters - true to return CData as characters, false to wrap CData as {:cdata, data}. Defaults to true.

Note that it is recommended to disable :cdata_as_characters if the outcome simple form data is meant to be re-encoded later. Consider the following example, the encoded document has different sematics from the original one.

iex> xml = "<foo><![CDATA[<greeting>Hello, world!</greeting>]]></foo>"
iex> {:ok, simple_form} = Saxy.SimpleForm.parse_string(xml, cdata_as_characters: true)
{:ok, {"foo", [], ["<greeting>Hello, world!</greeting>"]}}
iex> Saxy.encode!(simple_form)
"<foo><greeting>Hello, world!</greeting></foo>"

Examples

Given this XML document.

iex> xml = """
...> <?xml version="1.0" encoding="utf-8" ?>
...> <menu>
...>   <movie url="https://www.imdb.com/title/tt0120338/" id="tt0120338">
...>     <name>Titanic</name>
...>     <characters>Jack &amp; Rose</characters>
...>   </movie>
...>   <movie url="https://www.imdb.com/title/tt0109830/" id="tt0109830">
...>     <name>Forest Gump</name>
...>     <characters>Forest &amp; Jenny</characters>
...>   </movie>
...> </menu>
...> """
iex> Saxy.SimpleForm.parse_string(xml)
{:ok,
 {"menu", [],
  [
    "\n  ",
    {
      "movie",
      [
        {"url", "https://www.imdb.com/title/tt0120338/"},
        {"id", "tt0120338"}
      ],
      [
        "\n    ",
        {"name", [], ["Titanic"]},
        "\n    ",
        {"characters", [], ["Jack & Rose"]},
        "\n  "]
    },
    "\n  ",
    {
      "movie",
      [
        {"url", "https://www.imdb.com/title/tt0109830/"},
        {"id", "tt0109830"}
      ],
      [
        "\n    ",
        {"name", [], ["Forest Gump"]},
        "\n    ",
        {"characters", [], ["Forest & Jenny"]},
        "\n  "
      ]
    },
    "\n"
  ]}}