Ultra-fast XML parsing for Elixir with full XPath 1.0 support.
RustyXML is a high-performance XML parser built from scratch as a Rust NIF
with SIMD acceleration. It achieves 100% W3C/OASIS XML Conformance
(1089/1089 test cases) and provides a drop-in replacement for SweetXml with
the familiar ~x sigil syntax.
Quick Start
import RustyXML
xml = "<root><item id="1">Hello</item><item id="2">World</item></root>"
# Get a list of items
xpath(xml, ~x"//item"l)
#=> [{:element, "item", ...}, {:element, "item", ...}]
# Get text content as string
xpath(xml, ~x"//item/text()"s)
#=> "Hello"
# Map multiple values
xmap(xml, items: ~x"//item"l, count: ~x"count(//item)"i)
#=> %{items: [...], count: 2}Sigil Modifiers
The ~x sigil supports modifiers for result transformation:
e- Return entity (element) for chaining, not text values- Return as string (binary)S- Soft string (empty string on error)l- Return as listo- Optional (return nil instead of raising on missing)i- Cast to integerI- Soft integer (0 on error)f- Cast to floatF- Soft float (0.0 on error)k- Return as keyword list
XPath 1.0 Functions
RustyXML supports all 27+ XPath 1.0 functions including:
- Node:
position(),last(),count(),local-name(),namespace-uri(),name() - String:
string(),concat(),starts-with(),contains(),substring(), etc. - Boolean:
boolean(),not(),true(),false(),lang() - Number:
number(),sum(),floor(),ceiling(),round()
Streaming
For large files, use the streaming API:
"large.xml"
|> RustyXML.stream_tags(:item)
|> Stream.each(&process_item/1)
|> Stream.run()
Summary
Functions
Add a namespace binding to an XPath expression.
Encode an XML element tree to a string.
Encode an XML element tree to iodata.
Parse an XML document.
Parse an XML document, returning {:ok, doc} or {:error, reason}.
Parse an XML stream with a SAX event handler.
Parse an XML string with a SAX event handler.
Get the root element of a parsed document.
The ~x sigil for XPath expressions.
Stream XML events from a file.
Stream XML events from a file. Raises on error.
Add a transformation function to an XPath expression.
Execute multiple XPath queries and return as a map.
Execute an XPath query on XML.
Execute an XPath query with a mapping spec for nested extraction.
Types
@type document() :: RustyXML.Native.document_ref()
@type handler() :: module()
@type parse_options() :: [parse_option()]
Functions
@spec add_namespace(RustyXML.SweetXpath.t(), binary(), binary()) :: RustyXML.SweetXpath.t()
Add a namespace binding to an XPath expression.
Returns a new %SweetXpath{} with the namespace added.
Examples
xpath_with_ns = add_namespace(~x"//ns:item"l, "ns", "http://example.com/ns")
RustyXML.xpath(xml, xpath_with_ns)
Encode an XML element tree to a string.
Drop-in replacement for Saxy.encode!/2.
Examples
import RustyXML.XML
element("root", [], ["text"]) |> RustyXML.encode!()
#=> "<root>text</root>"
Encode an XML element tree to iodata.
Drop-in replacement for Saxy.encode_to_iodata!/2.
Parse an XML document.
By default, RustyXML uses strict mode to match SweetXml/xmerl behavior.
Malformed XML raises RustyXML.ParseError.
Returns an opaque document reference that can be used with xpath/2,3
for multiple queries on the same document.
Options
:lenient- Iftrue, accept malformed XML without raising. Useful for processing third-party or legacy XML. Default:false.
Examples
# Strict mode (default) - matches SweetXml behavior
doc = RustyXML.parse("<root><item/></root>")
RustyXML.xpath(doc, ~x"//item"l)
# Raises on malformed XML (like SweetXml)
RustyXML.parse("<1invalid/>")
#=> ** (RustyXML.ParseError) Invalid element name...
# Lenient mode - accepts malformed XML
doc = RustyXML.parse("<1invalid/>", lenient: true)
Parse an XML document, returning {:ok, doc} or {:error, reason}.
Unlike parse/2, this function returns a tuple instead of raising,
allowing pattern matching on parse results.
Examples
{:ok, doc} = RustyXML.parse_document("<root/>")
{:error, reason} = RustyXML.parse_document("<1invalid/>")
@spec parse_stream(Enumerable.t(), handler(), any(), parse_options()) :: {:ok, any()} | {:halt, any()} | {:error, any()}
Parse an XML stream with a SAX event handler.
Drop-in replacement for Saxy.parse_stream/4.
Accepts any Enumerable that yields binary chunks (e.g. File.stream!/3).
Uses bounded memory via zero-copy tokenization and direct BEAM binary
encoding: when the internal buffer is empty (common case), the NIF tokenizes
the BEAM binary in-place without copying. Events are written directly into an
OwnedBinary on the BEAM heap — no intermediate Rust Vec allocation. Elixir
then decodes one event at a time via binary pattern matching, so only one
event tuple is ever live on the heap. Combined NIF + BEAM peak is ~128 KB
for a 2.93 MB document, comparable to Saxy while running ~1.8x faster.
Examples
File.stream!("large.xml", [], 64 * 1024)
|> RustyXML.parse_stream(MyHandler, initial_state)
@spec parse_string(binary(), handler(), any(), parse_options()) :: {:ok, any()} | {:halt, any()} | {:error, any()}
Parse an XML string with a SAX event handler.
Drop-in replacement for Saxy.parse_string/4.
The handler module must implement RustyXML.Handler (same callback as
Saxy.Handler). Events are dispatched in document order.
Options
:cdata_as_characters- Emit CDATA as:charactersevents (default:false):expand_entity- Accepted for Saxy API compatibility (default::keep)
Examples
defmodule MyHandler do
@behaviour RustyXML.Handler
def handle_event(:start_element, {name, _attrs}, acc), do: {:ok, [name | acc]}
def handle_event(_, _, acc), do: {:ok, acc}
end
{:ok, names} = RustyXML.parse_string("<root><a/><b/></root>", MyHandler, [])
#=> {:ok, ["b", "a", "root"]}
Get the root element of a parsed document.
Examples
doc = RustyXML.parse("<root><child/></root>")
RustyXML.root(doc)
#=> {:element, "root", [], [...]}
The ~x sigil for XPath expressions.
Creates a %SweetXpath{} struct with the specified path and modifiers.
Modifiers
e- Return entity (element) for chainings- Return as stringS- Soft string (empty on error)l- Return as listo- Optional (nil on missing)i- Cast to integerI- Soft integer (0 on error)f- Cast to floatF- Soft float (0.0 on error)k- Return as keyword list
Examples
import RustyXML
~x"//item"l # List of items
~x"//name/text()"s # String value
~x"count(//item)"i # Integer count
~x"//optional"so # Optional string
@spec stream_tags(binary() | Enumerable.t(), atom() | binary(), keyword()) :: Enumerable.t()
Stream XML events from a file.
Returns a Stream that yields events as the file is read.
Uses bounded memory regardless of file size.
Options
:chunk_size- Bytes to read per IO operation (default: 64KB):batch_size- Accepted for SweetXml API compatibility but has no effect. RustyXML's streaming parser yields complete elements directly from Rust as they are parsed — there is no event batching step to tune.:discard- Accepted for SweetXml API compatibility but has no effect. RustyXML's streaming parser already operates in bounded memory (~128 KB combined NIF + BEAM peak for a 2.93 MB document) by only materializing one element at a time, so tag discarding for memory reduction is unnecessary.
Examples
"large.xml"
|> RustyXML.stream_tags(:item)
|> Stream.each(&process/1)
|> Stream.run()
@spec stream_tags!(binary() | Enumerable.t(), atom() | binary(), keyword()) :: Enumerable.t()
Stream XML events from a file. Raises on error.
Provided for SweetXml API compatibility. Behaves identically to
stream_tags/3, which already raises on read errors.
@spec transform_by(RustyXML.SweetXpath.t(), (term() -> term())) :: RustyXML.SweetXpath.t()
Add a transformation function to an XPath expression.
The function will be applied to the result after all other modifiers.
Examples
spec = transform_by(~x"//price/text()"s, &String.to_float/1)
RustyXML.xpath(xml, spec)
#=> 45.99
Execute multiple XPath queries and return as a map.
Options
The third argument is accepted for SweetXml API compatibility but
is not required. Use the k sigil modifier instead for keyword output.
Examples
xml = "<root><a>1</a><b>2</b></root>"
RustyXML.xmap(xml, [
a: ~x"//a/text()"s,
b: ~x"//b/text()"s
])
#=> %{a: "1", b: "2"}
@spec xpath(binary() | document(), RustyXML.SweetXpath.t() | binary()) :: term()
Execute an XPath query on XML.
The first argument can be either:
- A raw XML binary
- A parsed document reference from
parse/1
The second argument can be:
- A
%SweetXpath{}struct (from~xsigil) - A plain XPath string (binary)
Examples
# On raw XML
RustyXML.xpath("<root>text</root>", ~x"//root/text()"s)
#=> "text"
# On parsed document
doc = RustyXML.parse("<root><a/><b/></root>")
RustyXML.xpath(doc, ~x"//a"l)
Execute an XPath query with a mapping spec for nested extraction.
The third argument is a keyword list of {name, xpath_spec} pairs
that will be evaluated for each node in the parent result.
Examples
xml = "<items><item id="1"><name>A</name></item><item id="2"><name>B</name></item></items>"
RustyXML.xpath(xml, ~x"//item"l, [
id: ~x"./@id"s,
name: ~x"./name/text()"s
])
#=> [%{id: "1", name: "A"}, %{id: "2", name: "B"}]