# `RustyXML.Native`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L1)

Low-level NIF bindings for XML parsing.

This module provides direct access to the Rust NIF functions. For normal use,
prefer the higher-level `RustyXML` module with its `~x` sigil support.

## Strategies

The module exposes parsing strategies:

  * `parse/1` + `xpath_query/2` - Structural index with XPath (main path)
  * `streaming_*` - Stateful streaming parser for large files
  * `sax_parse/1` - SAX event parser

## Memory Efficiency

The structural index (`parse/1`) uses ~4x input size vs SweetXml's ~600x.
Strings are stored as byte offsets into the original input, not copies.

## Scheduler Behaviour

NIFs that parse raw XML input run on the dirty CPU scheduler to avoid
blocking BEAM schedulers. Query NIFs on pre-parsed documents run on
normal schedulers for sub-millisecond lookups.

# `document_ref`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L54)

```elixir
@opaque document_ref()
```

Opaque reference to a parsed XML document (structural index)

# `parser_ref`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L57)

```elixir
@opaque parser_ref()
```

Opaque reference to a streaming parser

# `xml_event`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L60)

```elixir
@type xml_event() ::
  {:start_element, binary(), [{binary(), binary()}]}
  | {:end_element, binary()}
  | {:empty_element, binary(), [{binary(), binary()}]}
  | {:text, binary()}
  | {:cdata, binary()}
  | {:comment, binary()}
```

XML event from parser

# `accumulator_feed`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L363)

```elixir
@spec accumulator_feed(reference(), binary()) :: :ok
```

Feed a chunk of data to the document accumulator.

# `accumulator_new`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L357)

```elixir
@spec accumulator_new() :: reference()
```

Create a new document accumulator for streaming SimpleForm parsing.

Returns an opaque accumulator reference.

# `accumulator_to_simple_form`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L371)

```elixir
@spec accumulator_to_simple_form(reference()) :: {:ok, tuple()} | {:error, binary()}
```

Validate, index, and convert accumulated data to SimpleForm.

Returns `{:ok, tree}` or `{:error, reason}`.

# `get_root`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L195)

```elixir
@spec get_root(document_ref()) :: term() | nil
```

Get the root element of a parsed document.

Returns the root element as a tuple:
`{:element, name, attributes, children}`

## Examples

    doc = RustyXML.Native.parse("<root attr="value"><child/></root>")
    RustyXML.Native.get_root(doc)
    #=> {:element, "root", [{"attr", "value"}], [...]}

# `get_rust_memory`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L453)

```elixir
@spec get_rust_memory() :: non_neg_integer()
```

Get current Rust heap allocation in bytes.

Requires `memory_tracking` Cargo feature. Returns `0` otherwise.

# `get_rust_memory_peak`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L460)

```elixir
@spec get_rust_memory_peak() :: non_neg_integer()
```

Get peak Rust heap allocation since last reset.

# `parse`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L89)

```elixir
@spec parse(binary()) :: document_ref()
```

Parse XML into a structural index document.

Runs on the dirty CPU scheduler since parse time scales with input size.

Returns an opaque document reference that can be used with `xpath_query/2`
and `get_root/1`. The document is cached and can be queried multiple times.

This is the primary parse function - uses ~4x input size memory.

## Examples

    doc = RustyXML.Native.parse("<root><item id="1"/></root>")
    RustyXML.Native.xpath_query(doc, "//item")

# `parse_and_xpath`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L171)

```elixir
@spec parse_and_xpath(binary(), binary()) :: term()
```

Parse XML and execute an XPath query in one call.

Runs on the dirty CPU scheduler since it parses raw XML input.

More efficient than `parse/1` + `xpath_query/2` for single queries
since it doesn't create a persistent document reference.

## Examples

    RustyXML.Native.parse_and_xpath("<root><item/></root>", "//item")

# `parse_and_xpath_text`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L179)

```elixir
@spec parse_and_xpath_text(binary(), binary()) :: [binary()] | term()
```

Parse and immediately query, returning text values for node sets.

Optimized path for `is_value: true` — avoids building element tuples.

# `parse_strict`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L107)

```elixir
@spec parse_strict(binary()) :: {:ok, document_ref()} | {:error, binary()}
```

Parse XML in strict mode (returns {:ok, doc} or {:error, reason}).

Runs on the dirty CPU scheduler since parse time scales with input size.

Returns `{:ok, document_ref}` on success, or `{:error, reason}` if the
document is not well-formed per XML 1.0 specification.

## Examples

    {:ok, doc} = RustyXML.Native.parse_strict("<root>valid</root>")

    {:error, reason} = RustyXML.Native.parse_strict("<1invalid/>")

# `parse_to_simple_form`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L345)

```elixir
@spec parse_to_simple_form(binary()) :: {:ok, tuple()} | {:error, binary()}
```

Parse XML directly into SimpleForm `{name, attrs, children}` tree.

Bypasses the SAX event pipeline — builds the tree in Rust from the
structural index, decoding entities as needed.

Returns `{:ok, tree}` or `{:error, reason}`.

# `reset_rust_memory_stats`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L469)

```elixir
@spec reset_rust_memory_stats() :: {non_neg_integer(), non_neg_integer()}
```

Reset memory tracking statistics.

Returns `{current_bytes, previous_peak_bytes}`.

# `sax_parse`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L383)

```elixir
@spec sax_parse(binary()) :: [tuple()]
```

Parse XML and return SAX events.

Events are returned as tuples similar to Saxy's format.

# `sax_parse_saxy`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L397)

```elixir
@spec sax_parse_saxy(binary(), boolean()) :: [tuple()]
```

Parse XML and return SAX events in Saxy-compatible format.

Events are emitted directly in Saxy format:
- `{:start_element, {name, attrs}}`
- `{:end_element, name}`
- `{:characters, content}`
- `{:cdata, content}`

Comments and PIs are skipped. Empty elements emit start+end.

# `streaming_available_elements`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L330)

```elixir
@spec streaming_available_elements(parser_ref()) ::
  non_neg_integer() | {:error, :mutex_poisoned}
```

Get number of available complete elements.

# `streaming_feed`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L283)

```elixir
@spec streaming_feed(parser_ref(), binary()) ::
  {non_neg_integer(), non_neg_integer()} | {:error, :mutex_poisoned}
```

Feed a chunk of XML data to the streaming parser.

Returns `{available_events, buffer_size}` on success, or
`{:error, :mutex_poisoned}` if the parser mutex is poisoned.

# `streaming_feed_sax`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L430)

```elixir
@spec streaming_feed_sax(reference(), binary(), boolean()) :: binary()
```

Feed a chunk and return SAX events as a compact binary.

When the tail buffer is empty (common case), the NIF tokenizes the BEAM
binary in-place (zero copy) and writes events directly into an OwnedBinary
on the BEAM heap — no intermediate Rust Vec allocation. Only the
unprocessed tail (~100 bytes) is saved between calls.

Format: sequence of `<<type::8, ...>>` where type 1=start, 2=end, 3=chars, 4=cdata.

# `streaming_finalize`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L300)

```elixir
@spec streaming_finalize(parser_ref()) :: [xml_event()] | {:error, :mutex_poisoned}
```

Finalize the streaming parser and get remaining events.

Returns `{:error, :mutex_poisoned}` if the parser mutex is poisoned.

# `streaming_finalize_sax`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L439)

```elixir
@spec streaming_finalize_sax(reference(), boolean()) :: binary()
```

Finalize the streaming SAX parser, processing any remaining bytes.

Returns final events as a compact binary (same format as `streaming_feed_sax/3`).

# `streaming_new`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L257)

```elixir
@spec streaming_new() :: parser_ref()
```

Create a new streaming XML parser.

The streaming parser processes XML in chunks with bounded memory usage.

## Examples

    parser = RustyXML.Native.streaming_new()
    RustyXML.Native.streaming_feed(parser, "<root>")
    RustyXML.Native.streaming_feed(parser, "<item/></root>")
    events = RustyXML.Native.streaming_take_events(parser, 100)

# `streaming_new_with_filter`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L273)

```elixir
@spec streaming_new_with_filter(binary()) :: parser_ref()
```

Create a streaming parser with a tag filter.

Only events for the specified tag name will be emitted.
Useful for extracting specific elements from large documents.

## Examples

    parser = RustyXML.Native.streaming_new_with_filter("item")
    RustyXML.Native.streaming_feed(parser, "<root><item/><other/></root>")
    # Only item events will be returned

# `streaming_sax_new`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L417)

```elixir
@spec streaming_sax_new() :: reference()
```

Create a new streaming SAX parser.

Returns an opaque parser reference for use with `streaming_feed_sax/3`.

# `streaming_status`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L310)

```elixir
@spec streaming_status(parser_ref()) ::
  {non_neg_integer(), non_neg_integer(), boolean()} | {:error, :mutex_poisoned}
```

Get streaming parser status.

Returns `{available_events, buffer_size, has_pending}` on success, or
`{:error, :mutex_poisoned}` if the parser mutex is poisoned.

# `streaming_take_elements`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L322)

```elixir
@spec streaming_take_elements(parser_ref(), non_neg_integer()) ::
  [binary()] | {:error, :mutex_poisoned}
```

Take up to `max` complete elements from the streaming parser.

Returns a list of XML binaries for complete elements. This is faster than
using events because the element strings are built in Rust without needing
reconstruction in Elixir.

# `streaming_take_events`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L292)

```elixir
@spec streaming_take_events(parser_ref(), non_neg_integer()) ::
  [xml_event()] | {:error, :mutex_poisoned}
```

Take up to `max` events from the streaming parser.

Returns `{:error, :mutex_poisoned}` if the parser mutex is poisoned.

# `streaming_take_saxy_events`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L404)

```elixir
@spec streaming_take_saxy_events(reference(), non_neg_integer(), boolean()) ::
  [tuple()] | {:error, :mutex_poisoned}
```

Take events from streaming parser in Saxy-compatible format.

# `xpath_query`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L126)

```elixir
@spec xpath_query(document_ref(), binary()) :: term()
```

Execute an XPath query on a parsed document.

Returns the result based on the XPath expression:
  * Node-set queries return a list of element tuples
  * String queries return a string
  * Number queries return a float
  * Boolean queries return true/false

## Examples

    doc = RustyXML.Native.parse("<root><item>text</item></root>")
    RustyXML.Native.xpath_query(doc, "//item")
    #=> [{:element, "item", [], ["text"]}]

# `xpath_query_raw`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L143)

```elixir
@spec xpath_query_raw(document_ref(), binary()) :: [binary()] | term()
```

Execute an XPath query returning XML strings for node sets (fast path).

Instead of building nested Elixir tuples for each element, this returns
the serialized XML string for each node. Much faster for queries returning
many elements.

## Examples

    doc = RustyXML.Native.parse("<root><item>text</item></root>")
    RustyXML.Native.xpath_query_raw(doc, "//item")
    #=> ["<item>text</item>"]

# `xpath_string_value`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L231)

```elixir
@spec xpath_string_value(binary(), binary()) :: binary()
```

Execute XPath and return string value of result.

Runs on the dirty CPU scheduler since it parses raw XML input.
For node-sets, returns text content of first node.

## Examples

    RustyXML.Native.xpath_string_value("<root>hello</root>", "//root/text()")
    #=> "hello"

# `xpath_string_value_doc`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L237)

```elixir
@spec xpath_string_value_doc(document_ref(), binary()) :: binary()
```

Execute XPath on document reference and return string value.

# `xpath_text_list`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L155)

```elixir
@spec xpath_text_list(document_ref(), binary()) :: [binary()] | term()
```

Execute XPath query returning text values for node sets (optimized fast path).

Instead of building nested Elixir tuples for each element, returns the
concatenated text content of each node as a string. Much faster for the
common case where `is_value: true` (no `e` modifier).

For non-NodeSet results (numbers, strings, booleans), returns as-is.

# `xpath_with_subspecs`
[🔗](https://github.com/jeffhuen/rustyxml/blob/v0.2.3/lib/rusty_xml/native.ex#L216)

```elixir
@spec xpath_with_subspecs(binary(), binary(), [{binary(), binary()}]) :: [map()]
```

Execute parent XPath and evaluate subspecs for each result node.

Runs on the dirty CPU scheduler since it parses raw XML input.

Returns a list of maps with each subspec evaluated relative to the parent nodes.

## Examples

    xml = "<items><item><id>1</id><name>A</name></item></items>"
    RustyXML.Native.xpath_with_subspecs(xml, "//item", [{"id", "./id/text()"}, {"name", "./name/text()"}])
    #=> [%{id: "1", name: "A"}]

---

*Consult [api-reference.md](api-reference.md) for complete listing*