Expath.Document (Expath v0.2.0)

View Source

Represents a parsed XML document stored as a Rust resource.

The Expath.Document struct is created by Expath.new/1 and contains a reference to a parsed XML document stored efficiently in Rust memory. This enables the parse-once, query-many pattern where expensive XML parsing is done once, and multiple XPath queries can be executed against the same parsed document without re-parsing overhead.

Key Features

  • Memory Efficient: XML stored as optimized Rust data structures
  • Thread Safe: Can be safely passed between Elixir processes
  • Automatic Cleanup: Resource automatically freed by Erlang GC
  • High Performance: No re-parsing overhead for multiple queries

Usage Pattern

# Parse XML once into a Document resource
{:ok, doc} = Expath.new(xml_string)

# Query multiple times efficiently
{:ok, titles} = Expath.query(doc, "//title/text()")
{:ok, authors} = Expath.query(doc, "//author/text()")
{:ok, [count]} = Expath.query(doc, "count(//book)")

# Document automatically cleaned up when out of scope

Performance Benefits

This approach is particularly beneficial when:

  • Running multiple XPath queries on the same document
  • Working with large XML documents (>1KB)
  • Processing documents in loops or concurrent operations
  • Building applications that require high XML processing throughput

Memory Management

Document resources are automatically managed:

  • Created in Rust heap memory for efficiency
  • Tracked by Erlang's garbage collector
  • Automatically freed when no longer referenced
  • Safe to pass between processes and store in ETS/GenServer state

Example: RSS Feed Processing

defmodule RSSProcessor do
  def extract_articles(rss_xml) do
    {:ok, doc} = Expath.new(rss_xml)

    # Multiple queries on same parsed document
    {:ok, titles} = Expath.query(doc, "//item/title/text()")
    {:ok, links} = Expath.query(doc, "//item/link/text()")
    {:ok, descriptions} = Expath.query(doc, "//item/description/text()")

    # Combine results
    Enum.zip([titles, links, descriptions])
    |> Enum.map(fn {title, link, desc} ->
      %{title: title, link: link, description: desc}
    end)
  end
end

Concurrent Usage

Document resources are thread-safe and can be shared across processes:

# Parse once
{:ok, doc} = Expath.new(large_xml)

# Query concurrently 
tasks = for xpath <- xpath_expressions do
  Task.async(fn -> Expath.query(doc, xpath) end)
end

results = Task.await_many(tasks)

Summary

Types

t()

@type t() :: %Expath.Document{resource: reference()}