Expath.Document (Expath v0.2.0)
View SourceRepresents a parsed XML document stored as a Rust resource.
The Expath.Document
struct is created by Expath.new/1
and contains
a reference to a parsed XML document stored efficiently in Rust memory.
This enables the parse-once, query-many pattern where expensive XML
parsing is done once, and multiple XPath queries can be executed
against the same parsed document without re-parsing overhead.
Key Features
- Memory Efficient: XML stored as optimized Rust data structures
- Thread Safe: Can be safely passed between Elixir processes
- Automatic Cleanup: Resource automatically freed by Erlang GC
- High Performance: No re-parsing overhead for multiple queries
Usage Pattern
# Parse XML once into a Document resource
{:ok, doc} = Expath.new(xml_string)
# Query multiple times efficiently
{:ok, titles} = Expath.query(doc, "//title/text()")
{:ok, authors} = Expath.query(doc, "//author/text()")
{:ok, [count]} = Expath.query(doc, "count(//book)")
# Document automatically cleaned up when out of scope
Performance Benefits
This approach is particularly beneficial when:
- Running multiple XPath queries on the same document
- Working with large XML documents (>1KB)
- Processing documents in loops or concurrent operations
- Building applications that require high XML processing throughput
Memory Management
Document resources are automatically managed:
- Created in Rust heap memory for efficiency
- Tracked by Erlang's garbage collector
- Automatically freed when no longer referenced
- Safe to pass between processes and store in ETS/GenServer state
Example: RSS Feed Processing
defmodule RSSProcessor do
def extract_articles(rss_xml) do
{:ok, doc} = Expath.new(rss_xml)
# Multiple queries on same parsed document
{:ok, titles} = Expath.query(doc, "//item/title/text()")
{:ok, links} = Expath.query(doc, "//item/link/text()")
{:ok, descriptions} = Expath.query(doc, "//item/description/text()")
# Combine results
Enum.zip([titles, links, descriptions])
|> Enum.map(fn {title, link, desc} ->
%{title: title, link: link, description: desc}
end)
end
end
Concurrent Usage
Document resources are thread-safe and can be shared across processes:
# Parse once
{:ok, doc} = Expath.new(large_xml)
# Query concurrently
tasks = for xpath <- xpath_expressions do
Task.async(fn -> Expath.query(doc, xpath) end)
end
results = Task.await_many(tasks)
Summary
Types
@type t() :: %Expath.Document{resource: reference()}