RustyXML Benchmarks

Performance comparisons against SweetXml/xmerl.

Test Environment

Elixir: 1.19.4
OTP: 28
Hardware: Apple Silicon M1 Pro (10 cores)
RustyXML: 0.2.1
SweetXml: 0.7.5
NIF memory tracking: enabled

Variance: Throughput numbers vary ±15% between runs on the same hardware depending on thermal state, background load, and GC timing. Speedup ratios (RustyXML vs baseline) are more stable than absolute ips. Streaming memory measurements use :erlang.memory(:total) deltas and can vary ±30%.

Parsing Performance

RustyXML's structural index uses SIMD-accelerated scanning (memchr) and zero-copy spans. Gains increase with document size.

Small Document (14.6 KB, 50 items)

Parser	Throughput	vs SweetXml
RustyXML	9,370 ips	8.2x faster
SweetXml	1,140 ips	baseline

Medium Document (290.6 KB, 1,000 items)

Parser	Throughput	vs SweetXml
RustyXML	533 ips	9.5x faster
SweetXml	56 ips	baseline

Large Document (2.93 MB, 10,000 items)

Parser	Throughput	vs SweetXml
RustyXML	54 ips	72x faster
SweetXml	0.75 ips	baseline

XPath Query Performance

All queries run on pre-parsed documents.

XPath on Pre-Parsed Document (290.6 KB, 1,000 items)

Query Type	RustyXML	SweetXml	vs SweetXml
`//item` (full elements)	589 ips	397 ips	1.48x faster
`//item/name/text()`	676 ips	337 ips	2.0x faster
`//item/@id`	737 ips	433 ips	1.7x faster

Complex XPath Queries (2.93 MB, 10,000 items)

Query Type	RustyXML	SweetXml	vs SweetXml
Predicate (`[price > 50]`)	60 ips	16 ips	3.65x faster
Count function	63 ips	22 ips	2.8x faster

Lazy XPath API (290 KB, 1,000 items)

The lazy API keeps results in Rust memory, building BEAM terms only when accessed:

API	Latency (100 runs)	vs SweetXml
Regular `xpath/2`	104 ms	baseline
Lazy `xpath_lazy/2` (count only)	31 ms	3.0x faster
Lazy + batch accessor	31 ms	3.1x faster
Parse + lazy + batch	130 ms	4.4x faster

Recommendation: Use batch accessors (result_texts, result_attrs, result_extract) when accessing multiple items to reduce NIF call overhead.

Memory Comparison

Measurement Methodology

RustyXML "Total" = NIF peak (instantaneous high-water mark via mimalloc tracking) + BEAM allocations (Benchee GC tracing for parse; :erlang.memory(:total) delta for streaming)
SweetXml "BEAM" = Total BEAM allocations measured by Benchee's GC tracing, which includes memory that was subsequently garbage-collected — i.e. cumulative allocations, not peak
Streaming uses :erlang.memory(:total) snapshot deltas, which are noisier than Benchee's per-process GC tracing

These are not identical metrics. The ratios are directionally correct but should not be taken as exact.

Parse Memory

Document	RustyXML (NIF peak + BEAM)	SweetXml (BEAM total allocs)
Small (14.6 KB)	63.4 KB	5.65 MB
Medium (290.6 KB)	1.23 MB	112 MB
Large (2.93 MB)	12.81 MB	1,133 MB

XPath Memory

Query	RustyXML (NIF peak + BEAM)	SweetXml (BEAM total allocs)
`//item` (1K items)	475 KB	6.12 MB
`text()` (1K items)	491 KB	7.10 MB
`@id` (1K items)	491 KB	6.45 MB
Predicate (10K items)	5.96 MB	68.2 MB
Count (10K items)	5.94 MB	60.8 MB

Streaming Memory

Operation	RustyXML (NIF peak + BEAM delta)	SweetXml (BEAM delta)
Stream 10K items	128 KB	73 MB

Note: SweetXml's stream_tags retains every parsed element in xmerl's accumulator for the entire parse, so 73 MB reflects genuine SweetXml behavior but is not representative of a properly bounded streaming parser. For a fairer comparison, see the Saxy Comparison where both parsers use bounded memory (~128 KB vs ~124 KB).

Streaming Comparison

Feature Comparison

Feature	RustyXML	SweetXml
Memory model	Bounded (~128 KB)	Unbounded
`Stream.take`	Works correctly	Hangs (issue #97)
Chunk boundary handling	Handled correctly	N/A
Output format	`{tag_atom, xml_string}`	`{tag_atom, xml_string}`
Early termination	Proper cleanup	Can hang

Streaming Performance (10,000 items, 2.93 MB)

Metric	RustyXML	SweetXml	vs SweetXml
Time	23.3 ms	376.7 ms	16.2x faster
Throughput	~43/s	~2.7/s	16.2x faster
`Stream.take(5)`	Works	Hangs	RustyXML wins

Saxy Comparison

RustyXML also serves as a drop-in Saxy replacement. SAX parsing benchmarks against Saxy 1.6:

SAX Parse Performance

Measured on Apple Silicon M1 Pro. Throughput varies ±15% between runs; speedup ratios are more stable.

Operation	XML Size	RustyXML	Saxy	Speedup
`parse_string/4`	14.6 KB	~8.5K ips	~6.5K ips	~1.3x
`parse_string/4`	290.6 KB	~435 ips	~320 ips	~1.4x
`parse_string/4`	2.93 MB	~40 ips	~25 ips	~1.6x
`SimpleForm`	14.6 KB	~6.1K ips	~4.7K ips	~1.3x
`SimpleForm`	290.6 KB	~330 ips	~215 ips	~1.5x
`parse_stream/4`	2.93 MB	~41 ips	~23 ips	~1.8x

SAX Memory

Operation	XML Size	RustyXML Total	Saxy (BEAM)	Ratio
`parse_string/4`	14.6 KB	127 KB	308 KB	0.41x
`parse_string/4`	2.93 MB	26.4 MB	59.6 MB	0.44x
`SimpleForm`	290.6 KB	1.43 MB	10.7 MB	0.13x
`parse_stream/4`	2.93 MB	~130–162 KB	~124–133 KB	~1x

parse_stream memory is comparable: both parsers operate in bounded memory. RustyXML uses zero-copy tokenization and direct BEAM binary encoding to keep the NIF peak at ~67 KB. Streaming memory measurements use :erlang.memory(:total) deltas which are noisier than Benchee's per-process tracing — expect ±30% variance between runs on the same hardware.

Summary

Speed Rankings

Operation	vs SweetXml
Parse large (2.93 MB)	72x faster
Parse medium (290 KB)	9.5x faster
Parse small (14.6 KB)	8.2x faster
Streaming (10K items)	16.2x faster
Parse + lazy + batch	4.4x faster
Complex XPath (predicate)	3.65x faster
Lazy XPath (count only)	3.0x faster
Complex XPath (count)	2.8x faster
XPath text extraction	2.0x faster
XPath attribute extraction	1.7x faster
XPath full elements	1.48x faster

Memory Rankings

Operation	Comparison	Notes
Streaming (vs Saxy)	~1x	Both properly bounded; fairest comparison
Parse (vs SweetXml)	~90x less	Different metrics; see methodology
XPath (vs SweetXml)	10-14x less

Recommended API by Use Case

Use Case	Recommended
General XML processing	`parse/1` + `xpath/2`
Single query on XML string	`xpath/2` with raw XML
Large result sets, partial access	`xpath_lazy/2` + batch accessors
Count results only	`xpath_lazy/2` + `result_count/1`
Elements as XML strings	`Native.xpath_query_raw/2`
Large files (GB+)	`stream_tags/3`
Batch queries	`xmap/2`
Event-driven SAX processing	`parse_string/4` with handler
Streaming SAX (sockets, HTTP)	`parse_stream/4` or `Partial`
Simple tuple tree	`SimpleForm.parse_string/2`
Generating XML	`encode!/2` with `RustyXML.XML`

Key Findings

Parsing is 8-72x faster than SweetXml — The structural index with SIMD scanning dramatically outperforms xmerl, with gains increasing on larger documents.
SAX parsing is ~1.3-1.8x faster than Saxy — with comparable streaming memory. This is the fairest streaming comparison since both parsers are properly bounded.
All XPath queries are faster — Full elements (1.48x), text (2.0x), attributes (1.7x), predicates (3.65x), counts (2.8x) vs SweetXml.
Lazy XPath is 3-4.4x faster — Keeping node IDs in Rust and accessing on-demand eliminates BEAM term construction overhead.
Significantly less memory for parsing — The structural index uses compact spans instead of string copies. Parse memory for 2.93 MB doc: 12.8 MB (NIF peak + BEAM) vs SweetXml's 1,133 MB (BEAM total allocations). Note: these are different metrics — the magnitude of difference is real but the exact ratio is approximate.
Stream.take works correctly — Fixes SweetXml issue #97. Bounded memory regardless of file size.

Improvement from v0.1.1 to v0.1.2

The unified structural index brought substantial gains over the prior DOM-based approach:

Metric	v0.1.1 (DOM)	v0.1.2 (Index)	Improvement
Parse throughput (large)	30.7 ips	54.0 ips	1.76x
Parse memory (large)	30.17 MB	12.81 MB	58% less
XPath `//item`	0.83x SweetXml	1.48x SweetXml	was slower, now faster
XPath memory (medium)	28.3 MB	475 KB	60x less
Streaming throughput	21.9/s	43.0/s	1.96x
Streaming memory	52.8 MB	319 KB	165x less

Running the Benchmarks

# vs SweetXml
FORCE_RUSTYXML_BUILD=1 mix run bench/sweet_bench.exs

# vs Saxy
FORCE_RUSTYXML_BUILD=1 mix run bench/saxy_bench.exs

Enabling Memory Tracking

# In native/rustyxml/Cargo.toml
[features]
default = ["mimalloc", "memory_tracking"]

FORCE_RUSTYXML_BUILD=1 mix compile --force

RustyXML.Native.reset_rust_memory_stats()
doc = RustyXML.parse(xml)
peak = RustyXML.Native.get_rust_memory_peak()
current = RustyXML.Native.get_rust_memory()

Correctness Verification

All benchmarks include correctness verification:

count(//item): RustyXML=1000, SweetXml=1000 - ok
//item[1]/name/text(): RustyXML="Product 1", SweetXml="Product 1" - ok
//item/@id count: RustyXML=1000, SweetXml=1000 - ok

Overall: ALL TESTS PASSED

← Previous Page Architecture

Next Page → XML Compliance