RustyXML Benchmarks

Copy Markdown View Source

Performance comparisons against SweetXml/xmerl.

Test Environment

  • Elixir: 1.19.4
  • OTP: 28
  • Hardware: Apple Silicon M1 Pro (10 cores)
  • RustyXML: 0.2.1
  • SweetXml: 0.7.5
  • NIF memory tracking: enabled

Variance: Throughput numbers vary ±15% between runs on the same hardware depending on thermal state, background load, and GC timing. Speedup ratios (RustyXML vs baseline) are more stable than absolute ips. Streaming memory measurements use :erlang.memory(:total) deltas and can vary ±30%.

Parsing Performance

RustyXML's structural index uses SIMD-accelerated scanning (memchr) and zero-copy spans. Gains increase with document size.

Small Document (14.6 KB, 50 items)

ParserThroughputvs SweetXml
RustyXML9,370 ips8.2x faster
SweetXml1,140 ipsbaseline

Medium Document (290.6 KB, 1,000 items)

ParserThroughputvs SweetXml
RustyXML533 ips9.5x faster
SweetXml56 ipsbaseline

Large Document (2.93 MB, 10,000 items)

ParserThroughputvs SweetXml
RustyXML54 ips72x faster
SweetXml0.75 ipsbaseline

XPath Query Performance

All queries run on pre-parsed documents.

XPath on Pre-Parsed Document (290.6 KB, 1,000 items)

Query TypeRustyXMLSweetXmlvs SweetXml
//item (full elements)589 ips397 ips1.48x faster
//item/name/text()676 ips337 ips2.0x faster
//item/@id737 ips433 ips1.7x faster

Complex XPath Queries (2.93 MB, 10,000 items)

Query TypeRustyXMLSweetXmlvs SweetXml
Predicate ([price > 50])60 ips16 ips3.65x faster
Count function63 ips22 ips2.8x faster

Lazy XPath API (290 KB, 1,000 items)

The lazy API keeps results in Rust memory, building BEAM terms only when accessed:

APILatency (100 runs)vs SweetXml
Regular xpath/2104 msbaseline
Lazy xpath_lazy/2 (count only)31 ms3.0x faster
Lazy + batch accessor31 ms3.1x faster
Parse + lazy + batch130 ms4.4x faster

Recommendation: Use batch accessors (result_texts, result_attrs, result_extract) when accessing multiple items to reduce NIF call overhead.

Memory Comparison

Measurement Methodology

  • RustyXML "Total" = NIF peak (instantaneous high-water mark via mimalloc tracking) + BEAM allocations (Benchee GC tracing for parse; :erlang.memory(:total) delta for streaming)
  • SweetXml "BEAM" = Total BEAM allocations measured by Benchee's GC tracing, which includes memory that was subsequently garbage-collected — i.e. cumulative allocations, not peak
  • Streaming uses :erlang.memory(:total) snapshot deltas, which are noisier than Benchee's per-process GC tracing

These are not identical metrics. The ratios are directionally correct but should not be taken as exact.

Parse Memory

DocumentRustyXML (NIF peak + BEAM)SweetXml (BEAM total allocs)
Small (14.6 KB)63.4 KB5.65 MB
Medium (290.6 KB)1.23 MB112 MB
Large (2.93 MB)12.81 MB1,133 MB

XPath Memory

QueryRustyXML (NIF peak + BEAM)SweetXml (BEAM total allocs)
//item (1K items)475 KB6.12 MB
text() (1K items)491 KB7.10 MB
@id (1K items)491 KB6.45 MB
Predicate (10K items)5.96 MB68.2 MB
Count (10K items)5.94 MB60.8 MB

Streaming Memory

OperationRustyXML (NIF peak + BEAM delta)SweetXml (BEAM delta)
Stream 10K items128 KB73 MB

Note: SweetXml's stream_tags retains every parsed element in xmerl's accumulator for the entire parse, so 73 MB reflects genuine SweetXml behavior but is not representative of a properly bounded streaming parser. For a fairer comparison, see the Saxy Comparison where both parsers use bounded memory (~128 KB vs ~124 KB).

Streaming Comparison

Feature Comparison

FeatureRustyXMLSweetXml
Memory modelBounded (~128 KB)Unbounded
Stream.takeWorks correctlyHangs (issue #97)
Chunk boundary handlingHandled correctlyN/A
Output format{tag_atom, xml_string}{tag_atom, xml_string}
Early terminationProper cleanupCan hang

Streaming Performance (10,000 items, 2.93 MB)

MetricRustyXMLSweetXmlvs SweetXml
Time23.3 ms376.7 ms16.2x faster
Throughput~43/s~2.7/s16.2x faster
Stream.take(5)WorksHangsRustyXML wins

Saxy Comparison

RustyXML also serves as a drop-in Saxy replacement. SAX parsing benchmarks against Saxy 1.6:

SAX Parse Performance

Measured on Apple Silicon M1 Pro. Throughput varies ±15% between runs; speedup ratios are more stable.

OperationXML SizeRustyXMLSaxySpeedup
parse_string/414.6 KB~8.5K ips~6.5K ips~1.3x
parse_string/4290.6 KB~435 ips~320 ips~1.4x
parse_string/42.93 MB~40 ips~25 ips~1.6x
SimpleForm14.6 KB~6.1K ips~4.7K ips~1.3x
SimpleForm290.6 KB~330 ips~215 ips~1.5x
parse_stream/42.93 MB~41 ips~23 ips~1.8x

SAX Memory

OperationXML SizeRustyXML TotalSaxy (BEAM)Ratio
parse_string/414.6 KB127 KB308 KB0.41x
parse_string/42.93 MB26.4 MB59.6 MB0.44x
SimpleForm290.6 KB1.43 MB10.7 MB0.13x
parse_stream/42.93 MB~130–162 KB~124–133 KB~1x

parse_stream memory is comparable: both parsers operate in bounded memory. RustyXML uses zero-copy tokenization and direct BEAM binary encoding to keep the NIF peak at ~67 KB. Streaming memory measurements use :erlang.memory(:total) deltas which are noisier than Benchee's per-process tracing — expect ±30% variance between runs on the same hardware.

Summary

Speed Rankings

Operationvs SweetXml
Parse large (2.93 MB)72x faster
Parse medium (290 KB)9.5x faster
Parse small (14.6 KB)8.2x faster
Streaming (10K items)16.2x faster
Parse + lazy + batch4.4x faster
Complex XPath (predicate)3.65x faster
Lazy XPath (count only)3.0x faster
Complex XPath (count)2.8x faster
XPath text extraction2.0x faster
XPath attribute extraction1.7x faster
XPath full elements1.48x faster

Memory Rankings

OperationComparisonNotes
Streaming (vs Saxy)~1xBoth properly bounded; fairest comparison
Parse (vs SweetXml)~90x lessDifferent metrics; see methodology
XPath (vs SweetXml)10-14x less
Use CaseRecommended
General XML processingparse/1 + xpath/2
Single query on XML stringxpath/2 with raw XML
Large result sets, partial accessxpath_lazy/2 + batch accessors
Count results onlyxpath_lazy/2 + result_count/1
Elements as XML stringsNative.xpath_query_raw/2
Large files (GB+)stream_tags/3
Batch queriesxmap/2
Event-driven SAX processingparse_string/4 with handler
Streaming SAX (sockets, HTTP)parse_stream/4 or Partial
Simple tuple treeSimpleForm.parse_string/2
Generating XMLencode!/2 with RustyXML.XML

Key Findings

  1. Parsing is 8-72x faster than SweetXml — The structural index with SIMD scanning dramatically outperforms xmerl, with gains increasing on larger documents.

  2. SAX parsing is ~1.3-1.8x faster than Saxy — with comparable streaming memory. This is the fairest streaming comparison since both parsers are properly bounded.

  3. All XPath queries are faster — Full elements (1.48x), text (2.0x), attributes (1.7x), predicates (3.65x), counts (2.8x) vs SweetXml.

  4. Lazy XPath is 3-4.4x faster — Keeping node IDs in Rust and accessing on-demand eliminates BEAM term construction overhead.

  5. Significantly less memory for parsing — The structural index uses compact spans instead of string copies. Parse memory for 2.93 MB doc: 12.8 MB (NIF peak + BEAM) vs SweetXml's 1,133 MB (BEAM total allocations). Note: these are different metrics — the magnitude of difference is real but the exact ratio is approximate.

  6. Stream.take works correctly — Fixes SweetXml issue #97. Bounded memory regardless of file size.

Improvement from v0.1.1 to v0.1.2

The unified structural index brought substantial gains over the prior DOM-based approach:

Metricv0.1.1 (DOM)v0.1.2 (Index)Improvement
Parse throughput (large)30.7 ips54.0 ips1.76x
Parse memory (large)30.17 MB12.81 MB58% less
XPath //item0.83x SweetXml1.48x SweetXmlwas slower, now faster
XPath memory (medium)28.3 MB475 KB60x less
Streaming throughput21.9/s43.0/s1.96x
Streaming memory52.8 MB319 KB165x less

Running the Benchmarks

# vs SweetXml
FORCE_RUSTYXML_BUILD=1 mix run bench/sweet_bench.exs

# vs Saxy
FORCE_RUSTYXML_BUILD=1 mix run bench/saxy_bench.exs

Enabling Memory Tracking

# In native/rustyxml/Cargo.toml
[features]
default = ["mimalloc", "memory_tracking"]
FORCE_RUSTYXML_BUILD=1 mix compile --force
RustyXML.Native.reset_rust_memory_stats()
doc = RustyXML.parse(xml)
peak = RustyXML.Native.get_rust_memory_peak()
current = RustyXML.Native.get_rust_memory()

Correctness Verification

All benchmarks include correctness verification:

count(//item): RustyXML=1000, SweetXml=1000 - ok
//item[1]/name/text(): RustyXML="Product 1", SweetXml="Product 1" - ok
//item/@id count: RustyXML=1000, SweetXml=1000 - ok

Overall: ALL TESTS PASSED