Performance Benchmarks

View Source

Grove includes a comprehensive benchmark suite for measuring Eg-walker CRDT performance.

Running Benchmarks

# Run full benchmark suite
mix run benchmarks/eg_walker_bench.exs

# Run with production optimizations
MIX_ENV=prod mix run benchmarks/eg_walker_bench.exs

Benchmark Scenarios

ScenarioDescriptionTests
SequentialSingle replica, 1000 opsFast-path optimization
Collaborative5 replicas × 200 ops, 20% reorderMixed fast/slow path
High Concurrency10 replicas × 100 concurrent opsSlow-path merging
Large Document100-10k nodes + 100 opsScalability
Move-heavy500-1k nodes, 500-1k movesOperation log performance

Current Results

Date: 2026-01-03
Elixir: 1.18.4
OTP: 25

Eg-walker vs Traditional CRDT

Direct comparison using the same 1000 sequential operations:

ImplementationipsAverageMemory
Eg-walker (fast-path)619.691.61 ms1.43 MB
Traditional CRDT (slow-path)601.641.66 ms1.44 MB

For pure sequential put_node operations: Only ~3% difference.

Why? Simple put_node operations don't have much merge overhead. The real benefit appears with:

  • Concurrent operations requiring merge
  • Move operations with undo-do-redo
  • Long histories requiring traversal

See the high-concurrency benchmark below for the real difference.

apply_remote Performance

ScenarioipsAverageMedian99th %Memory
Sequential (fast-path)6171.62 ms1.51 ms3.00 ms1.43 MB
Collaborative (mixed)1446.94 ms6.76 ms8.76 ms5.82 MB
High-concurrency (slow-path)6315.76 ms15.59 ms18.48 ms21.84 MB

Key insight: Fast-path is ~10x faster than slow-path (617 vs 63 ips)

The real Eg-walker benefit:

  • Sequential → High-concurrency: 10x throughput drop, 15x memory increase
  • This is where Eg-walker shines: sequential editing has near-zero CRDT overhead

Local Operations

OperationipsAverageMemory
get_node5.57 M179 ns0 B
put_node1.13 M888 ns1016 B
update_node0.65 M1537 ns3104 B

Scalability (Document Size)

Document SizeipsAverageMemory
100 nodes + 100 ops6.44 K155 μs136 KB
1k nodes + 100 ops6.58 K152 μs146 KB
10k nodes + 100 ops5.99 K167 μs157 KB

Key insight: Document size has minimal impact on operation time (O(1) node access via Map)

Fast-path vs Slow-path

The Eg-walker implementation optimizes for sequential editing (the common case):

Fast-path (Sequential Events)

  • When: Events extend the frontier linearly (single user or turn-taking)
  • Performance: 617 ips, 1.43 MB memory
  • Overhead: Near-zero CRDT overhead
  • How it works: Direct application, no transformation needed

Slow-path (Concurrent Events)

  • When: Multiple replicas with divergent branches
  • Performance: 63 ips, 21.84 MB memory
  • Overhead: Full CRDT merge with undo-do-redo
  • How it works: Builds transient merge state, replays operations

Key Takeaway

For the same sequential workload:

  • Eg-walker fast-path: 619 ips
  • Traditional CRDT (forced slow-path): 601 ips
  • Difference: Only ~3%

The massive benefit appears when comparing sequential vs concurrent:

  • Sequential (Eg-walker optimized): 617 ips, 1.43 MB
  • High-concurrency (traditional behavior): 63 ips, 21.84 MB
  • 10x throughput, 15x memory difference

Optimization History

event_index Optimization (2026-01-02)

Added persistent event_index field to Tree struct for O(1) event lookups.

MetricBeforeAfterImprovement
Sequential throughput604 ips605 ips+0.2%
Collaborative throughput122.5 ips140.5 ips+14.7%
Slow-path throughput55.5 ips57.4 ips+3.4%
Memory (collaborative)5.91 MB5.64 MB-4.6%

Implementation:

  • event_index field in Tree struct (lazy construction)
  • ensure_index/1 builds index from history when needed
  • get_event/2 for O(1) lookup by event ID
  • Incremental updates in record_history/3 and record_remote_event/2

Metrics Explained

  • ips: Iterations per second (higher is better)
  • Average: Mean time per operation
  • Median: 50th percentile latency
  • 99th %: Tail latency (important for UI responsiveness)
  • Memory: Total bytes allocated per benchmark run

Future Optimizations

Remaining targets identified:

  • find_insertion_point - O(n) list scan → binary search
  • build_children_map in EventGraph - O(n) rebuild → cache
  • Pass event_index to EventGraph functions to avoid rebuilding