Performance Benchmarks
View SourceGrove includes a comprehensive benchmark suite for measuring Eg-walker CRDT performance.
Running Benchmarks
# Run full benchmark suite
mix run benchmarks/eg_walker_bench.exs
# Run with production optimizations
MIX_ENV=prod mix run benchmarks/eg_walker_bench.exs
Benchmark Scenarios
| Scenario | Description | Tests |
|---|---|---|
| Sequential | Single replica, 1000 ops | Fast-path optimization |
| Collaborative | 5 replicas × 200 ops, 20% reorder | Mixed fast/slow path |
| High Concurrency | 10 replicas × 100 concurrent ops | Slow-path merging |
| Large Document | 100-10k nodes + 100 ops | Scalability |
| Move-heavy | 500-1k nodes, 500-1k moves | Operation log performance |
Current Results
Date: 2026-01-03
Elixir: 1.18.4
OTP: 25Eg-walker vs Traditional CRDT
Direct comparison using the same 1000 sequential operations:
| Implementation | ips | Average | Memory |
|---|---|---|---|
| Eg-walker (fast-path) | 619.69 | 1.61 ms | 1.43 MB |
| Traditional CRDT (slow-path) | 601.64 | 1.66 ms | 1.44 MB |
For pure sequential put_node operations: Only ~3% difference.
Why? Simple put_node operations don't have much merge overhead. The real benefit appears with:
- Concurrent operations requiring merge
- Move operations with undo-do-redo
- Long histories requiring traversal
See the high-concurrency benchmark below for the real difference.
apply_remote Performance
| Scenario | ips | Average | Median | 99th % | Memory |
|---|---|---|---|---|---|
| Sequential (fast-path) | 617 | 1.62 ms | 1.51 ms | 3.00 ms | 1.43 MB |
| Collaborative (mixed) | 144 | 6.94 ms | 6.76 ms | 8.76 ms | 5.82 MB |
| High-concurrency (slow-path) | 63 | 15.76 ms | 15.59 ms | 18.48 ms | 21.84 MB |
Key insight: Fast-path is ~10x faster than slow-path (617 vs 63 ips)
The real Eg-walker benefit:
- Sequential → High-concurrency: 10x throughput drop, 15x memory increase
- This is where Eg-walker shines: sequential editing has near-zero CRDT overhead
Local Operations
| Operation | ips | Average | Memory |
|---|---|---|---|
| get_node | 5.57 M | 179 ns | 0 B |
| put_node | 1.13 M | 888 ns | 1016 B |
| update_node | 0.65 M | 1537 ns | 3104 B |
Scalability (Document Size)
| Document Size | ips | Average | Memory |
|---|---|---|---|
| 100 nodes + 100 ops | 6.44 K | 155 μs | 136 KB |
| 1k nodes + 100 ops | 6.58 K | 152 μs | 146 KB |
| 10k nodes + 100 ops | 5.99 K | 167 μs | 157 KB |
Key insight: Document size has minimal impact on operation time (O(1) node access via Map)
Fast-path vs Slow-path
The Eg-walker implementation optimizes for sequential editing (the common case):
Fast-path (Sequential Events)
- When: Events extend the frontier linearly (single user or turn-taking)
- Performance: 617 ips, 1.43 MB memory
- Overhead: Near-zero CRDT overhead
- How it works: Direct application, no transformation needed
Slow-path (Concurrent Events)
- When: Multiple replicas with divergent branches
- Performance: 63 ips, 21.84 MB memory
- Overhead: Full CRDT merge with undo-do-redo
- How it works: Builds transient merge state, replays operations
Key Takeaway
For the same sequential workload:
- Eg-walker fast-path: 619 ips
- Traditional CRDT (forced slow-path): 601 ips
- Difference: Only ~3%
The massive benefit appears when comparing sequential vs concurrent:
- Sequential (Eg-walker optimized): 617 ips, 1.43 MB
- High-concurrency (traditional behavior): 63 ips, 21.84 MB
- 10x throughput, 15x memory difference
Optimization History
event_index Optimization (2026-01-02)
Added persistent event_index field to Tree struct for O(1) event lookups.
| Metric | Before | After | Improvement |
|---|---|---|---|
| Sequential throughput | 604 ips | 605 ips | +0.2% |
| Collaborative throughput | 122.5 ips | 140.5 ips | +14.7% |
| Slow-path throughput | 55.5 ips | 57.4 ips | +3.4% |
| Memory (collaborative) | 5.91 MB | 5.64 MB | -4.6% |
Implementation:
event_indexfield in Tree struct (lazy construction)ensure_index/1builds index from history when neededget_event/2for O(1) lookup by event ID- Incremental updates in
record_history/3andrecord_remote_event/2
Metrics Explained
- ips: Iterations per second (higher is better)
- Average: Mean time per operation
- Median: 50th percentile latency
- 99th %: Tail latency (important for UI responsiveness)
- Memory: Total bytes allocated per benchmark run
Future Optimizations
Remaining targets identified:
find_insertion_point- O(n) list scan → binary searchbuild_children_mapin EventGraph - O(n) rebuild → cache- Pass event_index to EventGraph functions to avoid rebuilding