Core Concepts
View SourceThis guide explains the ideas behind Grove. You don't need to understand all of this to use Grove, but it helps when things get complex.
The Problem: Sync is Hard
When multiple users (or devices) edit the same data, conflicts happen:
sequenceDiagram
participant A as Alice
participant S as Server
participant B as Bob
A->>S: title = "Meeting Notes"
B->>S: title = "Project Plan"
Note over S: Which one wins?Traditional solutions all have drawbacks:
| Approach | Problem |
|---|---|
| Last-write-wins | Alice's change is silently lost |
| Lock the document | Bob has to wait, can't work offline |
| Manual conflict resolution | User sees scary merge dialogs |
| Operational Transform (Google Docs) | Complex, needs central server |
CRDTs: A Better Way
CRDT stands for "Conflict-free Replicated Data Type."
The key insight: Design your data structure so that any order of operations produces the same result.
flowchart TB
subgraph "Same Result, Any Order"
A1[Operation A] --> R1[Result: XYZ]
B1[Operation B] --> R1
A2[Operation B] --> R2[Result: XYZ]
B2[Operation A] --> R2
endThis is called commutativity — A + B = B + A.
How Grove Achieves This
- Unique IDs everywhere — Every node, every operation has a globally unique ID
- Timestamps for ordering — When two operations conflict, timestamps decide
- Merge by design — Data structures are designed to merge, not conflict
Grove's Data Model
Trees and Nodes
Grove stores data as a tree of nodes:
graph TD
R[Root] --> A[Form]
A --> B[Step 1]
A --> C[Step 2]
B --> D[Field: Name]
B --> E[Field: Email]
C --> F[Field: Address]Each node has:
| Field | Description |
|---|---|
id | Unique identifier |
type | What kind of node (atom) |
attrs | Node data (map) |
parent_id | Who is my parent? |
children | Ordered list of child IDs |
Operations
Every change creates an operation:
# These create operations internally
Grove.Tree.put_node(tree, node) # => {:put_node, id, node}
Grove.Tree.update_node(tree, id, f) # => {:update_node, id, changes}
Grove.Tree.delete_node(tree, id) # => {:delete_node, id}
Grove.Tree.move_node(tree, id, p) # => {:move_node, id, new_parent}Operations are wrapped in events with metadata:
%Grove.Tree.Event{
id: {"laptop", 42}, # {replica_id, sequence}
operation: {:put_node, ...}, # The actual change
parents: MapSet.new([...]), # Causal dependencies
timestamp: 1704067200000 # When it happened
}Replica IDs
Every device/user needs a unique replica ID:
flowchart LR
subgraph "Replica IDs"
L[Laptop: abc123]
P[Phone: def456]
T[Tablet: ghi789]
end
L --> S[Same Document]
P --> S
T --> SReplica IDs ensure:
- Node IDs are globally unique (
{replica_id}_{sequence}) - Operations can be attributed to their source
- Conflicts can be resolved deterministically
Eg-walker: Why Grove is Fast
Grove implements Eg-walker, an optimization from recent research (EuroSys 2025).
The Key Insight
Most collaborative editing is sequential — one person typing, then another. True concurrency (two people typing at the exact same moment) is rare.
pie title Typical Editing Patterns
"Sequential (fast-path)" : 85
"Concurrent (slow-path)" : 15Fast-Path vs Slow-Path
flowchart TD
E[Incoming Event] --> C{Is history linear?}
C -->|Yes| F[Fast-path: Just apply it]
C -->|No| S[Slow-path: Merge with CRDT rules]
F --> R[Result]
S --> RFast-path (sequential operations):
- Direct application
- No merge overhead
- O(1) complexity
Slow-path (concurrent operations):
- Build merge state
- Apply CRDT rules
- Still automatic, just slower
Performance Impact
| Scenario | Throughput | Memory |
|---|---|---|
| Sequential (fast-path) | 617 ops/sec | 1.4 MB |
| Concurrent (slow-path) | 63 ops/sec | 21.8 MB |
The fast-path is 10x faster and uses 15x less memory.
Key Guarantees
Grove provides these guarantees:
1. Eventual Consistency
All replicas that have seen the same operations will have the same state.
flowchart LR
subgraph "Eventually..."
A[Replica A] -.->|sync| B[Replica B]
B -.->|sync| C[Replica C]
C -.->|sync| A
end
A --> S[Same State]
B --> S
C --> S2. No Conflicts
Operations always merge. There's no "conflict" state that requires user intervention.
3. Causality Preserved
If operation B happened after seeing operation A, then B will always be applied after A.
4. Intent Preserved
Operations do what they intended:
- Adds stay added (unless explicitly deleted)
- Deletes stay deleted
- Moves go to intended parent (unless it would create a cycle)
Move Operations
Moving nodes in a distributed tree is the hardest problem. Grove implements Kleppmann's algorithm:
flowchart TD
subgraph "Concurrent Moves"
A1[Alice: Move X under Y]
B1[Bob: Move Y under X]
end
A1 --> M{Would create cycle?}
B1 --> M
M -->|Yes| R[One move wins by timestamp]
M -->|No| OK[Both moves apply]Guarantees:
- No cycles ever created
- Deterministic winner (by Lamport timestamp)
- Both replicas converge to same tree
Summary
| Concept | What It Means |
|---|---|
| CRDT | Data structure designed to merge automatically |
| Replica ID | Unique identifier for each device/user |
| Operation | A single change (add, update, delete, move) |
| Event | Operation + metadata (ID, timestamp, parents) |
| Fast-path | Optimized path for sequential operations |
| Slow-path | Full merge for concurrent operations |
| Convergence | All replicas end up with same state |
Next Steps
- Use Cases — See real-world applications
- Benchmarks — Performance details
- Architecture — Implementation deep-dive