Grove Architecture

View Source

A conflict-free replicated tree (CRDT) library for Elixir, designed for collaborative editing of hierarchical data structures like ASTs, documents, and form builders.

Design Philosophy

  1. Tree-First: Purpose-built for hierarchical data, not a generic CRDT library
  2. Operation-Based: Efficient sync via operations, not full-state transfer
  3. Move-Safe: Concurrent moves never create cycles
  4. BEAM-Native: Leverage processes, ETS, :pg, and OTP patterns
  5. Phoenix-Ready: First-class LiveView integration for real-time collaboration

Core Data Model

Tree Structure

Grove represents trees as a collection of nodes with parent-child relationships:

                    
                       root   
                    
           
            
       node_1      node_2      node_3  
            
                       
       node_1a                   node_3a 
                       

Node Structure

Each node contains:

FieldTypeDescription
idString.tUnique identifier ({replica_id}_{lamport_timestamp})
typeatomNode type (e.g., :function, :param, :section)
attrsmapNode attributes (LWW-Register semantics per field)
parent_idString.t | nilParent node reference
children[String.t]Ordered list of child node IDs
deleted?booleanTombstone flag
metamapOperation metadata (user_id, timestamp, etc.)
%Grove.Node{
  id: "replica_1_42",
  type: :dropdown,
  attrs: %{label: "Country", required: true},
  parent_id: "replica_1_12",
  children: ["replica_1_43", "replica_1_44"],
  deleted?: false,
  meta: %{updated_by: "user_123", updated_at: ~U[2024-01-15 10:30:00Z]}
}

Tree State

The tree maintains:

%Grove.Tree{
  replica_id: "replica_1",
  nodes: %{id => Node.t},           # All nodes by ID
  root_id: "root",                  # Root node ID
  clock: 42,                        # Lamport clock
  pending_ops: [],                  # Operations not yet flushed
  undo_stack: [],                   # For undo support
  redo_stack: []                    # For redo support
}

Operations

Grove uses operation-based CRDTs. Each mutation generates an operation that can be broadcast to other replicas.

Operation Types

OperationDescriptionConflict Resolution
InsertAdd new nodeUnique IDs prevent conflicts
DeleteMark node as tombstoneAdd-wins (concurrent insert + delete → node exists)
UpdateModify node attributesLWW per attribute field
MoveChange node's parentLamport timestamp tiebreaker, cycle prevention

Operation Structure

%Grove.Op.Insert{
  id: "replica_1_43",
  type: :option,
  attrs: %{value: "US", label: "United States"},
  parent_id: "replica_1_42",
  position: 0,
  meta: %{user_id: "user_123"}
}

%Grove.Op.Move{
  id: "replica_1_43",
  new_parent_id: "replica_1_50",
  position: 2,
  old_parent_id: "replica_1_42",   # For undo
  old_position: 0,                  # For undo
  timestamp: 156                    # Lamport timestamp for conflict resolution
}

%Grove.Op.Update{
  id: "replica_1_42",
  attrs: %{label: "Select Country"},
  old_attrs: %{label: "Country"},   # For undo
  timestamp: 157
}

%Grove.Op.Delete{
  id: "replica_1_43",
  timestamp: 158
}

%Grove.Op.Batch{
  id: "batch_replica_1_159",
  ops: [op1, op2, op3],             # Atomic group
  timestamp: 159
}

Move Operation Semantics

The move operation is the most complex part. Grove implements the algorithm from Kleppmann et al. (2021).

Cycle Prevention

When concurrent moves would create a cycle, one move is rejected based on Lamport timestamps:

Before:          Concurrent moves:           After sync:
    A                A: move BC               A
                    B: move CB               
    B                                          B
                                              
    C                                          C
                                         (higher timestamp wins)

Move vs Delete Conflicts

When a node is moved to a destination that's concurrently deleted:

# Replica A: move X under Y
# Replica B: delete Y (concurrent)

# Result: X is moved to Y's former parent (orphan rescue)

Implementation

defmodule Grove.Op.Move do
  def apply(tree, %__MODULE__{} = op) do
    cond do
      # Target deleted? Rescue to its former parent
      deleted?(tree, op.new_parent_id) ->
        rescue_to_ancestor(tree, op)

      # Would create cycle?
      creates_cycle?(tree, op.id, op.new_parent_id) ->
        # Compare timestamps, lower timestamp wins (its move is undone)
        resolve_cycle_conflict(tree, op)

      true ->
        do_move(tree, op)
    end
  end

  defp creates_cycle?(tree, node_id, new_parent_id) do
    # Check if new_parent_id is a descendant of node_id
    ancestors = get_ancestors(tree, new_parent_id)
    node_id in ancestors
  end
end

Plain Data Conversion

Grove separates CRDT state from application data. You store plain JSON/maps and only use Grove for real-time sync.

Converting To/From Plain Data

# Load plain data, convert to CRDT tree
plain_ast = load_from_database(doc_id)
tree = Grove.from_data(plain_ast, replica_id: socket.id)

# After editing, convert back to plain data for storage
plain_ast = Grove.to_data(tree)
save_to_database(doc_id, plain_ast)

# Round-trip guarantee
assert Grove.to_data(Grove.from_data(data, replica_id: "a")) == data

What Gets Stripped

PreservedStripped
Node IDsVector clocks
Tree structureTombstones
Attribute valuesPending operations
Node typesUndo/redo stacks
Child orderingReplica metadata

Batch Operations

Composite operations that should be treated as a single unit for undo and broadcast.

# Adding an address field (4 nodes) as one atomic operation
tree = Grove.batch(tree, fn t ->
  t
  |> Grove.insert_node("addr", :group, %{label: "Address"}, parent: "form")
  |> Grove.insert_node("street", :text, %{label: "Street"}, parent: "addr")
  |> Grove.insert_node("city", :text, %{label: "City"}, parent: "addr")
  |> Grove.insert_node("zip", :text, %{label: "ZIP"}, parent: "addr")
end)

# Results in single compound operation
%Grove.Op.Batch{
  id: "batch_replica_1_100",
  ops: [insert_addr, insert_street, insert_city, insert_zip],
  timestamp: 100
}

# Single undo reverts all 4 inserts
{tree, ops} = Grove.undo(tree)

Attribute-Level CRDT Types

Different fields can use different conflict resolution strategies:

defmodule MyApp.FormSchema do
  use Grove.Schema

  node :dropdown do
    field :label, :lww_register       # Last-writer-wins (default)
    field :options, :rga_list         # Ordered list, concurrent inserts merge
    field :tags, :or_set              # Unordered set, add-wins
    field :config, :mv_register       # Multi-value, expose conflicts
  end
end

# Merge behavior per field type:
# - :lww_register → Higher timestamp wins
# - :rga_list → Interleaved merge preserving all items
# - :or_set → Union of all added items
# - :mv_register → Returns list of concurrent values

Module Structure

grove/
 lib/
    grove.ex                      # Main API facade
    grove/
        tree.ex                   # Tree state and core operations
        node.ex                   # Node struct
       
        op/                       # Operations
           insert.ex
           delete.ex
           update.ex
           move.ex
           batch.ex
       
        conflict/                 # Conflict resolution
           move_resolver.ex      # Cycle prevention
           lww.ex                # Last-write-wins
           mv.ex                 # Multi-value
       
        query/                    # Tree queries
           find.ex               # find_nodes, get_node
           traverse.ex           # path_to, ancestors, descendants
           filter.ex             # Query DSL
       
        undo/                     # Undo/redo support
           stack.ex
           inverse.ex            # Generate inverse operations
       
        sync/                     # Synchronization
           operations.ex         # Operation encoding/decoding
           merge.ex              # Apply remote operations
       
        data/                     # Plain data conversion
           from_data.ex
           to_data.ex
       
        schema/                   # Schema DSL
           dsl.ex
           types.ex              # Field type registry
       
        session/                  # Document sessions
           manager.ex            # GenServer per document
           registry.ex           # Session lookup
           presence.ex           # Cursor/selection tracking
       
        live_view/                # Phoenix integration
           hooks.ex              # on_mount hooks
           helpers.ex            # mount_tree, apply_and_broadcast
       
        gc/                       # Garbage collection
           tombstone.ex
       
        testing/                  # Test utilities
            generators.ex         # StreamData generators
            helpers.ex            # sync, assert_convergent

 test/
     properties/
         convergence_test.exs
         move_test.exs
         cycle_test.exs

Session Management

Each document gets a managed session that handles replication and lifecycle.

defmodule Grove.Session.Manager do
  use GenServer

  defstruct [
    :doc_id,
    :tree,
    :replicas,            # Connected LiveView PIDs
    :last_activity,
    :persist_timer,
    :grace_timer
  ]

  # Lifecycle
  def join(doc_id, replica_pid)   # Register replica, return current tree
  def leave(doc_id, replica_pid)  # Unregister replica
  def info(doc_id)                # Get session info

  # Operations
  def apply_op(doc_id, op)        # Apply and broadcast
  def get_state(doc_id)           # Get current tree

  # Automatic behaviors:
  # - Grace period before shutdown when no replicas
  # - Periodic persistence
  # - Operation buffering for debouncing
end

LiveView Integration

Mount Hooks

defmodule Grove.LiveView do
  def on_mount(:subscribe, _params, _session, socket) do
    if connected?(socket) do
      topic = "grove:doc:#{socket.assigns.doc_id}"
      Phoenix.PubSub.subscribe(Grove.PubSub, topic)
      Grove.Session.Manager.join(socket.assigns.doc_id, self())
    end
    {:cont, socket}
  end
end

Usage

defmodule MyAppWeb.EditorLive do
  use Phoenix.LiveView
  on_mount {Grove.LiveView, :subscribe}

  def mount(%{"id" => id}, _session, socket) do
    tree = Grove.Session.Manager.get_state(id)
    {:ok, assign(socket, doc_id: id, tree: tree)}
  end

  def handle_event("insert_node", params, socket) do
    tree = Grove.insert_node(socket.assigns.tree, ...)
    {tree, ops} = Grove.flush_operations(tree)

    Grove.Session.Manager.apply_op(socket.assigns.doc_id, ops)
    {:noreply, assign(socket, tree: tree)}
  end

  def handle_info({:grove_ops, ops}, socket) do
    tree = Grove.apply_operations(socket.assigns.tree, ops)
    {:noreply, assign(socket, tree: tree)}
  end
end

Presence Tracking

Built-in cursor and selection tracking for collaborative UIs:

# Track focus
tree = Grove.set_focus(tree, "node_123")

# Track selection
tree = Grove.set_selection(tree, ["node_123", "node_124"])

# Get all replica presence
Grove.get_presence(tree)
# => %{
#   "replica_a" => %{focus: "node_123", selection: [], color: "#4CAF50"},
#   "replica_b" => %{focus: "node_456", selection: ["node_456"], color: "#2196F3"}
# }

Query API

Find and traverse nodes in the tree:

# Find by ID
Grove.get_node(tree, "node_123")

# Find by type
Grove.find_nodes(tree, type: :dropdown)

# Find by attribute
Grove.find_nodes(tree, where: [required: true])

# Traversal
Grove.parent(tree, "node_123")
Grove.children(tree, "node_123")
Grove.siblings(tree, "node_123")
Grove.ancestors(tree, "node_123")
Grove.descendants(tree, "node_123")

# Path (for breadcrumbs)
Grove.path_to(tree, "node_123")
# => ["root", "section_1", "field_group", "node_123"]

Garbage Collection

Tombstones are collected when all replicas have observed the deletion:

# Configure GC
tree = Grove.new(
  replica_id: "node_1",
  gc: [
    min_tombstone_age: :timer.hours(24),
    gc_interval: 1000  # Run every N operations
  ]
)

# Manual GC with known replica states
tree = Grove.gc(tree, observed_by: ["node_1", "node_2", "node_3"])

GC Strategy

  1. Track observation: Each replica tracks its vector clock
  2. Exchange clocks: Periodically gossip vector clocks via :pg
  3. Compute stability: Find operations observed by ALL replicas
  4. Prune: Remove tombstones in stable causal past

Testing Utilities

defmodule MyApp.EditorTest do
  use ExUnit.Case
  import Grove.Testing

  test "concurrent edits converge" do
    # Create isolated replicas
    [tree_a, tree_b] = create_replicas(initial_data, count: 2)

    # Simulate concurrent edits
    tree_a = Grove.insert_node(tree_a, "n1", :text, %{v: 1}, parent: "root")
    tree_b = Grove.insert_node(tree_b, "n2", :text, %{v: 2}, parent: "root")

    # Sync and verify
    [tree_a, tree_b] = sync_all([tree_a, tree_b])
    assert_convergent([tree_a, tree_b])
    assert_no_cycles(tree_a)
  end

  test "move conflicts resolve deterministically" do
    [tree_a, tree_b] = create_replicas(data, count: 2)

    # Concurrent moves that would create cycle
    tree_a = Grove.move_node(tree_a, "X", new_parent: "Y")
    tree_b = Grove.move_node(tree_b, "Y", new_parent: "X")

    [tree_a, tree_b] = sync_all([tree_a, tree_b])
    assert_convergent([tree_a, tree_b])
    assert_no_cycles(tree_a)
  end
end

Performance

OperationTime ComplexitySpace Complexity
insert_nodeO(1) amortizedO(1)
delete_nodeO(1)O(1) tombstone
move_nodeO(log n)O(1)
update_nodeO(1)O(1)
get_nodeO(1)
find_nodesO(n)O(matches)
apply_operationsO(k)O(k)
path_toO(depth)O(depth)

Research Background

Grove implements algorithms from: