Lockstep.CausalSlice (Lockstep v0.1.0)

Copy Markdown View Source

Backward-slice a trace to the events causally relevant to a given failing step.

Lockstep traces tend to be 100s–1000s of events. Most are scaffolding — spawn, monitor, register, hello, ticks. The user wants to see the small subset that actually caused the bug.

Algorithm (Lanese & Vidal 2021, "Causal-Consistent Replay

Debugging for Message Passing Programs")

Given a trace and a failing step:

  1. Build a happens-before graph over trace events.
  2. From the failing event, walk backwards via happens-before edges, collecting every reachable step.
  3. Return the events in step order.

The slice is a conservative over-approximation: any event that could have caused the failing one is included. False positives (extra included events) are acceptable; false negatives (missing causes) are not.

Happens-before edges

  • Process order — events on the same actor (the pid that took the action) are ordered by step.
  • Spawn → Hello{:spawn, parent, child} precedes {:hello, child}.
  • Send → Recv{:send, A, B, m} precedes the matching {:recv, B, m} (FIFO between same A and B; we match the next available recv with the same message).
  • Send_after → Timer_fire — matching by timer ref.
  • Timer_fire → Recv — the timer message is delivered, and a subsequent {:recv, target, msg} consumes it.
  • Monitor → Down_delivered — matching by monitor ref.
  • Exit → Down_delivered — when the monitored pid dies.
  • Exit → Exit_signal — when a linked pid dies.
  • NIF same-resource — prior writes on {:ets, table}, {:atomics, slot}, or {:persistent_term, key} precede subsequent reads/writes on the same resource.

Example

events = Lockstep.Trace.events(...)
sliced = Lockstep.CausalSlice.slice(events, fail_step)
IO.puts(Lockstep.CausalSlice.format(events, sliced, fail_at: fail_step))

Typical reduction: a 200-event trace slices to 5–30 events.

Summary

Functions

Render a sliced trace as a human-readable schedule. The output looks like Lockstep.Trace.format/2 but with [N events elided] markers between non-consecutive included steps.

Slice the trace to events causally reachable backwards from fail_step. Returns a sub-list of trace, ordered by step.

Types

trace()

@type trace() :: [%{step: non_neg_integer(), event: tuple()}]

Functions

format(full_trace, sliced_trace, opts \\ [])

@spec format(trace(), trace(), keyword()) :: String.t()

Render a sliced trace as a human-readable schedule. The output looks like Lockstep.Trace.format/2 but with [N events elided] markers between non-consecutive included steps.

slice(trace, fail_step)

@spec slice(trace(), non_neg_integer()) :: trace()

Slice the trace to events causally reachable backwards from fail_step. Returns a sub-list of trace, ordered by step.