Lockstep.LLMExplainer (Lockstep v0.1.0)

Copy Markdown View Source

Optional Anthropic Claude integration: explains a Lockstep concurrency bug in plain English and suggests a fix.

Lockstep itself never depends on this module — it activates only when (a) Jason is loaded, (b) ANTHROPIC_API_KEY is set in the environment, and (c) the user explicitly calls Lockstep.LLMExplainer.explain/5 or explain_bug/3. With any of those missing, every entry point returns :skipped and the rest of the test suite is unaffected.

Why optional

Calling Claude takes 5–30 seconds and costs money. Lockstep's exception message (Lockstep.BugFound) must render synchronously for ExUnit, so we don't auto-trigger an LLM call there — that would block test output on every bug found. Instead, callers ask for an explanation when they want one, typically right after assert_raise Lockstep.BugFound.

Usage

bug =
  assert_raise Lockstep.BugFound, fn ->
    Lockstep.Runner.run(...)
  end

sources = [
  "/tmp/hammer/lib/hammer/atomic.ex",
  "/tmp/hammer/lib/hammer/atomic/fix_window.ex"
]

case Lockstep.LLMExplainer.explain_bug(bug, sources) do
  {:ok, explanation} -> IO.puts("\n" <> explanation)
  :skipped          -> :ok           # no key / no Jason / disabled
  {:error, reason}  -> IO.warn(inspect(reason))
end

Configuration

  • ANTHROPIC_API_KEY — required to activate. Set to your Anthropic API key.
  • LOCKSTEP_LLM_OFF=1 — disable the explainer even when an API key is set. Useful in CI where you don't want test failures to make API calls.
  • LOCKSTEP_LLM_MODEL — model ID override. Default claude-opus-4-7.

What gets sent

Every call sends:

  • A short system prompt instructing the model to produce a 200–400 word explanation in three markdown sections.
  • The full content of every source file the caller provides.
  • The failure reason and the causal slice of the trace (from Lockstep.CausalSlice.slice/2) — typically 5–30 events instead of the raw 100–1000.

Prompt caching

The system prompt and the source files are marked cache_control: ephemeral. Across multiple calls for the same set of source files (e.g., re-running the same test multiple times during a debugging session, or explaining several bugs in the same library), the source files are served from cache at ~10% of the input price. The trace varies per call and is not cached.

Note: Anthropic's minimum cacheable prefix is 4096 tokens on Opus 4.7. Smaller source sets won't actually cache; the markers are harmless and the prompt continues to work.

Summary

Types

Result of an explanation attempt.

A source file presented to the model: {path, content}. The basename of path is used as the file's heading in the prompt.

Functions

Explain a bug given the failure reason, the (already-formatted) causal slice of the trace, the failing step, and the source files of the system under test.

Convenience wrapper: take a Lockstep.BugFound exception, slice the trace, read the listed source files, and call explain/5.

Types

result()

@type result() :: {:ok, String.t()} | :skipped | {:error, term()}

Result of an explanation attempt.

source_file()

@type source_file() :: {Path.t(), String.t()}

A source file presented to the model: {path, content}. The basename of path is used as the file's heading in the prompt.

Functions

explain(reason, sliced_text, fail_step, sources, opts \\ [])

@spec explain(
  failing_reason :: String.t(),
  sliced_trace_text :: String.t(),
  fail_step :: non_neg_integer(),
  source_files :: [source_file()],
  opts :: keyword()
) :: result()

Explain a bug given the failure reason, the (already-formatted) causal slice of the trace, the failing step, and the source files of the system under test.

Use explain_bug/3 for the common case where you have a Lockstep.BugFound exception in hand.

Options

  • :model — override the Claude model (default claude-opus-4-7, also overridable via LOCKSTEP_LLM_MODEL).
  • :timeout — HTTP timeout in milliseconds (default 60_000).
  • :effort:low | :medium | :high | :max (default :medium). See Claude's effort docs for the cost/quality tradeoff.

explain_bug(bug, source_paths, opts \\ [])

@spec explain_bug(Lockstep.BugFound.t(), [Path.t()], keyword()) :: result()

Convenience wrapper: take a Lockstep.BugFound exception, slice the trace, read the listed source files, and call explain/5.

source_paths is a list of file paths to read from disk; missing files are silently dropped.