Jido Composer provides two test doubles for LLM-dependent code: ReqCassette for HTTP replay and LLMStub for deterministic strategy tests. Use cassettes for protocol realism, stubs for fast control-flow testing.

When to Use What

ApproachWhen to useSpeed
No doublePure data structures (Machine transitions, context merge, AgentTool conversion)Fastest
LLMStubDeterministic strategy tests, directive-loop tests, unit testsFast
ReqCassetteHTTP boundaries, provider response shapes, integration/e2e pathsFast (replay)
Live recordingCapturing new cassettes from real APIsSlow

Test Layers

LayerScopeTest doubles
UnitSingle module in isolationNone or LLMStub
IntegrationMulti-module compositionLLMStub and/or cassette
End-to-EndFull stack through DSLCassette

ReqCassette

ReqCassette records real HTTP responses and replays them in tests. It works as a Req plug — no global mocking, fully async-safe.

Recording Workflow

  1. Delete existing cassette files (if re-recording)
  2. Run with RECORD_CASSETTES=true mix test
  3. Cassettes are saved to test/cassettes/
  4. Subsequent runs replay from saved cassettes

Usage

import ReqCassette

test "orchestrator handles query" do
  with_cassette("my_test", CassetteHelper.default_cassette_opts(), fn plug ->
    agent = MyOrchestrator.new()
    {:ok, _agent, answer} = MyOrchestrator.query_sync(
      agent,
      "test query",
      %{},
      req_options: [plug: plug]
    )
    assert answer =~ "expected"
  end)
end

Cassette Modes

ModeBehavior
:recordRecord if missing, replay if present (development)
:replayReplay only, error if missing (CI)
:bypassIgnore cassettes, always hit network (debugging)

Sensitive Data Filtering

Cassettes automatically filter secrets via CassetteHelper.default_cassette_opts/0:

  • Headers: authorization, x-api-key, cookie
  • Patterns: Anthropic keys (sk-ant-*), OpenAI keys (sk-*), Bearer tokens, JSON-embedded keys

Configuration is centralized in test/support/cassette_helper.ex.

Req Options Propagation

The plug flows through the full stack:

Test (with_cassette)
  -> Orchestrator DSL (req_options: [plug: plug])
  -> Strategy state
  -> LLMAction.run(req_options: ...)
  -> ReqLLM (req_http_options: ...)
  -> Req (plug: ...)

LLMStub

Queue predetermined LLM responses for tests that don't need HTTP.

Direct Mode

Uses process dictionary. For strategy tests that manually drive directive loops:

alias Jido.Composer.TestSupport.LLMStub

LLMStub.setup([
  {:tool_calls, [%{id: "1", name: "add", arguments: %{"value" => 5, "amount" => 3}}]},
  {:final_answer, "The answer is 8"}
])

result = LLMStub.execute(params)  # pops from queue

Plug Mode

Agent-backed queue serving Anthropic JSON via Req.Test.stub. For DSL query_sync tests through the full ReqLLM stack:

{Req.Test, stub_name} = LLMStub.setup_req_stub(:my_stub, [
  LLMStub.anthropic_tool_calls_response([
    %{id: "1", name: "add", input: %{"value" => 5, "amount" => 3}}
  ]),
  LLMStub.anthropic_text_response("The answer is 8")
])

When to Use Each Mode

ModeUse when
DirectManually calling strategy functions, testing directive emission
PlugTesting through query_sync/run_sync, need full Req stack

Key Notes

Retry Handling

LLMAction retries once by default. When stubbing errors, provide 2+ responses to cover the retry:

LLMStub.setup([
  {:error, "rate limited"},   # first attempt fails
  {:error, "rate limited"},   # retry also fails
  {:final_answer, "done"}     # won't reach this
])

Streaming Constraint

Streaming uses Finch directly, bypassing Req plugs. When using cassettes or stubs, set stream: false (the default). Streaming and plug-based test doubles are incompatible.

Test Directory Structure

test/
 cassettes/              # Recorded HTTP responses
 support/
    test_actions.ex     # Shared test action modules
    test_agents.ex      # Shared test agent modules
    llm_stub.ex         # LLMStub module
    cassette_helper.ex  # Cassette configuration
 jido/composer/
    node_test.exs
    node/               # Node-specific tests
    workflow/            # Workflow tests
    orchestrator/       # Orchestrator tests
 integration/            # Multi-module tests
 e2e/                    # Full-stack tests