Deterministic, scripted adapter for testing. Implements both ALLM.Adapter
and ALLM.StreamAdapter. See spec §31, §7.1, §7.2, §8.
Layer B — runtime. Fake is the canonical testing adapter that every
orchestration phase (5, 6, 7, 8) tests against. It carries serializable
plain data in adapter_opts and passes both the Phase 3 adapter- and
stream-adapter conformance harnesses.
What Fake is (and isn't)
Fake ignores the %ALLM.Request{} passed to generate/2 and stream/2.
It never inspects the request's :messages, :tools, :tool_choice,
:temperature, :max_tokens, or any other field. The scripted response is
produced irrespective of the request. This is intentional: Fake is for
testing orchestration, not provider-wire fidelity. Testing tool orchestration
happens at Phase 6 (ToolRunner) and Phase 7 (Chat) — Fake's scripts directly
emit {:tool_call, _} entries to simulate what the model would return; the
caller sets up an engine with tools: [...] and a {:tool_call, ...}
script, and the orchestrator dispatches to the tool executor exactly as it
would with a real provider.
Script shapes
Fake accepts two disjoint script shapes on adapter_opts. See
ALLM.Providers.Fake.Script for the full tag-to-shape table.
Spec §31 (user-facing)
adapter_opts: [
script: [
{:text, "Hello "},
{:text, "world"},
{:finish, :stop}
]
]Entry tags: :text, :tool_call, :tool_call_delta, :usage,
:raw_chunk, :finish, :error (2-tuple), :delay, :sleep
(deprecated alias of :delay).
Phase 3 harness
adapter_opts: [
script: [{:ok, %{output_text: "hi"}}],
stream_script: [[{:text_delta, "hel"}, {:text_delta, "lo"}, {:finish, :stop}]]
]Entry tags (non-streaming, on :script): {:ok, map}, {:error, reason_atom, keyword}.
Entry tags (streaming, on :stream_script): :text_delta, :finish,
:preflight_error, :error_event, :stream_error, shared-semantics
:tool_call / :finish.
Per-entry-point key precedence
| Entry point | Key precedence |
|---|---|
generate/2 | :scripts > :script (wrapped as [script]). :stream_script is not consulted; if only :stream_script is set, generate/2 returns :no_scripted_response. |
stream/2 | :stream_script > :scripts > :script (wrapped as [script]). |
Multi-call scripting uses :scripts / :stream_script as a list of lists —
each call consumes one inner list and advances the cursor.
Disambiguation
For generate/2, both shapes share the :script key and are disambiguated
by leading entry tag. The :error tag disambiguates by tuple_size/1
(2 → §31, 3 → harness). For stream/2, distinct keys disambiguate.
Cursor behaviour
Multi-call scripts (:scripts / :stream_script) advance a per-process
cursor on every call. By default the cursor lives in the process dictionary
at {:allm_fake_cursor, :erlang.phash2(scripts)} — isolated per ExUnit test
process (async: true), GC'd on pid-down, zero-setup for the common case.
Two engines built with content-equal :scripts values in the same
process share a cursor. This is a documented footgun: the cursor key is
:erlang.phash2(scripts), so identical script contents collide. A test
that constructs two Fake engines simulating two distinct providers with the
same fixture script finds the second engine's first call already at index 1.
Workaround: pass distinct adapter_opts[:script_cursor] Agent pids,
obtained from start_script_cursor/0.
The explicit-Agent override also supports cross-process cursor sharing
(Task.async/1 over the adapter call) and mitigates the rare hash-collision
case (:erlang.phash2/1 is a 27-bit hash).
Testing patterns
Use start_script_cursor/0 for multi-call tests. Reach for the explicit
Agent cursor whenever a test (a) runs multiple Fake calls with a :scripts
or :stream_script list AND (b) another test in the same async: true
module could share content-equal script entries, OR (c) the test dispatches
the adapter call across processes (Task.async/1). The default
process-dict cursor is fine for one-shot :script calls and for
single-multi-call tests whose script content is unique in the module; for
everything else, the explicit cursor is load-bearing:
cursor = ALLM.Providers.Fake.start_script_cursor()
opts = [adapter_opts: [scripts: [...]] ++ [script_cursor: cursor]]Worked examples: test/allm/providers/fake_test.exs:143 and
test/allm/providers/fake/fixtures_test.exs:66.
Cleanup observation
When adapter_opts[:cleanup_observer] is a :counters ref (as created by
:counters.new(1, [:atomics])), the Stream.resource/3 after_fun
increments index 1 on every normal termination path — consumer
Enum.take/N, Stream.take_while/2 returning false, Stream.run/1 scope
exit, throws from the reducer, consumer process exit with a trappable
reason. The counter increments at most once per stream (not once-per-event).
Halt-safety test shape:
ref = :counters.new(1, [:atomics])
{:ok, stream} = ALLM.Providers.Fake.stream(req,
adapter_opts: [script: [...], cleanup_observer: ref])
_ = Enum.take(stream, 2)
# :counters.get(ref, 1) == 1 within 500 msBrutal-kill caveat. Stream.resource/3's after_fun does NOT run when
the consumer is killed with Process.exit(pid, :kill) — brutal exits skip
all cleanup by OTP design. Real provider adapters (Phase 10–11) address this
via Finch's own monitor-based connection cleanup; Fake has no HTTP ref to
leak so the caveat is purely documentary. Tests assert cleanup on normal
halts only; no test should simulate :kill.
Adapter event vocabulary (streaming)
Emitted (spec §8 subset that belongs to an adapter):
:message_started, :text_delta, :text_completed, :tool_call_started,
:tool_call_delta, :tool_call_completed, :message_completed,
:raw_chunk, :error.
Not emitted (orchestrator-owned — Phase 6/7):
:tool_execution_started, :tool_execution_completed, :tool_result_encoded,
:ask_user_requested, :tool_halt, :step_completed, :chat_completed.
Backpressure and delays
{:delay, ms} entries call Process.sleep/1 inside the next_fun of the
Stream.resource/3 — the delay blocks the consumer's reducing process, not
a simulated provider. {:delay, _} is front-loaded: the interpreter
sleeps before consuming the NEXT entry, so placing {:delay, ms} as the
FIRST entry delays :message_started. Timing tests measure the wall-clock
interval between the emit preceding the {:delay, _} entry and the emit
following it.
{:sleep, ms} is a deprecated alias for {:delay, ms} — one Logger.warning/1
fires per BEAM lifetime when {:sleep, _} is used. Deletion target: v0.3.
Examples
iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "hi"}])
iex> opts = [adapter_opts: [script: [{:text, "hi"}, {:finish, :stop}]]]
iex> {:ok, resp} = ALLM.Providers.Fake.generate(req, opts)
iex> {resp.output_text, resp.finish_reason}
{"hi", :stop}
Summary
Functions
Read the current cursor index for an Agent-backed cursor. Used in tests to assert how many calls have been consumed.
Execute a scripted non-streaming request. See spec §7.1.
Return the canonical %AdapterError{} for a script-exhausted call.
Start an Agent-backed script cursor for cross-process multi-call scripting and for disambiguating content-equal scripts in the same process (see moduledoc "Cursor behaviour").
Open a scripted streaming request. See spec §7.2, §8.
Functions
@spec cursor_index(pid()) :: non_neg_integer()
Read the current cursor index for an Agent-backed cursor. Used in tests to assert how many calls have been consumed.
@spec generate( ALLM.Request.t(), keyword() ) :: {:ok, ALLM.Response.t()} | {:error, ALLM.Error.AdapterError.t()}
Execute a scripted non-streaming request. See spec §7.1.
Reads the script from opts[:adapter_opts], validates via
ALLM.Providers.Fake.Script.validate!/1, resolves the cursor, folds the
current call's entries into a %ALLM.Response{} via
ALLM.Providers.Fake.Script.fold_to_response/1, and advances the cursor.
Ignores the %Request{} entirely — the scripted response is produced
irrespective of the request's messages, tools, or params (see "What Fake is
(and isn't)" in the moduledoc).
Returns {:ok, %ALLM.Response{}} on a scripted success (including
harness-shape {:ok, _} entries), {:error, %AdapterError{}} on a scripted
failure or a script-exhausted cursor. adapter_opts[:request_id] is
propagated onto response.request_id verbatim.
Examples
iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}])
iex> opts = [adapter_opts: [script: [{:text, "hello"}, {:finish, :stop}]]]
iex> {:ok, resp} = ALLM.Providers.Fake.generate(req, opts)
iex> resp.output_text
"hello"
@spec script_exhausted_error() :: ALLM.Error.AdapterError.t()
Return the canonical %AdapterError{} for a script-exhausted call.
Spec §31 phrases this as {:error, :no_scripted_response}; the atom is
preserved as the struct's :reason field via the Phase 1 enum amendment.
Examples
iex> err = ALLM.Providers.Fake.script_exhausted_error()
iex> err.reason
:no_scripted_response
iex> err.message
"no scripted response"
@spec start_script_cursor() :: pid()
Start an Agent-backed script cursor for cross-process multi-call scripting and for disambiguating content-equal scripts in the same process (see moduledoc "Cursor behaviour").
Pass the returned pid as adapter_opts[:script_cursor]; subsequent calls
increment the cursor on the Agent rather than on the process dictionary.
Examples
iex> pid = ALLM.Providers.Fake.start_script_cursor()
iex> ALLM.Providers.Fake.cursor_index(pid)
0
@spec stream( ALLM.Request.t(), keyword() ) :: {:ok, Enumerable.t()} | {:error, ALLM.Error.AdapterError.t()}
Open a scripted streaming request. See spec §7.2, §8.
Reads the script from opts[:adapter_opts], validates via
ALLM.Providers.Fake.Script.validate!/1, resolves the cursor, and returns
a lazy Enumerable.t() of ALLM.Event values. No event fires until the
consumer reduces.
Key precedence: :stream_script > :scripts > :script (wrapped as
[script]). Empty opts return {:error, script_exhausted_error()} (no
stream opened).
When the first entry of the current call is a harness-shape
{:preflight_error, reason, opts}, returns synchronously as
{:error, %AdapterError{reason: ^reason, ...opts}} — no stream is opened.
The returned stream emits :message_started on open, per-entry events via
ALLM.Providers.Fake.Script.interpret/1, and closes with
:message_completed (prepended by :text_completed if any :text/:text_delta
was emitted). {:delay, ms} / {:sleep, ms} entries call Process.sleep/1
and yield no events. after_fun increments adapter_opts[:cleanup_observer]
(a :counters ref, when present).
Ignores the %Request{} (see "What Fake is (and isn't)" in the moduledoc).
Examples
iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}])
iex> opts = [adapter_opts: [script: [{:text, "hi"}, {:finish, :stop}]]]
iex> {:ok, stream} = ALLM.Providers.Fake.stream(req, opts)
iex> events = Enum.to_list(stream)
iex> Enum.any?(events, &match?({:text_delta, %{delta: "hi"}}, &1))
true