ALLM.Providers.Fake (allm v0.3.0)

Deterministic, scripted adapter for testing. Implements both ALLM.Adapter and ALLM.StreamAdapter. See spec §31, §7.1, §7.2, §8.

Layer B — runtime. Fake is the canonical testing adapter that every orchestration phase (5, 6, 7, 8) tests against. It carries serializable plain data in adapter_opts and passes both the Phase 3 adapter- and stream-adapter conformance harnesses.

What Fake is (and isn't)

Fake ignores the %ALLM.Request{} passed to generate/2 and stream/2. It never inspects the request's :messages, :tools, :tool_choice, :temperature, :max_tokens, or any other field. The scripted response is produced irrespective of the request. This is intentional: Fake is for testing orchestration, not provider-wire fidelity. Testing tool orchestration happens at Phase 6 (ToolRunner) and Phase 7 (Chat) — Fake's scripts directly emit {:tool_call, _} entries to simulate what the model would return; the caller sets up an engine with tools: [...] and a {:tool_call, ...} script, and the orchestrator dispatches to the tool executor exactly as it would with a real provider.

Script shapes

Fake accepts two disjoint script shapes on adapter_opts. See ALLM.Providers.Fake.Script for the full tag-to-shape table.

Spec §31 (user-facing)

adapter_opts: [
  script: [
    {:text, "Hello "},
    {:text, "world"},
    {:finish, :stop}
  ]
]

Entry tags: :text, :tool_call, :tool_call_delta, :usage, :raw_chunk, :finish, :error (2-tuple), :delay, :sleep (deprecated alias of :delay).

Phase 3 harness

adapter_opts: [
  script: [{:ok, %{output_text: "hi"}}],
  stream_script: [[{:text_delta, "hel"}, {:text_delta, "lo"}, {:finish, :stop}]]
]

Entry tags (non-streaming, on :script): {:ok, map}, {:error, reason_atom, keyword}. Entry tags (streaming, on :stream_script): :text_delta, :finish, :preflight_error, :error_event, :stream_error, shared-semantics :tool_call / :finish.

Per-entry-point key precedence

Entry point	Key precedence
`generate/2`	`:scripts` > `:script` (wrapped as `[script]`). `:stream_script` is not consulted; if only `:stream_script` is set, `generate/2` returns `:no_scripted_response`.
`stream/2`	`:stream_script` > `:scripts` > `:script` (wrapped as `[script]`).

Multi-call scripting uses :scripts / :stream_script as a list of lists — each call consumes one inner list and advances the cursor.

Disambiguation

For generate/2, both shapes share the :script key and are disambiguated by leading entry tag. The :error tag disambiguates by tuple_size/1 (2 → §31, 3 → harness). For stream/2, distinct keys disambiguate.

Cursor behaviour

Multi-call scripts (:scripts / :stream_script) advance a per-process cursor on every call. By default the cursor lives in the process dictionary at {:allm_fake_cursor, :erlang.phash2(scripts)} — isolated per ExUnit test process (async: true), GC'd on pid-down, zero-setup for the common case.

Two engines built with content-equal :scripts values in the same process share a cursor. This is a documented footgun: the cursor key is :erlang.phash2(scripts), so identical script contents collide. A test that constructs two Fake engines simulating two distinct providers with the same fixture script finds the second engine's first call already at index 1. Workaround: pass distinct adapter_opts[:script_cursor] Agent pids, obtained from start_script_cursor/0.

The explicit-Agent override also supports cross-process cursor sharing (Task.async/1 over the adapter call) and mitigates the rare hash-collision case (:erlang.phash2/1 is a 27-bit hash).

Testing patterns

Use start_script_cursor/0 for multi-call tests. Reach for the explicit Agent cursor whenever a test (a) runs multiple Fake calls with a :scripts or :stream_script list AND (b) another test in the same async: true module could share content-equal script entries, OR (c) the test dispatches the adapter call across processes (Task.async/1). The default process-dict cursor is fine for one-shot :script calls and for single-multi-call tests whose script content is unique in the module; for everything else, the explicit cursor is load-bearing:

cursor = ALLM.Providers.Fake.start_script_cursor()
opts = [adapter_opts: [scripts: [...]] ++ [script_cursor: cursor]]

Worked examples: test/allm/providers/fake_test.exs:143 and test/allm/providers/fake/fixtures_test.exs:66.

Cleanup observation

When adapter_opts[:cleanup_observer] is a :counters ref (as created by :counters.new(1, [:atomics])), the Stream.resource/3 after_fun increments index 1 on every normal termination path — consumer Enum.take/N, Stream.take_while/2 returning false, Stream.run/1 scope exit, throws from the reducer, consumer process exit with a trappable reason. The counter increments at most once per stream (not once-per-event).

Halt-safety test shape:

ref = :counters.new(1, [:atomics])
{:ok, stream} = ALLM.Providers.Fake.stream(req,
  adapter_opts: [script: [...], cleanup_observer: ref])
_ = Enum.take(stream, 2)
# :counters.get(ref, 1) == 1 within 500 ms

Brutal-kill caveat. Stream.resource/3's after_fun does NOT run when the consumer is killed with Process.exit(pid, :kill) — brutal exits skip all cleanup by OTP design. Real provider adapters (Phase 10–11) address this via Finch's own monitor-based connection cleanup; Fake has no HTTP ref to leak so the caveat is purely documentary. Tests assert cleanup on normal halts only; no test should simulate :kill.

Adapter event vocabulary (streaming)

Emitted (spec §8 subset that belongs to an adapter): :message_started, :text_delta, :text_completed, :tool_call_started, :tool_call_delta, :tool_call_completed, :message_completed, :raw_chunk, :error.

Not emitted (orchestrator-owned — Phase 6/7): :tool_execution_started, :tool_execution_completed, :tool_result_encoded, :ask_user_requested, :tool_halt, :step_completed, :chat_completed.

Backpressure and delays

{:delay, ms} entries call Process.sleep/1 inside the next_fun of the Stream.resource/3 — the delay blocks the consumer's reducing process, not a simulated provider. {:delay, _} is front-loaded: the interpreter sleeps before consuming the NEXT entry, so placing {:delay, ms} as the FIRST entry delays :message_started. Timing tests measure the wall-clock interval between the emit preceding the {:delay, _} entry and the emit following it.

{:sleep, ms} is a deprecated alias for {:delay, ms} — one Logger.warning/1 fires per BEAM lifetime when {:sleep, _} is used. Deletion target: v0.3.

Examples

iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "hi"}])
iex> opts = [adapter_opts: [script: [{:text, "hi"}, {:finish, :stop}]]]
iex> {:ok, resp} = ALLM.Providers.Fake.generate(req, opts)
iex> {resp.output_text, resp.finish_reason}
{"hi", :stop}

Summary

Functions

cursor_index(pid)

Read the current cursor index for an Agent-backed cursor. Used in tests to assert how many calls have been consumed.

generate(request, opts)

Execute a scripted non-streaming request. See spec §7.1.

script_exhausted_error()

Return the canonical %AdapterError{} for a script-exhausted call.

start_script_cursor()

Start an Agent-backed script cursor for cross-process multi-call scripting and for disambiguating content-equal scripts in the same process (see moduledoc "Cursor behaviour").

stream(request, opts)

Open a scripted streaming request. See spec §7.2, §8.

Functions

cursor_index(pid)

@spec cursor_index(pid()) :: non_neg_integer()

Read the current cursor index for an Agent-backed cursor. Used in tests to assert how many calls have been consumed.

generate(request, opts)

@spec generate(
  ALLM.Request.t(),
  keyword()
) :: {:ok, ALLM.Response.t()} | {:error, ALLM.Error.AdapterError.t()}

Execute a scripted non-streaming request. See spec §7.1.

Reads the script from opts[:adapter_opts], validates via ALLM.Providers.Fake.Script.validate!/1, resolves the cursor, folds the current call's entries into a %ALLM.Response{} via ALLM.Providers.Fake.Script.fold_to_response/1, and advances the cursor.

Ignores the %Request{} entirely — the scripted response is produced irrespective of the request's messages, tools, or params (see "What Fake is (and isn't)" in the moduledoc).

Returns {:ok, %ALLM.Response{}} on a scripted success (including harness-shape {:ok, _} entries), {:error, %AdapterError{}} on a scripted failure or a script-exhausted cursor. adapter_opts[:request_id] is propagated onto response.request_id verbatim.

Examples

iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}])
iex> opts = [adapter_opts: [script: [{:text, "hello"}, {:finish, :stop}]]]
iex> {:ok, resp} = ALLM.Providers.Fake.generate(req, opts)
iex> resp.output_text
"hello"

script_exhausted_error()

@spec script_exhausted_error() :: ALLM.Error.AdapterError.t()

Return the canonical %AdapterError{} for a script-exhausted call.

Spec §31 phrases this as {:error, :no_scripted_response}; the atom is preserved as the struct's :reason field via the Phase 1 enum amendment.

Examples

iex> err = ALLM.Providers.Fake.script_exhausted_error()
iex> err.reason
:no_scripted_response
iex> err.message
"no scripted response"

start_script_cursor()

@spec start_script_cursor() :: pid()

Start an Agent-backed script cursor for cross-process multi-call scripting and for disambiguating content-equal scripts in the same process (see moduledoc "Cursor behaviour").

Pass the returned pid as adapter_opts[:script_cursor]; subsequent calls increment the cursor on the Agent rather than on the process dictionary.

Examples

iex> pid = ALLM.Providers.Fake.start_script_cursor()
iex> ALLM.Providers.Fake.cursor_index(pid)
0

stream(request, opts)

@spec stream(
  ALLM.Request.t(),
  keyword()
) :: {:ok, Enumerable.t()} | {:error, ALLM.Error.AdapterError.t()}

Open a scripted streaming request. See spec §7.2, §8.

Reads the script from opts[:adapter_opts], validates via ALLM.Providers.Fake.Script.validate!/1, resolves the cursor, and returns a lazy Enumerable.t() of ALLM.Event values. No event fires until the consumer reduces.

Key precedence: :stream_script > :scripts > :script (wrapped as [script]). Empty opts return {:error, script_exhausted_error()} (no stream opened).

When the first entry of the current call is a harness-shape {:preflight_error, reason, opts}, returns synchronously as {:error, %AdapterError{reason: ^reason, ...opts}} — no stream is opened.

The returned stream emits :message_started on open, per-entry events via ALLM.Providers.Fake.Script.interpret/1, and closes with :message_completed (prepended by :text_completed if any :text/:text_delta was emitted). {:delay, ms} / {:sleep, ms} entries call Process.sleep/1 and yield no events. after_fun increments adapter_opts[:cleanup_observer] (a :counters ref, when present).

Ignores the %Request{} (see "What Fake is (and isn't)" in the moduledoc).

Examples

iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}])
iex> opts = [adapter_opts: [script: [{:text, "hi"}, {:finish, :stop}]]]
iex> {:ok, stream} = ALLM.Providers.Fake.stream(req, opts)
iex> events = Enum.to_list(stream)
iex> Enum.any?(events, &match?({:text_delta, %{delta: "hi"}}, &1))
true