# `ALLM.Providers.Fake`
[🔗](https://github.com/cykod/ALLM/blob/v0.3.0/lib/allm/providers/fake.ex#L1)

Deterministic, scripted adapter for testing. Implements both `ALLM.Adapter`
and `ALLM.StreamAdapter`. See spec §31, §7.1, §7.2, §8.

Layer B — runtime. Fake is the canonical testing adapter that every
orchestration phase (5, 6, 7, 8) tests against. It carries serializable
plain data in `adapter_opts` and passes both the Phase 3 adapter- and
stream-adapter conformance harnesses.

## What Fake is (and isn't)

Fake **ignores the `%ALLM.Request{}`** passed to `generate/2` and `stream/2`.
It never inspects the request's `:messages`, `:tools`, `:tool_choice`,
`:temperature`, `:max_tokens`, or any other field. The scripted response is
produced irrespective of the request. This is intentional: Fake is for
testing orchestration, not provider-wire fidelity. Testing tool orchestration
happens at Phase 6 (ToolRunner) and Phase 7 (Chat) — Fake's scripts directly
emit `{:tool_call, _}` entries to simulate what the model would return; the
caller sets up an engine with `tools: [...]` and a `{:tool_call, ...}`
script, and the orchestrator dispatches to the tool executor exactly as it
would with a real provider.

## Script shapes

Fake accepts two disjoint script shapes on `adapter_opts`. See
`ALLM.Providers.Fake.Script` for the full tag-to-shape table.

### Spec §31 (user-facing)

    adapter_opts: [
      script: [
        {:text, "Hello "},
        {:text, "world"},
        {:finish, :stop}
      ]
    ]

Entry tags: `:text`, `:tool_call`, `:tool_call_delta`, `:usage`,
`:raw_chunk`, `:finish`, `:error` (2-tuple), `:delay`, `:sleep`
(deprecated alias of `:delay`).

### Phase 3 harness

    adapter_opts: [
      script: [{:ok, %{output_text: "hi"}}],
      stream_script: [[{:text_delta, "hel"}, {:text_delta, "lo"}, {:finish, :stop}]]
    ]

Entry tags (non-streaming, on `:script`): `{:ok, map}`, `{:error, reason_atom, keyword}`.
Entry tags (streaming, on `:stream_script`): `:text_delta`, `:finish`,
`:preflight_error`, `:error_event`, `:stream_error`, shared-semantics
`:tool_call` / `:finish`.

### Per-entry-point key precedence

| Entry point | Key precedence |
|-------------|----------------|
| `generate/2` | `:scripts` > `:script` (wrapped as `[script]`). `:stream_script` is not consulted; if only `:stream_script` is set, `generate/2` returns `:no_scripted_response`. |
| `stream/2`   | `:stream_script` > `:scripts` > `:script` (wrapped as `[script]`). |

Multi-call scripting uses `:scripts` / `:stream_script` as a list of lists —
each call consumes one inner list and advances the cursor.

### Disambiguation

For `generate/2`, both shapes share the `:script` key and are disambiguated
by leading entry tag. The `:error` tag disambiguates by `tuple_size/1`
(2 → §31, 3 → harness). For `stream/2`, distinct keys disambiguate.

## Cursor behaviour

Multi-call scripts (`:scripts` / `:stream_script`) advance a per-process
cursor on every call. By default the cursor lives in the process dictionary
at `{:allm_fake_cursor, :erlang.phash2(scripts)}` — isolated per ExUnit test
process (`async: true`), GC'd on pid-down, zero-setup for the common case.

**Two engines built with content-equal `:scripts` values in the same
process share a cursor.** This is a documented footgun: the cursor key is
`:erlang.phash2(scripts)`, so identical script contents collide. A test
that constructs two Fake engines simulating two distinct providers with the
same fixture script finds the second engine's first call already at index 1.
Workaround: pass distinct `adapter_opts[:script_cursor]` Agent pids,
obtained from `start_script_cursor/0`.

The explicit-Agent override also supports cross-process cursor sharing
(`Task.async/1` over the adapter call) and mitigates the rare hash-collision
case (`:erlang.phash2/1` is a 27-bit hash).

## Testing patterns

**Use `start_script_cursor/0` for multi-call tests.** Reach for the explicit
Agent cursor whenever a test (a) runs multiple Fake calls with a `:scripts`
or `:stream_script` list AND (b) another test in the same `async: true`
module could share content-equal script entries, OR (c) the test dispatches
the adapter call across processes (`Task.async/1`). The default
process-dict cursor is fine for one-shot `:script` calls and for
single-multi-call tests whose script content is unique in the module; for
everything else, the explicit cursor is load-bearing:

    cursor = ALLM.Providers.Fake.start_script_cursor()
    opts = [adapter_opts: [scripts: [...]] ++ [script_cursor: cursor]]

Worked examples: `test/allm/providers/fake_test.exs:143` and
`test/allm/providers/fake/fixtures_test.exs:66`.

## Cleanup observation

When `adapter_opts[:cleanup_observer]` is a `:counters` ref (as created by
`:counters.new(1, [:atomics])`), the `Stream.resource/3` `after_fun`
increments index 1 on every normal termination path — consumer
`Enum.take/N`, `Stream.take_while/2` returning false, `Stream.run/1` scope
exit, throws from the reducer, consumer process exit with a trappable
reason. The counter increments at most once per stream (not once-per-event).

Halt-safety test shape:

    ref = :counters.new(1, [:atomics])
    {:ok, stream} = ALLM.Providers.Fake.stream(req,
      adapter_opts: [script: [...], cleanup_observer: ref])
    _ = Enum.take(stream, 2)
    # :counters.get(ref, 1) == 1 within 500 ms

**Brutal-kill caveat.** `Stream.resource/3`'s `after_fun` does NOT run when
the consumer is killed with `Process.exit(pid, :kill)` — brutal exits skip
all cleanup by OTP design. Real provider adapters (Phase 10–11) address this
via Finch's own monitor-based connection cleanup; Fake has no HTTP ref to
leak so the caveat is purely documentary. Tests assert cleanup on normal
halts only; no test should simulate `:kill`.

## Adapter event vocabulary (streaming)

**Emitted** (spec §8 subset that belongs to an adapter):
`:message_started`, `:text_delta`, `:text_completed`, `:tool_call_started`,
`:tool_call_delta`, `:tool_call_completed`, `:message_completed`,
`:raw_chunk`, `:error`.

**Not emitted** (orchestrator-owned — Phase 6/7):
`:tool_execution_started`, `:tool_execution_completed`, `:tool_result_encoded`,
`:ask_user_requested`, `:tool_halt`, `:step_completed`, `:chat_completed`.

## Backpressure and delays

`{:delay, ms}` entries call `Process.sleep/1` inside the `next_fun` of the
`Stream.resource/3` — the delay blocks the consumer's reducing process, not
a simulated provider. `{:delay, _}` is **front-loaded**: the interpreter
sleeps before consuming the NEXT entry, so placing `{:delay, ms}` as the
FIRST entry delays `:message_started`. Timing tests measure the wall-clock
interval between the emit preceding the `{:delay, _}` entry and the emit
following it.

`{:sleep, ms}` is a deprecated alias for `{:delay, ms}` — one `Logger.warning/1`
fires per BEAM lifetime when `{:sleep, _}` is used. Deletion target: v0.3.

## Examples

    iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "hi"}])
    iex> opts = [adapter_opts: [script: [{:text, "hi"}, {:finish, :stop}]]]
    iex> {:ok, resp} = ALLM.Providers.Fake.generate(req, opts)
    iex> {resp.output_text, resp.finish_reason}
    {"hi", :stop}

# `cursor_index`

```elixir
@spec cursor_index(pid()) :: non_neg_integer()
```

Read the current cursor index for an Agent-backed cursor. Used in tests to
assert how many calls have been consumed.

# `generate`

```elixir
@spec generate(
  ALLM.Request.t(),
  keyword()
) :: {:ok, ALLM.Response.t()} | {:error, ALLM.Error.AdapterError.t()}
```

Execute a scripted non-streaming request. See spec §7.1.

Reads the script from `opts[:adapter_opts]`, validates via
`ALLM.Providers.Fake.Script.validate!/1`, resolves the cursor, folds the
current call's entries into a `%ALLM.Response{}` via
`ALLM.Providers.Fake.Script.fold_to_response/1`, and advances the cursor.

Ignores the `%Request{}` entirely — the scripted response is produced
irrespective of the request's messages, tools, or params (see "What Fake is
(and isn't)" in the moduledoc).

Returns `{:ok, %ALLM.Response{}}` on a scripted success (including
harness-shape `{:ok, _}` entries), `{:error, %AdapterError{}}` on a scripted
failure or a script-exhausted cursor. `adapter_opts[:request_id]` is
propagated onto `response.request_id` verbatim.

## Examples

    iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}])
    iex> opts = [adapter_opts: [script: [{:text, "hello"}, {:finish, :stop}]]]
    iex> {:ok, resp} = ALLM.Providers.Fake.generate(req, opts)
    iex> resp.output_text
    "hello"

# `script_exhausted_error`

```elixir
@spec script_exhausted_error() :: ALLM.Error.AdapterError.t()
```

Return the canonical `%AdapterError{}` for a script-exhausted call.

Spec §31 phrases this as `{:error, :no_scripted_response}`; the atom is
preserved as the struct's `:reason` field via the Phase 1 enum amendment.

## Examples

    iex> err = ALLM.Providers.Fake.script_exhausted_error()
    iex> err.reason
    :no_scripted_response
    iex> err.message
    "no scripted response"

# `start_script_cursor`

```elixir
@spec start_script_cursor() :: pid()
```

Start an Agent-backed script cursor for cross-process multi-call scripting
and for disambiguating content-equal scripts in the same process (see
moduledoc "Cursor behaviour").

Pass the returned pid as `adapter_opts[:script_cursor]`; subsequent calls
increment the cursor on the Agent rather than on the process dictionary.

## Examples

    iex> pid = ALLM.Providers.Fake.start_script_cursor()
    iex> ALLM.Providers.Fake.cursor_index(pid)
    0

# `stream`

```elixir
@spec stream(
  ALLM.Request.t(),
  keyword()
) :: {:ok, Enumerable.t()} | {:error, ALLM.Error.AdapterError.t()}
```

Open a scripted streaming request. See spec §7.2, §8.

Reads the script from `opts[:adapter_opts]`, validates via
`ALLM.Providers.Fake.Script.validate!/1`, resolves the cursor, and returns
a lazy `Enumerable.t()` of `ALLM.Event` values. No event fires until the
consumer reduces.

Key precedence: `:stream_script` > `:scripts` > `:script` (wrapped as
`[script]`). Empty opts return `{:error, script_exhausted_error()}` (no
stream opened).

When the first entry of the current call is a harness-shape
`{:preflight_error, reason, opts}`, returns synchronously as
`{:error, %AdapterError{reason: ^reason, ...opts}}` — no stream is opened.

The returned stream emits `:message_started` on open, per-entry events via
`ALLM.Providers.Fake.Script.interpret/1`, and closes with
`:message_completed` (prepended by `:text_completed` if any `:text`/`:text_delta`
was emitted). `{:delay, ms}` / `{:sleep, ms}` entries call `Process.sleep/1`
and yield no events. `after_fun` increments `adapter_opts[:cleanup_observer]`
(a `:counters` ref, when present).

Ignores the `%Request{}` (see "What Fake is (and isn't)" in the moduledoc).

## Examples

    iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}])
    iex> opts = [adapter_opts: [script: [{:text, "hi"}, {:finish, :stop}]]]
    iex> {:ok, stream} = ALLM.Providers.Fake.stream(req, opts)
    iex> events = Enum.to_list(stream)
    iex> Enum.any?(events, &match?({:text_delta, %{delta: "hi"}}, &1))
    true

---

*Consult [api-reference.md](api-reference.md) for complete listing*