ALLM.Adapter behaviour (allm v0.3.0)

Copy Markdown View Source

Non-streaming provider adapter contract. See spec §7.1.

Implementations take an ALLM.Request plus a keyword opts list (resolved via ALLM.Engine.resolve_params/2 and ALLM.Engine.resolve_tools/2 at the call site) and return either {:ok, %ALLM.Response{}} or {:error, %ALLM.Error.AdapterError{}}.

HTTP transport guidance

Use Req for non-streaming calls. Streaming belongs in ALLM.StreamAdapter and must be implemented on top of Finch directly (HTTP/1, not HTTP/2 — Req's SSE path does not cover every provider's chunking quirks, and HTTP/2 flow control breaks for request bodies larger than 64 KB).

Invariants

  1. generate/2 is synchronous: it returns only after the HTTP response has been read in full.
  2. generate/2 never raises for HTTP-shaped failures. A 4xx/5xx response is converted to {:error, %AdapterError{status: status, reason: <atom>}}. Network failures (ECONNREFUSED, DNS, TLS) are converted to {:error, %AdapterError{reason: :network_error}}. Only programmer errors (invalid request shape reaching the adapter, which the validator should have caught) may raise.
  3. generate/2 must honor opts[:request_timeout] if provided. Exceeding the timeout produces {:error, %AdapterError{reason: :timeout}}.
  4. prepare_request/2 (optional) returns an unfired Req.Request configured exactly as generate/2 would fire it. Callers may mutate the returned request before firing.
  5. translate_options/2 (optional) takes the resolved opts keyword and the request, and returns a possibly-renamed keyword; providers use it to rename :max_tokens:max_completion_tokens, etc. The default (when unimplemented) is identity — the caller uses function_exported?(adapter, :translate_options, 2) to decide whether to shim.

Summary

Callbacks

Execute a request against the provider synchronously.

Escape hatch: return a configured but unfired Req.Request that the caller can further customize (headers, retries, middleware) before firing.

Rename or reshape engine-level params into the provider's API dialect (e.g., OpenAI's newer endpoints require :max_completion_tokens instead of :max_tokens; Anthropic uses :system as a top-level field rather than a system-role message).

Callbacks

generate(t, keyword)

@callback generate(
  ALLM.Request.t(),
  keyword()
) :: {:ok, ALLM.Response.t()} | {:error, ALLM.Error.AdapterError.t()}

Execute a request against the provider synchronously.

Returns {:ok, %ALLM.Response{}} on success, or {:error, %ALLM.Error.AdapterError{}} on every failure shape.

Error reasons

ReasonHTTP statusFires when
:authentication_failed401API key missing or invalid.
:rate_limited429Provider quota exceeded; :retry_after_ms populated when Retry-After header is present.
:invalid_request400Request shape rejected by provider (unsupported param, schema violation).
:content_filter400 (provider-specific)Provider's content filter rejected the prompt or response.
:context_length_exceeded400Request exceeded the model's context window.
:provider_unavailable500, 502, 503, 504, 529Provider server-side failure, retryable.
:timeoutRequest exceeded opts[:request_timeout].
:network_errorTCP/TLS/DNS failure.
:malformed_responseProvider returned a 200 with an unparseable body.
:unsupported_featureRequest combined features the adapter cannot express.
:unknownanyCatch-all for shapes the adapter cannot classify; callers should treat as non-retryable.

prepare_request(t, keyword)

(optional)
@callback prepare_request(
  ALLM.Request.t(),
  keyword()
) :: {:ok, Req.Request.t()} | {:error, ALLM.Error.AdapterError.t()}

Escape hatch: return a configured but unfired Req.Request that the caller can further customize (headers, retries, middleware) before firing.

Optional. When unimplemented, callers must dispatch to generate/2 directly.

The error branch mirrors generate/2; see that callback's reason table.

translate_options(keyword, t)

(optional)
@callback translate_options(
  keyword(),
  ALLM.Request.t()
) :: keyword()

Rename or reshape engine-level params into the provider's API dialect (e.g., OpenAI's newer endpoints require :max_completion_tokens instead of :max_tokens; Anthropic uses :system as a top-level field rather than a system-role message).

Optional. The default is identity — callers use function_exported?(adapter, :translate_options, 2) to decide whether to shim.