ALLM.Providers.Anthropic (allm v0.3.0)

Copy Markdown View Source

Anthropic provider adapter — Layer B. See spec §6.4, §7.1, §20, §32.1.

Phase 11.1 ships the non-streaming ALLM.Adapter callback set; Phase 11.2 adds the ALLM.StreamAdapter callbacks; Phase 11.3 adds structured-output tool-forcing for both arms. This module implements:

  • generate/2 — fires POST https://api.anthropic.com/v1/messages via Req, wrapped in ALLM.Retry.run/3 with the Anthropic-specific 529 Overloaded retryable status (Decision #2).
  • prepare_request/2 — returns an unfired %Req.Request{} with the API key injected as x-api-key and the API version pinned via anthropic-version: 2023-06-01 (Decision #9).
  • translate_options/2 — identity (Decision #7). Anthropic accepts :max_tokens natively across all model generations.
  • requires_structured_finalize?/1 — capability declaration consumed by ALLM.Capability.preflight/2. Always false because Anthropic uses tool-forcing (single-pass) for structured output rather than the OpenAI-style two-pass dance (Decision #13).

System-message extraction (Decision #1)

Anthropic's Messages API rejects {role: "system", ...} items inside messages:; the system prompt is a top-level system: parameter. extract_system/1 partitions system-role messages out of the thread and concatenates their content strings with \n\n. The non-system messages flow through unchanged.

Tool-choice translation (Decision #3)

to_anthropic_tool_choice/1 returns a sentinel-tagged result so the request-builder can decide whether to emit the field at all:

ALLM canonicalReturnsWire effect
nil / :auto{:omit}field omitted (Anthropic defaults to "auto" when tools present)
:none{:set, %{type: "none"}}tool_choice: %{type: "none"}
:required{:set, %{type: "any"}}tool_choice: %{type: "any"} (Anthropic's wording)
"<name>"{:set, %{type: "tool", name: "<name>"}}passthrough
%{type: t, ...} (t in ~w(auto any none tool)){:set, m}passthrough verbatim

Note the rename :required → "any" — Anthropic uses different wording than OpenAI for the same semantic.

Stop-reason normalization (total per spec §5.5)

Anthropic stringALLM atomNotes
"end_turn":stopNatural completion.
"max_tokens":lengthmax_tokens reached.
"tool_use":tool_callsTool-use content blocks emitted.
"stop_sequence":stopA stop_sequences: element matched.
"refusal":content_filterAnthropic policy block.
"pause_turn":otherLong-running pause; raw preserved.
anything else:otherraw_finish_reason carries the raw string.
nilnilMid-stream message_delta pre-finish.

Retry contract (Decision #2)

generate/2 wraps the HTTP call in ALLM.Retry.run(opts[:retry] || :default, …). The closure adds 529 Overloaded (Anthropic-specific) to the retryable set on top of the spec §6.1 default [429, 500, 502, 503, 504, :timeout]. Retry-After honored when present. Streaming never retries (spec §6.1).

Key resolution

Keys never appear on the engine. prepare_request/2 and generate/2 call ALLM.Keys.fetch!(:anthropic, opts) at request-build time per spec §6.4. Per Decision #9, prepare_request/2 raises %ALLM.Error.EngineError{reason: :missing_key} when no key resolver yields a value.

Structured output via tool-forcing (§5.4 + Decision #4)

When request.response_format == %{type: :json_schema, name: n, schema: s, strict: b}, to_anthropic_request_body/1 injects a synthetic tool %{"name" => "respond_with_json_<n>", "description" => "...", "input_schema" => s} into the wire body's tools: array (appending to any user tools) AND sets tool_choice: %{type: "tool", name: "respond_with_json_<n>"} to force the model to call it. The response decoder (from_anthropic_response/2) calls lift_structured_output/1, which detects the synthetic call by name prefix ("respond_with_json_"), replaces Response.output_text with Jason.encode!(tool_call.arguments), sets finish_reason: :stop, clears tool_calls: [], and stamps metadata.structured_output_tool == true for observability. The streaming arm (stream/2) wraps its inner enumerable in Stream.transform/3 so the same lift_structured_output/1 helper runs on the accumulated state at completion — both arms produce byte-identical %Response{} shapes (Decision #5b, invariant 14).

Streamed structured output — event shape

When response_format: %{type: :json_schema, ...} is set, the streaming wrapper emits :text_delta events for partial JSON and a final :text_completed event before the terminal :message_completed. The synthetic tool_use round-trip is hidden from the consumer; this matches OpenAI's native :json_schema streaming behavior so consumers can write provider-neutral structured-output streaming code (pattern-match :text_delta events to display JSON character-by-character).

Per Decision #5b: :tool_call_* events DO NOT fire on this path. The shared lift_structured_output/1 ensures the collected %Response{} is byte-identical with the non-streaming arm: output_text carries the JSON, finish_reason is :stop, tool_calls is empty, and metadata carries structured_output_tool: true (invariant 14).

requires_structured_finalize?/1 is false because tool-forcing is single-pass — the OpenAI-style two-pass structured_finalize dance is unnecessary (Decision #13).

Cross-provider byte-shape carve-out

output_text from Anthropic's structured-output path is Jason.encode!/1 of the parsed map — the bytes are re-encoded from a parsed map, so whitespace, key order, number formatting, and Unicode escape style may differ from OpenAI's :json_schema path (which preserves the model's literal output string). The semantic content is identical — Jason.decode!/1 of either yields the same Elixir map. Consumers that hash, diff, or store output_text as a canonical "the model said exactly this" record across providers should canonicalize via Jason.encode!/1 themselves.

Synthetic-tool-name collision

The synthetic tool's name is "respond_with_json_<schema_name>" — the schema name embeds the namespace marker. A collision is only possible when the user names a tool exactly identical (e.g., a user-defined respond_with_json_person tool plus response_format: %{type: :json_schema, name: "person", ...}). In that pathological case the body's tools: array contains both entries and the response decoder's lift_structured_output/1 only fires when there is exactly one tool call whose name starts with the prefix; ambiguous multi-call responses surface unchanged (finish_reason: :tool_calls). Avoid the collision by renaming the user-defined tool.

Multi-turn synthetic-tool de-injection

After the first turn where the synthetic tool fires, the assistant message carries the synthetic tool_use call and the next turn's thread carries a :tool message with tool_call_id matching the synthetic id. inject_structured_output_tool/2 detects this by scanning request.messages for a :tool message whose tool_call_id matches the synthetic prefix; when found, the synthetic injection is SKIPPED so user-defined tools remain callable on subsequent turns.

Vision input (Phase 17.2)

[%ALLM.TextPart{}, %ALLM.ImagePart{}] content lists translate to Anthropic's Messages-API content-block shape automatically. URL-source images flow through source: %{type: "url", url: u}; binary, base64, and file sources resolve to source: %{type: "base64", media_type: mime, data: ...}. ImagePart.detail is NOT supported by Anthropic and is dropped silently with a one-time Logger.debug/1 per process (Decision #3). System messages remain text-only — an %ImagePart{} in a system role is hard-rejected as %ValidationError{reason: :invalid_message} before any HTTP call. Per-image MIME / 20 MB size validation runs in pre-flight via ALLM.Providers.Support.ImageMime.

Summary

Functions

Partition system-role messages out of messages. Returns {system_text_or_nil, non_system_messages} where system_text is the concatenation of all system-message contents joined with "\n\n" (Decision #1).

Decode an Anthropic Messages-API response body to canonical %Response{}.

Execute a non-streaming Messages-API request synchronously.

Inject the synthetic structured-output tool when request.response_format is %{type: :json_schema, ...} (Phase 11 design Decision #4).

Lift a synthetic structured-output tool call back to Response.output_text (Phase 11 design Decision #4).

Build an unfired %Req.Request{} with the resolved API key injected as x-api-key: <key> AND the API version pinned via anthropic-version: 2023-06-01 (Decision #9).

Capability declaration consumed by ALLM.Capability.preflight/2 (Decision #13).

Open a streaming Messages-API request against the Anthropic provider.

Map a list of canonical %Message{}s to Anthropic's wire shape.

Compose the JSON request body from a canonical %Request{}.

Translate an ALLM canonical tool_choice to Anthropic's sentinel-tagged wire shape per Decision #3.

Map a list of canonical %ALLM.Tool{}s to Anthropic's wire shape.

Identity translator — Anthropic accepts :max_tokens natively across all model generations (Decision #7). Reshape of system messages, tool_choice, and tools happens in the request-build helpers, not here.

Functions

extract_system(messages)

@spec extract_system([ALLM.Message.t()]) :: {String.t() | nil, [ALLM.Message.t()]}

Partition system-role messages out of messages. Returns {system_text_or_nil, non_system_messages} where system_text is the concatenation of all system-message contents joined with "\n\n" (Decision #1).

Examples

iex> ALLM.Providers.Anthropic.extract_system([%ALLM.Message{role: :user, content: "hi"}])
{nil, [%ALLM.Message{role: :user, content: "hi"}]}

iex> {sys, rest} = ALLM.Providers.Anthropic.extract_system([
...>   %ALLM.Message{role: :system, content: "be brief"},
...>   %ALLM.Message{role: :user, content: "hi"}
...> ])
iex> {sys, length(rest)}
{"be brief", 1}

from_anthropic_response(body, opts)

@spec from_anthropic_response(
  map(),
  keyword()
) :: ALLM.Response.t()

Decode an Anthropic Messages-API response body to canonical %Response{}.

Maps stop_reason per the table in the moduledoc; preserves the raw string on Response.raw_finish_reason for non-canonical values. Decodes tool_use content blocks to %ToolCall{} per Decision #6 — the input map maps to arguments, and raw_arguments is computed via Jason.encode!/1 for OpenAI parity.

Examples

iex> body = %{
...>   "id" => "msg_test",
...>   "model" => "claude-sonnet-4-6",
...>   "content" => [%{"type" => "text", "text" => "hi"}],
...>   "stop_reason" => "end_turn",
...>   "usage" => %{"input_tokens" => 5, "output_tokens" => 1}
...> }
iex> resp = ALLM.Providers.Anthropic.from_anthropic_response(body, [])
iex> {resp.output_text, resp.finish_reason, resp.usage.input_tokens}
{"hi", :stop, 5}

generate(request, opts)

Execute a non-streaming Messages-API request synchronously.

Wraps the HTTP call in ALLM.Retry.run/3. The closure adds 529 Overloaded (Anthropic-specific — Decision #2) to the spec §6.1 default retryable set [429, 500, 502, 503, 504, :timeout]. Returns {:ok, %Response{}} on 2xx success or {:error, %AdapterError{}} on every failure shape.

Vision input (Phase 17.2)

[%ALLM.TextPart{}, %ALLM.ImagePart{}] content lists translate to Anthropic's content-block wire shape automatically. URL-source images use source: %{type: "url", url: u}; base64/binary/file sources resolve to source: %{type: "base64", media_type: mime, data: ...}.

Note: ImagePart.detail is dropped

Anthropic's Messages API has no detail field. The translator drops the value silently and emits a single Logger.debug/1 per process the first time an ImagePart with detail: :auto | :low | :high flows through. The wire shape never carries detail (Decision #3).

System messages remain text-only — an %ImagePart{} in a system role is hard-rejected as %ValidationError{reason: :invalid_message} before any HTTP call. Per-image MIME / 20 MB size validation runs in pre-flight via ALLM.Providers.Support.ImageMime.

Examples

iex> ALLM.Keys.put(:anthropic, "sk-ant-doctest-gen")
iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}], model: "claude-sonnet-4-6")
iex> {:error, %ALLM.Error.AdapterError{reason: :authentication_failed}} =
...>   ALLM.Providers.Anthropic.generate(req,
...>     retry: false,
...>     adapter_opts: [plug: fn conn ->
...>       conn
...>       |> Plug.Conn.put_resp_content_type("application/json")
...>       |> Plug.Conn.resp(401, ~s({"type":"error","error":{"type":"authentication_error","message":"bad"}}))
...>     end]
...>   )
iex> ALLM.Keys.delete(:anthropic)
:ok

iex> # Vision pre-flight rejects an ImagePart in a system message.
iex> img = ALLM.Image.from_url("https://example.com/x.png")
iex> sys = %ALLM.Message{role: :system, content: [%ALLM.ImagePart{image: img}]}
iex> req = ALLM.Request.new([sys, %ALLM.Message{role: :user, content: "hi"}], model: "claude-sonnet-4-6")
iex> {:error, %ALLM.Error.ValidationError{reason: :invalid_message}} =
...>   ALLM.Providers.Anthropic.generate(req, api_key: "sk-x")
iex> :ok
:ok

inject_structured_output_tool(request, body)

@spec inject_structured_output_tool(ALLM.Request.t(), map()) :: map()

Inject the synthetic structured-output tool when request.response_format is %{type: :json_schema, ...} (Phase 11 design Decision #4).

Branches:

  • nil or %{type: :json_object} (or anything other than :json_schema) → returns body unchanged.
  • %{type: :json_schema, name: n, schema: s, strict: _} AND the request has NOT already produced a synthetic-tool result in a prior turn → injects a synthetic tool entry into body["tools"] (preserving any user tools — APPEND, not replace) AND sets body["tool_choice"] = %{type: "tool", name: "respond_with_json_<n>"} to force the model to call it.
  • %{type: :json_schema, ...} BUT a prior assistant turn already produced the synthetic tool's output (the request's messages contains a :tool message whose tool_call_id starts with the synthetic prefix "respond_with_json_") → returns body unchanged so user-defined tools remain callable on subsequent turns. See moduledoc "Multi-turn synthetic-tool de-injection".

Examples

iex> body = %{"model" => "claude-sonnet-4-6"}
iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}])
iex> ALLM.Providers.Anthropic.inject_structured_output_tool(req, body)
%{"model" => "claude-sonnet-4-6"}

iex> rf = %{type: :json_schema, name: "person", schema: %{"type" => "object"}, strict: true}
iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}], response_format: rf)
iex> body = ALLM.Providers.Anthropic.inject_structured_output_tool(req, %{"tools" => []})
iex> body["tool_choice"]
%{type: "tool", name: "respond_with_json_person"}

lift_structured_output(response)

@spec lift_structured_output(ALLM.Response.t()) :: ALLM.Response.t()

Lift a synthetic structured-output tool call back to Response.output_text (Phase 11 design Decision #4).

When the response's tool_calls list has exactly one entry whose name starts with @structured_output_tool_prefix ("respondwith_json"):

  • output_text becomes Jason.encode!(tool_call.arguments) (the parsed input map; per Decision #6, arguments already carries the parsed map and raw_arguments carries the JSON string).
  • finish_reason is set to :stop (NOT :tool_calls).
  • tool_calls is cleared to [] — the synthetic call is consumed.
  • metadata.structured_output_tool is set to true for observability.
  • The assistant message is rewritten so its content carries the JSON-encoded text and its metadata.tool_calls is dropped.

In every other shape (zero tool calls, multiple tool calls, single non-synthetic tool call) the response is returned unchanged.

Examples

iex> resp = %ALLM.Response{output_text: "hi", finish_reason: :stop}
iex> ALLM.Providers.Anthropic.lift_structured_output(resp).output_text
"hi"

iex> tc = %ALLM.ToolCall{id: "toolu_x", name: "respond_with_json_person",
...>                     arguments: %{"name" => "Alice"}, raw_arguments: ~s({"name":"Alice"})}
iex> resp = %ALLM.Response{tool_calls: [tc], finish_reason: :tool_calls,
...>                       message: %ALLM.Message{role: :assistant, content: ""}}
iex> lifted = ALLM.Providers.Anthropic.lift_structured_output(resp)
iex> {Jason.decode!(lifted.output_text), lifted.finish_reason, lifted.tool_calls}
{%{"name" => "Alice"}, :stop, []}

prepare_request(request, opts)

@spec prepare_request(
  ALLM.Request.t(),
  keyword()
) :: {:ok, Req.Request.t()} | {:error, ALLM.Error.AdapterError.t()}

Build an unfired %Req.Request{} with the resolved API key injected as x-api-key: <key> AND the API version pinned via anthropic-version: 2023-06-01 (Decision #9).

Per Decision #9: this function raises %ALLM.Error.EngineError{reason: :missing_key} when no key resolver yields a value (via ALLM.Keys.fetch!/2).

Examples

iex> ALLM.Keys.put(:anthropic, "sk-ant-doctest-prep")
iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "hi"}], model: "claude-sonnet-4-6")
iex> {:ok, %Req.Request{} = http} = ALLM.Providers.Anthropic.prepare_request(req, [])
iex> {Req.Request.get_header(http, "x-api-key"), Req.Request.get_header(http, "anthropic-version"), http.url.path}
{["sk-ant-doctest-prep"], ["2023-06-01"], "/v1/messages"}
iex> ALLM.Keys.delete(:anthropic)
:ok

requires_structured_finalize?(request)

@spec requires_structured_finalize?(ALLM.Request.t()) :: false

Capability declaration consumed by ALLM.Capability.preflight/2 (Decision #13).

Always returns false. Anthropic's tool-forcing pattern (Phase 11.3) is single-pass — the OpenAI-style two-pass structured_finalize dance is unnecessary.

Examples

iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}])
iex> ALLM.Providers.Anthropic.requires_structured_finalize?(req)
false

stream(request, opts)

Open a streaming Messages-API request against the Anthropic provider.

Returns {:ok, lazy_stream} on success — no HTTP call fires until the consumer reduces over the stream. Pre-flight failures (missing key, invalid request shape, request-build raises) surface as {:error, %AdapterError{}} synchronously.

Per CLAUDE.md and spec §10.1, mid-stream failures (HTTP 4xx/5xx after Finch successfully retrieves headers, transport drops, malformed events) emit a terminal {:error, _} event INSIDE the stream — the call-site tuple stays {:ok, stream} and ALLM.StreamCollector folds the error into Response.finish_reason: :error. Streaming never retries (spec §6.1 + Phase 11 design Decision #14).

Anthropic SSE event mapping (Decision #14)

Anthropic uses NAMED SSE events (event: message_start\ndata: {...}). The ALLM.Providers.Support.SSE decoder carries the event: field through verbatim so this adapter switches on sse_msg.event:

Anthropic eventALLM events emitted
message_start:message_started
content_block_start (text)none — wait for text_delta
content_block_start (tool_use):tool_call_started
content_block_start (thinking){:raw_chunk, {:thinking_start, _}} (Decision #8)
content_block_delta (text_delta):text_delta
content_block_delta (input_json_delta):tool_call_delta
content_block_delta (thinking_delta){:raw_chunk, {:thinking_delta, _}} (Decision #8)
content_block_stop (text):text_completed
content_block_stop (tool_use):tool_call_completed (parsed args)
message_delta{:raw_chunk, {:usage, _}} if usage present; stores stop_reason
message_stopsynthetic :message_completed
pingdropped silently
unknown{:raw_chunk, {:unknown_event, name, data}} (forward-compat)

Options

  • :stream_timeout — milliseconds to wait between consecutive Finch messages (default 60000). Exceeding emits a terminal {:error, %AdapterError{reason: :timeout}} event.
  • :finch_module — overrides Finch (test seam — see ALLM.Test.FinchStub).
  • :finch_name — name of the Finch supervisor child (default ALLM.Finch, started by ALLM.Application with protocol: :http1).

Examples

iex> ALLM.Keys.put(:anthropic, "sk-ant-doctest-stream")
iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}], model: "claude-sonnet-4-6")
iex> {:ok, stream} = ALLM.Providers.Anthropic.stream(req, [])
iex> Enumerable.impl_for(stream) != nil
true
iex> ALLM.Keys.delete(:anthropic)
:ok

to_anthropic_messages(messages)

@spec to_anthropic_messages([ALLM.Message.t()]) :: [map()]

Map a list of canonical %Message{}s to Anthropic's wire shape.

System messages must already be filtered out by extract_system/1; passing a system-role message here is a programmer error and is silently coerced to a "user" role for safety. Tool-result messages encode as {role: "user", content: [{type: "tool_result", tool_use_id, content}]} per Anthropic's documented round-trip shape.

Examples

iex> ALLM.Providers.Anthropic.to_anthropic_messages([
...>   %ALLM.Message{role: :user, content: "hi"}
...> ])
[%{"role" => "user", "content" => "hi"}]

to_anthropic_request_body(request)

@spec to_anthropic_request_body(ALLM.Request.t()) :: map()

Compose the JSON request body from a canonical %Request{}.

Performs system-message extraction (Decision #1), message/tool/tool_choice translation, and structured-output synthetic-tool injection (Decision #4 — see inject_structured_output_tool/2).

Examples

iex> req = ALLM.Request.new(
...>   [%ALLM.Message{role: :system, content: "Be concise."},
...>    %ALLM.Message{role: :user, content: "Hi"}],
...>   model: "claude-sonnet-4-6", max_tokens: 256
...> )
iex> body = ALLM.Providers.Anthropic.to_anthropic_request_body(req)
iex> {body["model"], body["system"], length(body["messages"])}
{"claude-sonnet-4-6", "Be concise.", 1}

to_anthropic_tool_choice(name)

@spec to_anthropic_tool_choice(ALLM.Request.tool_choice()) :: {:omit} | {:set, map()}

Translate an ALLM canonical tool_choice to Anthropic's sentinel-tagged wire shape per Decision #3.

Returns {:omit} to skip the field entirely, or {:set, map} to inject it.

ALLM canonicalReturns
nil / :auto{:omit}
:none{:set, %{type: "none"}}
:required{:set, %{type: "any"}} (Anthropic's wording)
"<name>" (string){:set, %{type: "tool", name: "<name>"}}
%{type: t, ...} where t in ~w(auto any none tool){:set, m} (passthrough)

Raises ArgumentError on any other shape.

Examples

iex> ALLM.Providers.Anthropic.to_anthropic_tool_choice(nil)
{:omit}

iex> ALLM.Providers.Anthropic.to_anthropic_tool_choice(:auto)
{:omit}

iex> ALLM.Providers.Anthropic.to_anthropic_tool_choice(:none)
{:set, %{"type" => "none"}}

iex> ALLM.Providers.Anthropic.to_anthropic_tool_choice(:required)
{:set, %{"type" => "any"}}

iex> ALLM.Providers.Anthropic.to_anthropic_tool_choice("get_weather")
{:set, %{"type" => "tool", "name" => "get_weather"}}

to_anthropic_tools(tools)

@spec to_anthropic_tools([ALLM.Tool.t()]) :: [map()]

Map a list of canonical %ALLM.Tool{}s to Anthropic's wire shape.

Anthropic uses input_schema (not parameters) as the JSON-Schema field name.

Examples

iex> tool = ALLM.Tool.new(name: "get_weather", description: "weather", schema: %{"type" => "object"})
iex> ALLM.Providers.Anthropic.to_anthropic_tools([tool])
[%{"name" => "get_weather", "description" => "weather", "input_schema" => %{"type" => "object"}}]

translate_options(opts, request)

@spec translate_options(
  keyword(),
  ALLM.Request.t()
) :: keyword()

Identity translator — Anthropic accepts :max_tokens natively across all model generations (Decision #7). Reshape of system messages, tool_choice, and tools happens in the request-build helpers, not here.

Examples

iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}], model: "claude-sonnet-4-6")
iex> ALLM.Providers.Anthropic.translate_options([max_tokens: 100, temperature: 0.7], req)
[max_tokens: 100, temperature: 0.7]