OpenAI Images provider adapter — implements ALLM.ImageAdapter against
OpenAI's /v1/images/generations, /v1/images/edits, and
/v1/images/variations endpoints. See spec §35.7.
Layer B — runtime. Constructed via
ALLM.Engine.new(image_adapter: ALLM.Providers.OpenAI.Images, model: "dall-e-2")
and consumed through the ALLM.generate_image/3 · edit_image/4 · image_variations/3 façade (Phase 14.2). Keys resolve via
ALLM.Keys.fetch!(:openai, opts) at request-build time per spec §6.4 —
no key ever lives on the engine.
Status
The JSON :generate HTTP path is wired for dall-e-2, dall-e-3, and
gpt-image-1. The gpt-image-1 path applies forced-base64 normalization
(gpt-image-1 ignores response_format at the wire), token-based usage
(input_tokens / output_tokens), and output_format → :mime_type
mapping per Decision #19. The multipart :edit HTTP path is wired for
dall-e-2 and gpt-image-1, including URL-source eager-download per
Decision #8. The :variation path is wired for dall-e-2 via the same
multipart machinery as :edit — variation drops prompt and mask
fields and otherwise mirrors :edit's wire shape.
Model × Operation matrix
| Model | :generate | :edit | :variation | Wire format | Usage shape |
|---|---|---|---|---|---|
dall-e-2 | yes | yes | yes | url or b64_json per caller | images = length(data) |
dall-e-3 | yes | no | no | url or b64_json per caller | images = length(data) |
gpt-image-1 | yes | yes | no | b64_json ALWAYS (forced) | images + input_tokens + output_tokens |
Cells marked "no" produce
{:error, %ImageAdapterError{reason: :unsupported_operation, metadata: %{operation: op, model: model}}} BEFORE any HTTP I/O. Unknown
models (any string not in the matrix) fall through to the provider —
see Decision #3 of steering/PHASE_15_image_layer_6.md.
gpt-image-1 specifics
- Body fields. When
request.model == "gpt-image-1",to_json_body/2OMITSresponse_format, includesquality/backgroundper the wire-field map, and includesoutput_formatfromrequest.options[:output_format]("png" | "jpeg" | "webp"). When:output_formatis absent the adapter OMITS the field and the OpenAI API applies its server-side default of"png". Per CLAUDE.md "Adapters MUST document any default they inject for a Layer-Anilfield that the wire requires" — gpt-image-1'soutput_formatdoes NOT need an adapter-side default because the provider-default and the project's response:mime_typedefault both resolve to PNG. - Response decode. gpt-image-1 always returns
b64_jsonper image. For:binarycallers the adapter Base64-decodes server-side; for:base64callers the b64 is forwarded verbatim.:urlcallers are rejected pre-flight (Decision #6). - Response
:mime_type. Driven byrequest.options[:output_format]viamime_type_for_output_format/1:"png"|:png→"image/png","jpeg"|:jpeg|:jpg→"image/jpeg","webp"|:webp→"image/webp", absent →"image/png". - Token usage.
ImageUsage.input_tokens/output_tokensfrombody.usage;ImageUsage.images = length(data)as elsewhere.body.usage.input_tokens_details(when present) lands onresponse.metadata[:usage_details]without overwriting caller keys.
Multipart vs JSON dispatch (Decision #7)
The :generate operation uses an application/json body via
Req.new(json: ...) and OpenAIHeaders.json_headers/2. The :edit
and :variation operations require an actual image upload, so they
use multipart/form-data via Req.new(form_multipart: ...) and
OpenAIHeaders.multipart_headers/2 (which elides content-type so
Req's :form_multipart step stamps it with the boundary).
Both paths flow through the same Retry.run/3 integration and share
the same decode_response/4 and to_image_adapter_error/4 helpers —
the response shape is identical to :generate (a data: [...] array
of url/b64_json items, optional usage on gpt-image-1).
URL-source resolution (Decision #8)
:edit / :variation requests carrying Image.from_url/1 images are
eagerly fetched at request-build time. The Req.get/2 call honors a 30 s
default receive_timeout (override via opts[:request_timeout]), a
5-redirect cap, and a 25 MB body-size cap. Failure modes (closed):
- Non-2xx HTTP status →
:invalid_requestwithmetadata: %{url: u, status: status}. - Non-image content-type (must prefix-match
~r{^image/(png|jpeg|jpg|webp|gif)$}) →:invalid_requestwithmetadata: %{url: u, content_type: ct}. - Body > 25 MB →
:invalid_requestwithmetadata: %{url: u, size: bytes}. - More than 5 redirects (
Req.TooManyRedirectsError) →:invalid_requestwithmetadata: %{url: u}. - Timeout / network error (
Req.TransportError, etc.) →:network_errorwithmetadata: %{url: u}and the underlying exception on:cause.
URL fetches are stubbable in tests via Req.Test.stub/2; pass the same
adapter_opts: [plug: {Req.Test, stub}] used for the API stub and the
fetch will route through the stub.
Test-injection escape hatch
Per Decision #20, generate/2 honors
opts[:adapter_opts][:image_script] as a documented test-only short-
circuit: when present, the call delegates to
ALLM.Providers.FakeImages.generate/2 BEFORE any pre-flight gate runs
and returns its result verbatim. This is the same pattern
ALLM.Test.ImageAdapterConformance uses to script adapters under test.
Production callers do not populate this key.
Retry integration
HTTP error closures return {:retry, delay_ms, error} for 429 +
Retry-After, 5xx, and timeouts; ALLM.Retry.run/3 is wired against
the engine's policy. Phase 14.3 augmented the retry vocabulary with the
four image-error atoms (:rate_limited, :provider_unavailable,
:timeout, :network_error) at the façade call site.
URL-mode expiry warning
Per Decision #5, OpenAI documents that image URLs returned via
response_format: :url expire ~60 minutes after creation. Callers
persisting Image{source: {:url, _}} beyond that window should
download the bytes themselves before persisting, or request :base64 /
:binary upfront. The adapter does NOT proactively materialize
URL-mode responses to bytes.
Closed-enum mapping table caveat
:context_length_exceeded is reserved in ImageAdapterError's closed
enum but is NOT actively mapped — long-prompt rejections from OpenAI
surface as :invalid_request per Decision #21.
:unsupported_feature is not produced by this adapter (Decision #22).
Summary
Functions
Return the OpenAI endpoint path (relative to the API base URL) for an image operation.
Execute an image-generation request synchronously against OpenAI.
Return an unfired Req.Request configured exactly as generate/2
would fire it.
Resolve an %Image{} source to raw bytes for multipart upload.
Return the per-module union of operations the adapter can ever perform.
Build a multipart/form-data field list for :edit / :variation.
Functions
@spec endpoint_for(ALLM.ImageRequest.operation()) :: String.t()
Return the OpenAI endpoint path (relative to the API base URL) for an image operation.
Examples
iex> ALLM.Providers.OpenAI.Images.endpoint_for(:generate)
"/images/generations"
iex> ALLM.Providers.OpenAI.Images.endpoint_for(:edit)
"/images/edits"
iex> ALLM.Providers.OpenAI.Images.endpoint_for(:variation)
"/images/variations"
@spec generate( ALLM.ImageRequest.t(), keyword() ) :: {:ok, ALLM.ImageResponse.t()} | {:error, ALLM.Error.ImageAdapterError.t()}
Execute an image-generation request synchronously against OpenAI.
Pre-flight gates
Before any HTTP I/O, generate/2 checks four gates in order (per
Invariant 1 of the design):
- Operation gate.
request.operation in supported_operations(). Failure →:unsupported_operation. - Model gate. When
request.modelis in the known matrix (dall-e-2,dall-e-3,gpt-image-1), the operation must be allowed for that model. Failure →:unsupported_operationwithmetadata: %{operation: op, model: model}. Unknown models fall through. - gpt-image-1 +
:urlrejection. Whenrequest.model == "gpt-image-1"andrequest.response_format == :url, the request is rejected with:invalid_requestbecause gpt-image-1 only returns base64 (Decision #6). - URL-source resolution —
:edit/:variationrequests with{:url, _}source images are eagerly fetched. Not implemented yet (lands with the multipart body builder).
Key resolution (ALLM.Keys.fetch!/2) runs AFTER the gates per
Invariant 2 — a request that's going to be rejected pre-flight does
not require a valid API key.
Adapter-injected defaults
When request.size is nil, the adapter OMITS the size field from
the wire body and lets OpenAI apply its server-side default
("1024x1024" for dall-e-3 / gpt-image-1; "1024x1024" for dall-e-2).
Per the wire-field map row, nil → omit. Other size shapes encode as:
{w, h} → "<w>x<h>"; :auto → "auto"; binary → passthrough.
Test-injection short-circuit (Decision #20 / Invariant 0)
When opts[:adapter_opts][:image_script] is non-nil, generate/2
delegates to ALLM.Providers.FakeImages.generate/2 with the same opts
BEFORE any pre-flight gate runs. This is the documented test-only
escape hatch the conformance suite uses; production callers do not
populate this key.
Response-format normalization (Decision #5)
The provider response carries either url: or b64_json: per image.
The adapter materializes the caller's requested form:
- caller asked
:url→ response carries{:url, url}source. - caller asked
:base64→ response carries{:base64, b64}source. - caller asked
:binary→ adapter Base64-decodes theb64_jsonfield server-side and produces{:binary, bytes}source.
For dall-e-2 / dall-e-3 the response :mime_type defaults to
"image/png". For gpt-image-1 the MIME type is driven by
request.options[:output_format] per Decision #19:
"png"|:png → "image/png", "jpeg"|:jpeg|:jpg → "image/jpeg",
"webp"|:webp → "image/webp". When :output_format is absent the
default is "image/png" (matching OpenAI's server-side default). The
adapter OMITS the output_format field from the wire body when
:output_format is absent and lets the provider default apply.
Request-id preservation (Invariant 3)
opts[:request_id] is reflected onto response.request_id
unchanged. The OpenAI response's x-request-id header is preserved
separately on response.metadata[:openai_request_id] (Decision #18).
Retry contract
Wraps the HTTP call in ALLM.Retry.run(opts[:retry] || :default, ...).
The closure parses Retry-After (seconds form), returns
{:retry, delay_ms, error} for 429/5xx/:timeout/:network_error,
{:ok, response} for 2xx, and {:error, error} for everything else.
@spec prepare_request( ALLM.ImageRequest.t(), keyword() ) :: {:ok, Req.Request.t()} | {:error, ALLM.Error.ImageAdapterError.t()}
Return an unfired Req.Request configured exactly as generate/2
would fire it.
Mirrors the chat-adapter prepare_request/2 shape at
lib/allm/providers/openai.ex:411-435. The :generate, :edit, and
:variation operations are all supported; :variation shares the
multipart machinery with :edit (variation drops prompt / mask).
When opts[:adapter_opts][:image_script] is set, prepare_request/2
returns the same stub error rather than delegating to FakeImages —
the script path has no Req.Request analogue, so prepare_request/2
intentionally diverges from generate/2 (which DOES delegate to
FakeImages.generate/2 under the script key per Invariant 0).
@spec resolve_image_bytes( ALLM.Image.t(), keyword() ) :: {:ok, binary(), String.t(), String.t()} | {:error, ALLM.Error.ImageAdapterError.t()}
Resolve an %Image{} source to raw bytes for multipart upload.
Returns {:ok, bytes, mime_type, filename} on success or a typed
%ImageAdapterError{} for URL-source failures (non-2xx, non-image
content-type, oversized body, too many redirects, timeout / network
error) and base64 / file decode failures.
Filename is always "image.png" for {:binary, _} / {:base64, _} /
{:url, _} sources (OpenAI ignores the filename for content-type
resolution); {:file, path} sources use Path.basename(path) so the
uploaded filename matches the local file for human readability.
@spec supported_operations() :: [:generate | :edit | :variation]
Return the per-module union of operations the adapter can ever perform.
Per Phase 14.1 Decision #3, this is a per-MODULE list, not a per-call
function of model. Per-model gating lives in gate_model_op/2.
Examples
iex> ALLM.Providers.OpenAI.Images.supported_operations()
[:generate, :edit, :variation]
@spec to_multipart_body( ALLM.ImageRequest.t(), keyword() ) :: {:ok, [{String.t(), term()}]} | {:error, ALLM.Error.ImageAdapterError.t()}
Build a multipart/form-data field list for :edit / :variation.
Returns {:ok, [{name, content}, ...]} ready to hand to Req.new(..., form_multipart: form). Plain fields are {name, value} 2-tuples;
file fields (:image, :mask) use Req's {body, opts} shape:
{name, {bytes, filename: "image.png", content_type: <mime>}}. See
deps/req/lib/req/steps.ex:446-468 for the encoding contract.
All fields are emitted as strings (multipart fields are always strings on the wire); integer / atom values are stringified.
URL-source images on :edit / :variation are eagerly fetched per
Decision #8. Failure modes (closed): non-2xx, non-image content-type,
body > 25 MB, > 5 redirects, timeout / network error. Each maps to a
typed %ImageAdapterError{} with metadata describing the URL and the
failure detail. The Req.get/2 call honors opts[:adapter_opts][:plug]
so URL fetches are stubbable in tests via Req.Test.stub/2.