Google Gemini native image-out adapter — implements ALLM.ImageAdapter
against generateContent with responseModalities: ["TEXT", "IMAGE"]
on the Gemini-native image preview models (gemini-3.1-flash-image-preview
/ "Nano Banana 2", gemini-3-pro-image-preview / "Nano Banana Pro").
See spec §35.3, §35.7 and steering/GEMINI_DESIGN.md Phase 16.5.
Layer B — runtime. Consumed through the ALLM.generate_image/3 façade.
Keys resolve via ALLM.Keys.fetch!(:gemini, opts) at request-build
time per spec §6.4 — no key ever lives on the engine.
Single translator (Decision #7)
Image generation is generateContent with responseModalities
toggled to ["TEXT", "IMAGE"]. The request body is built by
ALLM.Providers.Gemini.to_gemini_request_body/2 (the same translator
the chat adapter uses). The image adapter then overrides
generationConfig.responseModalities and adds
generationConfig.imageConfig.aspectRatio from the Decision #19
size-mapping table. The :edit operation reuses Phase 16.4's
part_to_block/1 for source-image translation by synthesizing a
user-role message with [%TextPart{}, %ImagePart{}, ...] content.
Aspect-ratio mapping (Decision #19)
ALLM ImageRequest.size | Gemini imageConfig.aspectRatio |
|---|---|
"1024x1024", "512x512", "256x256", any square | "1:1" |
"1792x1024", any 16:9 | "16:9" |
"1024x1792", any 9:16 | "9:16" |
"1024x768", any 4:3 | "4:3" |
"768x1024", any 3:4 | "3:4" |
nil | omit imageConfig (Gemini default) |
| anything else | {:error, %ImageAdapterError{reason: :invalid_request}} |
Pixel sizing (imageSize: "1K"|"2K"|"4K") is not exposed in v0.2's
ImageRequest.size field; deferred. Aspect-ratio is the only knob.
Operation gate (Decision #6)
supported_operations/0 returns [:generate, :edit]. :variation is
rejected with :unsupported_operation BEFORE any HTTP I/O per
ImageAdapter invariant 4.
Test-injection escape hatch
opts[:adapter_opts][:image_script], when present, delegates to
ALLM.Providers.FakeImages.generate/2 BEFORE any pre-flight gate
runs. Mirrors the OpenAI.Images precedent at
lib/allm/providers/openai/images.ex:251 (Phase 14.3 Decision #20).
Shared response decoder (Cross-function invariant)
Response bodies are decoded via ALLM.Providers.Gemini.Decode.candidate_parts/1
— the same helper Gemini.generate/2 calls (see
lib/allm/providers/gemini.ex:991 post-Phase-16.5 refactor). The image
adapter consumes the image_parts element of the returned tuple while
the chat adapter consumes text + tool_calls; both walk the parts
list once. Per steering/GEMINI_DESIGN.md cross-function invariants
lines 217-219.
Summary
Functions
Return the Gemini endpoint path (relative to the API base URL) for the image-generation operation.
Execute an image-generation or edit request synchronously.
Return an unfired Req.Request configured exactly as generate/2
would fire it.
Resolve an %Image{} source to raw bytes. Mirrors the OpenAI seam at
lib/allm/providers/openai/images.ex:858.
Return the closed list of operations Gemini's image adapter supports.
Map ImageRequest.size to Gemini's imageConfig.aspectRatio per
Decision #19. Returns the raw aspect-ratio string, :omit for nil,
or {:error, :invalid_size} for an unmappable size.
Build the JSON request body for an image request.
Functions
Return the Gemini endpoint path (relative to the API base URL) for the image-generation operation.
Both :generate and :edit route through generateContent (the
request body shape differs, the URL path does not). :variation is
rejected pre-flight by gate_operation/2.
Examples
iex> ALLM.Providers.Gemini.Images.endpoint_for("gemini-3.1-flash-image-preview")
"/models/gemini-3.1-flash-image-preview:generateContent"
@spec generate( ALLM.ImageRequest.t(), keyword() ) :: {:ok, ALLM.ImageResponse.t()} | {:error, ALLM.Error.ImageAdapterError.t()}
Execute an image-generation or edit request synchronously.
Pre-flight gates (per ImageAdapter invariant 4)
Before any HTTP I/O, generate/2 checks (in order):
- Test-injection escape hatch. When
opts[:adapter_opts][:image_script]is non-nil, the call delegates toALLM.Providers.FakeImages.generate/2. - Operation gate.
request.operation in supported_operations(). Failure →:unsupported_operationwithmetadata: %{operation: op}. - Aspect-ratio gate.
request.size, when non-nil, must map to one of"1:1" | "16:9" | "9:16" | "4:3" | "3:4". Failure →:invalid_request.
Key resolution (ALLM.Keys.fetch!/2) runs AFTER the gates — a request
rejected pre-flight does not require a valid key.
Request-id / metadata round-trip (invariants 5 + 6)
opts[:request_id] is reflected onto response.request_id.
request.metadata round-trips onto response.metadata unchanged.
@spec prepare_request( ALLM.ImageRequest.t(), keyword() ) :: {:ok, Req.Request.t()} | {:error, ALLM.Error.ImageAdapterError.t()}
Return an unfired Req.Request configured exactly as generate/2
would fire it.
Same gate ordering as generate/2. Returns {:error, %ImageAdapterError{}}
for any pre-flight failure.
@spec resolve_image_bytes( ALLM.Image.t(), keyword() ) :: {:ok, binary(), String.t()} | {:error, ALLM.Error.ImageAdapterError.t()}
Resolve an %Image{} source to raw bytes. Mirrors the OpenAI seam at
lib/allm/providers/openai/images.ex:858.
For Gemini, this helper exists for parity with the OpenAI image-adapter
testing surface. The actual :edit request build delegates source
translation to Gemini.part_to_block/1 (Phase 16.4) via the chat
translator, which handles :binary, :base64, and :file sources;
:url is rejected by Gemini.reject_unsupported_image_sources/1.
@spec supported_operations() :: [:generate | :edit]
Return the closed list of operations Gemini's image adapter supports.
Per Decision #6 — [:generate, :edit]. :variation is not supported
by the Gemini-native image models and is rejected pre-flight.
Examples
iex> ALLM.Providers.Gemini.Images.supported_operations()
[:generate, :edit]
@spec to_aspect_ratio(ALLM.ImageRequest.size() | nil) :: {:ok, String.t()} | :omit | {:error, :invalid_size}
Map ImageRequest.size to Gemini's imageConfig.aspectRatio per
Decision #19. Returns the raw aspect-ratio string, :omit for nil,
or {:error, :invalid_size} for an unmappable size.
Square sizes ("NxN" or {n, n}) collapse to "1:1". Non-square
sizes use exact ratio comparison rather than substring matching so
"768x1024" (3:4) and "1024x1792" (~9:16) are disambiguated.
Examples
iex> ALLM.Providers.Gemini.Images.to_aspect_ratio("1024x1024")
{:ok, "1:1"}
iex> ALLM.Providers.Gemini.Images.to_aspect_ratio({1792, 1024})
{:ok, "16:9"}
iex> ALLM.Providers.Gemini.Images.to_aspect_ratio(nil)
:omit
iex> ALLM.Providers.Gemini.Images.to_aspect_ratio("999x111")
{:error, :invalid_size}
@spec to_image_request_body( ALLM.ImageRequest.t(), keyword() ) :: {:ok, map()} | {:error, ALLM.Error.ImageAdapterError.t()}
Build the JSON request body for an image request.
Synthesizes a chat-equivalent %Request{} (single user message
whose content is the prompt for :generate, or
[%TextPart{}, %ImagePart{}, ...] for :edit) and delegates to
Gemini.to_gemini_request_body/2 per Decision #7. Then overrides
generationConfig.responseModalities = ["TEXT", "IMAGE"] and (when the
size maps to a known aspect ratio) adds
generationConfig.imageConfig.aspectRatio. :n > 1 adds
generationConfig.candidateCount: n.
Returns {:error, %ImageAdapterError{reason: :invalid_request}} for
unmappable sizes per Decision #19's closed table.