# `ALLM.Providers.Gemini`
[🔗](https://github.com/cykod/ALLM/blob/v0.3.0/lib/allm/providers/gemini.ex#L1)

Google Gemini provider adapter — Layer B. See spec §6.4, §7.1, §20,
§32.1 (bundled adapters).

Phase 16.1 ships the non-streaming `ALLM.Adapter` callback set against
the Generative Language API at
`https://generativelanguage.googleapis.com/v1beta`. Streaming
(`ALLM.StreamAdapter`) lands in Phase 16.2; tools / vision / image-out
in Phases 16.3/16.4/16.5.

This module implements:

  * `generate/2` — fires `POST /v1beta/models/{model}:generateContent`
    via `Req`, wrapped in `ALLM.Retry.run/3` with the **default** retry
    policy (Decision #16 — Gemini's 429 / 500 / 503 / 504 are already
    covered by spec §6.1's default retryable set; no Gemini-specific
    wrapper is needed).
  * `prepare_request/2` — returns an unfired `%Req.Request{}` with the
    API key injected as `x-goog-api-key` (Decision #2).
  * `translate_options/2` — identity (Decision #18). Gemini's
    camelCase rename and `generationConfig` nesting happens inside
    `to_generation_config/1` at request-build time.

## Single translator (Decision #1)

Gemini exposes one chat endpoint, `generateContent`, that covers both
text and image generation — image generation is selected by toggling
`generationConfig.responseModalities`. The request-builder
(`to_gemini_request_body/2`) is therefore a single function shared
across the chat adapter and (in Phase 16.5) the image adapter. This
amortizes the PHASE_10 dual-translator drift class to zero.

## Auth header (Decision #2/#3)

The API key flows on the `x-goog-api-key` request header, not the
documented `?key=...` query parameter. Both forms are equivalent
server-side; the header form keeps the API key out of HTTP access
logs and metrics. The same header is reused for the streaming
endpoint (Decision #3).

## Wire field map (per spec §35.7 + GEMINI_DESIGN.md)

| Concern | Gemini wire field |
|---------|------------------|
| Endpoint host | `https://generativelanguage.googleapis.com/v1beta` |
| Method (chat non-streaming) | `POST /models/{model}:generateContent` |
| Auth header | `x-goog-api-key: $key` |
| Roles | `user`, `model` (`:assistant → "model"`) |
| System prompt | top-level `systemInstruction.parts[].text` |
| Generation params | nested under `generationConfig.{maxOutputTokens, temperature, topP, topK, stopSequences, responseMimeType, responseSchema}` |
| `finish_reason` | `candidates[0].finishReason` (UPPER_SNAKE_CASE; mapping table below) |
| Prompt-blocked path | `promptFeedback.blockReason` (top-level, no candidates) |
| Usage location | `usageMetadata.{promptTokenCount, candidatesTokenCount, totalTokenCount}` |
| Error envelope | `{"error": {"code", "status", "message"}}` |

## Finish-reason mapping (Decision #14)

Gemini's enum has 19 documented values. ALLM's
`Response.finish_reason` is a closed 6-atom union; the raw string is
preserved at `Response.raw_finish_reason` for non-canonical rows.

| Gemini `finishReason` | ALLM `Response.finish_reason` |
|-----------------------|-------------------------------|
| `STOP` | `:stop` |
| `MAX_TOKENS` | `:length` |
| `SAFETY` | `:content_filter` |
| `RECITATION` | `:content_filter` |
| `LANGUAGE` | `:content_filter` |
| `BLOCKLIST` | `:content_filter` |
| `PROHIBITED_CONTENT` | `:content_filter` |
| `SPII` | `:content_filter` |
| `IMAGE_SAFETY` | `:content_filter` |
| `IMAGE_PROHIBITED_CONTENT` | `:content_filter` |
| `IMAGE_RECITATION` | `:content_filter` |
| `IMAGE_OTHER` | `:other` |
| `NO_IMAGE` | `:other` |
| `MALFORMED_FUNCTION_CALL` | `:error` |
| `UNEXPECTED_TOOL_CALL` | `:error` |
| `TOO_MANY_TOOL_CALLS` | `:error` |
| `MISSING_THOUGHT_SIGNATURE` | `:error` |
| `MALFORMED_RESPONSE` | `:error` |
| `OTHER` / `FINISH_REASON_UNSPECIFIED` / unknown | `:other` |

## Empty-candidates branches (Decisions #9 + #10)

  * `promptFeedback.blockReason` with empty candidates →
    `{:ok, %Response{finish_reason: :content_filter, content: ""}}`.
    The block reason is preserved at
    `metadata.error.reason = "blocked:<BLOCK_REASON>"`.
  * empty candidates with no `promptFeedback.blockReason` →
    `{:error, %AdapterError{reason: :malformed_response}}`.

## Usage decoding (Decision #11)

`usageMetadata.candidatesTokenCount` is canonical;
`usageMetadata.responseTokenCount` is read as a defensive fallback
when `candidatesTokenCount` is absent. If both are missing,
`Usage.output_tokens` is left at `nil` and a one-time
`Logger.warning/1` fires per call.

## Error envelope mapping (Decision #15)

Maps Google's `{error: {code, status, message}}` envelope onto
`%AdapterError{reason: ...}`:

| HTTP | Google `status` | `AdapterError.reason` |
|------|-----------------|----------------------|
| 400 | `INVALID_ARGUMENT` (no marker) | `:invalid_request` |
| 400 | `INVALID_ARGUMENT` (`exceeds the maximum number of tokens` substring) | `:context_length_exceeded` |
| 401 | `UNAUTHENTICATED` | `:authentication_failed` |
| 403 | `PERMISSION_DENIED` | `:authentication_failed` |
| 404 | `NOT_FOUND` | `:invalid_request` |
| 429 | `RESOURCE_EXHAUSTED` | `:rate_limited` |
| 500 | `INTERNAL` | `:provider_unavailable` |
| 503 | `UNAVAILABLE` | `:provider_unavailable` |
| 504 | `DEADLINE_EXCEEDED` | `:provider_unavailable` |

## Retry policy (Decision #16)

No Gemini-specific retry-policy wrapper. The default policy at
`lib/allm/retry.ex` already retries HTTP 429, 500, 502, 503, 504,
and `:timeout` / `:network_error`. Streaming never retries
(spec §6.1).

## Key resolution

Keys never appear on the engine. `prepare_request/2` and `generate/2`
call `ALLM.Keys.fetch!(:gemini, opts)` at request-build time. The
`:gemini` provider atom is **not** in `ALLM.Keys`'s `@env_var_table`;
the unknown-provider fallback at
`lib/allm/keys.ex:189-194` returns `"GEMINI_API_KEY"`.

# `generate`

```elixir
@spec generate(
  ALLM.Request.t(),
  keyword()
) ::
  {:ok, ALLM.Response.t()}
  | {:error, ALLM.Error.AdapterError.t() | ALLM.Error.ValidationError.t()}
```

Execute a non-streaming `generateContent` request synchronously.

Wraps the HTTP call in `ALLM.Retry.run/3` with the spec §6.1 default
policy (Decision #16). Returns `{:ok, %Response{}}` on 2xx success or
`{:error, %AdapterError{}}` on every failure shape.

## Empty-candidates handling (Decisions #9 + #10)

  * `promptFeedback.blockReason` with empty candidates →
    `{:ok, %Response{finish_reason: :content_filter, content: ""}}`
    (a successful HTTP response is a successful call from the
    adapter's perspective; the content filter is a finish reason).
  * Empty candidates with no `promptFeedback.blockReason` →
    `{:error, %AdapterError{reason: :malformed_response}}`.

## Error reasons (Decision #15)

| HTTP | `AdapterError.reason` |
|------|----------------------|
| 400 generic | `:invalid_request` |
| 400 ctx-window | `:context_length_exceeded` |
| 401 / 403 | `:authentication_failed` |
| 404 | `:invalid_request` |
| 429 | `:rate_limited` |
| 500 / 503 / 504 | `:provider_unavailable` |
| network drop | `:network_error` |
| malformed body | `:malformed_response` |

## Examples

    iex> ALLM.Keys.put(:gemini, "AIza-doctest-gen")
    iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}], model: "gemini-2.5-flash")
    iex> {:error, %ALLM.Error.AdapterError{reason: :authentication_failed}} =
    ...>   ALLM.Providers.Gemini.generate(req,
    ...>     retry: false,
    ...>     adapter_opts: [plug: fn conn ->
    ...>       conn
    ...>       |> Plug.Conn.put_resp_content_type("application/json")
    ...>       |> Plug.Conn.resp(401, ~s({"error":{"code":401,"status":"UNAUTHENTICATED","message":"bad"}}))
    ...>     end]
    ...>   )
    iex> ALLM.Keys.delete(:gemini)
    :ok

# `parse_finish_reason`

```elixir
@spec parse_finish_reason(String.t() | nil) ::
  {ALLM.Response.finish_reason() | nil, String.t() | nil}
```

Map a Gemini `finishReason` string to ALLM's closed
`Response.finish_reason` enum, returning `{atom, raw_string_or_nil}`
per Decision #14.

`STOP` collapses to `{:stop, nil}` (the canonical "natural completion"
row); every other row preserves the raw string at index 1 so callers
can recover provider fidelity from `Response.raw_finish_reason`.

## Examples

    iex> ALLM.Providers.Gemini.parse_finish_reason("STOP")
    {:stop, nil}

    iex> ALLM.Providers.Gemini.parse_finish_reason("MAX_TOKENS")
    {:length, "MAX_TOKENS"}

    iex> ALLM.Providers.Gemini.parse_finish_reason("SAFETY")
    {:content_filter, "SAFETY"}

    iex> ALLM.Providers.Gemini.parse_finish_reason("OTHER")
    {:other, "OTHER"}

    iex> ALLM.Providers.Gemini.parse_finish_reason(nil)
    {nil, nil}

# `prepare_request`

```elixir
@spec prepare_request(
  ALLM.Request.t(),
  keyword()
) :: {:ok, Req.Request.t()} | {:error, ALLM.Error.AdapterError.t()}
```

Build an unfired `%Req.Request{}` with the resolved API key injected
as `x-goog-api-key: <key>` (Decision #2).

Per `ALLM.Keys.fetch!/2`, this function **raises**
`%ALLM.Error.EngineError{reason: :missing_key}` when no key resolver
yields a value.

Honors `opts[:request_timeout]` (forwarded as `Req`'s
`:receive_timeout`) and `opts[:adapter_opts][:endpoint]` (URL host
override, primarily for testing).

## Examples

    iex> ALLM.Keys.put(:gemini, "AIza-doctest-prep")
    iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "hi"}], model: "gemini-2.5-flash")
    iex> {:ok, %Req.Request{} = http} = ALLM.Providers.Gemini.prepare_request(req, [])
    iex> Req.Request.get_header(http, "x-goog-api-key")
    ["AIza-doctest-prep"]
    iex> ALLM.Keys.delete(:gemini)
    :ok

# `stream`

```elixir
@spec stream(
  ALLM.Request.t(),
  keyword()
) ::
  {:ok, Enumerable.t()}
  | {:error, ALLM.Error.AdapterError.t() | ALLM.Error.ValidationError.t()}
```

Open an SSE stream against `streamGenerateContent?alt=sse`.

Returns `{:ok, enumerable}` on success — the enumerable is lazy; the
HTTP request fires on the first reduce. Returns `{:error, %AdapterError{}}`
only for synchronous pre-flight failures (key-resolution failure raises
`%EngineError{}` directly per the `Keys.fetch!/2` contract; that is
surfaced through the existing `with`-chain at the call site).

Per CLAUDE.md mid-stream-error invariant, **HTTP-shaped errors observed
AFTER the consumer starts reducing are folded into a terminating
`{:error, _}` event** — the call-site tuple stays `{:ok, stream}`. This
includes 4xx status codes received before the first SSE event (the
`{:status, code}` Finch frame folds via `handle_finch_payload/2`).

## Decision references

  * **Decision #1** — request body byte-equal to `generate/2`'s. Only
    the URL path differs (`:streamGenerateContent?alt=sse` vs
    `:generateContent`).
  * **Decision #3** — `?alt=sse` is the ONLY required query parameter;
    auth still flows via `x-goog-api-key`.
  * **Decision #12** — `usageMetadata` may appear on intermediate
    chunks; the chunk-mapper emits `{:raw_chunk, {:usage, _}}` on
    every appearance and `StreamCollector.apply_event/2` overwrites.
  * **Decision #13** — stream terminates on Finch's `:done` payload,
    not a `data: [DONE]` lookahead. The synthetic `:message_completed`
    event is built from accumulated state.

## Options

  * `:stream_timeout` (default 60_000 ms) — receive-loop after-clause
    between chunks.
  * `:finch_module` (default `Finch`) — test injection seam.
  * `:finch_name` (default `ALLM.Finch`).
  * `:finch_stub_ref` — opaque ref forwarded to the Finch shim
    (used only by `ALLM.Test.FinchStub`).
  * `:adapter_opts[:endpoint]` — endpoint override (testing).

# `to_gemini_request_body`

```elixir
@spec to_gemini_request_body(
  ALLM.Request.t(),
  keyword()
) :: map()
```

Compose the JSON request body for `generateContent` from a canonical
`%Request{}`. Pure function; no I/O.

Performs system-message extraction (hoist into top-level
`systemInstruction`), role mapping (`:assistant → "model"`), and
`generationConfig` composition.

Phase 16.1 surface only — tools (16.3) and image-out (16.5) extend
this builder via `opts` flags without changing the text-only path.

## Examples

    iex> req = ALLM.Request.new(
    ...>   [%ALLM.Message{role: :system, content: "Be concise."},
    ...>    %ALLM.Message{role: :user, content: "Hi"}],
    ...>   model: "gemini-2.5-flash", max_tokens: 256
    ...> )
    iex> body = ALLM.Providers.Gemini.to_gemini_request_body(req, [])
    iex> {body["systemInstruction"], length(body["contents"]), body["generationConfig"]["maxOutputTokens"]}
    {%{"parts" => [%{"text" => "Be concise."}]}, 1, 256}

# `to_gemini_tool_config`

```elixir
@spec to_gemini_tool_config(ALLM.Request.tool_choice() | {:tool, String.t()}) :: map()
```

Translate an ALLM canonical `tool_choice` to Gemini's
`functionCallingConfig` map.

| ALLM canonical | Gemini wire |
|----------------|-------------|
| `:auto` | `%{"mode" => "AUTO"}` |
| `:required` | `%{"mode" => "ANY"}` |
| `:none` | `%{"mode" => "NONE"}` |
| `{:tool, "name"}` | `%{"mode" => "ANY", "allowedFunctionNames" => ["name"]}` |
| `"name"` (string) | shorthand for `{:tool, "name"}` |

Map shapes (`%{"mode" => "AUTO"}`, etc.) are passed through verbatim
so callers can hand-craft Gemini-specific extensions.

## Examples

    iex> ALLM.Providers.Gemini.to_gemini_tool_config(:auto)
    %{"mode" => "AUTO"}

    iex> ALLM.Providers.Gemini.to_gemini_tool_config({:tool, "set_color"})
    %{"mode" => "ANY", "allowedFunctionNames" => ["set_color"]}

# `to_gemini_tools`

```elixir
@spec to_gemini_tools([ALLM.Tool.t()]) :: [map()]
```

Translate a list of canonical `%ALLM.Tool{}`s to Gemini's
`functionDeclarations` shape.

Gemini's `tools` is an array of `%{functionDeclarations: [...]}`
objects, not a flat array of declarations. Each declaration carries
`:name`, `:description`, and `:parameters` (Gemini's name for the
JSON-Schema field — distinct from OpenAI's `parameters` key on the
tool's `function` sub-map and Anthropic's `input_schema`).

## Examples

    iex> tool = ALLM.Tool.new(name: "get_weather", description: "weather", schema: %{"type" => "object"})
    iex> ALLM.Providers.Gemini.to_gemini_tools([tool])
    [%{"name" => "get_weather", "description" => "weather", "parameters" => %{"type" => "object"}}]

# `translate_options`

```elixir
@spec translate_options(
  keyword(),
  ALLM.Request.t()
) :: keyword()
```

Identity translator (Decision #18). Gemini accepts ALLM's canonical
`:max_tokens`, `:temperature`, `:top_p`, etc. — the camelCase rename
and `generationConfig` nesting happens in `to_generation_config/1` at
request-build time, not here.

## Examples

    iex> req = ALLM.Request.new([%ALLM.Message{role: :user, content: "x"}], model: "gemini-2.5-flash")
    iex> ALLM.Providers.Gemini.translate_options([max_tokens: 100, temperature: 0.7], req)
    [max_tokens: 100, temperature: 0.7]

---

*Consult [api-reference.md](api-reference.md) for complete listing*