Represents a chat model hosted by AWS Bedrock's Mantle endpoint — the OpenAI-compatible gateway AWS introduced for third-party models such as Moonshot AI's Kimi K2 family and OpenAI's gpt-oss series.
Mantle accepts standard OpenAI Chat Completions requests, so much of the
wire format mirrors LangChain.ChatModels.ChatOpenAI. This module exists as
a separate chat model because Mantle has several differences that warrant
dedicated handling:
- Region-aware URL building —
https://bedrock-mantle.{region}.api.aws/v1/chat/completions - Two auth modes — Bedrock API key (Bearer) or AWS IAM (SigV4)
- Reasoning extraction — Mantle returns model reasoning at
message.reasoning(ordelta.reasoningwhen streaming), whichChatOpenAIsilently drops - Higher default
receive_timeout— Mantle exhibits intermittent slow starts of 60s+, so the default is 120s here vs OpenAI's 60s - Bounded default
max_tokens: 4096— Kimi occasionally falls into token-repetition loops; streaming keeps the HTTP layer alive as chunks arrive, so an uncapped request can run indefinitely. Override as needed when reasoning budgets require more - Per-model quirks — Kimi prepends a leading space to text content; uses
functions.NAME:Nforcall_idshape; narrates before tool calls
Tested Models (as of writing)
| Model ID | Vendor | Notes |
|---|---|---|
moonshotai.kimi-k2-thinking | Moonshot | Reasoning by default, 128K ctx |
moonshotai.kimi-k2.5 | Moonshot | Multimodal, hybrid thinking via :reasoning_effort |
openai.gpt-oss-120b | OpenAI | Open-source GPT, hosted by AWS |
Refer to the published list of supported models.
Authentication
Two mutually-exclusive auth modes:
Bearer (Bedrock API key) — simplest
Generate a long-term Bedrock API key in the AWS console
(Bedrock API keys)
and set it as the :api_key:
ChatAwsMantle.new!(%{
model: "moonshotai.kimi-k2.5",
region: "us-east-1",
api_key: System.fetch_env!("AWS_BEARER_TOKEN_BEDROCK")
})AWS SigV4 (IAM credentials) — production-friendly
Pass a zero-arity function returning IAM credentials. Useful when the host already has IAM-based credentials available (e.g. ExAws):
ChatAwsMantle.new!(%{
model: "moonshotai.kimi-k2.5",
region: "us-east-1",
credentials: fn ->
ExAws.Config.new(:s3)
|> Map.take([:access_key_id, :secret_access_key])
|> Map.to_list()
end
})Reasoning / Thinking
K2.5 is a hybrid thinking model — pass OpenAI's standard :reasoning_effort
to enable structured reasoning:
ChatAwsMantle.new!(%{
model: "moonshotai.kimi-k2.5",
region: "us-east-1",
api_key: System.fetch_env!("AWS_BEARER_TOKEN_BEDROCK"),
reasoning_effort: "high"
})When reasoning is active, the response message will include a ContentPart
of type: :thinking containing the model's chain of thought, alongside the
normal :text content parts.
K2 Thinking always reasons (it's the model's default mode); the field is
populated regardless of :reasoning_effort.
Sampling controls
Standard OpenAI sampling parameters are supported and passed through to Mantle unchanged:
:temperature— 0.0 to 2.0 (default1.0):top_p— 0.0 to 1.0 nucleus sampling cutoff. OpenAI recommends tuning this or temperature, not both:frequency_penalty— -2.0 to 2.0. Positive values discourage reuse of tokens proportional to how often they've already appeared. Kimi K2.5 on Mantle has been observed to occasionally lock into single-token repetition loops (e.g. streams of "!");frequency_penalty: 0.5is a reasonable starting defense.:presence_penalty— -2.0 to 2.0. Binary variant of frequency_penalty (penalizes any token that has appeared at all)
Streaming
Set stream: true to receive incremental MessageDelta updates via the
on_llm_new_delta callback. Mantle emits standard OpenAI SSE chunks for
content and tool calls, and adds a sibling delta.reasoning field when
reasoning is active. ChatAwsMantle extracts those into :thinking
ContentParts so the merged final message carries [thinking_part, text_part]
in order.
Multimodal (K2.5 vision)
Kimi K2.5 is natively multimodal. Send images via standard LangChain
ContentPart structs — ChatAwsMantle delegates serialization to
ChatOpenAI.content_part_for_api/2, which emits Mantle's expected
{"type": "image_url", "image_url": {"url": "data:<media>;base64,..."}}
shape:
{:ok, bytes} = File.read("photo.jpg")
Message.new_user!([
ContentPart.text!("What's in this image?"),
ContentPart.image!(Base.encode64(bytes), media: :jpeg)
])
|> then(&ChatAwsMantle.call(model, [&1]))Mantle runs images through an upstream sanitizer that rejects degenerate
inputs (tiny or unusual images may return a 400 with
"Failed to sanitize image"). Use real photographs or reasonably-sized
source images. Vision tokens add meaningfully to prompt_tokens — a
1200×675 JPG consumes roughly 1100 prompt tokens.
Open Notes
Streaming and tool-calling support follow the same wire format as
ChatOpenAI — see the smoke tests for verified behavior.
Summary
Functions
Make a call to the Mantle API. Returns {:ok, [%Message{}]} on success or
{:error, %LangChainError{}} on failure.
Format the request body for the Mantle API. Reuses ChatOpenAI's per-message
formatting (since the wire format is OpenAI-shaped), but assembles the
top-level body with Mantle-relevant fields only.
Build a new ChatAwsMantle instance from attributes.
Build a new ChatAwsMantle instance, raising on validation failure.
Types
@type t() :: %LangChain.ChatModels.ChatAwsMantle{ api_key: term(), callbacks: term(), credentials: term(), endpoint: term(), frequency_penalty: term(), json_response: term(), json_schema: term(), max_tokens: term(), model: term(), presence_penalty: term(), reasoning_effort: term(), receive_timeout: term(), region: term(), req_config: term(), stream: term(), stream_options: term(), temperature: term(), tool_choice: term(), top_p: term(), verbose_api: term() }
Functions
Make a call to the Mantle API. Returns {:ok, [%Message{}]} on success or
{:error, %LangChainError{}} on failure.
@spec for_api(t(), [LangChain.Message.t()], [LangChain.Function.t()]) :: %{ required(atom()) => any() }
Format the request body for the Mantle API. Reuses ChatOpenAI's per-message
formatting (since the wire format is OpenAI-shaped), but assembles the
top-level body with Mantle-relevant fields only.
@spec new(attrs :: map()) :: {:ok, t()} | {:error, Ecto.Changeset.t()}
Build a new ChatAwsMantle instance from attributes.
Build a new ChatAwsMantle instance, raising on validation failure.