LangChain.ChatModels.ChatAwsMantle (LangChain v0.8.4)

Copy Markdown View Source

Represents a chat model hosted by AWS Bedrock's Mantle endpoint — the OpenAI-compatible gateway AWS introduced for third-party models such as Moonshot AI's Kimi K2 family and OpenAI's gpt-oss series.

Mantle accepts standard OpenAI Chat Completions requests, so much of the wire format mirrors LangChain.ChatModels.ChatOpenAI. This module exists as a separate chat model because Mantle has several differences that warrant dedicated handling:

  • Region-aware URL buildinghttps://bedrock-mantle.{region}.api.aws/v1/chat/completions
  • Two auth modes — Bedrock API key (Bearer) or AWS IAM (SigV4)
  • Reasoning extraction — Mantle returns model reasoning at message.reasoning (or delta.reasoning when streaming), which ChatOpenAI silently drops
  • Higher default receive_timeout — Mantle exhibits intermittent slow starts of 60s+, so the default is 120s here vs OpenAI's 60s
  • Bounded default max_tokens: 4096 — Kimi occasionally falls into token-repetition loops; streaming keeps the HTTP layer alive as chunks arrive, so an uncapped request can run indefinitely. Override as needed when reasoning budgets require more
  • Per-model quirks — Kimi prepends a leading space to text content; uses functions.NAME:N for call_id shape; narrates before tool calls

Tested Models (as of writing)

Model IDVendorNotes
moonshotai.kimi-k2-thinkingMoonshotReasoning by default, 128K ctx
moonshotai.kimi-k2.5MoonshotMultimodal, hybrid thinking via :reasoning_effort
openai.gpt-oss-120bOpenAIOpen-source GPT, hosted by AWS

Refer to the published list of supported models.

Authentication

Two mutually-exclusive auth modes:

Bearer (Bedrock API key) — simplest

Generate a long-term Bedrock API key in the AWS console (Bedrock API keys) and set it as the :api_key:

ChatAwsMantle.new!(%{
  model: "moonshotai.kimi-k2.5",
  region: "us-east-1",
  api_key: System.fetch_env!("AWS_BEARER_TOKEN_BEDROCK")
})

AWS SigV4 (IAM credentials) — production-friendly

Pass a zero-arity function returning IAM credentials. Useful when the host already has IAM-based credentials available (e.g. ExAws):

ChatAwsMantle.new!(%{
  model: "moonshotai.kimi-k2.5",
  region: "us-east-1",
  credentials: fn ->
    ExAws.Config.new(:s3)
    |> Map.take([:access_key_id, :secret_access_key])
    |> Map.to_list()
  end
})

Reasoning / Thinking

K2.5 is a hybrid thinking model — pass OpenAI's standard :reasoning_effort to enable structured reasoning:

ChatAwsMantle.new!(%{
  model: "moonshotai.kimi-k2.5",
  region: "us-east-1",
  api_key: System.fetch_env!("AWS_BEARER_TOKEN_BEDROCK"),
  reasoning_effort: "high"
})

When reasoning is active, the response message will include a ContentPart of type: :thinking containing the model's chain of thought, alongside the normal :text content parts.

K2 Thinking always reasons (it's the model's default mode); the field is populated regardless of :reasoning_effort.

Sampling controls

Standard OpenAI sampling parameters are supported and passed through to Mantle unchanged:

  • :temperature — 0.0 to 2.0 (default 1.0)
  • :top_p — 0.0 to 1.0 nucleus sampling cutoff. OpenAI recommends tuning this or temperature, not both
  • :frequency_penalty — -2.0 to 2.0. Positive values discourage reuse of tokens proportional to how often they've already appeared. Kimi K2.5 on Mantle has been observed to occasionally lock into single-token repetition loops (e.g. streams of "!"); frequency_penalty: 0.5 is a reasonable starting defense.
  • :presence_penalty — -2.0 to 2.0. Binary variant of frequency_penalty (penalizes any token that has appeared at all)

Streaming

Set stream: true to receive incremental MessageDelta updates via the on_llm_new_delta callback. Mantle emits standard OpenAI SSE chunks for content and tool calls, and adds a sibling delta.reasoning field when reasoning is active. ChatAwsMantle extracts those into :thinking ContentParts so the merged final message carries [thinking_part, text_part] in order.

Multimodal (K2.5 vision)

Kimi K2.5 is natively multimodal. Send images via standard LangChain ContentPart structs — ChatAwsMantle delegates serialization to ChatOpenAI.content_part_for_api/2, which emits Mantle's expected {"type": "image_url", "image_url": {"url": "data:<media>;base64,..."}} shape:

{:ok, bytes} = File.read("photo.jpg")

Message.new_user!([
  ContentPart.text!("What's in this image?"),
  ContentPart.image!(Base.encode64(bytes), media: :jpeg)
])
|> then(&ChatAwsMantle.call(model, [&1]))

Mantle runs images through an upstream sanitizer that rejects degenerate inputs (tiny or unusual images may return a 400 with "Failed to sanitize image"). Use real photographs or reasonably-sized source images. Vision tokens add meaningfully to prompt_tokens — a 1200×675 JPG consumes roughly 1100 prompt tokens.

Open Notes

Streaming and tool-calling support follow the same wire format as ChatOpenAI — see the smoke tests for verified behavior.

Summary

Functions

Make a call to the Mantle API. Returns {:ok, [%Message{}]} on success or {:error, %LangChainError{}} on failure.

Format the request body for the Mantle API. Reuses ChatOpenAI's per-message formatting (since the wire format is OpenAI-shaped), but assembles the top-level body with Mantle-relevant fields only.

Build a new ChatAwsMantle instance from attributes.

Build a new ChatAwsMantle instance, raising on validation failure.

Types

t()

@type t() :: %LangChain.ChatModels.ChatAwsMantle{
  api_key: term(),
  callbacks: term(),
  credentials: term(),
  endpoint: term(),
  frequency_penalty: term(),
  json_response: term(),
  json_schema: term(),
  max_tokens: term(),
  model: term(),
  presence_penalty: term(),
  reasoning_effort: term(),
  receive_timeout: term(),
  region: term(),
  req_config: term(),
  stream: term(),
  stream_options: term(),
  temperature: term(),
  tool_choice: term(),
  top_p: term(),
  verbose_api: term()
}

Functions

call(model, prompt, tools \\ [])

Make a call to the Mantle API. Returns {:ok, [%Message{}]} on success or {:error, %LangChainError{}} on failure.

for_api(model, messages, tools)

@spec for_api(t(), [LangChain.Message.t()], [LangChain.Function.t()]) :: %{
  required(atom()) => any()
}

Format the request body for the Mantle API. Reuses ChatOpenAI's per-message formatting (since the wire format is OpenAI-shaped), but assembles the top-level body with Mantle-relevant fields only.

new(attrs \\ %{})

@spec new(attrs :: map()) :: {:ok, t()} | {:error, Ecto.Changeset.t()}

Build a new ChatAwsMantle instance from attributes.

new!(attrs \\ %{})

@spec new!(attrs :: map()) :: t() | no_return()

Build a new ChatAwsMantle instance, raising on validation failure.