LangChain.ChatModels.ChatOpenAIResponses (LangChain v0.8.4)

Copy Markdown View Source

Represents the OpenAI Responses API

Parses and validates inputs for making requests to the OpenAI Responses API.

Converts responses into more specialized LangChain data structures.

ContentPart Types

OpenAI's Responses API supports several types of content parts that can be combined in a single message:

Text Content

Basic text content is the default and most common type:

Message.new_user!("Hello, how are you?")

Image Content

OpenAI supports both base64-encoded images and image URLs:

# Using a base64 encoded image
Message.new_user!([
  ContentPart.text!("What's in this image?"),
  ContentPart.image!("base64_encoded_image_data", media: :jpg)
])

# Using an image URL
Message.new_user!([
  ContentPart.text!("Describe this image:"),
  ContentPart.image_url!("https://example.com/image.jpg")
])

# Using a file ID (after uploading to OpenAI)
Message.new_user!([
  ContentPart.text!("Describe this image:"),
  ContentPart.image!("file-1234", type: :file_id)
])

For images, you can specify the detail level which affects token usage:

  • detail: "low" - Lower resolution, fewer tokens
  • detail: "high" - Higher resolution, more tokens
  • detail: "auto" - Let the model decide

File Content

OpenAI supports both base64-encoded files and file IDs:

# Using a base64 encoded file
Message.new_user!([
  ContentPart.text!("Process this file:"),
  ContentPart.file!("base64_encoded_file_data",
    type: :base64,
    filename: "document.pdf"
  )
])

# Using a file ID (after uploading to OpenAI)
Message.new_user!([
  ContentPart.text!("Process this file:"),
  ContentPart.file!("file-1234", type: :file_id)
])

Callbacks

See the set of available callbacks: LangChain.Chains.ChainCallbacks

Rate Limit API Response Headers

OpenAI returns rate limit information in the response headers. Those can be accessed using the LLM callback on_llm_ratelimit_info like this:

handlers = %{
  on_llm_ratelimit_info: fn _model, headers ->
    IO.inspect(headers)
  end
}

{:ok, chat} = ChatOpenAI.new(%{callbacks: [handlers]})

When a request is received, something similar to the following will be output to the console.

%{
  "x-ratelimit-limit-requests" => ["5000"],
  "x-ratelimit-limit-tokens" => ["160000"],
  "x-ratelimit-remaining-requests" => ["4999"],
  "x-ratelimit-remaining-tokens" => ["159973"],
  "x-ratelimit-reset-requests" => ["12ms"],
  "x-ratelimit-reset-tokens" => ["10ms"],
  "x-request-id" => ["req_1234"]
}

Token Usage

OpenAI returns token usage information as part of the response body. The LangChain.TokenUsage is added to the metadata of the LangChain.Message and LangChain.MessageDelta structs that are processed under the :usage key.

The OpenAI documentation instructs to provide the stream_options with the include_usage: true for the information to be provided.

The TokenUsage data is accumulated for MessageDelta structs and the final usage information will be on the LangChain.Message.

NOTE: Of special note is that the TokenUsage information is returned once for all "choices" in the response. The LangChain.TokenUsage data is added to each message, but if your usage requests multiple choices, you will see the same usage information for each choice but it is duplicated and only one response is meaningful.

Open AI's Responses API also supports built-in tools. Among those, we support Web Search currently.

Example

To optionally permit the model to use web search:

native_web_tool = NativeTool.new!(%{name: "web_search_preview", configuration: %{}})

%{llm: ChatOpenAIResponses.new!(%{model: "gpt-4o"})}
|> LLMChain.new!()
|> LLMChain.add_message(Message.new_user!("Can you tell me something that happened today in Texas?"))
|> LLMChain.add_tools(web_tool)
|> LLMChain.run()

You may provide additional configuration per the OpenAI documentation:

web_config = %{
  search_context_size: "medium",
  user_location: %{
    type: "approximate",
    city: "Humble",
    country: "US",
    region: "Texas",
    timezone: "America/Chicago"
  }
}
native_web_tool = NativeTool.new!(%{name: "web_search_preview", configuration: web_config)

You may reference a prior web_search_call in subsequent runs as:

Message.new_assistant!([
  ContentPart.new!(%{
    type: :unsupported,
      options: %{
        id: "ws_123456789", # ID as provided from Open AI
        status: "completed",
        type: "web_search_call"
      }
    }
  ),
  ContentPart.text!("The Astros won today 5-4...")
])

Note: Not all Open AI models support web_search_preview. OpenAI will return an error if you request web_search_preview for when using a model that doesn't support it.

Tool Choice

OpenAI's ChatGPT API supports forcing a tool to be used.

This is supported through the tool_choice options. It takes a plain Elixir map to provide the configuration.

By default, the LLM will choose a tool call if a tool is available and it determines it is needed. That's the "auto" mode.

Example

For the LLM's response to make a tool call of the "get_weather" function.

ChatOpenAI.new(%{
  model: "...",
  tool_choice: %{"type" => "function", "function" => %{"name" => "get_weather"}}
})

...or to force a native tool (such as web search):

ChatOpenAI.new(%{
  model: "...",
  tool_choice: "web_search_preview"
})

Verbosity

The verbosity option controls the length of the model's response. Accepted values are "low", "medium", and "high". When omitted, the API uses its default behavior.

This is sent as part of the text parameter in the Responses API and can be combined with JSON response formats.

Only supported for gpt-5 or newer models

Example

ChatOpenAIResponses.new!(%{model: "gpt-5", verbosity: "low"})

WebSocket Transport

Instead of HTTP, requests can be sent over a persistent WebSocket connection for lower latency. Use connect_websocket!/1 to open a connection and disconnect_websocket!/1 to close it:

model =
  ChatOpenAIResponses.new!(%{model: "gpt-4o"})
  |> ChatOpenAIResponses.connect_websocket!()

{:ok, chain} =
  %{llm: model}
  |> LLMChain.new!()
  |> LLMChain.add_message(Message.new_user!("Hello"))
  |> LLMChain.run()

ChatOpenAIResponses.disconnect_websocket!(model)

The WebSocket connection is reused across multiple LLM calls within the same chain run (e.g. multi-turn tool calling with :while_needs_response).

Lifecycle Management

The application is responsible for managing the WebSocket lifecycle. connect_websocket!/1 starts a LangChain.WebSocket GenServer via start_link/1, linking it to the calling process. The PID is stored in the model struct's :websocket field. There is no supervisor, automatic reconnection, or health monitoring built in.

This means:

  • If the calling process exits, the WebSocket is terminated (process link).
  • The WebSocket PID cannot be serialized. If the model struct is persisted to a database and restored later, the :websocket field will be stale.
  • The server may close idle connections at any time. There is no automatic reconnection.
  • There is no retry logic for WebSocket failures (unlike the HTTP transport).

The WebSocket transport is best suited for short-lived, synchronous sessions where you control the full lifecycle. It is not currently safe for long-lived agent processes, human-in-the-loop workflows with interruptions, or any scenario where the model struct is serialized and restored across process boundaries.

For long-running or interruptible workloads, use the default HTTP transport.

Known Limitation: temperature and top_p

The :temperature and :top_p parameters are currently excluded from WebSocket payloads due to an OpenAI bug that silently closes the connection when these are sent as decimals. A Logger.warning is emitted when these values are dropped. This workaround will be removed once OpenAI resolves the issue.

Connection Retry Behavior

The retry_count option controls how many times a request is retried when a pooled HTTP connection turns out to be stale (server closed it between requests). This is a transport-level issue where retrying with a fresh connection is the correct response.

Only closed-connection errors are retried. Timeouts, rate limits (429), overloaded (529), authentication errors, and invalid requests all return immediately -- they are not problems that a simple retry will fix.

retry_countTotal HTTP requests
01 (no retries)
12 (1 initial + 1 retry)
2 (default)3 (1 initial + 2 retries)

Req's built-in HTTP retry is disabled to prevent the two retry layers from compounding. See GitHub issue #503.

When running LLM calls from a background job queue (e.g., Oban) that has its own retry logic, set retry_count: 0 so there are no hidden retries:

ChatOpenAIResponses.new!(%{model: "...", retry_count: 0})

Summary

Functions

Open a LangChain.WebSocket connection using the model's endpoint and API key.

Like connect_websocket/1 but raises on failure.

Convert a ContentPart to the expected map of data for the OpenAI API.

Convert a list of ContentParts to the expected map of data for the OpenAI API.

Close the WebSocket connection associated with this model.

Return the params formatted for an API request.

Setup a ChatOpenAI client configuration.

Setup a ChatOpenAI client configuration and return it or raise an error if invalid.

Restores the model from the config.

Determine if an error should be retried with a fallback model. Aligns with other providers.

Generate a config map that can later restore the model's configuration.

Types

t()

@type t() :: %LangChain.ChatModels.ChatOpenAIResponses{
  api_key: term(),
  callbacks: term(),
  endpoint: term(),
  include: term(),
  json_response: term(),
  json_schema: term(),
  json_schema_name: term(),
  max_output_tokens: term(),
  model: term(),
  previous_response_id: term(),
  reasoning: term(),
  receive_timeout: term(),
  req_config: term(),
  retry_count: term(),
  store: term(),
  stream: term(),
  temperature: term(),
  tool_choice: term(),
  top_p: term(),
  truncation: term(),
  user: term(),
  verbose_api: term(),
  verbosity: term(),
  websocket: term()
}

Functions

connect_websocket(model)

@spec connect_websocket(t()) :: {:ok, t()} | {:error, String.t()}

Open a LangChain.WebSocket connection using the model's endpoint and API key.

Returns {:ok, model} with the :websocket field set to the WebSocket PID, or {:error, reason} on failure.

Requires the optional mint_web_socket dependency.

Example

{:ok, model} = ChatOpenAIResponses.connect_websocket(model)

connect_websocket!(model)

@spec connect_websocket!(t()) :: t() | no_return()

Like connect_websocket/1 but raises on failure.

Example

model =
  ChatOpenAIResponses.new!(%{model: "gpt-4o"})
  |> ChatOpenAIResponses.connect_websocket!()

# ... use model in chains ...

ChatOpenAIResponses.disconnect_websocket!(model)

content_part_for_api(model, part)

Convert a ContentPart to the expected map of data for the OpenAI API.

content_parts_for_api(model, content_parts)

Convert a list of ContentParts to the expected map of data for the OpenAI API.

decode_stream(arg, done \\ [])

disconnect_websocket!(model)

@spec disconnect_websocket!(t()) :: t()

Close the WebSocket connection associated with this model.

Returns the model with :websocket set to nil. Safe to call even if the WebSocket is already closed.

do_api_request(openai, messages, tools, retry_count \\ nil)

@spec do_api_request(
  t(),
  [LangChain.Message.t()],
  LangChain.ChatModels.ChatModel.tools(),
  integer()
) ::
  list() | struct() | {:error, LangChain.LangChainError.t()}

for_api(model, fun)

for_api(openai, messages, tools)

@spec for_api(
  t() | LangChain.Message.t() | LangChain.Function.t(),
  message :: [map()],
  LangChain.ChatModels.ChatModel.tools()
) :: %{required(atom()) => any()}

Return the params formatted for an API request.

native_tool_call_for_api(model, arg2)

@spec native_tool_call_for_api(any(), any()) ::
  nil | %{id: any(), status: any(), type: <<_::120>>}

native_tool_calls_for_api(model, content_parts)

new(attrs \\ %{})

@spec new(attrs :: map()) :: {:ok, t()} | {:error, Ecto.Changeset.t()}

Setup a ChatOpenAI client configuration.

new!(attrs \\ %{})

@spec new!(attrs :: map()) :: t() | no_return()

Setup a ChatOpenAI client configuration and return it or raise an error if invalid.

restore_from_map(data)

Restores the model from the config.

retry_on_fallback?(arg1)

@spec retry_on_fallback?(LangChain.LangChainError.t()) :: boolean()

Determine if an error should be retried with a fallback model. Aligns with other providers.

serialize_config(model)

@spec serialize_config(t()) :: %{required(String.t()) => any()}

Generate a config map that can later restore the model's configuration.