Represents the Ollama AI Chat model
Parses and validates inputs for making a requests from the Ollama Chat API.
Converts responses into more specialized LangChain data structures.
The module's functionalities include:
- Initializing a new
ChatOllamaAIstruct with defaults or specific attributes. - Validating and casting input data to fit the expected schema.
- Preparing and sending requests to the Ollama AI service API.
- Managing both streaming and non-streaming API responses.
- Processing API responses to convert them into suitable message formats.
The ChatOllamaAI struct has fields to configure the AI, including but not limited to:
endpoint: URL of the Ollama AI service.model: The AI model used, e.g., "llama2:latest".receive_timeout: Max wait time for AI service responses.temperature: Influences the AI's response creativity.
For detailed info on on all other parameters see documentation here: https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values
This module is for use within LangChain and follows the ChatModel behavior,
outlining callbacks AI chat models must implement.
Usage examples and more details are in the LangChain documentation or the module's function docs.
Callbacks
See the set of available callbacks: LangChain.Chains.ChainCallbacks
Tool Support
ChatOllamaAI supports tool calls in both streaming and non-streaming modes.
Tools are defined using LangChain.Function and passed to the chain or call.
Not all Ollama models support tool calling. Models that support tools include
llama3.1, mistral, qwen2.5, and others. Check the
Ollama model library for the latest supported models.
Example: Non-streaming with tools
{:ok, chat} = ChatOllamaAI.new(%{model: "llama3.1:latest", stream: false})
weather_tool = LangChain.Function.new!(%{
name: "get_weather",
description: "Get the current weather for a location",
parameters: [
LangChain.FunctionParam.new!(%{
name: "location",
type: :string,
description: "City name"
})
],
function: fn %{"location" => location}, _context ->
{:ok, "72F and sunny in #{location}"}
end
})
{:ok, result} = ChatOllamaAI.call(chat,
[LangChain.Message.new_user!("What's the weather in Portland?")],
[weather_tool]
)Example: Streaming with tools
{:ok, chat} = ChatOllamaAI.new(%{model: "llama3.1:latest", stream: true})
# Streaming returns a list of MessageDelta structs followed by a final Message.
# When tool calls are present, they appear in the delta's tool_calls field.
{:ok, deltas} = ChatOllamaAI.call(chat,
[LangChain.Message.new_user!("What's the weather in Portland?")],
[weather_tool]
)Configuration
The Ollama endpoint defaults to http://localhost:11434/api/chat. Override
it using the :endpoint option:
ChatOllamaAI.new(%{
model: "llama3.1:latest",
endpoint: "http://my-ollama-host:11434/api/chat"
})Structured Outputs (:format)
Ollama supports server-side structured output via the request's top-level
format field. Set the :format option to either "json" for plain JSON
mode, or a JSON Schema map for schema-enforced generation:
{:ok, chat} = ChatOllamaAI.new(%{
model: "llama3.1:latest",
format: %{
"type" => "object",
"required" => ["name", "city"],
"properties" => %{
"name" => %{"type" => "string"},
"city" => %{"type" => "string"}
}
}
})See https://github.com/ollama/ollama/blob/main/docs/api.md#request-structured-outputs.
Multimodal (images)
Ollama accepts images on user messages via a top-level images array of
base64-encoded strings. Provide image content as :image ContentParts
on a user message; ChatOllamaAI strips them out of the message content
and re-attaches them to the images field on the wire:
{:ok, bytes} = File.read("photo.jpg")
user_msg = Message.new_user!([
ContentPart.text!("What's in this picture?"),
ContentPart.image!(Base.encode64(bytes), media: :jpg)
])
ChatOllamaAI.call(chat, [user_msg], [])Note: :image_url content parts are not supported because the Ollama
server has no URL fetcher — they will raise. Fetch the bytes yourself and
pass them as :image parts.
Connection Retry Behavior
The retry_count option controls how many times a request is retried when
a pooled HTTP connection turns out to be stale (server closed it between
requests). This is a transport-level issue where retrying with a fresh
connection is the correct response.
Only closed-connection errors are retried. Timeouts, rate limits (429), overloaded (529), authentication errors, and invalid requests all return immediately -- they are not problems that a simple retry will fix.
retry_count | Total HTTP requests |
|---|---|
0 | 1 (no retries) |
1 | 2 (1 initial + 1 retry) |
2 (default) | 3 (1 initial + 2 retries) |
Req's built-in HTTP retry is disabled to prevent the two retry layers from compounding. See GitHub issue #503.
When running LLM calls from a background job queue (e.g., Oban) that has its
own retry logic, set retry_count: 0 so there are no hidden retries:
ChatOllamaAI.new!(%{model: "...", retry_count: 0})
Summary
Functions
Calls the Ollama Chat Completion API struct with configuration, plus either a simple message or the list of messages to act as the prompt.
Return the params formatted for an API request.
Creates a new ChatOllamaAI struct with the given attributes.
Creates a new ChatOllamaAI struct with the given attributes. Will raise an error if the changeset is invalid.
Restores the model from the config.
Determine if an error should be retried. If true, a fallback LLM may be
used. If false, the error is understood to be more fundamental with the
request rather than a service issue and it should not be retried or fallback
to another service.
Generate a config map that can later restore the model's configuration.
Types
@type t() :: %LangChain.ChatModels.ChatOllamaAI{ callbacks: term(), endpoint: term(), format: term(), keep_alive: term(), mirostat: term(), mirostat_eta: term(), mirostat_tau: term(), model: term(), num_ctx: term(), num_gpu: term(), num_gqa: term(), num_predict: term(), num_thread: term(), receive_timeout: term(), repeat_last_n: term(), repeat_penalty: term(), retry_count: term(), seed: term(), stop: term(), stream: term(), temperature: term(), tfs_z: term(), top_k: term(), top_p: term(), verbose_api: term() }
Functions
Calls the Ollama Chat Completion API struct with configuration, plus either a simple message or the list of messages to act as the prompt.
NOTE: This function can be used directly, but the primary interface
should be through LangChain.Chains.LLMChain. The ChatOllamaAI module is more focused on
translating the LangChain data structures to and from the Ollama API.
Another benefit of using LangChain.Chains.LLMChain is that it combines the
storage of messages, adding functions, adding custom context that should be
passed to functions, and automatically applying LangChain.MessageDelta
structs as they are are received, then converting those to the full
LangChain.Message once fully complete.
Return the params formatted for an API request.
@spec new(attrs :: map()) :: {:ok, t()} | {:error, Ecto.Changeset.t()}
Creates a new ChatOllamaAI struct with the given attributes.
Creates a new ChatOllamaAI struct with the given attributes. Will raise an error if the changeset is invalid.
Restores the model from the config.
@spec retry_on_fallback?(LangChain.LangChainError.t()) :: boolean()
Determine if an error should be retried. If true, a fallback LLM may be
used. If false, the error is understood to be more fundamental with the
request rather than a service issue and it should not be retried or fallback
to another service.
Generate a config map that can later restore the model's configuration.