LangChain.ChatModels.ChatOllamaAI (LangChain v0.8.4)

Copy Markdown View Source

Represents the Ollama AI Chat model

Parses and validates inputs for making a requests from the Ollama Chat API.

Converts responses into more specialized LangChain data structures.

The module's functionalities include:

  • Initializing a new ChatOllamaAI struct with defaults or specific attributes.
  • Validating and casting input data to fit the expected schema.
  • Preparing and sending requests to the Ollama AI service API.
  • Managing both streaming and non-streaming API responses.
  • Processing API responses to convert them into suitable message formats.

The ChatOllamaAI struct has fields to configure the AI, including but not limited to:

  • endpoint: URL of the Ollama AI service.
  • model: The AI model used, e.g., "llama2:latest".
  • receive_timeout: Max wait time for AI service responses.
  • temperature: Influences the AI's response creativity.

For detailed info on on all other parameters see documentation here: https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values

This module is for use within LangChain and follows the ChatModel behavior, outlining callbacks AI chat models must implement.

Usage examples and more details are in the LangChain documentation or the module's function docs.

Callbacks

See the set of available callbacks: LangChain.Chains.ChainCallbacks

Tool Support

ChatOllamaAI supports tool calls in both streaming and non-streaming modes. Tools are defined using LangChain.Function and passed to the chain or call.

Not all Ollama models support tool calling. Models that support tools include llama3.1, mistral, qwen2.5, and others. Check the Ollama model library for the latest supported models.

Example: Non-streaming with tools

{:ok, chat} = ChatOllamaAI.new(%{model: "llama3.1:latest", stream: false})

weather_tool = LangChain.Function.new!(%{
  name: "get_weather",
  description: "Get the current weather for a location",
  parameters: [
    LangChain.FunctionParam.new!(%{
      name: "location",
      type: :string,
      description: "City name"
    })
  ],
  function: fn %{"location" => location}, _context ->
    {:ok, "72F and sunny in #{location}"}
  end
})

{:ok, result} = ChatOllamaAI.call(chat,
  [LangChain.Message.new_user!("What's the weather in Portland?")],
  [weather_tool]
)

Example: Streaming with tools

{:ok, chat} = ChatOllamaAI.new(%{model: "llama3.1:latest", stream: true})

# Streaming returns a list of MessageDelta structs followed by a final Message.
# When tool calls are present, they appear in the delta's tool_calls field.
{:ok, deltas} = ChatOllamaAI.call(chat,
  [LangChain.Message.new_user!("What's the weather in Portland?")],
  [weather_tool]
)

Configuration

The Ollama endpoint defaults to http://localhost:11434/api/chat. Override it using the :endpoint option:

ChatOllamaAI.new(%{
  model: "llama3.1:latest",
  endpoint: "http://my-ollama-host:11434/api/chat"
})

Structured Outputs (:format)

Ollama supports server-side structured output via the request's top-level format field. Set the :format option to either "json" for plain JSON mode, or a JSON Schema map for schema-enforced generation:

{:ok, chat} = ChatOllamaAI.new(%{
  model: "llama3.1:latest",
  format: %{
    "type" => "object",
    "required" => ["name", "city"],
    "properties" => %{
      "name" => %{"type" => "string"},
      "city" => %{"type" => "string"}
    }
  }
})

See https://github.com/ollama/ollama/blob/main/docs/api.md#request-structured-outputs.

Multimodal (images)

Ollama accepts images on user messages via a top-level images array of base64-encoded strings. Provide image content as :image ContentParts on a user message; ChatOllamaAI strips them out of the message content and re-attaches them to the images field on the wire:

{:ok, bytes} = File.read("photo.jpg")

user_msg = Message.new_user!([
  ContentPart.text!("What's in this picture?"),
  ContentPart.image!(Base.encode64(bytes), media: :jpg)
])

ChatOllamaAI.call(chat, [user_msg], [])

Note: :image_url content parts are not supported because the Ollama server has no URL fetcher — they will raise. Fetch the bytes yourself and pass them as :image parts.

Connection Retry Behavior

The retry_count option controls how many times a request is retried when a pooled HTTP connection turns out to be stale (server closed it between requests). This is a transport-level issue where retrying with a fresh connection is the correct response.

Only closed-connection errors are retried. Timeouts, rate limits (429), overloaded (529), authentication errors, and invalid requests all return immediately -- they are not problems that a simple retry will fix.

retry_countTotal HTTP requests
01 (no retries)
12 (1 initial + 1 retry)
2 (default)3 (1 initial + 2 retries)

Req's built-in HTTP retry is disabled to prevent the two retry layers from compounding. See GitHub issue #503.

When running LLM calls from a background job queue (e.g., Oban) that has its own retry logic, set retry_count: 0 so there are no hidden retries:

ChatOllamaAI.new!(%{model: "...", retry_count: 0})

Summary

Functions

Calls the Ollama Chat Completion API struct with configuration, plus either a simple message or the list of messages to act as the prompt.

Return the params formatted for an API request.

Creates a new ChatOllamaAI struct with the given attributes.

Creates a new ChatOllamaAI struct with the given attributes. Will raise an error if the changeset is invalid.

Restores the model from the config.

Determine if an error should be retried. If true, a fallback LLM may be used. If false, the error is understood to be more fundamental with the request rather than a service issue and it should not be retried or fallback to another service.

Generate a config map that can later restore the model's configuration.

Types

t()

@type t() :: %LangChain.ChatModels.ChatOllamaAI{
  callbacks: term(),
  endpoint: term(),
  format: term(),
  keep_alive: term(),
  mirostat: term(),
  mirostat_eta: term(),
  mirostat_tau: term(),
  model: term(),
  num_ctx: term(),
  num_gpu: term(),
  num_gqa: term(),
  num_predict: term(),
  num_thread: term(),
  receive_timeout: term(),
  repeat_last_n: term(),
  repeat_penalty: term(),
  retry_count: term(),
  seed: term(),
  stop: term(),
  stream: term(),
  temperature: term(),
  tfs_z: term(),
  top_k: term(),
  top_p: term(),
  verbose_api: term()
}

Functions

call(ollama_ai, prompt, tools \\ [])

Calls the Ollama Chat Completion API struct with configuration, plus either a simple message or the list of messages to act as the prompt.

NOTE: This function can be used directly, but the primary interface should be through LangChain.Chains.LLMChain. The ChatOllamaAI module is more focused on translating the LangChain data structures to and from the Ollama API.

Another benefit of using LangChain.Chains.LLMChain is that it combines the storage of messages, adding functions, adding custom context that should be passed to functions, and automatically applying LangChain.MessageDelta structs as they are are received, then converting those to the full LangChain.Message once fully complete.

do_process_response(model, response)

for_api(msg)

for_api(model, messages, tools)

Return the params formatted for an API request.

new(attrs \\ %{})

@spec new(attrs :: map()) :: {:ok, t()} | {:error, Ecto.Changeset.t()}

Creates a new ChatOllamaAI struct with the given attributes.

new!(attrs \\ %{})

@spec new!(attrs :: map()) :: t() | no_return()

Creates a new ChatOllamaAI struct with the given attributes. Will raise an error if the changeset is invalid.

restore_from_map(data)

Restores the model from the config.

retry_on_fallback?(arg1)

@spec retry_on_fallback?(LangChain.LangChainError.t()) :: boolean()

Determine if an error should be retried. If true, a fallback LLM may be used. If false, the error is understood to be more fundamental with the request rather than a service issue and it should not be retried or fallback to another service.

serialize_config(model)

@spec serialize_config(t()) :: %{required(String.t()) => any()}

Generate a config map that can later restore the model's configuration.