View Source Ollama (Ollama v0.7.1)

Ollama-ex

License

Ollama is a powerful tool for running large language models locally or on your own infrastructure. This library provides an interface for working with Ollama in Elixir.

  • 🦙 Full implementation of the Ollama API
  • 🛜 Support for streaming requests (to an Enumerable or any Elixir process)
  • 🛠️ Tool use (Function calling) capability

Installation

The package can be installed by adding ollama to your list of dependencies in mix.exs.

def deps do
  [
    {:ollama, "0.7.1"}
  ]
end

Quickstart

Assuming you have Ollama running on localhost, and that you have installed a model, use completion/2 or chat/2 interact with the model.

1. Generate a completion

iex> client = Ollama.init()

iex> Ollama.completion(client, [
...>   model: "llama2",
...>   prompt: "Why is the sky blue?",
...> ])
{:ok, %{"response" => "The sky is blue because it is the color of the sky.", ...}}

2. Generate the next message in a chat

iex> client = Ollama.init()
iex> messages = [
...>   %{role: "system", content: "You are a helpful assistant."},
...>   %{role: "user", content: "Why is the sky blue?"},
...>   %{role: "assistant", content: "Due to rayleigh scattering."},
...>   %{role: "user", content: "How is that different than mie scattering?"},
...> ]

iex> Ollama.chat(client, [
...>   model: "llama2",
...>   messages: messages,
...> ])
{:ok, %{"message" => %{
  "role" => "assistant",
  "content" => "Mie scattering affects all wavelengths similarly, while Rayleigh favors shorter ones."
}, ...}}

Streaming

Streaming is supported on certain endpoints by setting the :stream option to true or a pid/0.

When :stream is set to true, a lazy Enumerable.t/0 is returned, which can be used with any Stream functions.

iex> Ollama.completion(client, [
...>   model: "llama2",
...>   prompt: "Why is the sky blue?",
...>   stream: true,
...> ])
{:ok, stream}

iex> is_function(stream, 2)
true

iex> stream
...> |> Stream.each(& Process.send(pid, &1, [])
...> |> Stream.run()
:ok

This approach above builds the Enumerable.t/0 by calling receive, which may cause issues in GenServer callbacks. As an alternative, you can set the :stream option to a pid/0. This returns a Task.t/0 that sends messages to the specified process.

The following example demonstrates a streaming request in a LiveView event, sending each streaming message back to the same LiveView process:

defmodule MyApp.ChatLive do
  use Phoenix.LiveView

  # When the client invokes the "prompt" event, create a streaming request and
  # asynchronously send messages back to self.
  def handle_event("prompt", %{"message" => prompt}, socket) do
    {:ok, task} = Ollama.completion(Ollama.init(), [
      model: "llama2",
      prompt: prompt,
      stream: self(),
    ])

    {:noreply, assign(socket, current_request: task)}
  end

  # The streaming request sends messages back to the LiveView process.
  def handle_info({_request_pid, {:data, _data}} = message, socket) do
    pid = socket.assigns.current_request.pid
    case message do
      {^pid, {:data, %{"done" => false} = data}} ->
        # handle each streaming chunk

      {^pid, {:data, %{"done" => true} = data}} ->
        # handle the final streaming chunk

      {_pid, _data} ->
        # this message was not expected!
    end
  end

  # Tidy up when the request is finished
  def handle_info({ref, {:ok, %Req.Response{status: 200}}}, socket) do
    Process.demonitor(ref, [:flush])
    {:noreply, assign(socket, current_request: nil)}
  end
end

Regardless of the streaming approach used, each streaming message is a plain map/0. For the message schema, refer to the Ollama API docs.

Function calling

Ollama 0.3 and later versions support tool use and function calling on compatible models. Note that Ollama currently doesn't support tool use with streaming requests, so avoid setting :stream to true.

Using tools typically involves at least two round-trip requests to the model. Begin by defining one or more tools using a schema similar to ChatGPT's. Provide clear and concise descriptions for the tool and each argument.

iex> stock_price_tool = %{
...>   type: "function",
...>   function: %{
...>     name: "get_stock_price",
...>     description: "Fetches the live stock price for the given ticker.",
...>     parameters: %{
...>       type: "object",
...>       properties: %{
...>         ticker: %{
...>           type: "string",
...>           description: "The ticker symbol of a specific stock."
...>         }
...>       },
...>       required: ["ticker"]
...>     }
...>   }
...> }

The first round-trip involves sending a prompt in a chat with the tool definitions. The model should respond with a message containing a list of tool calls.

iex> Ollama.chat(client, [
...>   model: "mistral-nemo",
...>   messages: [
...>     %{role: "user", content: "What is the current stock price for Apple?"}
...>   ],
...>   tools: [stock_price_tool],
...> ])
{:ok, %{"message" => %{
  "role" => "assistant",
  "content" => "",
  "tool_calls" => [
    %{"function" => %{
      "name" => "get_stock_price",
      "arguments" => %{"ticker" => "AAPL"}
    }}
  ]
}, ...}}

Your implementation must intercept these tool calls and execute a corresponding function in your codebase with the specified arguments. The next round-trip involves passing the function's result back to the model as a message with a :role of "tool".

iex> Ollama.chat(client, [
...>   model: "mistral-nemo",
...>   messages: [
...>     %{role: "user", content: "What is the current stock price for Apple?"},
...>     %{role: "assistant", content: "", tool_calls: [%{"function" => %{"name" => "get_stock_price", "arguments" => %{"ticker" => "AAPL"}}}]},
...>     %{role: "tool", content: "$217.96"},
...>   ],
...>   tools: [stock_price_tool],
...> ])
{:ok, %{"message" => %{
  "role" => "assistant",
  "content" => "The current stock price for Apple (AAPL) is approximately $217.96.",
}, ...}}

After receiving the function tool's value, the model will respond to the user's original prompt, incorporating the function result into its response.

Summary

Types

Client struct

Chat message

Client response

Tool definition

Functions

Generates the next message in a chat using the specified model. Optionally streamable.

Checks a blob exists in ollama by its digest or binary data.

Generates a completion for the given prompt using the specified model. Optionally streamable.

Creates a model with another name from an existing model.

Creates a blob from its binary data.

Creates a model using the given name and model file. Optionally streamable.

Deletes a model and its data.

Generate embeddings from a model for the given prompt.

Generate embeddings from a model for the given prompt.

Creates a new Ollama API client. Accepts either a base URL for the Ollama API, a keyword list of options passed to Req.new/1, or an existing Req.Request.t/0 struct.

Lists all models that Ollama has available.

Lists currently running models, their memory footprint, and process details.

Downloads a model from the ollama library. Optionally streamable.

Upload a model to a model library. Requires registering for ollama.ai and adding a public key first. Optionally streamable.

Shows all information for a specific model.

Types

@type client() :: %Ollama{req: Req.Request.t()}

Client struct

@type message() ::
  {:role, term()}
  | {:content, binary()}
  | {:images, [binary()]}
  | {:tool_calls, [%{optional(term()) => term()}]}

Chat message

A chat message is a map/0 with the following fields:

  • :role - Required. The role of the message, either system, user, assistant or tool.
  • :content (String.t/0) - Required. The content of the message.
  • :images (list of String.t/0) - (optional) List of Base64 encoded images (for multimodal models only).
  • :tool_calls (list of map of term/0 keys and term/0 values) - (optional) List of tools the model wants to use.
@type response() ::
  {:ok, map() | boolean() | Enumerable.t() | Task.t()} | {:error, term()}

Client response

@type tool() :: {:type, term()} | {:function, map()}

Tool definition

A tool definition is a map/0 with the following fields:

  • :type - Required. Type of tool. (Currently only "function" supported).
  • :function (map/0) - Required.
    • :name (String.t/0) - Required. The name of the function to be called.
    • :description (String.t/0) - A description of what the function does.
    • :parameters (map/0) - Required. The parameters the functions accepts.

Functions

@spec chat(
  client(),
  keyword()
) :: response()

Generates the next message in a chat using the specified model. Optionally streamable.

Options

  • :model (String.t/0) - Required. The ollama model name.
  • :messages (list of map/0) - Required. List of messages - used to keep a chat memory.
  • :tools (list of map/0) - Tools for the model to use if supported (requires stream to be false)
  • :format (String.t/0) - Set the expected format of the response (json).
  • :stream - See section on streaming. The default value is false.
  • :keep_alive - How long to keep the model loaded.
  • :options - Additional advanced model parameters.

Message structure

Each message is a map with the following fields:

  • :role - Required. The role of the message, either system, user, assistant or tool.
  • :content (String.t/0) - Required. The content of the message.
  • :images (list of String.t/0) - (optional) List of Base64 encoded images (for multimodal models only).
  • :tool_calls (list of map of term/0 keys and term/0 values) - (optional) List of tools the model wants to use.

Tool definitions

  • :type - Required. Type of tool. (Currently only "function" supported).
  • :function (map/0) - Required.
    • :name (String.t/0) - Required. The name of the function to be called.
    • :description (String.t/0) - A description of what the function does.
    • :parameters (map/0) - Required. The parameters the functions accepts.

Examples

iex> messages = [
...>   %{role: "system", content: "You are a helpful assistant."},
...>   %{role: "user", content: "Why is the sky blue?"},
...>   %{role: "assistant", content: "Due to rayleigh scattering."},
...>   %{role: "user", content: "How is that different than mie scattering?"},
...> ]

iex> Ollama.chat(client, [
...>   model: "llama2",
...>   messages: messages,
...> ])
{:ok, %{"message" => %{
  "role" => "assistant",
  "content" => "Mie scattering affects all wavelengths similarly, while Rayleigh favors shorter ones."
}, ...}}

# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.chat(client, [
...>   model: "llama2",
...>   messages: messages,
...>   stream: true,
...> ])
{:ok, Ollama.Streaming{}}
Link to this function

check_blob(client, digest)

View Source
@spec check_blob(client(), Ollama.Blob.digest() | binary()) :: response()

Checks a blob exists in ollama by its digest or binary data.

Examples

iex> Ollama.check_blob(client, "sha256:fe938a131f40e6f6d40083c9f0f430a515233eb2edaa6d72eb85c50d64f2300e")
{:ok, true}

iex> Ollama.check_blob(client, "this should not exist")
{:ok, false}
Link to this function

completion(client, params)

View Source
@spec completion(
  client(),
  keyword()
) :: response()

Generates a completion for the given prompt using the specified model. Optionally streamable.

Options

  • :model (String.t/0) - Required. The ollama model name.
  • :prompt (String.t/0) - Required. Prompt to generate a response for.
  • :images (list of String.t/0) - A list of Base64 encoded images to be included with the prompt (for multimodal models only).
  • :system (String.t/0) - System prompt, overriding the model default.
  • :template (String.t/0) - Prompt template, overriding the model default.
  • :context - The context parameter returned from a previous completion/2 call (enabling short conversational memory).
  • :format (String.t/0) - Set the expected format of the response (json).
  • :raw (boolean/0) - Set true if specifying a fully templated prompt. (:template is ingored)
  • :stream - See section on streaming. The default value is false.
  • :keep_alive - How long to keep the model loaded.
  • :options - Additional advanced model parameters.

Examples

iex> Ollama.completion(client, [
...>   model: "llama2",
...>   prompt: "Why is the sky blue?",
...> ])
{:ok, %{"response": "The sky is blue because it is the color of the sky.", ...}}

# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.completion(client, [
...>   model: "llama2",
...>   prompt: "Why is the sky blue?",
...>   stream: true,
...> ])
{:ok, %Ollama.Streaming{}}
Link to this function

copy_model(client, params)

View Source
@spec copy_model(
  client(),
  keyword()
) :: response()

Creates a model with another name from an existing model.

Options

  • :source (String.t/0) - Required. Name of the model to copy from.
  • :destination (String.t/0) - Required. Name of the model to copy to.

Example

iex> Ollama.copy_model(client, [
...>   source: "llama2",
...>   destination: "llama2-backup"
...> ])
{:ok, true}
Link to this function

create_blob(client, blob)

View Source
@spec create_blob(client(), binary()) :: response()

Creates a blob from its binary data.

Example

iex> Ollama.create_blob(client, data)
{:ok, true}
Link to this function

create_model(client, params)

View Source
@spec create_model(
  client(),
  keyword()
) :: response()

Creates a model using the given name and model file. Optionally streamable.

Any dependent blobs reference in the modelfile, such as FROM and ADAPTER instructions, must exist first. See check_blob/2 and create_blob/2.

Options

  • :name (String.t/0) - Required. Name of the model to create.
  • :modelfile (String.t/0) - Required. Contents of the Modelfile.
  • :quantize (String.t/0) - Quantize f16 and f32 models when importing them.
  • :stream - See section on streaming. The default value is false.

Example

iex> modelfile = "FROM llama2\nSYSTEM \"You are mario from Super Mario Bros.\""
iex> Ollama.create_model(client, [
...>   name: "mario",
...>   modelfile: modelfile,
...>   stream: true,
...> ])
{:ok, Ollama.Streaming{}}
Link to this function

delete_model(client, params)

View Source
@spec delete_model(
  client(),
  keyword()
) :: response()

Deletes a model and its data.

Options

  • :source (String.t/0) - Required. Name of the model to copy from.
  • :destination (String.t/0) - Required. Name of the model to copy to.

Example

iex> Ollama.delete_model(client, name: "llama2")
{:ok, true}
@spec embed(
  client(),
  keyword()
) :: response()

Generate embeddings from a model for the given prompt.

Options

  • :model (String.t/0) - Required. The name of the model used to generate the embeddings.
  • :input - Required. Text or list of text to generate embeddings for.
  • :truncate (boolean/0) - Truncates the end of each input to fit within context length.
  • :keep_alive - How long to keep the model loaded.
  • :options - Additional advanced model parameters.

Example

iex> Ollama.embed(client, [
...>   model: "nomic-embed-text",
...>   input: ["Why is the sky blue?", "Why is the grass green?"],
...> ])
{:ok, %{"embedding" => [
  [ 0.009724553, 0.04449892, -0.14063916, 0.0013168337, 0.032128844,
    0.10730086, -0.008447222, 0.010106917, 5.2289694e-4, -0.03554127, ...],
  [ 0.028196355, 0.043162502, -0.18592504, 0.035034444, 0.055619627,
    0.12082449, -0.0090096295, 0.047170386, -0.032078084, 0.0047163847, ...]
]}}
Link to this function

embeddings(client, params)

View Source
This function is deprecated. Superseded by embed/2.
@spec embeddings(
  client(),
  keyword()
) :: response()

Generate embeddings from a model for the given prompt.

Options

  • :model (String.t/0) - Required. The name of the model used to generate the embeddings.
  • :prompt (String.t/0) - Required. The prompt used to generate the embedding.
  • :keep_alive - How long to keep the model loaded.
  • :options - Additional advanced model parameters.

Example

iex> Ollama.embeddings(client, [
...>   model: "llama2",
...>   prompt: "Here is an article about llamas..."
...> ])
{:ok, %{"embedding" => [
  0.5670403838157654, 0.009260174818336964, 0.23178744316101074, -0.2916173040866852, -0.8924556970596313,
  0.8785552978515625, -0.34576427936553955, 0.5742510557174683, -0.04222835972905159, -0.137906014919281
]}}
@spec init(Req.url() | keyword() | Req.Request.t()) :: client()

Creates a new Ollama API client. Accepts either a base URL for the Ollama API, a keyword list of options passed to Req.new/1, or an existing Req.Request.t/0 struct.

If no arguments are given, the client is initiated with the default options:

@default_req_opts [
  base_url: "http://localhost:11434/api",
  receive_timeout: 60_000,
]

Examples

iex> client = Ollama.init("https://ollama.service.ai:11434/api")
%Ollama{}
@spec list_models(client()) :: response()

Lists all models that Ollama has available.

Example

iex> Ollama.list_models(client)
{:ok, %{"models" => [
  %{"name" => "codellama:13b", ...},
  %{"name" => "llama2:latest", ...},
]}}
@spec list_running(client()) :: response()

Lists currently running models, their memory footprint, and process details.

Example

iex> Ollama.list_running(client)
{:ok, %{"models" => [
  %{"name" => "nomic-embed-text:latest", ...},
]}}
Link to this function

pull_model(client, params)

View Source
@spec pull_model(
  client(),
  keyword()
) :: response()

Downloads a model from the ollama library. Optionally streamable.

Options

Example

iex> Ollama.pull_model(client, name: "llama2")
{:ok, %{"status" => "success"}}

# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.pull_model(client, name: "llama2", stream: true)
{:ok, %Ollama.Streaming{}}
Link to this function

push_model(client, params)

View Source
@spec push_model(
  client(),
  keyword()
) :: response()

Upload a model to a model library. Requires registering for ollama.ai and adding a public key first. Optionally streamable.

Options

Example

iex> Ollama.push_model(client, name: "mattw/pygmalion:latest")
{:ok, %{"status" => "success"}}

# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.push_model(client, name: "mattw/pygmalion:latest", stream: true)
{:ok, %Ollama.Streaming{}}
Link to this function

show_model(client, params)

View Source
@spec show_model(
  client(),
  keyword()
) :: response()

Shows all information for a specific model.

Options

  • :name (String.t/0) - Required. Name of the model to show.

Example

iex> Ollama.show_model(client, name: "llama2")
{:ok, %{
  "details" => %{
    "families" => ["llama", "clip"],
    "family" => "llama",
    "format" => "gguf",
    "parameter_size" => "7B",
    "quantization_level" => "Q4_0"
  },
  "modelfile" => "...",
  "parameters" => "...",
  "template" => "..."
}}