Ollama provides cloud capabilities including web search, web fetch, cloud models, and a hosted API. This guide covers all cloud features.

API Key Setup

Cloud features require an Ollama API key:

  1. Create an account at https://ollama.com
  2. Generate a key at https://ollama.com/settings/keys
  3. Export the key:
export OLLAMA_API_KEY="your_api_key_here"

The client automatically uses OLLAMA_API_KEY when set.

Search the web and get structured results:

{:ok, response} = Ollixir.web_search(client, query: "Elixir programming language")

for result <- response.results do
  IO.puts("#{result.title}")
  IO.puts("  #{result.url}")
  IO.puts("  #{result.content}")
end

Parameters

ParameterTypeDefaultDescription
querystringrequiredSearch query
max_resultsinteger3Maximum results to return

Response Structure

%Ollixir.Web.SearchResponse{
  results: [
    %Ollixir.Web.SearchResult{
      title: "Elixir Programming Language",
      url: "https://elixir-lang.org",
      content: "Elixir is a dynamic, functional language..."
    }
  ]
}

Web Fetch

Fetch and extract content from a URL:

{:ok, response} = Ollixir.web_fetch(client, url: "https://elixir-lang.org")

IO.puts("Title: #{response.title}")
IO.puts("Content: #{response.content}")
IO.inspect(response.links, label: "Links")

Response Structure

%Ollixir.Web.FetchResponse{
  title: "The Elixir programming language",
  content: "Elixir is a dynamic, functional language...",
  links: ["https://...", "https://..."]
}

Web Tools for Agents

Use web capabilities as tools in agentic workflows.

See examples/mcp/mcp_server.exs for an MCP stdio server that exposes web_search and web_fetch to MCP clients (Cursor, Cline, Open WebUI, etc.).

# Get predefined web tool definitions
tools = Ollixir.Web.Tools.all()

{:ok, response} = Ollixir.chat(client,
  model: "llama3.2",
  messages: [
    %{role: "user", content: "Search for the latest Elixir news"}
  ],
  tools: tools
)

# Handle tool calls
case get_in(response, ["message", "tool_calls"]) do
  [%{"function" => %{"name" => "web_search", "arguments" => args}}] ->
    query = args["query"]
    {:ok, results} = Ollixir.web_search(client, query: query)
    # Continue conversation with results...

  _ ->
    response["message"]["content"]
end

Cloud Models

Cloud models run on Ollama's infrastructure while using your local client.

Setup

  1. Sign in (one-time):
ollama signin
  1. Pull a cloud model:
ollama pull gpt-oss:120b-cloud
  1. Use it like any local model:
client = Ollixir.init()

{:ok, response} = Ollixir.chat(client,
  model: "gpt-oss:120b-cloud",
  messages: [%{role: "user", content: "Explain quantum computing."}]
)

Available Cloud Models

ModelParametersContextFeatures
cogito-2.1:671b-cloud671B160KGeneral purpose, MIT license
deepseek-v3.1:671b-cloud671B160KThinking, tools, coding
deepseek-v3.2:cloud-160KReasoning, agentic
devstral-2:123b-cloud123B256KAgentic coding
devstral-small-2:24b-cloud24B256KAgentic coding
gemini-3-flash-preview:cloud-1MVision, thinking
glm-4.6:cloud-198KCoding, reasoning, agentic
glm-4.7:cloud-198KCoding, tool use
glm-5:cloud744B (40B active)198KReasoning, coding
gpt-oss:20b-cloud20B128KThinking levels
gpt-oss:120b-cloud120B128KThinking levels
kimi-k2:1t-cloud1T (32B active)256KAgentic coding
kimi-k2-thinking:cloud-256KReasoning, agentic
kimi-k2.5:cloud-256KMultimodal, thinking
minimax-m2:cloud230B (10B active)200KCoding, agentic
minimax-m2.1:cloud10B active200KMultilingual coding
minimax-m2.5:cloud-198KCoding, thinking, tools
ministral-3:3b-cloud3B256KEdge, vision
ministral-3:8b-cloud8B256KEdge, vision
ministral-3:14b-cloud14B256KEdge, vision
nemotron-3-nano:30b-cloud30B (3.5B active)1MReasoning
qwen3-coder:480b-cloud480B256KCode generation
qwen3-coder-next:cloud80B (3B active)256KAgentic coding
qwen3-next:80b-cloud80B256KGeneral purpose
qwen3-vl:235b-cloud235B256KVision-language
rnj-1:8b-cloud8B32KCode, STEM

See https://ollama.com/search?c=cloud for the latest list.

Cloud Models with Streaming

{:ok, stream} = Ollixir.chat(client,
  model: "gpt-oss:120b-cloud",
  messages: [%{role: "user", content: "Write a haiku about Elixir."}],
  stream: true
)

stream
|> Stream.each(fn chunk ->
  IO.write(get_in(chunk, ["message", "content"]) || "")
end)
|> Stream.run()

Hosted API (ollama.com)

Use the Ollama-hosted API directly instead of a local server:

# Point client at hosted API
client = Ollixir.init("https://ollama.com")

{:ok, response} = Ollixir.chat(client,
  model: "gpt-oss:120b",  # Note: no :cloud suffix for hosted API
  messages: [%{role: "user", content: "Hello!"}]
)

List Available Models

curl -H "Authorization: Bearer $OLLAMA_API_KEY" https://ollama.com/api/tags

Custom Headers

Override the default authorization:

client = Ollixir.init("https://ollama.com",
  headers: [{"authorization", "Bearer your_api_key_here"}]
)

Cloud vs Local Comparison

FeatureLocal OllamaCloud ModelsHosted API
ServerYour machineYour machineollama.com
ModelsDownloadedStreamedRemote
API KeyOptionalRequired (signin)Required
LatencyLowestMediumHighest
Model SizeLimited by hardwareVery largeVery large

Error Handling

Missing API Key

case Ollixir.web_search(client, query: "test") do
  {:ok, response} ->
    response

  {:error, %Ollixir.ResponseError{status: 401}} ->
    IO.puts("Missing or invalid API key. Set OLLAMA_API_KEY.")

  {:error, %Ollixir.ResponseError{status: 403}} ->
    IO.puts("API key lacks permission for this operation.")

  {:error, error} ->
    IO.puts("Request failed: #{inspect(error)}")
end

Validate API Key

curl -H "Authorization: Bearer $OLLAMA_API_KEY" https://ollama.com/api/tags

Rate Limiting

Cloud APIs may have rate limits. Implement backoff:

defmodule CloudClient do
  def search_with_retry(client, query, retries \\ 3) do
    case Ollixir.web_search(client, query: query) do
      {:ok, response} ->
        {:ok, response}

      {:error, %Ollixir.ResponseError{status: 429}} when retries > 0 ->
        Process.sleep(1000 * (4 - retries))
        search_with_retry(client, query, retries - 1)

      {:error, error} ->
        {:error, error}
    end
  end
end

Environment Variables

VariableDescription
OLLAMA_HOSTDefault server URL
OLLAMA_API_KEYBearer token for cloud features

Testing Cloud Features

Cloud tests are tagged and excluded by default:

# Run cloud tests after setting API key
mix test --include cloud_api

See Also