Ollama provides cloud capabilities including web search, web fetch, cloud models, and a hosted API. This guide covers all cloud features.
API Key Setup
Cloud features require an Ollama API key:
- Create an account at https://ollama.com
- Generate a key at https://ollama.com/settings/keys
- Export the key:
export OLLAMA_API_KEY="your_api_key_here"
The client automatically uses OLLAMA_API_KEY when set.
Web Search
Search the web and get structured results:
{:ok, response} = Ollixir.web_search(client, query: "Elixir programming language")
for result <- response.results do
IO.puts("#{result.title}")
IO.puts(" #{result.url}")
IO.puts(" #{result.content}")
endParameters
| Parameter | Type | Default | Description |
|---|---|---|---|
query | string | required | Search query |
max_results | integer | 3 | Maximum results to return |
Response Structure
%Ollixir.Web.SearchResponse{
results: [
%Ollixir.Web.SearchResult{
title: "Elixir Programming Language",
url: "https://elixir-lang.org",
content: "Elixir is a dynamic, functional language..."
}
]
}Web Fetch
Fetch and extract content from a URL:
{:ok, response} = Ollixir.web_fetch(client, url: "https://elixir-lang.org")
IO.puts("Title: #{response.title}")
IO.puts("Content: #{response.content}")
IO.inspect(response.links, label: "Links")Response Structure
%Ollixir.Web.FetchResponse{
title: "The Elixir programming language",
content: "Elixir is a dynamic, functional language...",
links: ["https://...", "https://..."]
}Web Tools for Agents
Use web capabilities as tools in agentic workflows.
See examples/mcp/mcp_server.exs for an MCP stdio server that exposes
web_search and web_fetch to MCP clients (Cursor, Cline, Open WebUI, etc.).
# Get predefined web tool definitions
tools = Ollixir.Web.Tools.all()
{:ok, response} = Ollixir.chat(client,
model: "llama3.2",
messages: [
%{role: "user", content: "Search for the latest Elixir news"}
],
tools: tools
)
# Handle tool calls
case get_in(response, ["message", "tool_calls"]) do
[%{"function" => %{"name" => "web_search", "arguments" => args}}] ->
query = args["query"]
{:ok, results} = Ollixir.web_search(client, query: query)
# Continue conversation with results...
_ ->
response["message"]["content"]
endCloud Models
Cloud models run on Ollama's infrastructure while using your local client.
Setup
- Sign in (one-time):
ollama signin
- Pull a cloud model:
ollama pull gpt-oss:120b-cloud
- Use it like any local model:
client = Ollixir.init()
{:ok, response} = Ollixir.chat(client,
model: "gpt-oss:120b-cloud",
messages: [%{role: "user", content: "Explain quantum computing."}]
)Available Cloud Models
| Model | Parameters | Context | Features |
|---|---|---|---|
cogito-2.1:671b-cloud | 671B | 160K | General purpose, MIT license |
deepseek-v3.1:671b-cloud | 671B | 160K | Thinking, tools, coding |
deepseek-v3.2:cloud | - | 160K | Reasoning, agentic |
devstral-2:123b-cloud | 123B | 256K | Agentic coding |
devstral-small-2:24b-cloud | 24B | 256K | Agentic coding |
gemini-3-flash-preview:cloud | - | 1M | Vision, thinking |
glm-4.6:cloud | - | 198K | Coding, reasoning, agentic |
glm-4.7:cloud | - | 198K | Coding, tool use |
glm-5:cloud | 744B (40B active) | 198K | Reasoning, coding |
gpt-oss:20b-cloud | 20B | 128K | Thinking levels |
gpt-oss:120b-cloud | 120B | 128K | Thinking levels |
kimi-k2:1t-cloud | 1T (32B active) | 256K | Agentic coding |
kimi-k2-thinking:cloud | - | 256K | Reasoning, agentic |
kimi-k2.5:cloud | - | 256K | Multimodal, thinking |
minimax-m2:cloud | 230B (10B active) | 200K | Coding, agentic |
minimax-m2.1:cloud | 10B active | 200K | Multilingual coding |
minimax-m2.5:cloud | - | 198K | Coding, thinking, tools |
ministral-3:3b-cloud | 3B | 256K | Edge, vision |
ministral-3:8b-cloud | 8B | 256K | Edge, vision |
ministral-3:14b-cloud | 14B | 256K | Edge, vision |
nemotron-3-nano:30b-cloud | 30B (3.5B active) | 1M | Reasoning |
qwen3-coder:480b-cloud | 480B | 256K | Code generation |
qwen3-coder-next:cloud | 80B (3B active) | 256K | Agentic coding |
qwen3-next:80b-cloud | 80B | 256K | General purpose |
qwen3-vl:235b-cloud | 235B | 256K | Vision-language |
rnj-1:8b-cloud | 8B | 32K | Code, STEM |
See https://ollama.com/search?c=cloud for the latest list.
Cloud Models with Streaming
{:ok, stream} = Ollixir.chat(client,
model: "gpt-oss:120b-cloud",
messages: [%{role: "user", content: "Write a haiku about Elixir."}],
stream: true
)
stream
|> Stream.each(fn chunk ->
IO.write(get_in(chunk, ["message", "content"]) || "")
end)
|> Stream.run()Hosted API (ollama.com)
Use the Ollama-hosted API directly instead of a local server:
# Point client at hosted API
client = Ollixir.init("https://ollama.com")
{:ok, response} = Ollixir.chat(client,
model: "gpt-oss:120b", # Note: no :cloud suffix for hosted API
messages: [%{role: "user", content: "Hello!"}]
)List Available Models
curl -H "Authorization: Bearer $OLLAMA_API_KEY" https://ollama.com/api/tags
Custom Headers
Override the default authorization:
client = Ollixir.init("https://ollama.com",
headers: [{"authorization", "Bearer your_api_key_here"}]
)Cloud vs Local Comparison
| Feature | Local Ollama | Cloud Models | Hosted API |
|---|---|---|---|
| Server | Your machine | Your machine | ollama.com |
| Models | Downloaded | Streamed | Remote |
| API Key | Optional | Required (signin) | Required |
| Latency | Lowest | Medium | Highest |
| Model Size | Limited by hardware | Very large | Very large |
Error Handling
Missing API Key
case Ollixir.web_search(client, query: "test") do
{:ok, response} ->
response
{:error, %Ollixir.ResponseError{status: 401}} ->
IO.puts("Missing or invalid API key. Set OLLAMA_API_KEY.")
{:error, %Ollixir.ResponseError{status: 403}} ->
IO.puts("API key lacks permission for this operation.")
{:error, error} ->
IO.puts("Request failed: #{inspect(error)}")
endValidate API Key
curl -H "Authorization: Bearer $OLLAMA_API_KEY" https://ollama.com/api/tags
Rate Limiting
Cloud APIs may have rate limits. Implement backoff:
defmodule CloudClient do
def search_with_retry(client, query, retries \\ 3) do
case Ollixir.web_search(client, query: query) do
{:ok, response} ->
{:ok, response}
{:error, %Ollixir.ResponseError{status: 429}} when retries > 0 ->
Process.sleep(1000 * (4 - retries))
search_with_retry(client, query, retries - 1)
{:error, error} ->
{:error, error}
end
end
endEnvironment Variables
| Variable | Description |
|---|---|
OLLAMA_HOST | Default server URL |
OLLAMA_API_KEY | Bearer token for cloud features |
Testing Cloud Features
Cloud tests are tagged and excluded by default:
# Run cloud tests after setting API key
mix test --include cloud_api
See Also
- Getting Started - Basic setup
- Tools Guide - Using web tools in agents
- Ollama Server Setup - Local and cloud setup