Adding a New Provider
View SourceTL;DR
- Implement a provider module under
lib/req_llm/providers/, useReqLLM.Provider.DSL+Defaults, and only override what the API actually deviates on. - The
Defaultprovider implementation is OpenAI Compatible. - Non-streaming requests run through Req with
attach/3+encode_body/1+decode_response/1; streaming runs through Finch withattach_stream/4+decode_stream_event/2or/3. - Add models via
priv/models_local/, runmix req_llm.model_sync, then add tests using the three-tier strategy and record fixtures withLIVE=true.
Overview and Prerequisites
What it means to add a provider
Adding a provider means implementing a single Elixir module that:
- Translates between canonical types (
Model,Context,Message,ContentPart,Tool) and the provider HTTP API - Implements the
ReqLLM.Providerbehavior via the DSL and default callbacks - Provides SSE-to-
StreamChunkdecoding for streaming when applicable
Required knowledge and setup
You should know:
- Provider's API paths, request/response JSON, auth, and streaming protocol
- Req basics (request/response steps) and Finch for streaming
- ReqLLM canonical types (see Data Structures) and normalization principles (Core Concepts)
Before coding
- Confirm provider supports needed capabilities (chat, tools, images, streaming)
- Gather API key/env var name and any extra headers or versions
- Start with the OpenAI-compatible defaults if at all possible
Provider Module Structure
File location
Create lib/req_llm/providers/<provider>.ex
Using the DSL
Use the DSL to register:
id(atom) - Provider identifierbase_url- Default API endpointmetadata- Path to metadata file (priv/models_dev/<provider>.json)default_env_key- Fallback environment variable for API keyprovider_schema- Provider-only options
Implementing the behavior
Required vs optional callbacks:
Required for non-streaming:
prepare_request/4- Configure operation-specific requestsattach/3- Set up authentication and Req pipeline stepsencode_body/1- Transform context to provider JSONdecode_response/1- Parse API responses
Streaming (recommended):
attach_stream/4- Build complete Finch streaming requestdecode_stream_event/2or/3- Decode provider SSE events to StreamChunk structs
Optional:
extract_usage/2- Extract usage/cost datatranslate_options/3- Provider-specific parameter translationnormalize_model_id/1- Handle model ID aliasesparse_stream_protocol/2- Custom streaming protocol handlinginit_stream_state/1- Initialize stateful streamingflush_stream_state/2- Flush accumulated stream state
Using Defaults
Prefer use ReqLLM.Provider.Defaults to get robust OpenAI-style defaults and override only when needed.
Core Implementation
Minimal OpenAI-compatible provider
This example shows a provider that reuses defaults and only adds custom headers:
defmodule ReqLLM.Providers.Acme do
@moduledoc "Acme – OpenAI-compatible chat API."
@behaviour ReqLLM.Provider
use ReqLLM.Provider.DSL,
id: :acme,
base_url: "https://api.acme.ai/v1",
metadata: "priv/models_dev/acme.json",
default_env_key: "ACME_API_KEY",
provider_schema: [
organization: [type: :string, doc: "Tenant/Org header"]
]
use ReqLLM.Provider.Defaults
@impl ReqLLM.Provider
def attach(request, model_input, user_opts) do
request = super(request, model_input, user_opts)
org = user_opts[:organization]
case org do
nil -> request
_ -> Req.Request.put_header(request, "x-acme-organization", org)
end
end
endWhat you get for free:
- Non-streaming: Req pipeline with Bearer auth, JSON encode/decode in OpenAI shape
- Streaming: Finch request builder with OpenAI-compatible body and SSE decoding
- Usage extraction from response body
- Error handling and retry logic
Non-OpenAI wire-format provider
This example shows custom encoding/decoding for a provider with different JSON schema:
defmodule ReqLLM.Providers.Zephyr do
@moduledoc "Zephyr – custom JSON schema, SSE streaming."
@behaviour ReqLLM.Provider
use ReqLLM.Provider.DSL,
id: :zephyr,
base_url: "https://api.zephyr.ai",
metadata: "priv/models_dev/zephyr.json",
default_env_key: "ZEPHYR_API_KEY",
provider_schema: [
version: [type: :string, default: "2024-10-01"],
tenant: [type: :string]
]
use ReqLLM.Provider.Defaults
@impl ReqLLM.Provider
def attach(request, model_input, user_opts) do
request = ReqLLM.Provider.Defaults.default_attach(__MODULE__, request, model_input, user_opts)
request
|> Req.Request.put_header("x-zephyr-version", user_opts[:version] || "2024-10-01")
|> then(fn req ->
case user_opts[:tenant] do
nil -> req
t -> Req.Request.put_header(req, "x-zephyr-tenant", t)
end
end)
end
@impl ReqLLM.Provider
def encode_body(%Req.Request{} = request) do
context = request.options[:context]
model = request.options[:model]
stream = request.options[:stream] == true
tools = request.options[:tools] || []
provider_opts = request.options[:provider_options] || []
messages =
Enum.map(context.messages, fn m ->
%{
role: Atom.to_string(m.role),
parts: Enum.map(m.content, &encode_part/1)
}
end)
body =
%{
model: model,
messages: messages,
stream: stream
}
|> maybe_put(:temperature, request.options[:temperature])
|> maybe_put(:max_output_tokens, request.options[:max_tokens])
|> maybe_put(:tools, encode_tools(tools))
|> Map.merge(Map.new(provider_opts))
encoded = Jason.encode!(body)
request
|> Req.Request.put_header("content-type", "application/json")
|> Map.put(:body, encoded)
end
@impl ReqLLM.Provider
def decode_response({req, resp}) do
case resp.status do
200 ->
body = ensure_parsed_body(resp.body)
with {:ok, response} <- decode_chat_response(body, req) do
{req, %{resp | body: response}}
else
{:error, reason} ->
{req, ReqLLM.Error.Parse.exception(reason: inspect(reason))}
end
status ->
{req,
ReqLLM.Error.API.Response.exception(
reason: "Zephyr API error",
status: status,
response_body: resp.body
)}
end
end
@impl ReqLLM.Provider
def attach_stream(model, context, opts, _finch_name) do
api_key = ReqLLM.Keys.get!(model, opts)
url = Keyword.get(opts, :base_url, default_base_url()) <> "/chat:stream"
headers = [
{"authorization", "Bearer " <> api_key},
{"content-type", "application/json"},
{"accept", "text/event-stream"}
]
req = %Req.Request{
options: %{
model: model.model,
context: context,
stream: true,
provider_options: opts[:provider_options] || []
}
}
body = encode_body(req).body
{:ok, Finch.build(:post, url, headers, body)}
end
@impl ReqLLM.Provider
def decode_stream_event(%{data: data}, model) do
case Jason.decode(data) do
{:ok, %{"type" => "delta", "text" => text}} when is_binary(text) and text != "" ->
[ReqLLM.StreamChunk.text(text)]
{:ok, %{"type" => "reasoning", "text" => think}} when is_binary(think) and think != "" ->
[ReqLLM.StreamChunk.thinking(think)]
{:ok, %{"type" => "tool_call", "name" => name, "arguments" => args}} ->
[ReqLLM.StreamChunk.tool_call(name, Map.new(args))]
{:ok, %{"type" => "usage", "usage" => usage}} ->
[ReqLLM.StreamChunk.meta(%{usage: normalize_usage(usage), model: model.model})]
{:ok, %{"type" => "done", "finish_reason" => reason}} ->
[ReqLLM.StreamChunk.meta(%{
finish_reason: normalize_finish_reason(reason),
terminal?: true
})]
_ ->
[]
end
end
@impl ReqLLM.Provider
def extract_usage(body, _model) when is_map(body) do
case body do
%{"usage" => u} -> {:ok, normalize_usage(u)}
_ -> {:error, :no_usage}
end
end
@impl ReqLLM.Provider
def translate_options(:chat, _model, opts) do
{opts
|> Keyword.rename(:max_tokens, :max_output_tokens)
|> Keyword.drop([:presence_penalty]),
[]}
end
# Helper functions
defp encode_part(%ReqLLM.Message.ContentPart{type: :text, text: t}),
do: %{"type" => "text", "text" => t}
defp encode_part(%ReqLLM.Message.ContentPart{type: :image_url, url: url}),
do: %{"type" => "image_url", "url" => url}
defp encode_part(%ReqLLM.Message.ContentPart{type: :image, data: bin, media_type: mt}),
do: %{"type" => "image", "data" => Base.encode64(bin), "media_type" => mt}
defp encode_part(%ReqLLM.Message.ContentPart{type: :file, data: bin, media_type: mt, name: name}),
do: %{"type" => "file", "name" => name, "data" => Base.encode64(bin), "media_type" => mt}
defp encode_part(%ReqLLM.Message.ContentPart{type: :thinking, text: t}),
do: %{"type" => "thinking", "text" => t}
defp encode_part(%ReqLLM.Message.ContentPart{type: :tool_call, name: n, arguments: a}),
do: %{"type" => "tool_call", "name" => n, "arguments" => a}
defp encode_part(%ReqLLM.Message.ContentPart{type: :tool_result, name: n, arguments: a}),
do: %{"type" => "tool_result", "name" => n, "result" => a}
defp decode_chat_response(body, req) do
with %{"message" => %{"role" => role, "content" => content}} <- body,
{:ok, message} <- to_message(role, content) do
{:ok,
%ReqLLM.Response{
id: body["id"] || "zephyr_" <> Integer.to_string(System.unique_integer([:positive])),
model: req.options[:model],
context: req.options[:context] || ReqLLM.Context.new([]),
message: message,
usage: normalize_usage(body["usage"] || %{}),
stream?: false
}}
else
_ -> {:error, :unexpected_body}
end
end
defp to_message(role, parts) do
content_parts =
Enum.flat_map(parts, fn
%{"type" => "text", "text" => t} ->
[%ReqLLM.Message.ContentPart{type: :text, text: t}]
%{"type" => "thinking", "text" => t} ->
[%ReqLLM.Message.ContentPart{type: :thinking, text: t}]
%{"type" => "tool_call", "name" => n, "arguments" => a} ->
[%ReqLLM.Message.ContentPart{type: :tool_call, name: n, arguments: Map.new(a)}]
%{"type" => "tool_result", "name" => n, "result" => r} ->
[%ReqLLM.Message.ContentPart{type: :tool_result, name: n, arguments: Map.new(r)}]
_ -> []
end)
{:ok, %ReqLLM.Message{role: String.to_existing_atom(role), content: content_parts}}
end
defp encode_tools([]), do: nil
defp encode_tools(tools) do
Enum.map(tools, &ReqLLM.Tool.to_schema(&1, :openai))
end
defp maybe_put(map, _k, nil), do: map
defp maybe_put(map, k, v), do: Map.put(map, k, v)
defp ensure_parsed_body(body) when is_binary(body), do: Jason.decode!(body)
defp ensure_parsed_body(body), do: body
defp normalize_usage(%{"prompt" => i, "completion" => o}),
do: %{input_tokens: i, output_tokens: o, total_tokens: (i || 0) + (o || 0)}
defp normalize_usage(%{"input_tokens" => i, "output_tokens" => o, "total_tokens" => t}),
do: %{input_tokens: i || 0, output_tokens: o || 0, total_tokens: t || (i || 0) + (o || 0)}
defp normalize_usage(_),
do: %{input_tokens: 0, output_tokens: 0, total_tokens: 0}
defp normalize_finish_reason("stop"), do: :stop
defp normalize_finish_reason("length"), do: :length
defp normalize_finish_reason("tool"), do: :tool_calls
defp normalize_finish_reason(_), do: :error
endWorking with Canonical Data Structures
Input: Context to Provider JSON
Always convert ReqLLM.Context (list of Messages with ContentParts) to provider JSON.
Message structure:
roleis:system|:user|:assistant|:toolcontentis a list ofContentPart
ContentPart variants to handle:
text("...")- Plain text contentimage_url("...")- Image from URLimage(binary, mime)- Base64-encoded imagefile(binary, name, mime)- File attachmentthinking("...")- Reasoning tokens (for models that expose them)tool_call(name, map)- Function call requesttool_result(tool_call_id_or_name, map)- Function call result
Output: Provider JSON to Response
Non-streaming:
Decode provider JSON into a single assistant ReqLLM.Message with canonical ContentParts and fill ReqLLM.Response:
Response.messageis the assistant messageResponse.usageis normalized when available- For object generation, preserve
tool_call/tool_resultor JSON content soReqLLM.Response.object/1works consistently
Streaming (SSE):
Map each provider event into one or more ReqLLM.StreamChunk:
:content— Text tokens:thinking— Reasoning tokens:tool_call— Function name + arguments (may arrive in fragments):meta— Usage deltas, finish_reason,terminal?: trueon completion
Normalization principle
One conversation model, one streaming shape, one response shape: Never leak provider specifics to callers; normalize at the adapter boundary.
Model Metadata Integration
Add local patch
Create priv/models_local/<provider>.json to seed/supplement models before syncing:
{
"provider": {
"id": "acme",
"name": "Acme AI"
},
"models": [
{
"id": "acme-chat-mini",
"name": "Acme Chat Mini",
"type": "chat",
"capabilities": {
"stream": true,
"tool_call": true,
"vision": true
},
"modalities": {
"input": ["text","image"],
"output": ["text"]
},
"cost": {
"input": 0.00015,
"output": 0.0006
}
}
]
}Sync registry
Run:
mix req_llm.model_sync
This generates priv/models_dev/acme.json and updates ValidProviders.
Benefits
The registry enables:
- Validation with
mix mc - Model lookup by
"acme:acme-chat-mini" - Capability gating in tests
Testing Strategy
ReqLLM uses a three-tier testing architecture:
1. Core package tests (no API calls)
Under test/req_llm/ for core types/helpers.
2. Provider-specific tests (no API calls)
Under test/providers/, unit-testing your encoding/decoding and options behavior with small bodies.
Example:
defmodule Providers.AcmeTest do
use ExUnit.Case, async: true
alias ReqLLM.Message.ContentPart
test "encode_body: text + tools into OpenAI shape" do
ctx = ReqLLM.Context.new([ReqLLM.Context.user([ContentPart.text("Hello")])])
{:ok, model} = ReqLLM.Model.from("acme:acme-chat-mini")
req =
Req.new(url: "/chat/completions", method: :post, base_url: "https://example.test")
|> ReqLLM.Providers.Acme.attach(model, context: ctx, stream: false, temperature: 0.0)
|> ReqLLM.Providers.Acme.encode_body()
assert is_binary(req.body)
body = Jason.decode!(req.body)
assert body["model"] =~ "acme-chat-mini"
assert body["messages"] |> is_list()
end
end3. Live API coverage tests
Under test/coverage/ using the fixture system for integration against the high-level API.
Example:
defmodule Coverage.AcmeChatTest do
use ExUnit.Case, async: false
use ReqLLM.Test.LiveFixture, provider: :acme
test "basic text generation" do
{:ok, response} =
use_fixture(:provider, "acme-basic", fn ->
ReqLLM.generate_text("acme:acme-chat-mini", "Say hi", temperature: 0)
end)
assert ReqLLM.Response.text(response) =~ "hi"
end
test "streaming tokens" do
{:ok, sr} =
use_fixture(:provider, "acme-stream", fn ->
ReqLLM.stream_text("acme:acme-chat-mini", "Count 1..3", temperature: 0)
end)
tokens = ReqLLM.StreamResponse.tokens(sr) |> Enum.take(3)
assert length(tokens) >= 3
end
endRecording fixtures
# Record fixtures during live test runs
LIVE=true mix test --only provider:acme
# Or use model compatibility tool
mix mc "acme:*" --record
Validate coverage
# Quick validation
mix mc
# Sample models during development
mix mc --sample
Authentication
Use ReqLLM.Keys
Always use ReqLLM.Keys for key retrieval. Never read System.get_env/1 directly.
api_key = ReqLLM.Keys.get!(model, opts)Configuration
The DSL's default_env_key is the fallback env var name. ReqLLM.Keys also supports:
- Application config
- Per-call override via
opts[:api_key]
Adding authentication
Attach Bearer header in attach/3 or use Defaults (already sets authorization):
@impl ReqLLM.Provider
def attach(request, model_input, user_opts) do
api_key = ReqLLM.Keys.get!(model_input, user_opts)
request
|> Req.Request.put_header("authorization", "Bearer #{api_key}")
|> Req.Request.put_header("content-type", "application/json")
endError Handling
Use Splode error types
ReqLLM.Error.Auth- Missing/invalid API keysReqLLM.Error.API.Request- HTTP request issuesReqLLM.Error.API.Response- HTTP response errorsReqLLM.Error.Parse- JSON/body shape issues
Example
In decode_response/1, return {req, exception} for non-200 or malformed payloads:
@impl ReqLLM.Provider
def decode_response({req, resp}) do
case resp.status do
200 ->
body = ensure_parsed_body(resp.body)
with {:ok, response} <- decode_chat_response(body, req) do
{req, %{resp | body: response}}
else
{:error, reason} ->
{req, ReqLLM.Error.Parse.exception(reason: inspect(reason))}
end
status ->
{req,
ReqLLM.Error.API.Response.exception(
reason: "API error",
status: status,
response_body: resp.body
)}
end
endThe pipeline will propagate errors consistently to callers.
Step-by-Step Example
Let's add a fictional provider called "Acme" from start to finish.
1. Create provider module
File: lib/req_llm/providers/acme.ex
defmodule ReqLLM.Providers.Acme do
@moduledoc "Acme – OpenAI-compatible chat API."
@behaviour ReqLLM.Provider
use ReqLLM.Provider.DSL,
id: :acme,
base_url: "https://api.acme.ai/v1",
metadata: "priv/models_dev/acme.json",
default_env_key: "ACME_API_KEY",
provider_schema: [
organization: [type: :string, doc: "Tenant/Org header"]
]
use ReqLLM.Provider.Defaults
@impl ReqLLM.Provider
def attach(request, model_input, user_opts) do
request = super(request, model_input, user_opts)
org = user_opts[:organization]
case org do
nil -> request
_ -> Req.Request.put_header(request, "x-acme-organization", org)
end
end
end2. Add model metadata
File: priv/models_local/acme.json
{
"provider": {
"id": "acme",
"name": "Acme AI"
},
"models": [
{
"id": "acme-chat-mini",
"name": "Acme Chat Mini",
"type": "chat",
"capabilities": {
"stream": true,
"tool_call": true,
"vision": true
},
"modalities": {
"input": ["text","image"],
"output": ["text"]
},
"cost": {
"input": 0.00015,
"output": 0.0006
}
}
]
}3. Sync registry
mix req_llm.model_sync
4. Quick smoke test
export ACME_API_KEY=sk-...
mix req_llm.gen "Hello" --model acme:acme-chat-mini
5. Provider unit tests
File: test/providers/acme_test.exs
defmodule Providers.AcmeTest do
use ExUnit.Case, async: true
alias ReqLLM.Message.ContentPart
test "encode_body: text + tools into OpenAI shape" do
ctx = ReqLLM.Context.new([ReqLLM.Context.user([ContentPart.text("Hello")])])
{:ok, model} = ReqLLM.Model.from("acme:acme-chat-mini")
req =
Req.new(url: "/chat/completions", method: :post, base_url: "https://example.test")
|> ReqLLM.Providers.Acme.attach(model, context: ctx, stream: false, temperature: 0.0)
|> ReqLLM.Providers.Acme.encode_body()
assert is_binary(req.body)
body = Jason.decode!(req.body)
assert body["model"] =~ "acme-chat-mini"
assert body["messages"] |> is_list()
end
end6. Coverage tests with fixtures
File: test/coverage/acme_chat_test.exs
defmodule Coverage.AcmeChatTest do
use ExUnit.Case, async: false
use ReqLLM.Test.LiveFixture, provider: :acme
test "basic text generation" do
{:ok, response} =
use_fixture(:provider, "acme-basic", fn ->
ReqLLM.generate_text("acme:acme-chat-mini", "Say hi", temperature: 0)
end)
assert ReqLLM.Response.text(response) =~ "hi"
end
test "streaming tokens" do
{:ok, sr} =
use_fixture(:provider, "acme-stream", fn ->
ReqLLM.stream_text("acme:acme-chat-mini", "Count 1..3", temperature: 0)
end)
tokens = ReqLLM.StreamResponse.tokens(sr) |> Enum.take(3)
assert length(tokens) >= 3
end
end7. Record fixtures
# Option 1: During test run
LIVE=true mix test --only provider:acme
# Option 2: Using model compat tool
mix mc "acme:*" --record
8. Validate models
# Validate Acme models
mix req_llm.model_compat acme
# List all registered providers/models
mix mc --available
Best Practices
Simplicity-first and normalization
- Prefer using
ReqLLM.Provider.Defaults. Only override what the provider truly deviates on - Keep
prepare_request/4a thin dispatcher; centralize option prep inattach/3and the defaults pipeline
Code style (from AGENTS.md)
- No comments inside function bodies. Use clear naming and module docs
- Prefer pattern matching to conditionals
Use
{:ok, result}|{:error, reason}tuples for fallible helpers
Options translation
- Use
translate_options/3to rename/drop provider-specific params (e.g.,max_tokens→max_output_tokens)
Tools and multimodal
- Always map tools via
ReqLLM.Tool.to_schema/2 - Respect
ContentPartvariants for images/files. Base64 encode if the provider requires it
Streaming
- Build the Finch request in
attach_stream/4 - Decode events to
StreamChunkindecode_stream_event/2or/3 - Emit terminal meta chunk with
finish_reasonand usage if provided
Testing incrementally
- Start with non-streaming happy path, then add streaming and tools
- Record minimal, deterministic fixtures (
temperature: 0)
Advanced Topics
When to consider the advanced path
- Provider uses non-SSE streaming (binary protocol) or chunked JSON requiring stateful accumulation
- Models with unique parameter semantics that demand
translate_options/3and capability gating - Complex multimodal tool invocation requiring custom mapping of multi-part tool args/results
Advanced implementations
- Implement
parse_stream_protocol/2for custom binary protocols (e.g., AWS Event Stream) - Implement
init_stream_state/1,decode_stream_event/3,flush_stream_state/2to accumulate partial tool_call args or demultiplex multi-channel events - Implement
normalize_model_id/1for regional aliases andtranslate_options/3with warning aggregation - Provide provider-specific usage accounting that merges multi-phase usage deltas
Callback Reference
What to implement and when
prepare_request/4
- Build Req for the operation
- Defaults cover
:chat,:object,:embedding
attach/3
- Set headers, auth, and pipeline steps
- Defaults add Bearer, retry, error, usage, fixture steps
encode_body/1
- Transform options/context to provider JSON
- Defaults are OpenAI-compatible; override for custom wire formats
decode_response/1
- Map provider body to Response or error
- Defaults map OpenAI-style bodies; override if your shape differs
attach_stream/4
- Must return
{:ok, Finch.Request.t()} - Defaults build OpenAI-compatible streaming requests; override for custom endpoints/headers
decode_stream_event/2 or /3
- Map provider events to StreamChunk
- Defaults handle OpenAI-compatible deltas
extract_usage/2
- Normalize usage tokens/cost if provider deviates from standard usage shape
translate_options/3
- Rename/drop options per model or operation
Summary
Adding a provider to ReqLLM involves:
- Creating a provider module with the DSL and behavior implementation
- Implementing encoding/decoding for the provider's wire format
- Adding model metadata and syncing the registry
- Writing tests at all three tiers (core, provider, coverage)
- Recording fixtures for validation
By following these guidelines and leveraging the defaults, you can add robust, well-tested provider support that maintains ReqLLM's normalization principles across all AI interactions.