Adding a New Provider

View Source

TL;DR

  • Implement a provider module under lib/req_llm/providers/, use ReqLLM.Provider.DSL + Defaults, and only override what the API actually deviates on.
  • The Default provider implementation is OpenAI Compatible.
  • Non-streaming requests run through Req with attach/3 + encode_body/1 + decode_response/1; streaming runs through Finch with attach_stream/4 + decode_stream_event/2 or /3.
  • Add models via priv/models_local/, run mix req_llm.model_sync, then add tests using the three-tier strategy and record fixtures with LIVE=true.

Overview and Prerequisites

What it means to add a provider

Adding a provider means implementing a single Elixir module that:

  • Translates between canonical types (Model, Context, Message, ContentPart, Tool) and the provider HTTP API
  • Implements the ReqLLM.Provider behavior via the DSL and default callbacks
  • Provides SSE-to-StreamChunk decoding for streaming when applicable

Required knowledge and setup

You should know:

  • Provider's API paths, request/response JSON, auth, and streaming protocol
  • Req basics (request/response steps) and Finch for streaming
  • ReqLLM canonical types (see Data Structures) and normalization principles (Core Concepts)

Before coding

  • Confirm provider supports needed capabilities (chat, tools, images, streaming)
  • Gather API key/env var name and any extra headers or versions
  • Start with the OpenAI-compatible defaults if at all possible

Provider Module Structure

File location

Create lib/req_llm/providers/<provider>.ex

Using the DSL

Use the DSL to register:

  • id (atom) - Provider identifier
  • base_url - Default API endpoint
  • metadata - Path to metadata file (priv/models_dev/<provider>.json)
  • default_env_key - Fallback environment variable for API key
  • provider_schema - Provider-only options

Implementing the behavior

Required vs optional callbacks:

Required for non-streaming:

  • prepare_request/4 - Configure operation-specific requests
  • attach/3 - Set up authentication and Req pipeline steps
  • encode_body/1 - Transform context to provider JSON
  • decode_response/1 - Parse API responses

Streaming (recommended):

  • attach_stream/4 - Build complete Finch streaming request
  • decode_stream_event/2 or /3 - Decode provider SSE events to StreamChunk structs

Optional:

  • extract_usage/2 - Extract usage/cost data
  • translate_options/3 - Provider-specific parameter translation
  • normalize_model_id/1 - Handle model ID aliases
  • parse_stream_protocol/2 - Custom streaming protocol handling
  • init_stream_state/1 - Initialize stateful streaming
  • flush_stream_state/2 - Flush accumulated stream state

Using Defaults

Prefer use ReqLLM.Provider.Defaults to get robust OpenAI-style defaults and override only when needed.

Core Implementation

Minimal OpenAI-compatible provider

This example shows a provider that reuses defaults and only adds custom headers:

defmodule ReqLLM.Providers.Acme do
  @moduledoc "Acme – OpenAI-compatible chat API."

  @behaviour ReqLLM.Provider

  use ReqLLM.Provider.DSL,
    id: :acme,
    base_url: "https://api.acme.ai/v1",
    metadata: "priv/models_dev/acme.json",
    default_env_key: "ACME_API_KEY",
    provider_schema: [
      organization: [type: :string, doc: "Tenant/Org header"]
    ]

  use ReqLLM.Provider.Defaults

  @impl ReqLLM.Provider
  def attach(request, model_input, user_opts) do
    request = super(request, model_input, user_opts)
    org = user_opts[:organization]
    
    case org do
      nil -> request
      _ -> Req.Request.put_header(request, "x-acme-organization", org)
    end
  end
end

What you get for free:

  • Non-streaming: Req pipeline with Bearer auth, JSON encode/decode in OpenAI shape
  • Streaming: Finch request builder with OpenAI-compatible body and SSE decoding
  • Usage extraction from response body
  • Error handling and retry logic

Non-OpenAI wire-format provider

This example shows custom encoding/decoding for a provider with different JSON schema:

defmodule ReqLLM.Providers.Zephyr do
  @moduledoc "Zephyr – custom JSON schema, SSE streaming."

  @behaviour ReqLLM.Provider

  use ReqLLM.Provider.DSL,
    id: :zephyr,
    base_url: "https://api.zephyr.ai",
    metadata: "priv/models_dev/zephyr.json",
    default_env_key: "ZEPHYR_API_KEY",
    provider_schema: [
      version: [type: :string, default: "2024-10-01"],
      tenant: [type: :string]
    ]

  use ReqLLM.Provider.Defaults

  @impl ReqLLM.Provider
  def attach(request, model_input, user_opts) do
    request = ReqLLM.Provider.Defaults.default_attach(__MODULE__, request, model_input, user_opts)
    
    request
    |> Req.Request.put_header("x-zephyr-version", user_opts[:version] || "2024-10-01")
    |> then(fn req ->
      case user_opts[:tenant] do
        nil -> req
        t -> Req.Request.put_header(req, "x-zephyr-tenant", t)
      end
    end)
  end

  @impl ReqLLM.Provider
  def encode_body(%Req.Request{} = request) do
    context = request.options[:context]
    model = request.options[:model]
    stream = request.options[:stream] == true
    tools = request.options[:tools] || []
    provider_opts = request.options[:provider_options] || []

    messages =
      Enum.map(context.messages, fn m ->
        %{
          role: Atom.to_string(m.role),
          parts: Enum.map(m.content, &encode_part/1)
        }
      end)

    body =
      %{
        model: model,
        messages: messages,
        stream: stream
      }
      |> maybe_put(:temperature, request.options[:temperature])
      |> maybe_put(:max_output_tokens, request.options[:max_tokens])
      |> maybe_put(:tools, encode_tools(tools))
      |> Map.merge(Map.new(provider_opts))

    encoded = Jason.encode!(body)
    
    request
    |> Req.Request.put_header("content-type", "application/json")
    |> Map.put(:body, encoded)
  end

  @impl ReqLLM.Provider
  def decode_response({req, resp}) do
    case resp.status do
      200 ->
        body = ensure_parsed_body(resp.body)
        
        with {:ok, response} <- decode_chat_response(body, req) do
          {req, %{resp | body: response}}
        else
          {:error, reason} -> 
            {req, ReqLLM.Error.Parse.exception(reason: inspect(reason))}
        end

      status ->
        {req,
         ReqLLM.Error.API.Response.exception(
           reason: "Zephyr API error",
           status: status,
           response_body: resp.body
         )}
    end
  end

  @impl ReqLLM.Provider
  def attach_stream(model, context, opts, _finch_name) do
    api_key = ReqLLM.Keys.get!(model, opts)
    url = Keyword.get(opts, :base_url, default_base_url()) <> "/chat:stream"
    
    headers = [
      {"authorization", "Bearer " <> api_key},
      {"content-type", "application/json"},
      {"accept", "text/event-stream"}
    ]
    
    req = %Req.Request{
      options: %{
        model: model.model,
        context: context,
        stream: true,
        provider_options: opts[:provider_options] || []
      }
    }
    
    body = encode_body(req).body
    {:ok, Finch.build(:post, url, headers, body)}
  end

  @impl ReqLLM.Provider
  def decode_stream_event(%{data: data}, model) do
    case Jason.decode(data) do
      {:ok, %{"type" => "delta", "text" => text}} when is_binary(text) and text != "" ->
        [ReqLLM.StreamChunk.text(text)]
        
      {:ok, %{"type" => "reasoning", "text" => think}} when is_binary(think) and think != "" ->
        [ReqLLM.StreamChunk.thinking(think)]
        
      {:ok, %{"type" => "tool_call", "name" => name, "arguments" => args}} ->
        [ReqLLM.StreamChunk.tool_call(name, Map.new(args))]
        
      {:ok, %{"type" => "usage", "usage" => usage}} ->
        [ReqLLM.StreamChunk.meta(%{usage: normalize_usage(usage), model: model.model})]
        
      {:ok, %{"type" => "done", "finish_reason" => reason}} ->
        [ReqLLM.StreamChunk.meta(%{
          finish_reason: normalize_finish_reason(reason),
          terminal?: true
        })]
        
      _ ->
        []
    end
  end

  @impl ReqLLM.Provider
  def extract_usage(body, _model) when is_map(body) do
    case body do
      %{"usage" => u} -> {:ok, normalize_usage(u)}
      _ -> {:error, :no_usage}
    end
  end

  @impl ReqLLM.Provider
  def translate_options(:chat, _model, opts) do
    {opts
     |> Keyword.rename(:max_tokens, :max_output_tokens)
     |> Keyword.drop([:presence_penalty]),
     []}
  end

  # Helper functions

  defp encode_part(%ReqLLM.Message.ContentPart{type: :text, text: t}), 
    do: %{"type" => "text", "text" => t}
    
  defp encode_part(%ReqLLM.Message.ContentPart{type: :image_url, url: url}), 
    do: %{"type" => "image_url", "url" => url}
    
  defp encode_part(%ReqLLM.Message.ContentPart{type: :image, data: bin, media_type: mt}), 
    do: %{"type" => "image", "data" => Base.encode64(bin), "media_type" => mt}
    
  defp encode_part(%ReqLLM.Message.ContentPart{type: :file, data: bin, media_type: mt, name: name}), 
    do: %{"type" => "file", "name" => name, "data" => Base.encode64(bin), "media_type" => mt}
    
  defp encode_part(%ReqLLM.Message.ContentPart{type: :thinking, text: t}), 
    do: %{"type" => "thinking", "text" => t}
    
  defp encode_part(%ReqLLM.Message.ContentPart{type: :tool_call, name: n, arguments: a}), 
    do: %{"type" => "tool_call", "name" => n, "arguments" => a}
    
  defp encode_part(%ReqLLM.Message.ContentPart{type: :tool_result, name: n, arguments: a}), 
    do: %{"type" => "tool_result", "name" => n, "result" => a}

  defp decode_chat_response(body, req) do
    with %{"message" => %{"role" => role, "content" => content}} <- body,
         {:ok, message} <- to_message(role, content) do
      {:ok,
       %ReqLLM.Response{
         id: body["id"] || "zephyr_" <> Integer.to_string(System.unique_integer([:positive])),
         model: req.options[:model],
         context: req.options[:context] || ReqLLM.Context.new([]),
         message: message,
         usage: normalize_usage(body["usage"] || %{}),
         stream?: false
       }}
    else
      _ -> {:error, :unexpected_body}
    end
  end

  defp to_message(role, parts) do
    content_parts =
      Enum.flat_map(parts, fn
        %{"type" => "text", "text" => t} -> 
          [%ReqLLM.Message.ContentPart{type: :text, text: t}]
          
        %{"type" => "thinking", "text" => t} -> 
          [%ReqLLM.Message.ContentPart{type: :thinking, text: t}]
          
        %{"type" => "tool_call", "name" => n, "arguments" => a} -> 
          [%ReqLLM.Message.ContentPart{type: :tool_call, name: n, arguments: Map.new(a)}]
          
        %{"type" => "tool_result", "name" => n, "result" => r} -> 
          [%ReqLLM.Message.ContentPart{type: :tool_result, name: n, arguments: Map.new(r)}]
          
        _ -> []
      end)

    {:ok, %ReqLLM.Message{role: String.to_existing_atom(role), content: content_parts}}
  end

  defp encode_tools([]), do: nil
  defp encode_tools(tools) do
    Enum.map(tools, &ReqLLM.Tool.to_schema(&1, :openai))
  end

  defp maybe_put(map, _k, nil), do: map
  defp maybe_put(map, k, v), do: Map.put(map, k, v)

  defp ensure_parsed_body(body) when is_binary(body), do: Jason.decode!(body)
  defp ensure_parsed_body(body), do: body

  defp normalize_usage(%{"prompt" => i, "completion" => o}), 
    do: %{input_tokens: i, output_tokens: o, total_tokens: (i || 0) + (o || 0)}
    
  defp normalize_usage(%{"input_tokens" => i, "output_tokens" => o, "total_tokens" => t}), 
    do: %{input_tokens: i || 0, output_tokens: o || 0, total_tokens: t || (i || 0) + (o || 0)}
    
  defp normalize_usage(_), 
    do: %{input_tokens: 0, output_tokens: 0, total_tokens: 0}

  defp normalize_finish_reason("stop"), do: :stop
  defp normalize_finish_reason("length"), do: :length
  defp normalize_finish_reason("tool"), do: :tool_calls
  defp normalize_finish_reason(_), do: :error
end

Working with Canonical Data Structures

Input: Context to Provider JSON

Always convert ReqLLM.Context (list of Messages with ContentParts) to provider JSON.

Message structure:

  • role is :system | :user | :assistant | :tool

  • content is a list of ContentPart

ContentPart variants to handle:

  • text("...") - Plain text content
  • image_url("...") - Image from URL
  • image(binary, mime) - Base64-encoded image
  • file(binary, name, mime) - File attachment
  • thinking("...") - Reasoning tokens (for models that expose them)
  • tool_call(name, map) - Function call request
  • tool_result(tool_call_id_or_name, map) - Function call result

Output: Provider JSON to Response

Non-streaming:

Decode provider JSON into a single assistant ReqLLM.Message with canonical ContentParts and fill ReqLLM.Response:

  • Response.message is the assistant message
  • Response.usage is normalized when available
  • For object generation, preserve tool_call/tool_result or JSON content so ReqLLM.Response.object/1 works consistently

Streaming (SSE):

Map each provider event into one or more ReqLLM.StreamChunk:

  • :content — Text tokens
  • :thinking — Reasoning tokens
  • :tool_call — Function name + arguments (may arrive in fragments)
  • :meta — Usage deltas, finish_reason, terminal?: true on completion

Normalization principle

One conversation model, one streaming shape, one response shape: Never leak provider specifics to callers; normalize at the adapter boundary.

Model Metadata Integration

Add local patch

Create priv/models_local/<provider>.json to seed/supplement models before syncing:

{
  "provider": { 
    "id": "acme", 
    "name": "Acme AI" 
  },
  "models": [
    {
      "id": "acme-chat-mini",
      "name": "Acme Chat Mini",
      "type": "chat",
      "capabilities": { 
        "stream": true, 
        "tool_call": true, 
        "vision": true 
      },
      "modalities": { 
        "input": ["text","image"], 
        "output": ["text"] 
      },
      "cost": { 
        "input": 0.00015, 
        "output": 0.0006 
      }
    }
  ]
}

Sync registry

Run:

mix req_llm.model_sync

This generates priv/models_dev/acme.json and updates ValidProviders.

Benefits

The registry enables:

  • Validation with mix mc
  • Model lookup by "acme:acme-chat-mini"
  • Capability gating in tests

Testing Strategy

ReqLLM uses a three-tier testing architecture:

1. Core package tests (no API calls)

Under test/req_llm/ for core types/helpers.

2. Provider-specific tests (no API calls)

Under test/providers/, unit-testing your encoding/decoding and options behavior with small bodies.

Example:

defmodule Providers.AcmeTest do
  use ExUnit.Case, async: true

  alias ReqLLM.Message.ContentPart

  test "encode_body: text + tools into OpenAI shape" do
    ctx = ReqLLM.Context.new([ReqLLM.Context.user([ContentPart.text("Hello")])])
    {:ok, model} = ReqLLM.Model.from("acme:acme-chat-mini")
    
    req =
      Req.new(url: "/chat/completions", method: :post, base_url: "https://example.test")
      |> ReqLLM.Providers.Acme.attach(model, context: ctx, stream: false, temperature: 0.0)
      |> ReqLLM.Providers.Acme.encode_body()

    assert is_binary(req.body)
    body = Jason.decode!(req.body)
    assert body["model"] =~ "acme-chat-mini"
    assert body["messages"] |> is_list()
  end
end

3. Live API coverage tests

Under test/coverage/ using the fixture system for integration against the high-level API.

Example:

defmodule Coverage.AcmeChatTest do
  use ExUnit.Case, async: false
  use ReqLLM.Test.LiveFixture, provider: :acme

  test "basic text generation" do
    {:ok, response} =
      use_fixture(:provider, "acme-basic", fn ->
        ReqLLM.generate_text("acme:acme-chat-mini", "Say hi", temperature: 0)
      end)

    assert ReqLLM.Response.text(response) =~ "hi"
  end

  test "streaming tokens" do
    {:ok, sr} =
      use_fixture(:provider, "acme-stream", fn ->
        ReqLLM.stream_text("acme:acme-chat-mini", "Count 1..3", temperature: 0)
      end)

    tokens = ReqLLM.StreamResponse.tokens(sr) |> Enum.take(3)
    assert length(tokens) >= 3
  end
end

Recording fixtures

# Record fixtures during live test runs
LIVE=true mix test --only provider:acme

# Or use model compatibility tool
mix mc "acme:*" --record

Validate coverage

# Quick validation
mix mc

# Sample models during development
mix mc --sample

Authentication

Use ReqLLM.Keys

Always use ReqLLM.Keys for key retrieval. Never read System.get_env/1 directly.

api_key = ReqLLM.Keys.get!(model, opts)

Configuration

The DSL's default_env_key is the fallback env var name. ReqLLM.Keys also supports:

  • Application config
  • Per-call override via opts[:api_key]

Adding authentication

Attach Bearer header in attach/3 or use Defaults (already sets authorization):

@impl ReqLLM.Provider
def attach(request, model_input, user_opts) do
  api_key = ReqLLM.Keys.get!(model_input, user_opts)
  
  request
  |> Req.Request.put_header("authorization", "Bearer #{api_key}")
  |> Req.Request.put_header("content-type", "application/json")
end

Error Handling

Use Splode error types

Example

In decode_response/1, return {req, exception} for non-200 or malformed payloads:

@impl ReqLLM.Provider
def decode_response({req, resp}) do
  case resp.status do
    200 ->
      body = ensure_parsed_body(resp.body)
      
      with {:ok, response} <- decode_chat_response(body, req) do
        {req, %{resp | body: response}}
      else
        {:error, reason} -> 
          {req, ReqLLM.Error.Parse.exception(reason: inspect(reason))}
      end

    status ->
      {req,
       ReqLLM.Error.API.Response.exception(
         reason: "API error",
         status: status,
         response_body: resp.body
       )}
  end
end

The pipeline will propagate errors consistently to callers.

Step-by-Step Example

Let's add a fictional provider called "Acme" from start to finish.

1. Create provider module

File: lib/req_llm/providers/acme.ex

defmodule ReqLLM.Providers.Acme do
  @moduledoc "Acme – OpenAI-compatible chat API."

  @behaviour ReqLLM.Provider

  use ReqLLM.Provider.DSL,
    id: :acme,
    base_url: "https://api.acme.ai/v1",
    metadata: "priv/models_dev/acme.json",
    default_env_key: "ACME_API_KEY",
    provider_schema: [
      organization: [type: :string, doc: "Tenant/Org header"]
    ]

  use ReqLLM.Provider.Defaults

  @impl ReqLLM.Provider
  def attach(request, model_input, user_opts) do
    request = super(request, model_input, user_opts)
    org = user_opts[:organization]
    
    case org do
      nil -> request
      _ -> Req.Request.put_header(request, "x-acme-organization", org)
    end
  end
end

2. Add model metadata

File: priv/models_local/acme.json

{
  "provider": { 
    "id": "acme", 
    "name": "Acme AI" 
  },
  "models": [
    {
      "id": "acme-chat-mini",
      "name": "Acme Chat Mini",
      "type": "chat",
      "capabilities": { 
        "stream": true, 
        "tool_call": true, 
        "vision": true 
      },
      "modalities": { 
        "input": ["text","image"], 
        "output": ["text"] 
      },
      "cost": { 
        "input": 0.00015, 
        "output": 0.0006 
      }
    }
  ]
}

3. Sync registry

mix req_llm.model_sync

4. Quick smoke test

export ACME_API_KEY=sk-...
mix req_llm.gen "Hello" --model acme:acme-chat-mini

5. Provider unit tests

File: test/providers/acme_test.exs

defmodule Providers.AcmeTest do
  use ExUnit.Case, async: true

  alias ReqLLM.Message.ContentPart

  test "encode_body: text + tools into OpenAI shape" do
    ctx = ReqLLM.Context.new([ReqLLM.Context.user([ContentPart.text("Hello")])])
    {:ok, model} = ReqLLM.Model.from("acme:acme-chat-mini")
    
    req =
      Req.new(url: "/chat/completions", method: :post, base_url: "https://example.test")
      |> ReqLLM.Providers.Acme.attach(model, context: ctx, stream: false, temperature: 0.0)
      |> ReqLLM.Providers.Acme.encode_body()

    assert is_binary(req.body)
    body = Jason.decode!(req.body)
    assert body["model"] =~ "acme-chat-mini"
    assert body["messages"] |> is_list()
  end
end

6. Coverage tests with fixtures

File: test/coverage/acme_chat_test.exs

defmodule Coverage.AcmeChatTest do
  use ExUnit.Case, async: false
  use ReqLLM.Test.LiveFixture, provider: :acme

  test "basic text generation" do
    {:ok, response} =
      use_fixture(:provider, "acme-basic", fn ->
        ReqLLM.generate_text("acme:acme-chat-mini", "Say hi", temperature: 0)
      end)

    assert ReqLLM.Response.text(response) =~ "hi"
  end

  test "streaming tokens" do
    {:ok, sr} =
      use_fixture(:provider, "acme-stream", fn ->
        ReqLLM.stream_text("acme:acme-chat-mini", "Count 1..3", temperature: 0)
      end)

    tokens = ReqLLM.StreamResponse.tokens(sr) |> Enum.take(3)
    assert length(tokens) >= 3
  end
end

7. Record fixtures

# Option 1: During test run
LIVE=true mix test --only provider:acme

# Option 2: Using model compat tool
mix mc "acme:*" --record

8. Validate models

# Validate Acme models
mix req_llm.model_compat acme

# List all registered providers/models
mix mc --available

Best Practices

Simplicity-first and normalization

  • Prefer using ReqLLM.Provider.Defaults. Only override what the provider truly deviates on
  • Keep prepare_request/4 a thin dispatcher; centralize option prep in attach/3 and the defaults pipeline

Code style (from AGENTS.md)

  • No comments inside function bodies. Use clear naming and module docs
  • Prefer pattern matching to conditionals
  • Use {:ok, result} | {:error, reason} tuples for fallible helpers

Options translation

  • Use translate_options/3 to rename/drop provider-specific params (e.g., max_tokensmax_output_tokens)

Tools and multimodal

  • Always map tools via ReqLLM.Tool.to_schema/2
  • Respect ContentPart variants for images/files. Base64 encode if the provider requires it

Streaming

  • Build the Finch request in attach_stream/4
  • Decode events to StreamChunk in decode_stream_event/2 or /3
  • Emit terminal meta chunk with finish_reason and usage if provided

Testing incrementally

  • Start with non-streaming happy path, then add streaming and tools
  • Record minimal, deterministic fixtures (temperature: 0)

Advanced Topics

When to consider the advanced path

  • Provider uses non-SSE streaming (binary protocol) or chunked JSON requiring stateful accumulation
  • Models with unique parameter semantics that demand translate_options/3 and capability gating
  • Complex multimodal tool invocation requiring custom mapping of multi-part tool args/results

Advanced implementations

  • Implement parse_stream_protocol/2 for custom binary protocols (e.g., AWS Event Stream)
  • Implement init_stream_state/1, decode_stream_event/3, flush_stream_state/2 to accumulate partial tool_call args or demultiplex multi-channel events
  • Implement normalize_model_id/1 for regional aliases and translate_options/3 with warning aggregation
  • Provide provider-specific usage accounting that merges multi-phase usage deltas

Callback Reference

What to implement and when

prepare_request/4

  • Build Req for the operation
  • Defaults cover :chat, :object, :embedding

attach/3

  • Set headers, auth, and pipeline steps
  • Defaults add Bearer, retry, error, usage, fixture steps

encode_body/1

  • Transform options/context to provider JSON
  • Defaults are OpenAI-compatible; override for custom wire formats

decode_response/1

  • Map provider body to Response or error
  • Defaults map OpenAI-style bodies; override if your shape differs

attach_stream/4

  • Must return {:ok, Finch.Request.t()}
  • Defaults build OpenAI-compatible streaming requests; override for custom endpoints/headers

decode_stream_event/2 or /3

  • Map provider events to StreamChunk
  • Defaults handle OpenAI-compatible deltas

extract_usage/2

  • Normalize usage tokens/cost if provider deviates from standard usage shape

translate_options/3

  • Rename/drop options per model or operation

Summary

Adding a provider to ReqLLM involves:

  1. Creating a provider module with the DSL and behavior implementation
  2. Implementing encoding/decoding for the provider's wire format
  3. Adding model metadata and syncing the registry
  4. Writing tests at all three tiers (core, provider, coverage)
  5. Recording fixtures for validation

By following these guidelines and leveraging the defaults, you can add robust, well-tested provider support that maintains ReqLLM's normalization principles across all AI interactions.