Configuration
View SourceThis guide covers all global configuration options for ReqLLM, including timeouts, connection pools, and runtime settings.
Quick Reference
# config/config.exs
config :req_llm,
# HTTP timeouts (all values in milliseconds)
receive_timeout: 120_000, # Default response timeout
stream_receive_timeout: 120_000, # Streaming chunk timeout
metadata_timeout: 120_000, # Streaming metadata collection timeout
thinking_timeout: 300_000, # Extended timeout for reasoning models
image_receive_timeout: 120_000, # Image generation timeout
# Streaming request transforms
finch_request_adapter: MyApp.FinchAdapter, # Module implementing ReqLLM.FinchRequestAdapter
# Key management
load_dotenv: true, # Auto-load .env files at startup
# Telemetry
telemetry: [payloads: :none], # Request payload policy (:none or :raw)
# Debugging
debug: false # Enable verbose loggingTimeout Configuration
ReqLLM uses multiple timeout settings to handle different scenarios:
receive_timeout (default: 30,000ms)
The standard HTTP response timeout for non-streaming requests. Increase this for slow models or large responses.
config :req_llm, receive_timeout: 60_000Per-request override:
ReqLLM.generate_text("openai:gpt-4o", "Hello", receive_timeout: 60_000)stream_receive_timeout (default: inherits from receive_timeout)
Timeout between streaming chunks. If no data arrives within this window, the stream fails.
config :req_llm, stream_receive_timeout: 120_000thinking_timeout (default: 300,000ms / 5 minutes)
Extended timeout for reasoning models that "think" before responding (e.g., Claude with extended thinking, OpenAI o1/o3 models, Z.AI thinking mode). These models may take several minutes to produce the first token.
config :req_llm, thinking_timeout: 600_000 # 10 minutesAutomatic detection: ReqLLM automatically applies thinking_timeout when:
- Extended thinking is enabled on Anthropic models
- Using OpenAI o1/o3 reasoning models
- Z.AI or Z.AI Coder thinking mode is enabled
metadata_timeout (default: 300,000ms)
Timeout for collecting streaming metadata (usage, finish_reason) after the stream completes. Long-running streams or slow providers may need more time.
config :req_llm, metadata_timeout: 120_000Per-request override:
ReqLLM.stream_text("anthropic:claude-haiku-4-5", "Hello", metadata_timeout: 60_000)image_receive_timeout (default: 120,000ms)
Extended timeout specifically for image generation operations, which can take longer than text generation.
config :req_llm, image_receive_timeout: 180_000Connection Pool Configuration
ReqLLM uses Finch for HTTP connections. By default, HTTP/1-only pools are used due to a known Finch issue with HTTP/2 and large request bodies.
Default Configuration
config :req_llm,
finch: [
name: ReqLLM.Finch,
pools: %{
:default => [protocols: [:http1], size: 1, count: 8]
}
]High-Concurrency Configuration
For applications making many concurrent requests:
config :req_llm,
finch: [
name: ReqLLM.Finch,
pools: %{
:default => [protocols: [:http1], size: 1, count: 32]
}
]HTTP/2 Configuration (Advanced)
Use with caution—HTTP/2 pools may fail with request bodies larger than 64KB:
config :req_llm,
finch: [
name: ReqLLM.Finch,
pools: %{
:default => [protocols: [:http2, :http1], size: 1, count: 8]
}
]Custom Finch Instance Per-Request
{:ok, response} = ReqLLM.stream_text(model, messages, finch_name: MyApp.CustomFinch)Streaming Request Transforms
ReqLLM provides two hooks for modifying a Finch.Request struct just before a streaming request is sent (to align with a similar ability present in Req) — useful for injecting headers, adding tracing metadata, or other environment-specific concerns.
finch_request_adapter (config-level)
Set a module that implements the ReqLLM.FinchRequestAdapter behaviour. Because config files cannot hold anonymous functions, this mechanism requires a named module.
# config/test.exs
config :req_llm, finch_request_adapter: MyApp.TestFinchAdapterdefmodule MyApp.TestFinchAdapter do
@behaviour ReqLLM.FinchRequestAdapter
@impl true
def call(%Finch.Request{} = request) do
%{request | headers: request.headers ++ [{"x-test-env", "true"}]}
end
endon_finch_request (per-request)
Pass an anonymous function (Finch.Request.t() -> Finch.Request.t()) as a per-call option:
ReqLLM.stream_text("openai:gpt-4o", "Hello",
on_finch_request: fn req ->
%{req | headers: req.headers ++ [{"x-request-id", UUID.generate()}]}
end
)Precedence
Both mechanisms can be combined. The config-level adapter is applied first, then the per-request callback. Each step receives the output of the previous one.
Telemetry Configuration
ReqLLM emits native :telemetry events for request lifecycle, reasoning lifecycle, and token usage. By default, those events are metadata-only:
config :req_llm, telemetry: [payloads: :none]To include sanitized request and response payloads on request lifecycle events:
config :req_llm, telemetry: [payloads: :raw]Per-request override:
ReqLLM.generate_text("anthropic:claude-haiku-4-5", "Hello", telemetry: [payloads: :raw])
ReqLLM.stream_text("openai:gpt-5-mini", "Hello", telemetry: [payloads: :raw])Notes:
- Payload capture only applies to request lifecycle events. Reasoning events are always metadata-only.
- Thinking and reasoning text is redacted from payloads.
- Tools are summarized to stable metadata and binary attachments are reduced to byte and media summaries.
- Unknown payload shapes are recursively sanitized so opaque binaries are summarized instead of passed through.
- Embedding and audio operations stay summarized rather than emitting raw vectors or audio bytes.
- Requested and effective reasoning telemetry are tracked separately, so provider translation can be observed when a reasoning setting is dropped or rewritten.
- If callers provide conflicting reasoning controls, explicit disable signals win in the normalized telemetry snapshot.
- The default is
:none, which is the safer choice for multi-tenant systems.
See the Telemetry Guide for the event model and payload semantics.
API Key Configuration
Keys are loaded with clear precedence: per-request → in-memory → app config → env vars → .env files.
.env Files (Recommended)
# .env
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=...
Disable automatic .env loading:
config :req_llm, load_dotenv: falseApplication Config
config :req_llm,
anthropic_api_key: "sk-ant-...",
openai_api_key: "sk-..."Runtime / In-Memory
ReqLLM.put_key(:anthropic_api_key, "sk-ant-...")
ReqLLM.put_key(:openai_api_key, "sk-...")Per-Request Override
ReqLLM.generate_text("openai:gpt-4o", "Hello", api_key: "sk-...")Provider-Specific Configuration
Configure base URLs or other provider-specific settings:
config :req_llm, :azure,
base_url: "https://your-resource.openai.azure.com",
api_version: "2024-08-01-preview"See individual provider guides for available options.
Debug Mode
Enable verbose logging for troubleshooting:
config :req_llm, debug: trueOr via environment variable:
REQ_LLM_DEBUG=1 mix test
Example: Production Configuration
# config/prod.exs
config :req_llm,
receive_timeout: 120_000,
stream_receive_timeout: 120_000,
thinking_timeout: 300_000,
metadata_timeout: 120_000,
telemetry: [payloads: :none],
load_dotenv: false, # Use proper secrets management in production
finch: [
name: ReqLLM.Finch,
pools: %{
:default => [protocols: [:http1], size: 1, count: 16]
}
]Example: Development Configuration
# config/dev.exs
config :req_llm,
receive_timeout: 60_000,
debug: true,
load_dotenv: true