Telemetry
View SourceReqLLM emits native :telemetry events for both Req-backed requests and Finch-backed streaming. Every event for a logical request shares the same request_id, so you can correlate request lifecycle, reasoning lifecycle, and token usage without provider-specific parsing.
Use these events for billing, tenant attribution, latency tracking, reasoning observability, and low-level integrations that cannot rely on wrapping Req directly.
Event Families
[:req_llm, :request, :start]fires once when a request begins.[:req_llm, :request, :stop]fires once when a request completes, including streaming completion and cancellation.[:req_llm, :request, :exception]fires once when a request fails.[:req_llm, :reasoning, :start]fires when the effective request enables provider reasoning.[:req_llm, :reasoning, :update]fires on reasoning milestones, not every chunk.[:req_llm, :reasoning, :stop]fires when a reasoning request finishes, is cancelled, or errors.[:req_llm, :token_usage]remains as a compatibility event for token and cost tracking.
Request lifecycle events always include a reasoning map in metadata, even for operations that do not support reasoning. In those cases, the snapshot is explicit about reasoning being disabled or unsupported.
Measurements
request.start,reasoning.start, andreasoning.updateemit%{system_time: integer}.request.stop,request.exception, andreasoning.stopemit%{duration: integer, system_time: integer}.
duration is in native monotonic time units and should be converted with System.convert_time_unit/3 if you want milliseconds.
Request Metadata
Every request lifecycle event includes these core metadata fields:
request_idoperationmodeprovidermodeltransportreasoningrequest_summaryresponse_summaryhttp_statusfinish_reasonusage
When payload capture is enabled, request lifecycle events also include request_payload and response_payload.
Typical request metadata looks like this:
%{
request_id: "2184",
operation: :chat,
mode: :stream,
provider: :anthropic,
model: %LLMDB.Model{},
transport: :finch,
reasoning: %{
supported?: true,
requested?: true,
effective?: true,
requested_mode: :enabled,
requested_effort: :medium,
requested_budget_tokens: 4096,
effective_mode: :enabled,
effective_effort: :medium,
effective_budget_tokens: 4096,
returned_content?: true,
reasoning_tokens: 812,
content_bytes: 1432,
channel: :content_and_usage
},
request_summary: %{
message_count: 1,
text_bytes: 42,
image_part_count: 0,
tool_call_count: 0
},
response_summary: %{
text_bytes: 318,
thinking_bytes: 1432,
tool_call_count: 0,
image_count: 0,
object?: false
},
http_status: 200,
finish_reason: :stop,
usage: %{
input_tokens: 24,
output_tokens: 133,
total_tokens: 157,
reasoning_tokens: 812
}
}request_summary and response_summary are compact by design. Their exact shape varies by operation:
- Chat, object, and image requests summarize message count, text bytes, image parts, and tool calls.
- Chat, object, and image responses summarize output text bytes, thinking bytes, tool calls, image count, and structured object presence.
- Embeddings summarize input count, vector count, and dimensions.
- Speech summarizes input text bytes and output audio size and format.
- Transcription summarizes input audio size plus transcript text bytes, segment count, and duration.
Standardized Reasoning Metadata
The reasoning map is the provider-neutral contract for reasoning and thinking observability:
supported?says whether the operation and model support reasoning.requested?reflects the original API options passed to ReqLLM.effective?reflects the translated provider request after normalization.requested_mode,requested_effort, andrequested_budget_tokenscapture the caller intent.effective_mode,effective_effort, andeffective_budget_tokenscapture what the provider request actually used.returned_content?indicates whether reasoning content was observed.reasoning_tokenstracks normalized reasoning token usage when providers expose it.content_bytestracks the amount of reasoning content observed without exposing the content itself.channelis one of:none,:usage_only,:content_only, or:content_and_usage.
Requested reasoning is normalized from the original ReqLLM options, such as:
reasoning_effortthinking: %{type: "enabled", budget_tokens: ...}provider_options: [google_thinking_budget: ...]- provider-specific reasoning budget and thinking toggles
Effective reasoning is normalized from the translated provider request so that OpenAI, Anthropic, Google, Vertex, and other providers can be compared through the same telemetry shape.
The normalizer currently covers these provider request shapes:
- OpenAI-style reasoning effort fields such as
reasoning.effortandreasoning_efforton OpenAI, OpenRouter, Groq, and xAI - Anthropic-style thinking fields such as
thinkingandadditional_model_request_fields.thinkingon Anthropic, Azure Claude, Bedrock Claude, and Vertex Claude - Google-style thinking budgets such as
google_thinking_budgetandgenerationConfig.thinkingConfig.thinkingBudgeton Google Gemini and Vertex Gemini - Alibaba
enable_thinkingandthinking_budget - Zenmux
reasoning.enable,reasoning.depth, andreasoning_effort - Z.AI
thinking.type
Because requested is derived from the original ReqLLM call and effective is derived from the translated provider request, they can diverge when provider translation drops, disables, or rewrites a reasoning configuration.
When callers send conflicting reasoning controls, ReqLLM telemetry resolves them conservatively. Explicit disable signals such as thinking: %{type: "disabled"}, reasoning_effort: :none, or zero-token budgets win over enable hints in the normalized requested snapshot.
Reasoning Milestones
Reasoning events never include raw thinking text. They are metadata-only, even when payload capture is enabled.
reasoning.update is emitted only for milestone transitions:
milestone: :content_startedwhen the first reasoning content is observedmilestone: :usage_updatedwhen reasoning token usage first appears or changesmilestone: :details_availablewhen provider reasoning details become available
reasoning.start uses milestone: :request_started.
reasoning.stop uses the terminal outcome as its milestone, for example:
:stop:length:tool_calls:cancelled:incomplete:error:unknown
Token Usage Compatibility Event
[:req_llm, :token_usage] remains available for existing consumers and now fires for streaming as well as non-streaming requests.
Measurements include:
input_tokensoutput_tokenstotal_tokensinput_costoutput_costtotal_costreasoning_tokens
Metadata includes:
modelrequest_idoperationmodeprovidertransport
For new integrations, prefer [:req_llm, :request, :stop] as the source of truth because it includes duration, finish reason, summaries, and normalized reasoning metadata alongside usage.
Attaching Telemetry Handlers
defmodule MyApp.ReqLLMObserver do
require Logger
@events [
[:req_llm, :request, :start],
[:req_llm, :request, :stop],
[:req_llm, :request, :exception],
[:req_llm, :reasoning, :start],
[:req_llm, :reasoning, :update],
[:req_llm, :reasoning, :stop],
[:req_llm, :token_usage]
]
def attach do
:telemetry.attach_many("my-app-req-llm", @events, &__MODULE__.handle_event/4, nil)
end
def handle_event([:req_llm, :request, :stop], %{duration: duration}, metadata, _config) do
duration_ms = System.convert_time_unit(duration, :native, :millisecond)
Logger.info(
"req_llm request=#{metadata.request_id} model=#{metadata.model.provider}:#{metadata.model.id} " <>
"duration_ms=#{duration_ms} finish_reason=#{inspect(metadata.finish_reason)} " <>
"total_tokens=#{metadata.usage && metadata.usage.total_tokens}"
)
end
def handle_event([:req_llm, :reasoning, :update], _measurements, metadata, _config) do
Logger.debug(
"req_llm reasoning request=#{metadata.request_id} milestone=#{inspect(metadata.milestone)} " <>
"channel=#{inspect(metadata.reasoning.channel)} tokens=#{metadata.reasoning.reasoning_tokens}"
)
end
def handle_event([:req_llm, :token_usage], measurements, metadata, _config) do
Logger.info(
"req_llm usage request=#{metadata.request_id} total_tokens=#{measurements.total_tokens} " <>
"total_cost=#{measurements.total_cost}"
)
end
def handle_event(_event, _measurements, _metadata, _config), do: :ok
endPayload Capture
By default, ReqLLM telemetry is metadata-only:
config :req_llm, telemetry: [payloads: :none]You can opt into payload capture globally:
config :req_llm, telemetry: [payloads: :raw]Or per request:
ReqLLM.generate_text("anthropic:claude-haiku-4-5", "Hello", telemetry: [payloads: :raw])
ReqLLM.stream_text("openai:gpt-5-mini", "Hello", telemetry: [payloads: :raw])Payload mode only affects request lifecycle events. Reasoning events stay metadata-only.
Raw payload mode is still sanitized:
- reasoning and thinking text is redacted from payloads
- tools are emitted as stable metadata only (
name,description,strict,parameter_schema) - binary message parts such as images and files are summarized by byte size, media type, and filename instead of emitting raw bytes
- unknown payload shapes are recursively sanitized so opaque binaries are summarized instead of passed through
- speech telemetry reports audio size and format, not raw audio bytes
- embedding telemetry reports vector counts and dimensions, not the vectors themselves
- transcription telemetry stays structured and avoids opaque binary payloads
Use raw payload capture carefully in multi-tenant systems because request and response payloads may still contain user content, tool call arguments, and structured outputs.
OpenTelemetry Bridge
ReqLLM also includes a small OpenTelemetry bridge in ReqLLM.OpenTelemetry.
It turns the normalized request lifecycle telemetry above into GenAI client spans
without adding provider-specific instrumentation paths.
Attach it once during application startup:
case ReqLLM.OpenTelemetry.attach() do
:ok -> :ok
{:error, :opentelemetry_unavailable} -> :ok
endThe bridge uses:
gen_ai.provider.namegen_ai.operation.namegen_ai.request.modelgen_ai.output.typegen_ai.response.finish_reasonsgen_ai.usage.input_tokensgen_ai.usage.output_tokens- cache read and cache creation token attributes when available
error.typefor failed requests
ReqLLM does not configure an SDK or exporter for you. To export traces, your host
application still needs normal OpenTelemetry setup, such as :opentelemetry
and an exporter dependency.
For advanced integrations, ReqLLM also exposes a dependency-free mapper in
ReqLLM.Telemetry.OpenTelemetry. It builds span stubs from ReqLLM telemetry
metadata without attaching handlers or depending on an OpenTelemetry SDK.
defmodule MyApp.ReqLLMOpenTelemetry do
alias ReqLLM.Telemetry.OpenTelemetry
@events [
[:req_llm, :request, :start],
[:req_llm, :request, :stop],
[:req_llm, :request, :exception]
]
def attach do
:telemetry.attach_many("my-app-req-llm-otel", @events, &__MODULE__.handle_event/4, %{})
end
def handle_event([:req_llm, :request, :start], _measurements, metadata, _config) do
stub = OpenTelemetry.request_start(metadata, content: :attributes)
MyApp.Tracing.start_gen_ai_span(metadata.request_id, stub)
end
def handle_event([:req_llm, :request, :stop], _measurements, metadata, _config) do
stub = OpenTelemetry.request_stop(metadata, content: :attributes)
MyApp.Tracing.finish_gen_ai_span(metadata.request_id, stub)
end
def handle_event([:req_llm, :request, :exception], _measurements, metadata, _config) do
stub = OpenTelemetry.request_exception(metadata, content: :attributes)
MyApp.Tracing.finish_gen_ai_span(metadata.request_id, stub)
end
endThe low-level mapper includes richer normalized GenAI metadata such as:
gen_ai.response.idgen_ai.response.modelgen_ai.input.messagesandgen_ai.output.messageswhen content capture is enabled- tool call and tool result payloads in message parts
- exception event payloads for manual span finishing
Coverage Across APIs
These event families are emitted for:
- high-level sync APIs like
ReqLLM.generate_text/3,ReqLLM.generate_object/4,ReqLLM.generate_image/3,ReqLLM.embed/3,ReqLLM.transcribe/3, andReqLLM.speak/3 - high-level streaming APIs like
ReqLLM.stream_text/3andReqLLM.stream_object/4 - low-level Req-backed flows using
provider_module.prepare_request/4followed byReq.request/1 - low-level streaming flows using
ReqLLM.Streaming.start_stream/4
If you need observability that covers both sync and streaming, attach to ReqLLM telemetry rather than Req middleware alone.