ReqLLM.Providers.Groq (ReqLLM v1.0.0)

View Source

Groq provider – 100% OpenAI Chat Completions compatible with Groq's high-performance hardware.

Implementation

Uses built-in OpenAI-style encoding/decoding defaults. No custom request/response handling needed – leverages the standard OpenAI wire format.

Groq-Specific Extensions

Beyond standard OpenAI parameters, Groq supports:

  • service_tier - Performance tier (auto, on_demand, flex, performance)
  • reasoning_effort - Reasoning level (none, default, low, medium, high)
  • reasoning_format - Format for reasoning output
  • search_settings - Web search configuration
  • compound_custom - Custom Compound systems configuration
  • logit_bias - Token bias adjustments

See provider_schema/0 for the complete Groq-specific schema and ReqLLM.Provider.Options for inherited OpenAI parameters.

Configuration

# Add to .env file (automatically loaded)
GROQ_API_KEY=gsk_...

Summary

Functions

Default implementation of attach/3.

Custom attach_stream that ensures translate_options is called for streaming requests.

Custom response decoding that normalizes <think> tags into reasoning content parts.

Default implementation of decode_stream_event/2.

Stateful SSE event decoding that normalizes <think> tags.

Custom body encoding that adds Groq-specific extensions to the default OpenAI-compatible format.

Default implementation of extract_usage/2.

Flush any remaining buffered content when stream ends.

Initialize streaming state for <think> tag normalization.

Custom prepare_request for :object operations to maintain Groq-specific max_tokens handling.

Default implementation of translate_options/3.

Functions

attach(request, model_input, user_opts)

Default implementation of attach/3.

Sets up Bearer token authentication and standard pipeline steps.

attach_stream(model, context, opts, finch_name)

Custom attach_stream that ensures translate_options is called for streaming requests.

This is necessary because the default streaming path doesn't call translate_options, which means model-specific option normalization (like omitting reasoning_effort for qwen models) wouldn't be applied to streaming requests.

decode_response(request_response)

Custom response decoding that normalizes <think> tags into reasoning content parts.

Some Groq models (qwen/qwen3-32b, deepseek-r1-distill-llama-70b) embed thinking content within <think>...</think> tags in the message content field. This override normalizes those responses to extract thinking content as :thinking content parts, matching the behavior of models that use separate delta.reasoning fields.

For non-streaming: splits <think> blocks from message content into :thinking and :text parts. For streaming: wraps the stream to convert embedded <think> sequences in chunks into separate chunks.

decode_stream_event(event, model)

Default implementation of decode_stream_event/2.

Decodes SSE events using OpenAI-compatible format.

decode_stream_event(event, model, provider_state)

Stateful SSE event decoding that normalizes <think> tags.

Maintains state across events to handle tags split across chunks. Returns updated chunks and new state.

default_base_url()

default_env_key()

Callback implementation for ReqLLM.Provider.default_env_key/0.

default_provider_opts()

encode_body(request)

Custom body encoding that adds Groq-specific extensions to the default OpenAI-compatible format.

Adds support for:

  • service_tier (auto, on_demand, flex, performance)
  • reasoning_effort (none, default, low, medium, high)
  • reasoning_format
  • search_settings
  • compound_custom
  • logit_bias (in addition to standard options)

extract_usage(body, model)

Default implementation of extract_usage/2.

Extracts usage data from standard usage field in response body.

flush_stream_state(model, state)

Flush any remaining buffered content when stream ends.

Emits final thinking or text chunk if buffer is non-empty.

init_stream_state(model)

Initialize streaming state for <think> tag normalization.

Returns initial state with :text mode and empty buffer.

metadata()

prepare_request(operation, model_spec, input, opts)

Custom prepare_request for :object operations to maintain Groq-specific max_tokens handling.

Ensures that structured output requests have adequate token limits while delegating other operations to the default implementation.

provider_extended_generation_schema()

provider_id()

provider_schema()

supported_provider_options()

translate_options(operation, model, opts)

Default implementation of translate_options/3.

Pass-through implementation that returns options unchanged.