Groq

Ultra-fast LLM inference with custom hardware (LPU). OpenAI-compatible with Groq-specific options.

Configuration

GROQ_API_KEY=gsk_...

Provider Options

Passed via :provider_options keyword:

`service_tier`

Type: "auto" | "on_demand" | "flex" | "performance"
Default: "auto"
Purpose: Control performance tier for requests
Example: provider_options: [service_tier: "performance"]

`reasoning_effort`

Type: "none" | "default" | "low" | "medium" | "high"
Purpose: Control reasoning level for compatible models
Compatible: DeepSeek R1 distill models
Example: provider_options: [reasoning_effort: "high"]

`reasoning_format`

Type: String
Purpose: Specify format for reasoning output
Example: provider_options: [reasoning_format: "detailed"]

`search_settings`

Type: Map
Purpose: Enable web search capabilities
Keys:
- include_domains: List of domains to include
- exclude_domains: List of domains to exclude

Example:

provider_options: [
  search_settings: %{
    include_domains: ["techcrunch.com", "arstechnica.com"],
    exclude_domains: ["spam.com"]
  }
]

`compound_custom`

Type: Map
Purpose: Custom configuration for Compound systems
Example: provider_options: [compound_custom: %{...}]

Performance Notes

Streaming: Groq's LPU hardware excels at streaming - tokens appear instantly
Model Selection: Use 8b-instant for speed, 70b for quality
Service Tier: Use "performance" for lowest latency
Concurrency: Handles concurrent requests efficiently

Resources

← Previous Page xAI (Grok)

Next Page → OpenRouter