Smart Router
View SourceThe Router provides intelligent multi-provider orchestration with automatic failover, load balancing, and task-based routing.
Overview
The Router selects the best LLM provider for each request based on configurable strategies, handles failures gracefully, and tracks provider health.
Creating a Router
alias Rag.Router
# With specific providers
{:ok, router} = Router.new(providers: [:gemini, :claude, :codex])
# Auto-detect available providers
{:ok, router} = Router.new(auto_detect: true)
# With specific strategy
{:ok, router} = Router.new(
providers: [:gemini, :claude],
strategy: :fallback
)Auto-Strategy Selection
When strategy: :auto or not specified:
| Provider Count | Strategy | Reason |
|---|---|---|
| 3+ providers | :specialist | Task-based routing |
| 2 providers | :fallback | Reliability via retry |
| 1 provider | :fallback | Passthrough |
Core API
Execute Request
# Text generation
{:ok, response, router} = Router.execute(router, :text, "Hello", [])
# With options
{:ok, response, router} = Router.execute(router, :text, "Hello",
system_prompt: "You are helpful.",
temperature: 0.7
)
# Embeddings
{:ok, embeddings, router} = Router.execute(router, :embeddings, ["text1", "text2"], [])
# Streaming
{:ok, stream, router} = Router.execute(router, :text, "Count to 10", stream: true)
Enum.each(stream, &IO.write/1)Route Only (No Execution)
# Get selected provider without executing
{:ok, provider, router} = Router.route(router, :text, "Hello", [])
# provider is :gemini, :claude, or :codexQuery Providers
# Available providers
Router.available_providers(router) # [:gemini, :claude]
# Get provider capabilities
{:ok, caps} = Router.get_provider(router, :gemini)
# %{embeddings: true, streaming: true, max_context: 1_000_000, ...}Routing Strategies
Fallback Strategy
Tries providers in order until one succeeds.
{:ok, router} = Router.new(
providers: [:gemini, :claude, :codex],
strategy: :fallback,
fallback_order: [:claude, :gemini, :codex], # Custom order
max_failures: 3, # Skip after 3 failures
failure_decay_ms: 60_000 # Reset after 60s
)Behavior:
- Try first provider in order
- On failure, mark and try next
- Skip providers with >=
max_failuresrecent failures - Automatically recover after
failure_decay_ms - Success resets failure count
Configuration:
fallback_order- Custom provider order (default: providers list order)max_failures- Failures before skipping (default: 3)failure_decay_ms- Time to reset failures (default: 60,000ms)
Round-Robin Strategy
Distributes load evenly across providers.
{:ok, router} = Router.new(
providers: [:gemini, :claude, :codex],
strategy: :round_robin,
weights: %{gemini: 3, codex: 2, claude: 1}, # Optional weighted
max_consecutive_failures: 3,
recovery_cooldown_ms: 30_000
)Behavior:
- Rotate through providers
- With weights:
{gemini: 2, codex: 1}produces gemini, gemini, codex, gemini, gemini, codex... - Skip unavailable providers
- Mark unavailable after consecutive failures
- Recover after cooldown period
Configuration:
weights- Provider weights (default: equal)max_consecutive_failures- Before marking unavailable (default: 3)recovery_cooldown_ms- Recovery time (default: 30,000ms)
Specialist Strategy
Routes based on task type to best-suited provider.
{:ok, router} = Router.new(
providers: [:gemini, :claude, :codex],
strategy: :specialist,
task_mappings: %{
embeddings: :gemini,
code_generation: :codex,
analysis: :claude
},
fallback_provider: :gemini,
max_failures: 3
)Default Task Mappings:
| Task | Provider | Reason |
|---|---|---|
:embeddings | Gemini | Embedding support |
:long_context | Gemini | 1M token window |
:multimodal | Gemini | Image/audio |
:cost | Gemini | Cheapest |
:speed | Gemini | Fastest |
:code_generation | Codex | Code optimized |
:code_review | Codex | Code understanding |
:structured_output | Codex | JSON generation |
:analysis | Claude | Deep reasoning |
:writing | Claude | Best prose |
:agentic | Claude | Multi-step |
:reasoning | Claude | Complex logic |
:safety | Claude | Safety focus |
Task Inference:
- Explicit:
Router.execute(router, :text, prompt, task: :code_generation) - Automatic: Infers from prompt keywords
- Code keywords: "write", "implement", "function", "code", "class"
- Analysis keywords: "analyze", "explain", "review", "compare"
Configuration:
task_mappings- Task to provider mappingfallback_provider- Default if preferred unavailablemax_failures- Before marking unavailable
Error Handling
case Router.execute(router, :text, "Hello", []) do
{:ok, response, updated_router} ->
# Success - use updated_router for subsequent calls
IO.puts(response)
{:error, :all_providers_failed} ->
# All providers failed
IO.puts("No providers available")
{:error, reason} ->
# Other error
IO.puts("Error: #{inspect(reason)}")
endManual Result Reporting
# Report success/failure for custom execution
router = Router.report_result(router, :gemini, {:ok, "response"})
router = Router.report_result(router, :gemini, {:error, :timeout})
# Get next provider after failure
{:ok, next_provider, router} = Router.next_provider(router, :gemini)Execution Flow
User Request
|
v
Router.execute()
|
v
Strategy.select_provider()
|
v
Get/Create Provider Instance
|
v
Provider.generate_text() or .generate_embeddings()
|
+---> Success: return {:ok, result, router}
|
+---> Failure: report_result() -> next_provider() -> retry
|
+--> All exhausted: {:error, :all_providers_failed}Health Tracking
Each strategy tracks provider health differently:
Fallback
- Counts consecutive failures per provider
- Skips provider when count >=
max_failures - Resets count after
failure_decay_msor on success
Round-Robin
- Counts consecutive failures per provider
- Marks unavailable when count >=
max_consecutive_failures - Recovers after
recovery_cooldown_ms - Resets count on success
Specialist
- Counts total failures per provider
- Marks unavailable when count >=
max_failures - Falls back to fallback_provider
- Resets count on success
Provider Instance Caching
The router caches provider instances:
# First use creates instance
{:ok, response, router} = Router.execute(router, :text, "Hello", [])
# Subsequent calls reuse cached instance
{:ok, response, router} = Router.execute(router, :text, "World", [])Best Practices
- Use fallback for reliability - When uptime is critical
- Use round-robin for load balancing - When distributing load matters
- Use specialist for optimization - When matching task to provider matters
- Always use updated router - Router state changes after each call
- Handle all_providers_failed - Have a fallback plan
- Configure timeouts - Prevent hanging on slow providers
Example: Complete Setup
alias Rag.Router
# Configure with all strategies
{:ok, router} = Router.new(
providers: [:gemini, :claude, :codex],
strategy: :specialist,
task_mappings: %{
embeddings: :gemini,
code_generation: :codex,
analysis: :claude,
general: :gemini
},
fallback_provider: :gemini,
max_failures: 3
)
# Embeddings go to Gemini
{:ok, embeddings, router} = Router.execute(router, :embeddings, ["text"], [])
# Code tasks go to Codex
{:ok, code, router} = Router.execute(router, :text,
"Write a fibonacci function",
task: :code_generation
)
# Analysis goes to Claude
{:ok, analysis, router} = Router.execute(router, :text,
"Analyze this architecture decision",
task: :analysis
)Next Steps
- LLM Providers - Learn about each provider
- Embeddings - Embedding generation service