ReqLLM.Providers.Azure
(ReqLLM v1.6.0)
View Source
Azure AI provider implementation.
Supports Azure's AI services for accessing models from multiple families:
OpenAI Models
- GPT-4o, GPT-4, GPT-3.5 Turbo
- Reasoning models (o1, o3 series)
- Text embedding models
Anthropic Claude Models
- Claude 3 Opus, Sonnet, Haiku
- Claude 3.5 Sonnet
- Extended thinking/reasoning support
Capabilities
- Text generation (chat completions / messages)
- Streaming responses with usage tracking
- Tool calling (function calling)
- Embeddings generation (OpenAI models only)
- Multi-modal inputs (text and images)
- Structured output generation
- Extended thinking (Claude models)
Key Differences from Direct Provider APIs
Custom endpoints: Each Azure resource has a unique base URL. Azure supports two endpoint formats, auto-detected from the domain:
- Azure OpenAI Service (
.cognitiveservices.azure.comor.openai.azure.com): URL:/deployments/{deployment}/chat/completions?api-version={version}Model determined by deployment name in URL path.
- Azure OpenAI Service (
API key authentication: Uses
api-keyheader for all model familiesBearer token authentication: Prefix api_key with
"Bearer "to useAuthorization: BearerheaderDeployment names: The deployment name is used either in the URL path (traditional) or in the request body (Foundry format)
No model field in body: The deployment ID in the URL determines the model
- Azure AI Foundry (
.services.ai.azure.com): URL:/models/chat/completions?api-version={version}Model specified in request body (deployment name used).
- Azure AI Foundry (
Authentication
Environment variables are resolved by model family:
# For OpenAI models (GPT, o1, o3, etc.)
export AZURE_OPENAI_API_KEY=your-api-key
export AZURE_OPENAI_BASE_URL=https://your-openai-resource.openai.azure.com/openai
# For Anthropic models (Claude)
export AZURE_ANTHROPIC_API_KEY=your-api-key
export AZURE_ANTHROPIC_BASE_URL=https://your-anthropic-resource.openai.azure.com/openai
# Universal fallbacks (if all models share the same Azure resource)
export AZURE_API_KEY=your-api-key
export AZURE_BASE_URL=https://your-resource.openai.azure.com/openai
# Or pass directly in options (Azure OpenAI Service format)
ReqLLM.generate_text(
"azure:gpt-4o",
"Hello!",
api_key: "your-api-key",
base_url: "https://my-resource.openai.azure.com/openai",
deployment: "my-gpt4-deployment"
)
# Using Bearer token authentication (e.g., Entra ID / Azure AD tokens)
ReqLLM.generate_text(
"azure:gpt-4o",
"Hello!",
api_key: "Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...",
base_url: "https://my-resource.openai.azure.com/openai",
deployment: "my-gpt4-deployment"
)
# Azure AI Foundry format (auto-detected from domain)
ReqLLM.generate_text(
"azure:deepseek-v3",
"Hello!",
api_key: "your-api-key",
base_url: "https://my-resource.services.ai.azure.com",
deployment: "deepseek-v3"
)Examples
# Basic usage
{:ok, response} = ReqLLM.generate_text(
"azure:gpt-4o",
"What is Elixir?",
base_url: "https://my-resource.openai.azure.com/openai",
deployment: "my-gpt4-deployment"
)
# Streaming
{:ok, response} = ReqLLM.stream_text(
"azure:gpt-4o",
"Tell me a story",
base_url: "https://my-resource.openai.azure.com/openai",
deployment: "my-gpt4-deployment"
)
# With tools
tools = [%ReqLLM.Tool{name: "get_weather", ...}]
{:ok, response} = ReqLLM.generate_text(
"azure:gpt-4o",
"What's the weather?",
base_url: "https://my-resource.openai.azure.com/openai",
deployment: "my-gpt4-deployment",
tools: tools
)
# Embeddings
{:ok, embedding} = ReqLLM.generate_embedding(
"azure:text-embedding-3-small",
"Hello world",
base_url: "https://my-resource.openai.azure.com/openai",
deployment: "my-embedding-deployment"
)
# OpenAI reasoning models (o1, o3, o4-mini)
{:ok, response} = ReqLLM.generate_text(
"azure:o1",
"Solve this complex math problem step by step...",
base_url: "https://my-resource.openai.azure.com/openai",
deployment: "my-o1-deployment",
max_tokens: 8000,
provider_options: [reasoning_effort: "high"]
)
# Claude with extended thinking
{:ok, response} = ReqLLM.generate_text(
"azure:claude-3-5-sonnet-20241022",
"Analyze this complex problem...",
base_url: "https://my-resource.openai.azure.com/openai",
deployment: "my-claude-deployment",
thinking: %{type: "enabled", budget_tokens: 10000},
max_tokens: 4096
)Deployment Configuration
Azure uses deployment names to route requests to specific model instances. If no deployment is specified, the model ID is used as a default (with a warning).
To find your deployment name:
- Go to Azure OpenAI Studio (https://oai.azure.com/)
- Navigate to "Deployments"
- Copy the deployment name (e.g., "gpt-4o-prod", "claude-sonnet")
Error Handling
Common error scenarios:
- Missing API key: Set
AZURE_API_KEY(or family-specific:AZURE_OPENAI_API_KEY,AZURE_ANTHROPIC_API_KEY) - Missing base URL: Set
AZURE_BASE_URL(or family-specific:AZURE_OPENAI_BASE_URL,AZURE_ANTHROPIC_BASE_URL) - Invalid deployment: Ensure the deployment name matches your Azure resource
- Unsupported API version: Check Azure documentation for supported versions
Extending for New Model Families
Azure hosts multiple model families (OpenAI GPT, Anthropic Claude). To add support for a new model family:
- Create a formatter module under
ReqLLM.Providers.Azure.*(seeAzure.OpenAIorAzure.Anthropicas examples) - Add the model prefix to
@model_familiesmap in this module - Handle any family-specific endpoint paths in
get_chat_endpoint_path/3 - Add family-specific headers in
get_anthropic_headers/2if needed
Summary
Functions
Attaches Azure-specific authentication and pipeline steps to a request.
Builds a Finch request for streaming responses.
Default implementation of build_body/1.
Checks if an error indicates missing Azure credentials.
Decodes Azure API responses using the appropriate model-family formatter.
Decodes Server-Sent Events for streaming responses.
Callback implementation for ReqLLM.Provider.default_env_key/0.
Pass-through encoding - body is pre-encoded by formatters in prepare_request.
Extracts usage/token information from API responses.
Pre-validates and transforms options before request building.
Prepares a request for Azure AI services.
Returns thinking constraints for extended thinking support.
Translates ReqLLM options to provider-specific format.
Functions
Attaches Azure-specific authentication and pipeline steps to a request.
Authentication is determined by the api_key format:
- If api_key starts with "Bearer ", uses
Authorization: Bearerheader - Otherwise, uses
api-keyheader for OpenAI models,x-api-keyfor Claude
Also adds model-family specific headers (e.g., anthropic-version for Claude models).
Builds a Finch request for streaming responses.
Constructs the appropriate endpoint URL based on model family and adds
Azure-specific headers (api-key, anthropic-version for Claude).
Default implementation of build_body/1.
Builds request body using OpenAI-compatible format for chat and embedding operations.
Checks if an error indicates missing Azure credentials.
Returns true if the error message mentions AZURE_OPENAI_API_KEY or api_key.
Decodes Azure API responses using the appropriate model-family formatter.
Routes to Azure.OpenAI.parse_response/3 or Azure.Anthropic.parse_response/3
based on the model. Handles both successful responses and error extraction.
Decodes Server-Sent Events for streaming responses.
Delegates to the appropriate model-family formatter for SSE parsing.
Callback implementation for ReqLLM.Provider.default_env_key/0.
Pass-through encoding - body is pre-encoded by formatters in prepare_request.
This follows the same pattern as Amazon Bedrock where the model-family-specific formatter handles body encoding during request preparation.
Extracts usage/token information from API responses.
Delegates to the model-family formatter for provider-specific usage extraction.
Pre-validates and transforms options before request building.
Delegates to the model-specific formatter (Azure.OpenAI or Azure.Anthropic). This handles model-specific requirements like reasoning parameter translation.
Note: This is not yet a formal Provider callback but is called by Options.process/4 if the provider exports it.
Prepares a request for Azure AI services.
Routes to the appropriate formatter (OpenAI or Anthropic) based on model family.
Operations
:chat- Text generation via chat completions or messages endpoint:object- Structured output generation (uses tools for OpenAI, native for Claude):embedding- Vector embeddings (OpenAI embedding models only)
Returns thinking constraints for extended thinking support.
Azure hosts both OpenAI and Anthropic models with different constraints:
- Claude models require temperature=1.0 for extended thinking (enforced in
Azure.Anthropic.pre_validate_options/3) - OpenAI reasoning models (o1, o3, o4) use
reasoning_effortparameter, not the extended thinking protocol
Returns :none since there are no universal constraints that apply to all
Azure models. Model-family-specific constraints are enforced in the respective
formatter modules during pre_validate_options.
Translates ReqLLM options to provider-specific format.
Delegates to OpenAI.translate_options/3 for GPT models or
Anthropic.translate_options/3 for Claude models to handle
model-specific parameter requirements.