Google Vertex AI
View SourceAccess Claude models through Google Cloud's Vertex AI platform. All Claude 4.x models including Opus, Sonnet, and Haiku with full tool calling and reasoning support.
Configuration
Vertex AI uses Google Cloud OAuth2 authentication with service accounts.
Service Account (Recommended)
Environment Variables:
GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
GOOGLE_CLOUD_PROJECT="your-project-id"
GOOGLE_CLOUD_REGION="global"
Provider Options:
ReqLLM.generate_text(
"google_vertex_anthropic:claude-sonnet-4-5@20250929",
"Hello",
provider_options: [
service_account_json: "/path/to/service-account.json",
project_id: "your-project-id",
region: "global"
]
)Provider Options
Passed via :provider_options keyword:
service_account_json
- Type: String (file path)
- Purpose: Path to Google Cloud service account JSON file
- Fallback:
GOOGLE_APPLICATION_CREDENTIALSenv var - Example:
provider_options: [service_account_json: "/path/to/credentials.json"]
access_token
- Type: String
- Purpose: Use an existing OAuth2 access token generated outside ReqLLM (e.g., via Goth or gcloud)
- Behavior: Bypasses the service account JSON flow and internal token management
- Example:
provider_options: [access_token: "your-access-token"]
project_id
- Type: String
- Purpose: Google Cloud project ID
- Fallback:
GOOGLE_CLOUD_PROJECTenv var - Example:
provider_options: [project_id: "my-project-123"] - Required: Yes
region
- Type: String
- Default:
"global" - Purpose: GCP region for Vertex AI endpoint
- Example:
provider_options: [region: "us-central1"] - Note: Use
"global"for newest models, specific regions for regional deployment
additional_model_request_fields
- Type: Map
- Purpose: Model-specific request fields (e.g., thinking configuration)
- Example:
provider_options: [ additional_model_request_fields: %{ thinking: %{type: "enabled", budget_tokens: 4096} } ]
Claude-Specific Options
Vertex AI supports the same Claude options as native Anthropic:
anthropic_top_k
- Type:
1..40 - Purpose: Sample from top K options per token
- Example:
provider_options: [anthropic_top_k: 20]
stop_sequences
- Type: List of strings
- Purpose: Custom stop sequences
- Example:
provider_options: [stop_sequences: ["END", "STOP"]]
anthropic_metadata
- Type: Map
- Purpose: Request metadata for tracking
- Example:
provider_options: [anthropic_metadata: %{user_id: "123"}]
thinking
- Type: Map
- Purpose: Enable extended thinking/reasoning
- Example:
provider_options: [thinking: %{type: "enabled", budget_tokens: 4096}] - Access:
ReqLLM.Response.thinking(response)
anthropic_prompt_cache
- Type: Boolean
- Purpose: Enable prompt caching
- Example:
provider_options: [anthropic_prompt_cache: true]
anthropic_prompt_cache_ttl
- Type: String (e.g.,
"1h") - Purpose: Cache TTL (default ~5min if omitted)
- Example:
provider_options: [anthropic_prompt_cache_ttl: "1h"]
Supported Models
Claude 4.5 Family
Haiku 4.5:
google_vertex_anthropic:claude-haiku-4-5@20251001- Fast, cost-effective
- Full tool calling and reasoning support
Sonnet 4.5:
google_vertex_anthropic:claude-sonnet-4-5@20250929- Balanced performance and capability
- Extended thinking support
Opus 4.1:
google_vertex_anthropic:claude-opus-4-1@20250805- Highest capability
- Advanced reasoning
Claude 4.0 & Earlier
- Sonnet 4.0:
google_vertex_anthropic:claude-sonnet-4@20250514 - Opus 4.0:
google_vertex_anthropic:claude-opus-4@20250514 - Sonnet 3.7:
google_vertex_anthropic:claude-3-7-sonnet@20250219 - Sonnet 3.5 v2:
google_vertex_anthropic:claude-3-5-sonnet@20241022 - Haiku 3.5:
google_vertex_anthropic:claude-3-5-haiku@20241022
Model ID Format
Vertex uses the @ symbol for versioning:
- Format:
claude-{tier}-{version}@{date} - Example:
claude-sonnet-4-5@20250929
Wire Format Notes
- Authentication: OAuth2 with service account tokens (auto-refreshed)
- Endpoint: Model-specific paths under
aiplatform.googleapis.com - API: Uses Anthropic's raw message format (compatible with native API)
- Streaming: Standard Server-Sent Events (SSE)
- Region routing: Global endpoint for newest models, regional for specific deployments
All differences handled automatically by ReqLLM.