Modules
Batches API for batch processing of content generation and embedding requests.
Context caching API for improved performance with long context.
Coordinates API calls across different authentication strategies and endpoints.
Documents API for RAG (Retrieval-Augmented Generation) document management.
File Search Stores API for semantic search and RAG (Retrieval-Augmented Generation).
Files API for uploading, managing, and using files with Gemini models.
API for image generation using Google's Imagen models.
Interactions API (experimental).
Complete Models API implementation following the unified architecture.
Operations API for managing long-running operations.
RAG Stores API for managing file search stores.
Token counting functionality for Gemini API.
API module for model tuning (fine-tuning) operations.
API for video generation using Google's Veo models.
Authentication strategy behavior and implementations for Gemini and Vertex AI.
Application Default Credentials (ADC) for Google Cloud authentication.
Authentication strategy for Google Gemini API using API key.
JWT token generation and management for Google Cloud service accounts.
Authentication via GCP metadata server for workloads running on Google Cloud Platform.
Coordinates multiple authentication strategies for concurrent usage.
Behavior for authentication strategies.
ETS-based token caching with automatic expiration handling.
Authentication strategy for Google Vertex AI using OAuth2/Service Account.
Formalized chat session management with immutable history updates.
Main client module that delegates to the appropriate HTTP client implementation.
HTTP client for both Gemini and Vertex AI APIs using Req.
HTTP client for streaming Server-Sent Events (SSE) from Gemini API.
WebSocket client for Gemini Live API using :gun.
Unified configuration management for both Gemini and Vertex AI authentication.
Standardized error structure for Gemini client.
Audio utilities for Live API.
Creates ephemeral tokens for client-side Live API access.
Live API model selection helpers.
GenServer managing a Live API WebSocket session.
Canonical model registry with capability metadata.
Rate limiting, concurrency gating, and retry management for Gemini API requests.
Per-model concurrency gating using semaphore-like permits.
Configuration management for the rate limiter.
Central rate limiter manager that coordinates request submission.
Manages retry logic with backoff strategies.
ETS-based state management for rate limiting.
Server-Sent Events (SSE) parser for streaming responses.
GenServer responsible for managing a single, stateful, automatic tool-calling stream.
Unified streaming manager that supports multiple authentication strategies.
Top-level supervisor for the Gemini application.
Named task supervisor for Gemini background tasks.
Telemetry instrumentation helpers for Gemini library.
High-level facade for tool registration and execution in the Gemini client.
Implements the Automatic Function Calling (AFC) loop for Gemini.
Configuration for automatic function calling.
Executes function calls from Gemini API responses against a registry of implementations.
Type definitions for batch processing jobs.
Binary data with MIME type for Gemini API.
Metadata describing cached content usage.
Content type for Gemini API requests and responses.
Configuration for creating a batch job.
Configuration for creating a new File Search Store.
Response type for file deletion.
Type definitions for RAG document management.
Comprehensive enumeration types for the Gemini API.
Image aspect ratios for image generation.
Reasons why content generation was blocked.
Outcome of code execution.
Dynamic retrieval configuration modes.
Supported languages for code execution.
Reasons why generation finished.
Function calling configuration modes.
Confidence levels for grounding attribution.
Threshold levels for blocking harmful content.
Categories of harmful content that can be filtered.
Probability levels of harmful content.
Output image sizes for image generation.
Task types for embedding generation.
Thinking configuration levels for Gemini 3 models.
Available voice names for text-to-speech.
Type definitions for file management operations.
URI-based file data reference used in parts and tool results.
Represents a document within a File Search Store.
Type definitions for File Search Stores (semantic search stores).
Result output of a function call.
Type definitions for image generation using Google's Imagen models.
Configuration for image editing operations.
Represents a generated image result.
Configuration for image generation requests.
Configuration for image upscaling operations.
Type definitions for video generation using Google's Veo models.
Represents a generated video result.
Configuration for video generation requests.
Reference image used to guide video generation.
Represents a video generation operation with progress tracking.
Configuration for content generation parameters.
Configuration for image generation in Gemini 3 Pro Image.
Configuration for thinking/reasoning in Gemini models.
Agent config union (DynamicAgentConfig | DeepResearchAgentConfig).
Allowed tools configuration ({mode, tools}).
Citation information for model-generated text.
An audio content block (type: "audio").
Audio mime types for Interactions content.
Cached token count for a response modality.
code_execution tool declaration.
Arguments for a code_execution_call content block.
Code execution call content block (type: "code_execution_call").
Code execution result content block (type: "code_execution_result").
computer_use tool declaration.
Union type for Interactions input/output content blocks.
Deep Research agent configuration (type: "deep-research").
Discriminated union for content.delta.delta payloads (18 variants).
Audio content delta for streaming responses.
Code execution call delta for streaming responses.
Code execution result delta for streaming responses.
Document content delta for streaming responses.
File search result delta for streaming responses.
Result type for file search result delta.
Function call delta for streaming responses.
Function result delta for streaming responses.
Result type for function result delta.
Items container for function result delta.
Item type for function result delta.
Google search call delta for streaming responses.
Google search result delta for streaming responses.
Image content delta for streaming responses.
MCP server tool call delta for streaming responses.
MCP server tool result delta for streaming responses.
Result type for MCP server tool result delta.
Items container for MCP server tool result delta.
Item type for MCP server tool result delta.
Text content delta for streaming responses.
Thought signature delta for streaming responses.
Thought summary delta for streaming responses.
Content type for thought summary delta.
URL context call delta for streaming responses.
URL context result delta for streaming responses.
Video content delta for streaming responses.
A document content block (type: "document").
Dynamic agent configuration (type: "dynamic").
Helpers for decoding Interactions SSE events.
Interactions SSE event: content.delta.
Interactions SSE event: content.start.
Interactions SSE event: content.stop.
Error payload inside an Interactions SSE error event.
Interactions SSE event: event_type: "error".
Interactions SSE event: interaction.start or interaction.complete.
Union type for Interactions SSE events (6 variants).
Interactions SSE event: interaction.status_update.
file_search tool declaration.
File Search call content block (type: "file_search_call").
An item inside file_search_result results.
File Search result content block (type: "file_search_result").
function tool declaration.
A function tool call content block (type: "function_call").
A function tool result content block (type: "function_result").
Interactions GenerationConfig (snake_case keys).
google_search tool declaration.
Arguments for a google_search_call content block.
Google Search call content block (type: "google_search_call").
A Google Search result item.
Google Search result content block (type: "google_search_result").
Configuration for image generation in Interactions.
An image content block (type: "image").
Image mime types for Interactions content.
Input union for Interactions create.
Input token count for a response modality.
Interactions Interaction resource.
mcp_server tool declaration.
MCP server tool call content block (type: "mcp_server_tool_call").
MCP server tool result content block (type: "mcp_server_tool_result").
Output token count for a response modality.
Speech config for Interactions generation (different from generateContent).
A text content block (type: "text").
Thinking level for Interactions generation ("minimal", "low", "medium", "high").
A thought content block (type: "thought").
Union type for Interactions tools.
Tool choice union (ToolChoiceType | ToolChoiceConfig).
Tool choice configuration.
Tool choice type ("auto" | "any" | "none" | "validated").
Tool-use token count for a response modality.
A conversation turn in the Interactions API.
url_context tool declaration.
Arguments for a url_context_call content block.
URL context call content block (type: "url_context_call").
URL context result item ({status, url}).
URL context result content block (type: "url_context_result").
Token usage statistics for an Interaction.
A video content block (type: "video").
Video mime types for Interactions content.
Response type for listing batch jobs.
Response type for listing documents in a RAG store.
Response type for listing file search stores.
Response type for listing files.
Response type for listing operations.
Response type for listing RAG stores.
Audio transcription configuration for Live API sessions.
Automatic activity detection configuration for Live API sessions.
Client content message for Live API sessions.
Context window compression configuration for Live API sessions.
Enumeration types for the Live API (WebSocket).
The different ways of handling user activity.
Determines how end of speech is detected.
Determines how start of speech is detected.
Options about which input is included in the user's turn.
Voice Activity Detection signal types.
Notice from the server that the connection will soon be terminated.
Grounding metadata for Live API responses.
Proactivity configuration for Live API sessions.
Realtime input for Live API sessions.
Realtime input configuration for Live API sessions.
Server content message for Live API sessions.
Server message wrapper for Live API responses.
Session resumption configuration for Live API sessions.
Session resumption state update from the server.
Session setup configuration for Live API.
Setup complete message from the server.
Sliding window context compression configuration.
Tool call request from the server in Live API sessions.
Tool call cancellation notification from the server.
Tool response from the client in Live API sessions.
Transcription of audio (input or output) in Live API sessions.
Usage metadata for Live API responses.
Voice activity signal for Live API sessions.
Media resolution enum for controlling token allocation on media inputs.
Response modality types for multimodal generation.
Configuration for Model Armor integrations.
Configuration for multi-speaker voice synthesis.
Type definitions for long-running operations.
Part type for content in Gemini API.
Media resolution settings for Gemini 3 vision processing.
Configuration for a prebuilt voice.
Type definitions for RAG stores (FileSearchStores).
Configuration for the register_files method.
Response from the register_files method.
Request structure for batch embedding multiple content items.
Request structure for counting tokens.
Async batch embedding job request.
Request structure for embedding content using Gemini embedding models.
Request structure for content generation.
Request structure for getting a specific model.
A single embedding request within an async batch, with optional metadata.
Container for multiple inlined embedding requests in a batch.
Input configuration for async batch embedding.
Request structure for listing models with pagination support.
Response types for the Gemini API.
Response structure for batch embedding requests.
Represents the state of an async batch embedding job.
Content candidate in response.
Citation metadata for generated content.
Citation source information.
A list of floats representing an embedding.
Response from counting tokens.
Complete async batch embedding job status and results.
Output of an async batch embedding job.
Statistics about an async embedding batch job.
Response structure for embedding content requests.
Response from content generation.
Grounding attribution information.
Grounding attribution source ID.
Grounding passage ID.
Response for a single request within an async batch.
Container for all responses in an inline batch.
Response structure for listing models.
Token counting information for a single modality.
Model information response structure.
Prompt feedback information.
Safety rating for content.
Semantic retriever chunk information.
Traffic type for API requests (billing classification).
Usage metadata for API calls.
Safety settings for content generation.
JSON Schema type for defining function parameters in Gemini tool calling.
Configuration for a single speaker in multi-speaker voice synthesis.
Speech generation configuration.
Pure data transformation utilities to serialize ALTAR ADM tool structures into the exact JSON maps expected by the Gemini API.
Types for the Tunings API (fine-tuning/model tuning).
Configuration for creating a new tuning job.
Hyperparameters for supervised tuning.
Response from listing tuning jobs with pagination support.
Specification for supervised tuning configuration.
Represents a tuning job with full status and configuration.
Error information for failed tuning jobs.
Configuration options for file upload.
Voice configuration for speech synthesis.
Shared helper functions for building maps with optional values.
Shared helper functions for polling operations.
Utilities for normalizing Google Cloud resource names for Gemini/Vertex AI.
Validation for thinking configuration parameters based on model capabilities.