Gemini.APIs.Images (GeminiEx v0.8.4)

View Source

API for image generation using Google's Imagen models.

Imagen is Google's family of text-to-image models that can generate, edit, and upscale high-quality images from text descriptions. This module provides a unified interface for all image generation operations.

Note: Image generation is currently only available through Vertex AI, not the Gemini API. You must configure Vertex AI credentials to use these functions.

Supported Models

  • imagegeneration@006 - Latest stable Imagen model (recommended)
  • imagen-3.0-generate-001 - Imagen 3.0 generation model

Capabilities

  • Text-to-Image: Generate images from text descriptions
  • Image Editing: Modify existing images with inpainting/outpainting
  • Image Upscaling: Enhance image resolution (2x or 4x)

Examples

# Generate an image
{:ok, images} = Gemini.APIs.Images.generate(
  "A serene mountain landscape at sunset",
  %ImageGenerationConfig{
    number_of_images: 2,
    aspect_ratio: "16:9"
  }
)

# Edit an image
{:ok, edited} = Gemini.APIs.Images.edit(
  "Replace the sky with a starry night",
  image_base64,
  mask_base64,
  %EditImageConfig{edit_mode: :inpainting}
)

# Upscale an image
{:ok, upscaled} = Gemini.APIs.Images.upscale(
  image_base64,
  %UpscaleImageConfig{upscale_factor: :x2}
)

Configuration Options

See Gemini.Types.Generation.Image for all available configuration options.

Safety and Responsible AI

All generated images are subject to Google's safety filters and Responsible AI policies. You can configure the safety filter level, but some content will always be blocked regardless of settings.

Summary

Types

api_result(t)

@type api_result(t) :: {:ok, t} | {:error, term()}

generation_opts()

@type generation_opts() :: [
  model: String.t(),
  project_id: String.t(),
  location: String.t()
]

Functions

edit(prompt, image_data, mask_data \\ nil, config \\ %EditImageConfig{}, opts \\ [])

Edit an existing image using text prompts.

Supports inpainting (editing specific regions) and outpainting (extending the image).

Parameters

  • prompt - Text description of the desired edits
  • image_data - Base64-encoded source image
  • mask_data - Base64-encoded mask image (nil for auto-masking)
  • config - EditImageConfig struct (default: %EditImageConfig{})
  • opts - Additional options (same as generate/3)

Returns

  • {:ok, [GeneratedImage.t()]} - List of edited images
  • {:error, term()} - Error if editing fails

Examples

# Inpainting - edit specific region
{:ok, edited} = Gemini.APIs.Images.edit(
  "Replace the background with a beach scene",
  image_base64,
  mask_base64,
  %EditImageConfig{edit_mode: :inpainting}
)

# Outpainting - extend image
{:ok, extended} = Gemini.APIs.Images.edit(
  "Continue the landscape to the right",
  image_base64,
  mask_base64,
  %EditImageConfig{edit_mode: :outpainting}
)

generate(prompt, config \\ %ImageGenerationConfig{}, opts \\ [])

Generate images from a text prompt.

Parameters

  • prompt - Text description of the image to generate
  • config - ImageGenerationConfig struct with generation parameters (default: %ImageGenerationConfig{})
  • opts - Additional options:
    • :model - Model to use (default: "imagegeneration@006")
    • :project_id - Vertex AI project ID (default: from config)
    • :location - Vertex AI location (default: "us-central1")

Returns

  • {:ok, [GeneratedImage.t()]} - List of generated images
  • {:error, term()} - Error if generation fails

Examples

# Simple generation
{:ok, images} = Gemini.APIs.Images.generate(
  "A cat playing piano"
)

# With configuration
config = %ImageGenerationConfig{
  number_of_images: 4,
  aspect_ratio: "1:1",
  safety_filter_level: :block_some,
  person_generation: :allow_adult
}
{:ok, images} = Gemini.APIs.Images.generate(
  "Professional headshot photo",
  config
)

# Custom model and location
{:ok, images} = Gemini.APIs.Images.generate(
  "Futuristic cityscape",
  config,
  model: "imagen-3.0-generate-001",
  location: "europe-west4"
)

upscale(image_data, config \\ %UpscaleImageConfig{}, opts \\ [])

Upscale an image to higher resolution.

Parameters

  • image_data - Base64-encoded source image
  • config - UpscaleImageConfig struct (default: %UpscaleImageConfig{})
  • opts - Additional options (same as generate/3)

Returns

  • {:ok, [GeneratedImage.t()]} - List containing upscaled image
  • {:error, term()} - Error if upscaling fails

Examples

# 2x upscale
{:ok, [upscaled]} = Gemini.APIs.Images.upscale(
  image_base64,
  %UpscaleImageConfig{upscale_factor: :x2}
)

# 4x upscale with JPEG output
{:ok, [upscaled]} = Gemini.APIs.Images.upscale(
  image_base64,
  %UpscaleImageConfig{
    upscale_factor: :x4,
    output_mime_type: "image/jpeg",
    output_compression_quality: 90
  }
)