WeaviateEx Logo

WeaviateEx

Elixir Hex.pm Documentation License Tests Coverage Version

A modern, idiomatic Elixir client for Weaviate vector database (v1.28+) with full Python client feature parity.

Features

Core Capabilities

  • Complete API Coverage - Collections, objects, batch operations, queries, aggregations, cross-references, tenants
  • RBAC & User Management - Full role-based access control, user lifecycle management, OIDC groups
  • Hybrid Protocol Architecture - gRPC for high-performance data operations, HTTP for schema management
  • Type-Safe - Protocol-based architecture with comprehensive typespecs
  • Test-First Design - 2600+ tests with Mox-based mocking for fast, isolated testing
  • Production-Ready - gRPC persistent channels, Finch HTTP pooling, proper error handling, health checks
  • Easy Setup - First-class Mix tasks for managing local Weaviate stacks

Generative AI (RAG) - 20+ Providers

  • OpenAI (GPT-4, GPT-3.5, O1/O3 reasoning models)
  • Anthropic (Claude 3.5 Sonnet, Claude 3 Opus/Haiku)
  • Cohere, Google Vertex/Gemini, AWS Bedrock/SageMaker
  • Mistral, Ollama, XAI (Grok), ContextualAI
  • NEW in v0.3: NVIDIA NIM, Databricks, FriendliAI
  • Typed provider configurations with full parameter support
  • Multimodal generation with image support
  • Semantic Search - near_text, near_vector, near_object
  • Multimodal Search - near_image (images), near_media (audio, video, thermal, depth, IMU)
  • Hybrid Search - Combined keyword + vector with configurable alpha
  • BM25 Keyword Search - Full-text search with AND/OR operators
  • Reranking - gRPC-based result reranking with Cohere, Transformers, VoyageAI, and more
  • Multi-Vector Support - ColBERT-style embeddings with Muvera encoding
  • Named Vectors - Multiple vectors per object with targeting strategies

Advanced Features

  • Cross-References - Full CRUD for object relationships
  • Multi-Tenancy - HOT, COLD, FROZEN, OFFLOADED states
  • Batch Operations - Error tracking, retry logic, rate limit handling
  • Embedded Mode - Run Weaviate without Docker
  • 20+ Vectorizers - OpenAI, Cohere, VoyageAI, Jina, Transformers, Ollama, and more
  • gRPC Batch Streaming - High-performance bidirectional streaming (Weaviate 1.34+)

Table of Contents

Quick Start

1. Start Weaviate locally

🧰 Prerequisite: Docker Desktop (macOS/Windows) or Docker Engine (Linux)

We ship Docker Compose profiles from the Python client under ci/. Use our Mix tasks to bring everything up:

# Start Weaviate containers (default version: 1.35.0)
mix weaviate.start

# Or specify a version
mix weaviate.start --version 1.35.0

# Inspect running services and health status
mix weaviate.status

The first run downloads the Weaviate Docker image and waits for the /v1/.well-known/ready endpoint to return 200.

When you're done:

mix weaviate.stop

Prefer direct scripts? Use ./ci/start_weaviate.sh 1.35.0 and ./ci/stop_weaviate.sh.

2. Add to Your Project

Add weaviate_ex to your mix.exs dependencies:

def deps do
  [
    {:weaviate_ex, "~> 0.7.4"}
  ]
end

Then fetch dependencies:

mix deps.get

3. Configure

The library automatically reads from environment variables (loaded from .env):

# .env file (created by install.sh)
WEAVIATE_URL=http://localhost:8080
WEAVIATE_API_KEY=  # Optional, for authenticated instances

Or configure in your Elixir config files:

# config/config.exs
config :weaviate_ex,
  url: "http://localhost:8080",
  api_key: nil,    # Optional
  strict: true     # Default: true - fails fast if Weaviate is unreachable

Strict Mode: By default, WeaviateEx validates connectivity on startup. If Weaviate is unreachable, your application won't start. Set strict: false to allow startup anyway (useful for development when Weaviate might not always be running).

4. Verify Connection

The library automatically performs a health check on startup:

[WeaviateEx] Successfully connected to Weaviate
  URL: http://localhost:8080
  Version: 1.34.0-rc.0

You can also run mix weaviate.status to see every profile that’s currently online and the ports they expose.

If configuration is missing, you'll get helpful error messages:

╔══════════════════════════════════════════════════════════════════════╗
β•‘                  WeaviateEx Configuration Error                      β•‘
╠══════════════════════════════════════════════════════════════════════╣
β•‘  Missing required configuration: WEAVIATE_URL                        β•‘
β•‘                                                                      β•‘
β•‘  Please set the Weaviate URL using one of these methods:             β•‘
β•‘  1. Environment variable: export WEAVIATE_URL=http://localhost:8080  β•‘
β•‘  2. Application configuration (config/config.exs)                    β•‘
β•‘  3. Runtime configuration (config/runtime.exs)                       β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

5. Shape a Tenant-Aware Collection and Load Data

alias WeaviateEx.{Collections, Objects, Batch}

# Define the collection and toggle multi-tenancy when ready
{:ok, _collection} =
  Collections.create("Article", %{
    description: "Articles by tenant",
    properties: [
      %{name: "title", dataType: ["text"]},
      %{name: "content", dataType: ["text"]}
    ]
  })

{:ok, %{"enabled" => true}} = Collections.set_multi_tenancy("Article", true)
{:ok, true} = Collections.exists?("Article")

# Create & read tenant-scoped objects with _additional metadata
{:ok, created} =
  Objects.create("Article", %{properties: %{title: "Tenant scoped", content: "Hello!"}},
    tenant: "tenant-a"
  )

{:ok, fetched} =
  Objects.get("Article", created["id"],
    tenant: "tenant-a",
    include: ["_additional", "vector"]
  )

# Batch ingest with a summary that separates successes from errors
objects =
  Enum.map(1..3, fn idx ->
    %{class: "Article", properties: %{title: "Story #{idx}"}, tenant: "tenant-a"}
  end)

{:ok, summary} = Batch.create_objects(objects, return_summary: true, tenant: "tenant-a")
summary.statistics
#=> %{processed: 3, successful: 3, failed: 0}

Installation

See INSTALL.md for detailed installation instructions covering:

  • Docker installation on various platforms
  • Manual Weaviate setup
  • Configuration options
  • Troubleshooting

Configuration

Environment Variables

VariableRequiredDefaultDescription
WEAVIATE_URLYes-Full URL to Weaviate (e.g., http://localhost:8080)
WEAVIATE_API_KEYNo-API key for authentication (for cloud/production)

Application Configuration

# config/config.exs
config :weaviate_ex,
  url: System.get_env("WEAVIATE_URL", "http://localhost:8080"),
  api_key: System.get_env("WEAVIATE_API_KEY"),
  strict: true,      # Fail on startup if unreachable
  timeout: 30_000    # Request timeout in milliseconds

gRPC Configuration

WeaviateEx v0.4.0+ uses a hybrid protocol architecture: gRPC for data operations (queries, batch, aggregations) and HTTP for schema management. gRPC provides significantly better performance for high-throughput operations.

# config/config.exs
config :weaviate_ex,
  url: "http://localhost:8080",       # HTTP endpoint for schema operations
  grpc_host: "localhost",             # gRPC host (default: derived from url)
  grpc_port: 50051,                   # gRPC port (default: 50051)
  grpc_max_message_size: 104_857_600, # Max message size in bytes (default: 100MB)
  api_key: nil                        # Used for both HTTP and gRPC auth
VariableRequiredDefaultDescription
grpc_hostNoDerived from urlgRPC endpoint hostname
grpc_portNo50051gRPC port
grpc_max_message_sizeNo104857600Maximum gRPC message size (100MB)

The gRPC connection is automatically established when you create a client:

# Connect with gRPC (automatic)
{:ok, client} = WeaviateEx.Client.connect(
  url: "http://localhost:8080",
  grpc_port: 50051
)

# Client now has both HTTP and gRPC channels
client.grpc_channel  # => gRPC channel for data operations
client.config        # => Configuration for HTTP operations

Custom Headers (v0.7.1+)

Add custom headers to all HTTP and gRPC requests for authentication, tracing, or other purposes:

# Configure additional headers in client config
{:ok, client} = WeaviateEx.Client.connect(
  url: "http://localhost:8080",
  additional_headers: %{
    "X-Custom-Header" => "custom-value",
    "X-Request-ID" => "trace-123",
    "Authorization" => "Bearer custom-token"
  }
)

# Headers are automatically included in:
# - All HTTP requests (schema operations, health checks)
# - All gRPC requests as metadata (lowercased keys)

Headers are validated on client creation - nil values will raise an ArgumentError.

gRPC Retry with Exponential Backoff (v0.7.1+)

All gRPC operations automatically retry on transient errors with exponential backoff:

# Retryable gRPC status codes:
# - UNAVAILABLE (14)     - Service temporarily unavailable
# - RESOURCE_EXHAUSTED (8) - Rate limiting
# - ABORTED (10)         - Transaction aborted
# - DEADLINE_EXCEEDED (4) - Timeout

# Default: 4 retries with exponential backoff
# Attempt 0: 1 second delay
# Attempt 1: 2 seconds
# Attempt 2: 4 seconds
# Attempt 3: 8 seconds
# Maximum delay capped at 32 seconds

# Configure retry behavior (optional)
alias WeaviateEx.GRPC.Retry

# Custom retry with options
result = Retry.with_retry(
  fn -> some_grpc_operation() end,
  max_retries: 3,
  base_delay_ms: 500
)

# Check if error is retryable
Retry.retryable?(%GRPC.RPCError{status: 14})  # => true (UNAVAILABLE)
Retry.retryable?(%GRPC.RPCError{status: 3})   # => false (INVALID_ARGUMENT)

# Calculate backoff delay
Retry.calculate_backoff(0)  # => 1000ms
Retry.calculate_backoff(2)  # => 4000ms
Retry.calculate_backoff(5)  # => 32000ms (capped)

All gRPC services (Search, Batch, Aggregate, Tenants, Health) automatically use retry logic.

Proxy Configuration (v0.5.0+)

WeaviateEx supports HTTP, HTTPS, and gRPC proxy configuration:

alias WeaviateEx.Config.Proxy

# Read from environment variables (HTTP_PROXY, HTTPS_PROXY, GRPC_PROXY)
proxy = Proxy.from_env()

# Or configure explicitly
proxy = Proxy.new(
  http: "http://proxy.example.com:8080",
  https: "https://proxy.example.com:8443",
  grpc: "http://grpc-proxy.example.com:8080"
)

# Check if proxy is configured
Proxy.configured?(proxy)  # => true

# Get Finch HTTP client options
Proxy.to_finch_opts(proxy)  # => [proxy: {:https, "proxy.example.com", 8443, []}]

# Get gRPC channel options
Proxy.to_grpc_opts(proxy)   # => [http_proxy: "http://grpc-proxy.example.com:8080"]

Environment variables are read case-insensitively (uppercase takes precedence):

  • HTTP_PROXY / http_proxy - HTTP proxy URL
  • HTTPS_PROXY / https_proxy - HTTPS proxy URL
  • GRPC_PROXY / grpc_proxy - gRPC proxy URL
# config/runtime.exs
config :weaviate_ex,
  url: System.fetch_env!("WEAVIATE_URL"),
  api_key: System.get_env("WEAVIATE_API_KEY")

Usage

Embedded Mode

Need an ephemeral instance without Docker? WeaviateEx can download and manage the official embedded binary:

# Downloads (once) into ~/.cache/weaviate-embedded and starts the process
{:ok, embedded} =
  WeaviateEx.start_embedded(
    version: "1.34.0",
    port: 8099,
    grpc_port: 50155,
    persistence_data_path: Path.expand("tmp/weaviate-data"),
    environment_variables: %{"DISABLE_TELEMETRY" => "true"}
  )

# Talk to it just like any other instance
System.put_env("WEAVIATE_URL", "http://localhost:8099")
{:ok, meta} = WeaviateEx.health_check()

# Always stop the handle when finished
:ok = WeaviateEx.stop_embedded(embedded)

Passing version: "latest" fetches the most recent GitHub release. Binaries are cached, so subsequent calls reuse the download. You can override binary_path/persistence_data_path to control where the executable and data live.

Health Checks

Check if Weaviate is accessible and get version information:

# Get metadata (version, modules)
{:ok, meta} = WeaviateEx.health_check()
# => %{"version" => "1.34.0-rc.0", "modules" => %{}}

# Check readiness (can handle requests) - K8s readiness probe
{:ok, true} = WeaviateEx.ready?()

# Check liveness (service is up) - K8s liveness probe
{:ok, true} = WeaviateEx.alive?()

# With explicit client
{:ok, client} = WeaviateEx.Client.connect(base_url: "http://localhost:8080")
{:ok, true} = WeaviateEx.Health.alive?(client)
{:ok, true} = WeaviateEx.Health.ready?(client)

# Wait for Weaviate to become ready (useful for startup scripts)
:ok = WeaviateEx.Health.wait_until_ready(timeout: 30_000, check_interval: 1000)

# gRPC health ping (v0.7.0+)
alias WeaviateEx.GRPC.Services.Health, as: GRPCHealth
:ok = GRPCHealth.ping(client.grpc_channel)

Kubernetes Integration

The alive? and ready? functions use the standard Kubernetes probe endpoints:

  • Liveness: /.well-known/live - Is the process running?
  • Readiness: /.well-known/ready - Can the service handle traffic?
# Example K8s deployment liveness/readiness probes
livenessProbe:
  httpGet:
    path: /.well-known/live
    port: 8080
readinessProbe:
  httpGet:
    path: /.well-known/ready
    port: 8080

Server Version Detection

Parse and validate Weaviate server versions (v0.7.0+):

alias WeaviateEx.Version

# Parse version strings
{:ok, {1, 28, 0}} = Version.parse("1.28.0")
{:ok, {1, 28, 0}} = Version.parse("v1.28.0-rc1")  # Handles v prefix and prerelease

# Check if version meets minimum requirement
true = Version.meets_minimum?({1, 28, 0}, {1, 27, 0})
false = Version.meets_minimum?({1, 26, 0}, {1, 27, 0})

# Validate server version (minimum: 1.27.0)
:ok = Version.validate_server({1, 28, 0})
{:error, {:unsupported_version, {1, 20, 0}, {1, 27, 0}}} = Version.validate_server({1, 20, 0})

# Extract version from meta endpoint response
{:ok, meta} = WeaviateEx.health_check()
{:ok, {1, 28, 0}} = Version.get_server_version(meta)

# Get minimum supported version
Version.minimum_version()        # => {1, 27, 0}
Version.minimum_version_string() # => "1.27.0"

# Format version tuple to string
"1.28.0" = Version.format_version({1, 28, 0})

Collections (Schema Management)

Collections define the structure of your data:

# Create a collection with properties
{:ok, collection} = WeaviateEx.Collections.create("Article", %{
  description: "News articles",
  properties: [
    %{name: "title", dataType: ["text"]},
    %{name: "content", dataType: ["text"]},
    %{name: "publishedAt", dataType: ["date"]},
    %{name: "views", dataType: ["int"]}
  ],
  vectorizer: "none"  # Use "text2vec-openai" for auto-vectorization
})

# List all collections
{:ok, schema} = WeaviateEx.Collections.list()

# Get a specific collection
{:ok, collection} = WeaviateEx.Collections.get("Article")

# Add a property to existing collection
{:ok, property} = WeaviateEx.Collections.add_property("Article", %{
  name: "author",
  dataType: ["text"]
})

# Check if collection exists
{:ok, true} = WeaviateEx.Collections.exists?("Article")

# Delete a collection
{:ok, _} = WeaviateEx.Collections.delete("Article")

Object TTL (Time-To-Live)

Automatically expire and delete objects after a specified duration:

alias WeaviateEx.Config.ObjectTTL

# Create collection with 24-hour TTL using human-readable duration
{:ok, _} = WeaviateEx.Collections.create("Events", %{
  properties: [%{name: "title", dataType: ["text"]}],
  object_ttl: ObjectTTL.from_duration(hours: 24)
})

# Or specify exact seconds with creation time deletion
{:ok, _} = WeaviateEx.Collections.create("Sessions", %{
  properties: [%{name: "user_id", dataType: ["text"]}],
  object_ttl: ObjectTTL.delete_by_creation_time(3600)  # 1 hour
})

# Delete objects based on last update time
{:ok, _} = WeaviateEx.Collections.create("Cache", %{
  properties: [%{name: "data", dataType: ["text"]}],
  object_ttl: ObjectTTL.delete_by_update_time(86_400, true)  # 24h, filter expired
})

# Delete objects based on a custom date property
{:ok, _} = WeaviateEx.Collections.create("Subscriptions", %{
  properties: [
    %{name: "plan", dataType: ["text"]},
    %{name: "expires_at", dataType: ["date"]}
  ],
  object_ttl: ObjectTTL.delete_by_date_property("expires_at")
})

# Update TTL on existing collection
{:ok, _} = WeaviateEx.Collections.update_ttl("Events",
  ObjectTTL.from_duration(days: 7)
)

# Disable TTL
{:ok, _} = WeaviateEx.Collections.update_ttl("Events",
  ObjectTTL.disable()
)

Note: Objects are deleted asynchronously in the background. The filter_expired_objects option (second parameter in delete_by_* functions) controls whether expired but not yet deleted objects are excluded from search results.

Schema helpers for range filters and auto-tenant configuration:

alias WeaviateEx.Config.{AutoTenant, ObjectTTL}
alias WeaviateEx.Schema.MultiTenancyConfig
alias WeaviateEx.Property

ttl = ObjectTTL.delete_by_update_time(86_400, true)

{:ok, _} = WeaviateEx.Collections.create("Session", %{
  properties: [
    Property.number("expires_in", index_range_filters: true)
  ],
  object_ttl: ttl,
  multi_tenancy_config: MultiTenancyConfig.new(enabled: true, auto_tenant_creation: true),
  auto_tenant: AutoTenant.enable(auto_delete_timeout: 3_600)
})

Nested Properties

Define complex object structures with nested properties:

alias WeaviateEx.Property
alias WeaviateEx.Property.Nested

# Create a collection with nested object properties
{:ok, _} = WeaviateEx.Collections.create("Product", %{
  description: "Products with specifications",
  properties: [
    %{name: "name", dataType: ["text"]},
    %{name: "price", dataType: ["number"]},
    # Nested object property
    Property.object("specs", [
      Nested.new(name: "weight", data_type: :number),
      Nested.new(name: "dimensions", data_type: :text),
      Nested.new(name: "material", data_type: :text)
    ]),
    # Array of nested objects
    Property.object_array("variants", [
      Nested.new(name: "color", data_type: :text),
      Nested.new(name: "size", data_type: :text),
      Nested.new(name: "sku", data_type: :text),
      Nested.new(name: "stock", data_type: :int)
    ])
  ]
})

# Insert object with nested data
{:ok, product} = WeaviateEx.Objects.create("Product", %{
  properties: %{
    name: "Laptop Stand",
    price: 79.99,
    specs: %{
      weight: 2.5,
      dimensions: "30x25x15cm",
      material: "aluminum"
    },
    variants: [
      %{color: "silver", size: "standard", sku: "LS-001", stock: 50},
      %{color: "black", size: "large", sku: "LS-002", stock: 30}
    ]
  }
})

# Deeply nested properties (object within object)
{:ok, _} = WeaviateEx.Collections.create("Company", %{
  properties: [
    %{name: "name", dataType: ["text"]},
    Property.object("headquarters", [
      Nested.new(name: "city", data_type: :text),
      Nested.new(name: "country", data_type: :text),
      Nested.new(
        name: "address",
        data_type: :object,
        nested_properties: [
          Nested.new(name: "street", data_type: :text),
          Nested.new(name: "zip", data_type: :text)
        ]
      )
    ])
  ]
})

# Parse nested properties from API response
api_data = %{
  "name" => "specs",
  "dataType" => ["object"],
  "nestedProperties" => [
    %{"name" => "weight", "dataType" => ["number"]}
  ]
}
nested = Nested.from_api(api_data)

Data Operations (CRUD)

Simple CRUD operations with automatic UUID generation:

alias WeaviateEx.API.Data

# Create (insert) a new object
data = %{
  properties: %{
    "title" => "Hello Weaviate",
    "content" => "This is a test article",
    "views" => 0
  },
  vector: [0.1, 0.2, 0.3, 0.4, 0.5]  # Optional if using auto-vectorization
}

{:ok, object} = Data.insert(client, "Article", data)

# Named vectors (v0.7.1+) - for collections with multiple vector spaces
data_with_named_vectors = %{
  properties: %{"title" => "Multi-vector article"},
  vectors: %{
    "title_vector" => [0.1, 0.2, 0.3],
    "content_vector" => [0.4, 0.5, 0.6, 0.7]
  }
}

{:ok, object} = Data.insert(client, "MultiVectorCollection", data_with_named_vectors)
uuid = object["id"]

# Read - get object by ID
{:ok, retrieved} = Data.get_by_id(client, "Article", uuid)

# Update - partial update (PATCH)
{:ok, updated} = Data.patch(client, "Article", uuid, %{
  properties: %{"views" => 42},
  vector: [0.1, 0.2, 0.3, 0.4, 0.5]
})

# Check if object exists
{:ok, true} = Data.exists?(client, "Article", uuid)

# Delete
{:ok, _} = Data.delete_by_id(client, "Article", uuid)

Collection handles with default tenant/consistency:

collection =
  WeaviateEx.Collection.new(client, "Article",
    tenant: "tenant-a",
    consistency_level: "QUORUM"
  )

{:ok, _} = WeaviateEx.Collection.insert(collection, %{properties: %{title: "Tenant scoped"}})

Inline References During Insert (v0.7.1+)

Create objects with references in a single operation:

# Insert object with inline references
{:ok, article} = WeaviateEx.Objects.create("Article", %{
  properties: %{
    title: "My Article",
    content: "Article content..."
  },
  # Single reference
  references: %{
    "hasAuthor" => "author-uuid-here"
  }
})

# Multiple references to same property
{:ok, article} = WeaviateEx.Objects.create("Article", %{
  properties: %{title: "Collaborative Article"},
  references: %{
    "hasAuthors" => ["author-uuid-1", "author-uuid-2", "author-uuid-3"]
  }
})

# Multi-target references (pointing to specific collection)
{:ok, article} = WeaviateEx.Objects.create("Article", %{
  properties: %{title: "Related Content"},
  references: %{
    "relatedTo" => %{
      target_collection: "Category",
      uuids: "category-uuid"
    }
  }
})

# Multiple multi-target references
{:ok, article} = WeaviateEx.Objects.create("Article", %{
  properties: %{title: "Multi-related"},
  references: %{
    "mentions" => %{
      target_collection: "Person",
      uuids: ["person-1", "person-2"]
    }
  }
})

References are automatically converted to Weaviate beacon format.

Reference Operations API (v0.7.3+)

For managing references after object creation, use the References API with full multi-target support:

alias WeaviateEx.API.References
alias WeaviateEx.Data.ReferenceToMulti
alias WeaviateEx.Types.Beacon

# Add a single reference
{:ok, _} = References.add(client, "Article", article_uuid, "hasAuthor", author_uuid)

# Add a multi-target reference using ReferenceToMulti
ref = ReferenceToMulti.new("Person", person_uuid)
{:ok, _} = References.add(client, "Article", article_uuid, "hasAuthor", ref)

# Add multiple references at once
ref = ReferenceToMulti.new("Person", [person1_uuid, person2_uuid])
{:ok, _} = References.add(client, "Article", article_uuid, "hasAuthors", ref)

# Replace all references on a property
{:ok, _} = References.replace(client, "Article", article_uuid, "hasAuthors",
  [author1_uuid, author2_uuid, author3_uuid]
)

# Replace with multi-target references pointing to different collections
{:ok, _} = References.replace(client, "Article", article_uuid, "relatedTo", [
  ReferenceToMulti.new("Person", person_uuid),
  ReferenceToMulti.new("Organization", org_uuid)
])

# Delete a reference
{:ok, _} = References.delete(client, "Article", article_uuid, "hasAuthor", author_uuid)

# Batch add references
refs = [
  %{from_uuid: "article-1", from_property: "hasAuthor", to_uuid: "author-1"},
  %{from_uuid: "article-2", from_property: "hasAuthor", to_uuid: "author-2",
    target_collection: "Person"}  # For multi-target properties
]
{:ok, _} = References.add_many(client, "Article", refs)

# Parse beacon URLs
parsed = Beacon.parse("weaviate://localhost/Person/uuid-123")
# => %{collection: "Person", uuid: "uuid-123"}

# Build beacon URLs
beacon = Beacon.build("uuid-123", "Person")
# => "weaviate://localhost/Person/uuid-123"

Objects API

Full CRUD operations with explicit UUID control:

# Create with custom UUID
{:ok, object} = WeaviateEx.Objects.create("Article", %{
  id: "custom-uuid-here",  # Optional
  properties: %{
    title: "Hello Weaviate",
    content: "This is a test article",
    publishedAt: "2025-01-15T10:00:00Z"
  },
  vector: [0.1, 0.2, 0.3]  # Optional
})

# Get an object with additional fields
{:ok, object} = WeaviateEx.Objects.get("Article", uuid,
  include: "vector,classification"
)

# List objects with pagination
{:ok, result} = WeaviateEx.Objects.list("Article",
  limit: 10,
  offset: 0,
  include: "vector"
)

# Update (full replacement)
{:ok, updated} = WeaviateEx.Objects.update("Article", uuid, %{
  properties: %{
    title: "Updated Title",
    content: "Updated content"
  }
})

# Patch (partial update)
{:ok, patched} = WeaviateEx.Objects.patch("Article", uuid, %{
  properties: %{title: "New Title"}
})

# Delete
{:ok, _} = WeaviateEx.Objects.delete("Article", uuid)

# Check existence
{:ok, true} = WeaviateEx.Objects.exists?("Article", uuid)

Payload validation happens client-side: properties is required for inserts/updates, and property names id and vector are reserved (raises ArgumentError).

Complex Data Types

WeaviateEx automatically serializes complex Elixir types when creating or updating objects:

alias WeaviateEx.Types.{GeoCoordinate, PhoneNumber, Blob}

# DateTime - serialized to RFC3339/ISO8601
%{created_at: ~U[2024-01-01 00:00:00Z]}
# -> {"created_at": "2024-01-01T00:00:00Z"}

# Date - serialized as midnight UTC
%{published_date: ~D[2024-06-15]}
# -> {"published_date": "2024-06-15T00:00:00Z"}

# GeoCoordinate - serialized to lat/lon map
{:ok, geo} = GeoCoordinate.new(40.71, -74.00)
%{location: geo}
# -> {"location": {"latitude": 40.71, "longitude": -74.00}}

# PhoneNumber - serialized with input and country
phone = PhoneNumber.new("555-1234", default_country: "US")
%{contact: phone}
# -> {"contact": {"input": "555-1234", "defaultCountry": "US"}}

# Blob (binary data) - base64 encoded
blob = Blob.new(<<binary_image_data>>)
%{image: blob}
# -> {"image": "<base64 encoded string>"}

# Nested objects with complex types
{:ok, geo} = GeoCoordinate.new(40.7128, -74.0060)
{:ok, article} = WeaviateEx.Objects.create("Place", %{
  properties: %{
    name: "Central Park",
    location: geo,
    created_at: ~U[2024-01-01 00:00:00Z],
    metadata: %{
      last_visited: ~D[2024-12-25]
    }
  }
})

Deserializing Responses

Convert Weaviate response data back to rich Elixir types:

alias WeaviateEx.Types.Deserialize

# Parse individual values
{:ok, dt} = Deserialize.deserialize("2024-01-01T00:00:00Z", :date)
# => {:ok, ~U[2024-01-01 00:00:00Z]}

{:ok, geo} = Deserialize.deserialize(
  %{"latitude" => 52.37, "longitude" => 4.90},
  :geo_coordinates
)
# => {:ok, %GeoCoordinate{latitude: 52.37, longitude: 4.90}}

# Deserialize properties with schema hints
schema = %{"created_at" => :date, "location" => :geo_coordinates}
{:ok, props} = Deserialize.deserialize_properties(raw_props, schema)

# Auto-detect types based on value structure
{:ok, props} = Deserialize.auto_deserialize(response["properties"])

Batch Operations

Efficient bulk operations for importing large datasets:

# Batch create multiple objects
objects = [
  %{class: "Article", properties: %{title: "Article 1", content: "Content 1"}},
  %{class: "Article", properties: %{title: "Article 2", content: "Content 2"}},
  %{class: "Article", properties: %{title: "Article 3", content: "Content 3"}}
]

{:ok, summary} = WeaviateEx.Batch.create_objects(objects, return_summary: true)

# Check rolled-up stats and per-object errors
summary.statistics
#=> %{processed: 3, successful: 3, failed: 0}

Enum.each(summary.errors, fn error ->
  Logger.warn("[Batch error] #{error.id} => #{Enum.join(error.messages, "; ")}")
end)

If every object in the batch fails, `Batch.create_objects/2` returns
`{:error, %WeaviateEx.Error{type: :batch_all_failed}}`.

# Batch delete with criteria (WHERE filter)
{:ok, result} = WeaviateEx.Batch.delete_objects(%{
  class: "Article",
  where: %{
    path: ["status"],
    operator: "Equal",
    valueText: "draft"
  }
})

Concurrent Batch Operations

High-throughput parallel batch processing with failure tracking:

alias WeaviateEx.Batch.Concurrent
alias WeaviateEx.Batch.Queue

# Concurrent batch insertion with parallel processing
objects = Enum.map(1..10_000, fn i ->
  %{class: "Article", properties: %{title: "Article #{i}", content: "Content #{i}"}}
end)

{:ok, result} = Concurrent.insert_many(client, "Article", objects,
  max_concurrency: 8,    # Parallel batch requests
  batch_size: 200,       # Objects per request
  ordered: false,        # Don't maintain order (faster)
  timeout: 60_000        # Timeout per batch
)

# Check results
IO.puts(Concurrent.Result.summary(result))
# => "Inserted 10000/10000 objects in 50 batches (1234ms). Failures: 0, Batch errors: 0"

if Concurrent.Result.all_successful?(result) do
  IO.puts("All objects inserted successfully!")
else
  IO.puts("Some failures occurred")
  for failed <- result.failed do
    IO.puts("Failed: #{failed.id} - #{failed.error}")
  end
end

# Batch Queue for failure tracking and re-queuing
queue = Queue.new()

# Add objects to queue
queue = Enum.reduce(objects, queue, fn obj, q ->
  Queue.enqueue(q, obj)
end)

# Dequeue a batch for processing
{batch, queue} = Queue.dequeue_batch(queue, 100)

# Process batch and mark failures
queue = Enum.reduce(failed_objects, queue, fn {obj, reason}, q ->
  Queue.mark_failed(q, obj, reason)
end)

# Re-queue failed objects for retry (with max retry limit)
queue = Queue.requeue_failed(queue, max_retries: 3)

# Get queue statistics
IO.puts("Pending: #{Queue.pending_count(queue)}")
IO.puts("Failed: #{Queue.failed_count(queue)}")
IO.puts("Empty: #{Queue.empty?(queue)}")

# Rate limit detection
alias WeaviateEx.Batch.RateLimit

response = %{status: 429, headers: [{"retry-after", "5"}]}
case RateLimit.detect(response) do
  :ok -> IO.puts("No rate limit")
  {:rate_limited, wait_ms} ->
    IO.puts("Rate limited, wait #{wait_ms}ms")
    Process.sleep(wait_ms)
end

# Server queue monitoring for dynamic batch sizing
alias WeaviateEx.API.Cluster

{:ok, stats} = Cluster.batch_stats(client)
IO.puts("Queue length: #{stats.queue_length}")
IO.puts("Rate: #{stats.rate_per_second}/s")
IO.puts("Failed: #{stats.failed_count}")

gRPC Batch Streaming (v0.6.0+)

Bidirectional gRPC streaming for high-throughput batch operations (requires Weaviate 1.34+):

alias WeaviateEx.Batch.Stream

# Create a streaming batch session
{:ok, stream} = Stream.new(client, "Article",
  buffer_size: 200,           # Objects per batch
  flush_interval_ms: 1000,    # Auto-flush interval
  auto_flush: true            # Enable automatic flushing
)

# Add objects to the stream buffer
{:ok, stream} = Stream.add(stream, %{
  properties: %{title: "Article 1", content: "Content 1"}
})

{:ok, stream} = Stream.add(stream, %{
  properties: %{title: "Article 2", content: "Content 2"}
})

# Manually flush when buffer reaches threshold
{:ok, stream} = Stream.flush(stream)

# Add many objects efficiently
objects = Enum.map(1..1000, fn i ->
  %{properties: %{title: "Article #{i}", content: "Content #{i}"}}
end)

{:ok, stream} = Enum.reduce(objects, {:ok, stream}, fn obj, {:ok, s} ->
  Stream.add(s, obj)
end)

# Close stream and get final results
{:ok, results} = Stream.close(stream)

# Results include success/failure for each object
Enum.each(results, fn result ->
  case result do
    %{status: :success, uuid: uuid} ->
      IO.puts("Created: #{uuid}")
    %{status: :failed, error: error} ->
      IO.puts("Failed: #{error}")
  end
end)

When the server sends backoff messages, the stream automatically updates its buffer size to the server-provided batch size for subsequent flushes.

Low-Level gRPC Streaming

For advanced use cases, access the underlying gRPC stream directly:

alias WeaviateEx.GRPC.Services.BatchStream

# Open a bidirectional stream
{:ok, stream_handle} = BatchStream.open(client.grpc_channel)

# Send objects
:ok = BatchStream.send_objects(stream_handle, [
  %{collection: "Article", properties: %{title: "Test"}, uuid: nil, vector: nil}
])

# Send cross-references
:ok = BatchStream.send_references(stream_handle, [
  %{from_collection: "Article", from_uuid: "...", to_collection: "Author", to_uuid: "..."}
])

# Receive results
{:ok, results} = BatchStream.receive_results(stream_handle, timeout: 5000)

# Close the stream
:ok = BatchStream.close(stream_handle)

Background Batch Processing (v0.7.0+)

For high-throughput scenarios, use the background batcher for continuous async processing:

alias WeaviateEx.Batch.Background

# Start a background batch processor
{:ok, batcher} = WeaviateEx.Batch.background(client, "Article",
  batch_size: 100,
  concurrent_requests: 2,
  flush_interval: 1000
)

# Add objects asynchronously (non-blocking)
for article <- articles do
  :ok = Background.add_object(batcher, %{
    title: article.title,
    content: article.content
  })
end

# Add objects with explicit UUID and vector
:ok = Background.add_object(batcher, %{title: "Test"},
  uuid: "550e8400-e29b-41d4-a716-446655440000",
  vector: [0.1, 0.2, 0.3]
)

# Add references (automatically ordered after related objects)
:ok = Background.add_reference(batcher, article_uuid, "hasAuthor", author_uuid)

# Force immediate flush
:ok = Background.flush(batcher)

# Get current results
results = Background.get_results(batcher)
IO.puts("Imported #{map_size(results.successful_uuids)} objects")

# Stop and get final results (with flush)
results = Background.stop(batcher, flush: true)

Batch Safety Features (v0.7.4+)

WeaviateEx implements production-grade batch safety for reliable large-scale operations:

Memory Management

# MAX_STORED_RESULTS limit (100,000) prevents memory exhaustion
# Automatic eviction of oldest entries when limit exceeded
alias WeaviateEx.Batch.ErrorTracking.Results

# Check the limit
Results.max_stored_results()
#=> 100_000

# Results automatically evict oldest entries when limit is exceeded
# This prevents unbounded memory growth during large batch operations

Auto-Retry for Failed Objects

alias WeaviateEx.Batch.Dynamic

# Dynamic batcher with auto-retry enabled (default)
{:ok, batcher} = Dynamic.start(
  client: client,
  auto_retry: true,           # Enable automatic retry (default: true)
  max_retries: 5,             # Maximum retry attempts (default: 3)
  retry_delay_ms: 2000,       # Base delay for backoff (default: 1000ms)
  on_permanent_failure: fn objects ->
    Logger.error("Permanent failures: #{length(objects)}")
    # Handle objects that exceeded max_retries
  end
)

# Add objects - failed objects are automatically re-queued
Dynamic.add_object(batcher, "Article", %{title: "Test"})

# Retryable errors include:
# - Rate limit errors (429, "rate limit exceeded", etc.)
# - Transient gRPC errors (UNAVAILABLE, RESOURCE_EXHAUSTED, ABORTED, DEADLINE_EXCEEDED)

RetryQueue for Manual Control

alias WeaviateEx.Batch.RetryQueue

# Start a retry queue for manual control
{:ok, retry_queue} = RetryQueue.start_link(
  client: client,
  max_retries: 3,
  base_delay_ms: 1000,
  on_permanent_failure: fn objects ->
    Logger.error("Failed after max retries: #{length(objects)}")
  end
)

# Enqueue failed objects for retry
:ok = RetryQueue.enqueue_failed(retry_queue, failed_objects)

# Check retry count for a specific object
count = RetryQueue.get_retry_count(retry_queue, "uuid-123")

# Drain all queued objects for manual processing
{:ok, objects} = RetryQueue.drain(retry_queue)

# Clear the queue
:ok = RetryQueue.clear(retry_queue)

Configurable Batch Options

alias WeaviateEx.Batch.Config

# Create a batch configuration
config = Config.new(
  max_stored_results: 50_000,    # Custom limit
  auto_retry: true,
  max_retries: 5,
  retry_delay_ms: 2000,
  on_permanent_failure: fn objects ->
    Logger.error("Failed: #{length(objects)}")
  end
)

# Access configuration values
Config.auto_retry_enabled?(config)  #=> true
Config.default_max_retries()        #=> 3

Powerful query capabilities with semantic search:

alias WeaviateEx.Query

# Simple query with field selection
query = Query.get("Article")
  |> Query.fields(["title", "content", "publishedAt"])
  |> Query.limit(10)

{:ok, results} = Query.execute(query)

# Semantic search with near_text (requires vectorizer)
query = Query.get("Article")
  |> Query.near_text("artificial intelligence", certainty: 0.7)
  |> Query.fields(["title", "content"])
  |> Query.additional(["certainty", "distance"])
  |> Query.limit(5)

{:ok, results} = Query.execute(query)

# Vector search with custom vectors
query = Query.get("Article")
  |> Query.near_vector([0.1, 0.2, 0.3], certainty: 0.8)
  |> Query.fields(["title"])

{:ok, results} = Query.execute(query)

# Hybrid search (combines keyword + vector)
query = Query.get("Article")
  |> Query.hybrid("machine learning", alpha: 0.5)  # alpha: 0=keyword, 1=vector
  |> Query.fields(["title", "content"])

{:ok, results} = Query.execute(query)

# BM25 keyword search
query = Query.get("Article")
  |> Query.bm25("elixir programming")
  |> Query.fields(["title", "content"])

{:ok, results} = Query.execute(query)

# Semantic direction with Move (v0.5.0+)
query = Query.get("Article")
  |> Query.near_text("technology",
       move_to: [concepts: ["artificial intelligence", "machine learning"], force: 0.8],
       move_away: [concepts: ["politics", "sports"], force: 0.5]
     )
  |> Query.fields(["title", "content"])

{:ok, results} = Query.execute(query)

# Queries with filters (WHERE clause)
query = Query.get("Article")
  |> Query.where(%{
    path: ["publishedAt"],
    operator: "GreaterThan",
    valueDate: "2025-01-01T00:00:00Z"
  })
  |> Query.fields(["title", "publishedAt"])
  |> Query.sort([%{path: ["publishedAt"], order: "desc"}])

{:ok, results} = Query.execute(query)

Fetch Objects by IDs

alias WeaviateEx.API.Data

ids = [
  "550e8400-e29b-41d4-a716-446655440001",
  "550e8400-e29b-41d4-a716-446655440002"
]

{:ok, objects} = Data.fetch_objects_by_ids(client, "Article", ids,
  return_properties: ["title", "content"]
)

# Results preserve the input ID order.
# Using the Objects module (no client needed)
{:ok, objects} = WeaviateEx.Objects.fetch_objects_by_ids("Article", ids,
  return_properties: ["title", "content"]
)

gRPC vs GraphQL

When you pass a WeaviateEx.Client, Query.execute/2 uses gRPC and now supports filters, group_by, target vectors, near_image/near_media, references, vector metadata, reranking, and generative search (RAG). If a query includes options not yet supported in gRPC (for example sorting or cursor pagination), it automatically falls back to GraphQL.

Reranking

Improve search result relevance using reranker models:

alias WeaviateEx.Query
alias WeaviateEx.Query.Rerank

# Basic reranking - re-scores results using the "content" property
rerank = Rerank.new("content")

{:ok, results} = Query.get("Article")
|> Query.near_text("machine learning")
|> Query.fields(["title", "content"])
|> Query.limit(10)
|> Query.rerank(rerank)
|> Query.execute(client)

# With custom rerank query (different from search query)
rerank = Rerank.new("content", query: "latest AI applications in healthcare")

{:ok, results} = Query.get("Article")
|> Query.hybrid("AI trends", alpha: 0.5)
|> Query.fields(["title", "content"])
|> Query.rerank(rerank)
|> Query.execute(client)

# Access rerank scores in results
for result <- results do
  score = result["_additional"]["rerankScore"]
  IO.puts("Rerank score: #{score}")
end

Note: Requires a reranker module configured on the collection. See WeaviateEx.API.RerankerConfig for available rerankers: cohere, transformers, voyageai, jinaai, nvidia, contextualai.

gRPC Generative Search (v0.7.4+)

Generative queries now use gRPC for improved performance (~2-3x lower latency):

alias WeaviateEx.GRPC.Services.Search
alias WeaviateEx.Query.GenerativeResult

# Build a search request with generative config
request = Search.build_near_text_request("Article", "machine learning",
  limit: 5,
  return_properties: ["title", "content"],
  generative: %{
    single_prompt: "Summarize this article: {content}",
    provider: :openai,
    model: "gpt-4",
    temperature: 0.7
  }
)

# Execute the search
{:ok, reply} = Search.execute(channel, request)

# Parse the generative results
result = GenerativeResult.from_grpc_response(reply)

# Access per-object generations
for gen <- result.generated_per_object do
  IO.puts("Generated: #{gen}")
end

# Grouped generation
request = Search.build_near_text_request("Article", "AI trends",
  generative: %{
    grouped_task: "Synthesize the key themes from these articles",
    grouped_properties: ["title", "content"],
    provider: :anthropic,
    model: "claude-3-5-sonnet-20241022"
  }
)

{:ok, reply} = Search.execute(channel, request)
result = GenerativeResult.from_grpc_response(reply)
IO.puts("Grouped summary: #{result.generated}")

Supported providers: :openai, :anthropic, :cohere, :mistral, :ollama, :google, :aws, :databricks, :friendliai, :nvidia, :xai, :contextualai, :anyscale.

Multi-Vector Collections (v0.7.0+)

Query collections with multiple named vectors:

alias WeaviateEx.Query
alias WeaviateEx.Query.TargetVectors

# Single target vector
query = Query.get("MultiVectorCollection")
  |> Query.near_text("search term", target_vectors: "content_vector")
  |> Query.fields(["title", "content"])

{:ok, results} = Query.execute(query, client)

# Combined vectors with average method
target = TargetVectors.combine(["title_vector", "content_vector"], method: :average)

query = Query.get("MultiVectorCollection")
  |> Query.near_vector(embedding, target_vectors: target)
  |> Query.fields(["title"])

{:ok, results} = Query.execute(query, client)

# Weighted combination
target = TargetVectors.weighted(%{
  "title_vector" => 0.7,
  "content_vector" => 0.3
})

query = Query.get("MultiVectorCollection")
  |> Query.near_text("search", target_vectors: target)
  |> Query.fields(["title", "content"])

{:ok, results} = Query.execute(query, client)

Updating Named Vector Configuration (v0.7.0+)

Update existing named vector index settings and quantization:

alias WeaviateEx.API.NamedVectors

# Update vector index parameters
update = NamedVectors.update_config("title_vector",
  vector_index: [
    ef: 200,
    dynamic_ef_min: 100,
    dynamic_ef_max: 500,
    dynamic_ef_factor: 8,
    flat_search_cutoff: 40000
  ]
)

# Update with quantization settings
update = NamedVectors.update_config("content_vector",
  vector_index: [ef: 150],
  quantizer: [
    type: :pq,
    segments: 128,
    centroids: 256,
    training_limit: 100000
  ]
)

# Build update config for multiple vectors at once
updates = NamedVectors.build_update_config([
  {"title_vector", [vector_index: [ef: 200]]},
  {"content_vector", [quantizer: [type: :sq, rescore_limit: 200]]}
])

# Convert to API format
api_config = NamedVectors.update_to_api(update)

Advanced Hybrid Search (v0.7.0+)

Use HybridVector for sophisticated hybrid queries with Move operations:

alias WeaviateEx.Query
alias WeaviateEx.Query.{HybridVector, Move}

# Text sub-search with Move operations
hv = HybridVector.near_text("machine learning",
  move_to: Move.to(0.5, concepts: ["AI", "neural networks"]),
  move_away_from: Move.to(0.3, concepts: ["biology"])
)

query = Query.get("Article")
  |> Query.hybrid("search term", vector: hv, alpha: 0.7)
  |> Query.fields(["title", "content"])

{:ok, results} = Query.execute(query, client)

# Vector sub-search with target vectors
hv = HybridVector.near_vector(embedding, target_vectors: "content_vector")

query = Query.get("Article")
  |> Query.hybrid("search", vector: hv, fusion_type: :relative_score)
  |> Query.fields(["title"])

{:ok, results} = Query.execute(query, client)

Search using images, audio, video, and other media types (v0.7.0+):

Image Search (near_image)

Search collections using image data with multi2vec-clip, multi2vec-bind, or other image vectorizers:

alias WeaviateEx.Query
alias WeaviateEx.Query.NearImage

# Search by base64 encoded image
query = Query.get("ImageCollection")
  |> Query.near_image(image: base64_image_data, certainty: 0.8)
  |> Query.fields(["name", "description"])
  |> Query.limit(10)

{:ok, results} = Query.execute(query, client)

# Search by image file path
query = Query.get("ImageCollection")
  |> Query.near_image(image_file: "/path/to/image.png", distance: 0.3)
  |> Query.fields(["name"])

{:ok, results} = Query.execute(query, client)

# With named vectors (for collections with multiple vector spaces)
query = Query.get("MultiVectorCollection")
  |> Query.near_image(
       image: base64_data,
       certainty: 0.7,
       target_vectors: ["image_vector", "clip_vector"]
     )
  |> Query.fields(["title"])

{:ok, results} = Query.execute(query, client)

# Using NearImage directly
near_image = NearImage.new(image: base64_data, certainty: 0.8)
NearImage.to_graphql(near_image)  # => %{"image" => "...", "certainty" => 0.8}
NearImage.to_grpc(near_image)     # => %{image: "...", certainty: 0.8}

# Encode image file to base64
base64_data = NearImage.encode_image_file("/path/to/image.jpg")

Media Search (near_media)

Search using audio, video, thermal, depth, or IMU data with multi2vec-bind:

alias WeaviateEx.Query
alias WeaviateEx.Query.NearMedia

# Search by audio
query = Query.get("MediaCollection")
  |> Query.near_media(:audio, media: base64_audio, certainty: 0.7)
  |> Query.fields(["name", "transcript"])
  |> Query.limit(5)

{:ok, results} = Query.execute(query, client)

# Search by video file
query = Query.get("MediaCollection")
  |> Query.near_media(:video, media_file: "/path/to/video.mp4", distance: 0.3)
  |> Query.fields(["title", "duration"])

{:ok, results} = Query.execute(query, client)

# Search by thermal imaging data
query = Query.get("SensorData")
  |> Query.near_media(:thermal, media: base64_thermal, certainty: 0.8)
  |> Query.fields(["timestamp", "location"])

{:ok, results} = Query.execute(query, client)

# Supported media types
NearMedia.media_types()  # => [:audio, :video, :thermal, :depth, :imu]

# Using NearMedia directly
near_media = NearMedia.new(:audio, media: base64_audio, certainty: 0.7)
NearMedia.to_graphql(near_media)  # => %{"media" => "...", "type" => "audio", "certainty" => 0.7}
NearMedia.to_grpc(near_media)     # => %{media: "...", type: :MEDIA_TYPE_AUDIO, certainty: 0.7}

# With target vectors for named vectors
near_media = NearMedia.new(:depth,
  media: base64_depth_data,
  target_vectors: ["depth_vector"]
)

Convenience Methods (v0.8.0+)

For a simpler Python-like API, use the convenience methods that automatically handle file paths, base64 data, and raw binary input:

alias WeaviateEx.Query

# Search by image - accepts file path, base64, or binary
{:ok, results} = Query.get("Products")
  |> Query.with_near_image("/path/to/image.jpg")
  |> Query.limit(10)
  |> Query.execute(client)

# Search by base64 image data
{:ok, results} = Query.get("Products")
  |> Query.with_near_image(base64_image_data, certainty: 0.8)
  |> Query.execute(client)

# Search by audio
{:ok, results} = Query.get("Podcasts")
  |> Query.with_near_audio("/path/to/clip.mp3")
  |> Query.execute(client)

# Search by video
{:ok, results} = Query.get("Videos")
  |> Query.with_near_video("/path/to/clip.mp4")
  |> Query.execute(client)

# Search by other media types
{:ok, results} = Query.get("SensorData")
  |> Query.with_near_thermal(thermal_data)
  |> Query.execute(client)

{:ok, results} = Query.get("DepthMaps")
  |> Query.with_near_depth(depth_data, distance: 0.3)
  |> Query.execute(client)

{:ok, results} = Query.get("MotionData")
  |> Query.with_near_imu(imu_data)
  |> Query.execute(client)

# Generic method for any media type
{:ok, results} = Query.get("Products")
  |> Query.with_near_media(:image, "/path/to/image.jpg", certainty: 0.8)
  |> Query.execute(client)

Convenience method options:

  • :certainty - Minimum certainty threshold (0.0 to 1.0)
  • :distance - Maximum distance threshold
  • :target_vectors - Target vectors for multi-vector collections

Supported modalities: image, audio, video, thermal, depth, imu

Note: Requires a multi-modal vectorizer (e.g., multi2vec-clip for images, multi2vec-bind for audio/video).

Media Type Reference

TypeDescriptionUse Case
:audioAudio files (wav, mp3, etc.)Voice search, audio similarity
:videoVideo files (mp4, avi, etc.)Video content matching
:thermalThermal imaging dataIndustrial inspection, security
:depthDepth sensor data3D object recognition
:imuInertial measurement unit dataMotion/gesture recognition

Generative Search (RAG)

Combine search with AI generation for retrieval-augmented generation:

alias WeaviateEx.Query.Generate

# Single-object generation - generate for each result
query = Generate.new("Article")
  |> Generate.near_text("artificial intelligence")
  |> Generate.single("Summarize this article in one sentence: {title}")
  |> Generate.return_properties(["title", "content"])
  |> Generate.limit(5)

{:ok, result} = Generate.execute(query, client)

# Access generated content per object
for obj <- result.objects do
  IO.puts("Title: #{obj["title"]}")
  IO.puts("Generated: #{obj["_additional"]["generate"]["singleResult"]}")
end

# Grouped generation - generate once for all results combined
query = Generate.new("Article")
  |> Generate.bm25("machine learning")
  |> Generate.grouped("Based on these articles, what are the main trends?",
       properties: ["title", "content"])
  |> Generate.return_properties(["title"])
  |> Generate.limit(10)

{:ok, result} = Generate.execute(query, client)
IO.puts("Combined insight: #{result.generated}")

# Hybrid search with generation
query = Generate.new("Article")
  |> Generate.hybrid("neural networks", alpha: 0.7)
  |> Generate.single("Extract key points from: {content}")
  |> Generate.return_properties(["title", "content"])

{:ok, result} = Generate.execute(query, client)

# Convert existing Query to generative query
query = Query.get("Article")
  |> Query.near_text("climate change")
  |> Query.fields(["title", "content"])
  |> Query.limit(5)

gen_query = Query.generate(query, :single, "Summarize: {content}")
{:ok, result} = Generate.execute(gen_query, client)

Query References (v0.7.0+)

Query cross-references with multi-target support and metadata:

alias WeaviateEx.Query.QueryReference

# Basic reference query
ref = QueryReference.new("hasAuthor", return_properties: ["name", "email"])

# Multi-target reference query (for references pointing to multiple collections)
ref = QueryReference.multi_target("relatedTo", "Article",
  return_properties: ["title", "publishedAt"]
)

# Check if reference is multi-target
QueryReference.multi_target?(ref)  # => true

# Request metadata in referenced objects
ref = QueryReference.new("hasAuthor",
  return_properties: ["name"],
  return_metadata: [:uuid, :distance, :certainty]
)

# Use metadata presets
ref = QueryReference.new("hasAuthor",
  return_properties: ["name"],
  return_metadata: :full    # All available metadata
)

ref = QueryReference.new("hasAuthor",
  return_properties: ["name"],
  return_metadata: :common  # uuid, distance, certainty, creation_time
)

# Use in queries
query = Query.get("Article")
  |> Query.fields(["title", "content"])
  |> Query.reference(ref)

Aggregations

Statistical analysis over your data:

alias WeaviateEx.API.Aggregate
alias WeaviateEx.Aggregate.Metrics

# Count all objects
{:ok, result} = Aggregate.over_all(client, "Product", metrics: [:count])

# Numeric aggregations (mean, sum, min, max)
{:ok, stats} = Aggregate.over_all(client, "Product",
  properties: [{:price, [:mean, :sum, :maximum, :minimum, :count]}]
)

# Top occurrences for text fields
{:ok, categories} = Aggregate.over_all(client, "Product",
  properties: [{:category, [:topOccurrences], limit: 10}]
)

# Group by with aggregations
{:ok, grouped} = Aggregate.group_by(client, "Product", "category",
  metrics: [:count],
  properties: [{:price, [:mean, :maximum, :minimum]}]
)

Near Object Aggregation

Aggregate objects similar to a reference object:

# Aggregate objects near a reference UUID
{:ok, result} = Aggregate.with_near_object(client, "Articles", reference_uuid,
  distance: 0.5,
  metrics: [:count],
  properties: [
    {:views, [:mean, :sum]},
    {:category, [:topOccurrences], limit: 5}
  ]
)

IO.inspect(result)  # %{"meta" => %{"count" => 42}, "views" => %{"mean" => 1250.5, "sum" => 52521}}

Hybrid Aggregation

Aggregate with combined keyword and vector search:

# Hybrid search aggregation (balanced keyword + vector)
{:ok, result} = Aggregate.with_hybrid(client, "Products", "electronics",
  alpha: 0.5,  # 50% vector, 50% keyword (default)
  metrics: [:count],
  properties: [
    {:price, [:sum, :mean, :minimum, :maximum]}
  ]
)

# Pure keyword search aggregation (alpha = 0)
{:ok, result} = Aggregate.with_hybrid(client, "Products", "laptop",
  alpha: 0.0,
  fusion_type: :ranked,
  metrics: [:count]
)

# Vector-weighted search aggregation (alpha = 0.8)
{:ok, result} = Aggregate.with_hybrid(client, "Products", "portable computer",
  alpha: 0.8,
  fusion_type: :relative_score,
  properties: [{:category, [:topOccurrences], limit: 3}]
)

Using the Metrics Helper

Build metrics specifications with the helper module:

alias WeaviateEx.Aggregate.Metrics

# Number metrics with all options
{:ok, result} = Aggregate.over_all(client, "Products",
  metrics: [Metrics.count()],
  properties: [
    Metrics.number("price", sum: true, mean: true, minimum: true, maximum: true),
    Metrics.text("category", top_occurrences: 5),
    Metrics.boolean("inStock")
  ]
)

Advanced Filtering

Build complex filters with a type-safe DSL:

alias WeaviateEx.Filter

# Simple equality
filter = Filter.equal("status", "published")

# Numeric comparisons
filter = Filter.greater_than("views", 100)
filter = Filter.less_than_equal("price", 50.0)

# Text pattern matching
filter = Filter.like("title", "*AI*")

# Array operations
filter = Filter.contains_any("tags", ["elixir", "phoenix"])
filter = Filter.contains_all("tags", ["elixir", "tutorial"])

# Geospatial queries
filter = Filter.within_geo_range("location", {40.7128, -74.0060}, 5000.0)

# Date comparisons
filter = Filter.greater_than("publishedAt", "2025-01-01T00:00:00Z")

# Null checks
filter = Filter.is_null("deletedAt")

# Property length filtering (v0.7.0+)
filter = Filter.by_property_length("title", :greater_than, 10)
filter = Filter.by_property_length("tags", :greater_or_equal, 3)

# Combine filters with AND
combined = Filter.all_of([
  Filter.equal("status", "published"),
  Filter.greater_than("views", 100),
  Filter.like("title", "*Elixir*")
])

# Combine filters with OR
or_filter = Filter.any_of([
  Filter.equal("category", "technology"),
  Filter.equal("category", "science")
])

# Negate filters
not_filter = Filter.none_of([
  Filter.equal("status", "draft")
])

# Use in queries
query = Query.get("Article")
  |> Query.where(Filter.to_graphql(combined))
  |> Query.fields(["title", "views"])

Deep Reference Filtering (v0.7.0+)

Filter through chains of references to reach nested properties:

alias WeaviateEx.Filter
alias WeaviateEx.Filter.RefPath

# Filter articles where the author's company is in technology
filter = RefPath.through("hasAuthor", "Author")
  |> RefPath.through("worksAt", "Company")
  |> RefPath.property("industry", :equal, "Technology")

# Filter by author name directly
filter = RefPath.through("hasAuthor", "Author")
  |> RefPath.property("name", :like, "John*")

# Combine with other filters
combined = Filter.all_of([
  RefPath.through("hasAuthor", "Author")
  |> RefPath.property("verified", :equal, true),
  Filter.equal("status", "published")
])

# Get path depth
path = RefPath.through("hasAuthor", "Author")
  |> RefPath.through("worksAt", "Company")
RefPath.depth(path)  # => 2

# Use convenience function
filter = Filter.by_ref_path(
  RefPath.through("hasAuthor", "Author"),
  "name",
  :equal,
  "Jane"
)

Multi-Target Reference Filtering (v0.7.0+)

Filter on multi-target reference properties that can point to different collections:

alias WeaviateEx.Filter
alias WeaviateEx.Filter.{MultiTargetRef, RefPath}

# Filter where "relatedTo" points to an Article with specific title
filter = MultiTargetRef.new("relatedTo", "Article")
  |> MultiTargetRef.where("title", :equal, "My Article")

# Filter where "mentions" points to a verified Person
filter = MultiTargetRef.new("mentions", "Person")
  |> MultiTargetRef.where("verified", :equal, true)

# Deep path filtering through multi-target reference
filter = MultiTargetRef.new("mentions", "Person")
  |> MultiTargetRef.deep_where(fn path ->
    path
    |> RefPath.through("worksAt", "Company")
    |> RefPath.property("industry", :equal, "Tech")
  end)

# Convert to RefPath for chaining
ref_path = MultiTargetRef.new("mentions", "Person")
  |> MultiTargetRef.as_ref_path()
  |> RefPath.through("worksAt", "Company")
  |> RefPath.property("name", :equal, "Acme")

# Combine with other filters
combined = Filter.all_of([
  MultiTargetRef.new("relatedTo", "Article")
  |> MultiTargetRef.where("status", :equal, "published"),
  Filter.equal("featured", true)
])

# Use convenience function
filter = Filter.by_ref_multi_target(
  "relatedTo",
  "Article",
  "status",
  :equal,
  "published"
)

Vector Configuration

Configure vectorizers and index types:

alias WeaviateEx.API.VectorConfig

# Custom vectors with HNSW index
config = VectorConfig.new("AIArticle")
  |> VectorConfig.with_vectorizer(:none)  # Bring your own vectors
  |> VectorConfig.with_hnsw_index(
    distance: :cosine,
    ef: 100,
    max_connections: 64
  )
  |> VectorConfig.with_properties([
    %{"name" => "title", "dataType" => ["text"]},
    %{"name" => "content", "dataType" => ["text"]}
  ])

{:ok, _} = Collections.create(client, config)

# HNSW with Product Quantization (compression)
config = VectorConfig.new("CompressedData")
  |> VectorConfig.with_vectorizer(:none)
  |> VectorConfig.with_hnsw_index(distance: :dot)
  |> VectorConfig.with_product_quantization(
    enabled: true,
    segments: 96,
    centroids: 256
  )

# Flat index for exact search (no approximation)
config = VectorConfig.new("ExactSearch")
  |> VectorConfig.with_vectorizer(:none)
  |> VectorConfig.with_flat_index(distance: :dot)

Inverted Index Configuration (v0.5.0+)

Configure BM25 and stopwords for full-text search:

alias WeaviateEx.API.InvertedIndexConfig

# Configure BM25 algorithm parameters
bm25_config = InvertedIndexConfig.bm25(b: 0.75, k1: 1.2)

# Configure stopwords with English preset and customizations
stopwords = InvertedIndexConfig.stopwords(
  preset: :en,
  additions: ["foo", "bar"],
  removals: ["the"]
)

# Build complete inverted index configuration
config = InvertedIndexConfig.build(
  bm25: [b: 0.8, k1: 1.5],
  stopwords: [preset: :en],
  index_timestamps: true,
  index_property_length: true,
  index_null_state: false,
  cleanup_interval_seconds: 60
)

# Validate configuration
{:ok, validated} = InvertedIndexConfig.validate(config)

# Merge configurations
merged = InvertedIndexConfig.merge(base_config, override_config)

Reranker Configuration (v0.7.0+)

Configure reranking models to improve search result relevance:

alias WeaviateEx.API.RerankerConfig

# Cohere reranker (default or specific model)
config = RerankerConfig.cohere()
config = RerankerConfig.cohere("rerank-english-v3.0")
config = RerankerConfig.cohere("rerank-multilingual-v3.0", base_url: "https://api.cohere.ai")

# Local transformers reranker
config = RerankerConfig.transformers()
config = RerankerConfig.transformers(inference_url: "http://localhost:8080")

# Voyage AI reranker
config = RerankerConfig.voyageai("rerank-1")
config = RerankerConfig.voyageai("rerank-lite-1", base_url: "https://api.voyageai.com")

# Jina AI reranker
config = RerankerConfig.jinaai("jina-reranker-v1-base-en")
config = RerankerConfig.jinaai("jina-reranker-v1-turbo-en")

# Custom/unlisted reranker provider
config = RerankerConfig.custom("my-reranker",
  api_endpoint: "https://reranker.example.com",
  model: "rerank-v1",
  max_tokens: 512
)

# Disable reranking
config = RerankerConfig.none()

# Use in collection creation
{:ok, _} = Collections.create("Article", %{
  properties: [...],
  reranker_config: config
})

Custom Generative Provider Configuration (v0.7.0+)

Configure unlisted generative AI providers with custom settings:

alias WeaviateEx.API.GenerativeConfig

# Custom generative provider for unlisted LLMs
config = GenerativeConfig.custom("my-llm",
  api_endpoint: "https://llm.example.com",
  model: "custom-gpt",
  temperature: 0.7,
  max_tokens: 2048
)

# Custom provider with authentication options
config = GenerativeConfig.custom("enterprise-llm",
  api_endpoint: "https://llm.internal.corp",
  model: "llm-v2",
  api_key_header: "X-API-Key",
  temperature: 0.5
)

# Use with collection
{:ok, _} = Collections.create("Article", %{
  properties: [...],
  generative_config: config
})

Backup & Restore

Complete backup and restore operations with multiple storage backends:

alias WeaviateEx.Backup.{Config, Location}

# Create a backup to filesystem
{:ok, status} = WeaviateEx.create_backup(client, "daily-backup", :filesystem)

# Create backup to S3 with specific collections and wait for completion
{:ok, status} = WeaviateEx.create_backup(client, "daily-backup", :s3,
  include_collections: ["Article", "Author"],
  wait_for_completion: true,
  config: Config.create(compression: :best_compression)
)

# Check backup status
{:ok, status} = WeaviateEx.get_backup_status(client, "daily-backup", :filesystem)
IO.puts("Status: #{status.status}")  # :started, :transferring, :success, etc.

# List all backups
{:ok, backups} = WeaviateEx.list_backups(client, :filesystem)

# Restore a backup
{:ok, status} = WeaviateEx.restore_backup(client, "daily-backup", :filesystem,
  wait_for_completion: true
)

# Restore specific collections only
{:ok, status} = WeaviateEx.restore_backup(client, "daily-backup", :s3,
  include_collections: ["Article"]
)

# Cancel an in-progress backup
:ok = WeaviateEx.cancel_backup(client, "daily-backup", :filesystem)

Storage Backends

BackendDescriptionConfiguration
:filesystemLocal filesystemBACKUP_FILESYSTEM_PATH on server
:s3Amazon S3 / S3-compatibleBucket, region, credentials
:gcsGoogle Cloud StorageBucket, project ID, credentials
:azureAzure Blob StorageContainer, connection string

Compression Options (v0.5.0+)

alias WeaviateEx.Backup.{Config, Compression}

# GZIP compression (default)
Config.create(compression: :default)          # Balanced GZIP
Config.create(compression: :best_speed)       # Fast GZIP
Config.create(compression: :best_compression) # Max GZIP

# ZSTD compression (faster, better ratios)
Config.create(compression: :zstd_default)          # Balanced ZSTD
Config.create(compression: :zstd_best_speed)       # Fast ZSTD
Config.create(compression: :zstd_best_compression) # Max ZSTD

# No compression
Config.create(compression: :no_compression)

# Check compression type
Compression.gzip?(:default)  # => true
Compression.zstd?(:zstd_default)  # => true

RBAC Restore Options (v0.6.0+)

Restore backups with fine-grained control over RBAC data:

alias WeaviateEx.Backup

# Restore with RBAC options
{:ok, status} = Backup.restore(client, "daily-backup", :s3,
  roles_restore: true,          # Restore role definitions
  users_restore: true,          # Restore user assignments
  overwrite_alias: true,        # Overwrite existing aliases
  wait_for_completion: true
)

# Selective RBAC restore - roles only
{:ok, status} = Backup.restore(client, "daily-backup", :filesystem,
  roles_restore: true,
  users_restore: false
)

Location Configuration (Advanced)

Use typed location structs for cloud backend configuration:

alias WeaviateEx.Backup.{Location, Config}

# Filesystem location
fs_loc = Location.filesystem("/var/backups/weaviate")

# S3 location with full configuration
s3_loc = Location.s3("my-bucket", "/backups",
  endpoint: "s3.us-west-2.amazonaws.com",
  region: "us-west-2",
  access_key_id: "...",
  secret_access_key: "...",
  use_ssl: true
)

# GCS location
gcs_loc = Location.gcs("my-bucket", "/backups",
  project_id: "my-project",
  credentials: %{...}
)

# Azure location
azure_loc = Location.azure("my-container", "/backups",
  connection_string: "..."
)

# Use location structs directly in backup operations
{:ok, status} = Backup.create(client, "backup-001", s3_loc,
  include_collections: ["Article"],
  config: Config.create(chunk_size: 128, compression: :zstd_default)
)

# Restore from location struct
{:ok, status} = Backup.restore(client, "backup-001", s3_loc,
  roles_restore: true
)

Collection Aliases (v0.5.0+)

Aliases allow zero-downtime collection updates by providing alternative names:

alias WeaviateEx.API.Aliases

# Create an alias (requires Weaviate v1.32.0+)
{:ok, _} = Aliases.create(client, "articles", "Article_v1")

# List all aliases
{:ok, aliases} = Aliases.list(client)
# => [%Alias{alias: "articles", collection: "Article_v1"}]

# Update alias to point to new collection (blue-green deployment)
{:ok, _} = Aliases.update(client, "articles", "Article_v2")

# Get alias details
{:ok, alias_info} = Aliases.get(client, "articles")
# => %Alias{alias: "articles", collection: "Article_v2"}

# Check if alias exists
{:ok, true} = Aliases.exists?(client, "articles")

# Delete alias (underlying collection remains)
{:ok, true} = Aliases.delete(client, "articles")

Multi-Tenancy

Isolate data per tenant with automatic partitioning:

alias WeaviateEx.API.{VectorConfig, Tenants}

# Create multi-tenant collection
config = VectorConfig.new("TenantArticle")
  |> VectorConfig.with_multi_tenancy(enabled: true)
  |> VectorConfig.with_properties([
    %{"name" => "title", "dataType" => ["text"]}
  ])

Collections.create(client, config)

# Create tenants
{:ok, created} = Tenants.create(client, "TenantArticle",
  ["CompanyA", "CompanyB", "CompanyC"]
)

# List all tenants
{:ok, tenants} = Tenants.list(client, "TenantArticle")

# Get specific tenant
{:ok, tenant} = Tenants.get(client, "TenantArticle", "CompanyA")

# Check existence
{:ok, true} = Tenants.exists?(client, "TenantArticle", "CompanyA")

# Deactivate tenant (set to COLD storage)
{:ok, _} = Tenants.deactivate(client, "TenantArticle", "CompanyB")

# List only active tenants
{:ok, active} = Tenants.list_active(client, "TenantArticle")

# Activate tenant (set to HOT)
{:ok, _} = Tenants.activate(client, "TenantArticle", "CompanyB")

# Count tenants
{:ok, count} = Tenants.count(client, "TenantArticle")

# Delete tenant
{:ok, _} = Tenants.delete(client, "TenantArticle", "CompanyC")

# Use tenant in queries (specify tenant parameter)
{:ok, objects} = Data.insert(client, "TenantArticle", data, tenant: "CompanyA")

Fluent with_tenant API (v0.7.4+)

Get a tenant-scoped collection reference for cleaner multi-tenant code:

alias WeaviateEx.{Collections, TenantCollection, Query}

# Get tenant-scoped collection (matches Python client pattern)
tenant_col = Collections.with_tenant(client, "Articles", "tenant_A")

# All operations automatically scoped to tenant_A
{:ok, _} = TenantCollection.insert(tenant_col, %{
  title: "My Article",
  content: "Article content"
})

# Query within tenant
{:ok, results} = tenant_col
  |> TenantCollection.query()
  |> Query.bm25("search term")
  |> Query.execute(client)

# Batch insert within tenant
{:ok, _} = TenantCollection.insert_many(tenant_col, [
  %{title: "Article 1"},
  %{title: "Article 2"}
])

# Get, update, delete operations
{:ok, obj} = TenantCollection.get(tenant_col, uuid)
{:ok, _} = TenantCollection.update(tenant_col, uuid, %{title: "Updated"})
{:ok, _} = TenantCollection.delete(tenant_col, uuid)

Traditional API (still supported)

# Pass tenant as option to each operation
{:ok, _} = Objects.create("Articles", object, tenant: "tenant_A")
{:ok, _} = Query.get("Articles") |> Query.tenant("tenant_A") |> Query.execute(client)

RBAC (Role-Based Access Control)

WeaviateEx provides full RBAC support for managing roles, permissions, users, and groups.

Creating Roles with Permissions

alias WeaviateEx.API.RBAC
alias WeaviateEx.RBAC.Permissions

# Define permissions using the builder API
permissions = [
  Permissions.collections("Article", [:read, :create]),
  Permissions.data("Article", [:read, :create, :update]),
  Permissions.tenants("Article", [:read])
]

# Create a role
{:ok, role} = RBAC.create_role(client, "article-editor", permissions)

# List all roles
{:ok, roles} = RBAC.list_roles(client)

# Check if role has specific permissions
{:ok, true} = RBAC.has_permissions?(client, "article-editor",
  [Permissions.data("Article", :read)]
)

# Add more permissions to a role
:ok = RBAC.add_permissions(client, "article-editor",
  [Permissions.nodes(:verbose)]
)

# Delete a role
:ok = RBAC.delete_role(client, "article-editor")

Role Scope Permissions (v0.6.0+)

Fine-grained permissions with collection/tenant/shard scopes:

alias WeaviateEx.API.RBAC.{Scope, Permission}

# Create scopes for fine-grained access
scope = Scope.collection("Article")
  |> Scope.with_tenants(["tenant-a", "tenant-b"])

# Or use wildcard access
all_scope = Scope.all_collections()

# Build permissions with scopes
permissions = [
  Permission.read_collection("Article"),
  Permission.manage_data("Article"),
  Permission.new(:data, :read, scope: Scope.collection("*")),
  Permission.new(:tenants, :create, scope: scope)
]

# Convenience methods for common patterns
admin_permissions = Permission.admin()  # Full access
viewer_permissions = Permission.viewer()  # Read-only access

Permission Types

TypeActionsDescription
collectionscreate, read, update, delete, manageCollection schema operations
datacreate, read, update, delete, manageObject CRUD operations
tenantscreate, read, update, deleteMulti-tenancy management
rolescreate, read, update, deleteRole management
userscreate, read, update, delete, assign_and_revokeUser management
groupsread, assign_and_revokeOIDC group management
clusterreadCluster information
nodesread (minimal/verbose)Node information
backupsmanageBackup operations
replicatecreate, read, update, deleteReplication management
aliascreate, read, update, deleteCollection aliases

User Management

alias WeaviateEx.API.Users

# Create a new DB user (returns API key)
{:ok, user} = Users.create(client, "john.doe")
IO.puts("API Key: #{user.api_key}")

# Get user info
{:ok, user} = Users.get(client, "john.doe")

# Get current authenticated user
{:ok, me} = Users.get_my_user(client)

# Assign roles to user
:ok = Users.assign_roles(client, "john.doe", ["article-editor", "viewer"])

# Revoke roles from user
:ok = Users.revoke_roles(client, "john.doe", ["viewer"])

# Get user's assigned roles
{:ok, roles} = Users.get_assigned_roles(client, "john.doe")

# Rotate API key
{:ok, new_key} = Users.rotate_key(client, "john.doe")

# Deactivate/activate user
:ok = Users.deactivate(client, "john.doe")
:ok = Users.activate(client, "john.doe")

# Delete user
:ok = Users.delete(client, "john.doe")

Separate DB and OIDC User Management (v0.6.0+)

For fine-grained control, use the specialized modules:

alias WeaviateEx.API.Users.{DB, OIDC}

# Database-backed users (full lifecycle management)
{:ok, user} = DB.create(client, "db-user")
{:ok, new_key} = DB.rotate_api_key(client, "db-user")
{:ok, _} = DB.delete(client, "db-user")

# OIDC users (managed externally, role assignment only)
{:ok, users} = OIDC.list(client)
{:ok, user} = OIDC.get(client, "oidc-user@example.com")
:ok = OIDC.assign_roles(client, "oidc-user@example.com", ["viewer"])
:ok = OIDC.revoke_roles(client, "oidc-user@example.com", ["admin"])

Group Management

OIDC group management for role assignments:

alias WeaviateEx.API.Groups

# List known OIDC groups
{:ok, groups} = Groups.list_known(client)

# Assign roles to a group
:ok = Groups.assign_roles(client, "engineering", ["developer", "viewer"])

# Get roles assigned to a group
{:ok, roles} = Groups.get_assigned_roles(client, "engineering")

# Revoke roles from a group
:ok = Groups.revoke_roles(client, "engineering", ["admin"])

Examples

WeaviateEx includes 8 runnable examples that demonstrate all major features:

ExampleDescriptionWhat You'll Learn
01_collections.exsCollection managementCreate, list, get, add properties, delete collections
02_data.exsCRUD operationsInsert, get, patch, check existence, delete objects
03_filter.exsAdvanced filteringEquality, comparison, pattern matching, geo, array filters
04_aggregate.exsAggregationsCount, statistics, top occurrences, group by
05_vector_config.exsVector configurationHNSW, PQ compression, flat index, distance metrics
06_tenants.exsMulti-tenancyCreate tenants, activate/deactivate, list, delete
07_batch.exsBatch APIBulk create/delete with summaries, query remaining data
08_query.exsQuery builderBM25 search, filters, near-vector similarity

Prerequisites

Follow these steps once before running any example:

  1. Start the local stack (full profile with all compose files):

    # from the project root
    mix weaviate.start --version latest
    # or use the helper script
    ./scripts/weaviate-stack.sh start --version latest
    

    To shut everything down afterwards use mix weaviate.stop --version latest (or ./scripts/weaviate-stack.sh stop).

  2. Confirm the services are healthy (optional but recommended):

    mix weaviate.status
    
  3. Point the client at the running cluster (avoids repeated configuration warnings):

    export WEAVIATE_URL=http://localhost:8080
    # set WEAVIATE_API_KEY=... as well if your instance requires auth
    

Running Examples

All examples are self-contained and include clean visual output:

# With WEAVIATE_URL exported

# Run any example
mix run examples/01_collections.exs
mix run examples/02_data.exs
mix run examples/03_filter.exs
# ... etc

# Or run all examples
for example in examples/*.exs; do
  echo "Running $example..."
  mix run "$example"
done

Each example:

  • βœ… Checks Weaviate connectivity before running
  • βœ… Shows the code being executed
  • βœ… Displays formatted results
  • βœ… Cleans up after itself (deletes test data)
  • βœ… Provides clear success/error messages

Supported Weaviate Versions

Weaviate VersionStatusNotes
1.35.xFully SupportedLatest
1.34.xFully SupportedgRPC streaming
1.33.xFully Supported
1.32.xFully Supported
1.31.xFully Supported
1.30.xFully Supported
1.29.xFully Supported
1.28.xFully Supported
1.27.xFully SupportedMinimum
< 1.27Not Tested

Testing is performed against all supported versions in CI.

Testing

WeaviateEx has comprehensive test coverage with two testing modes:

Test Modes

Mock Mode (Default) - Fast, isolated unit tests:

  • βœ… Uses Mox to mock HTTP/Protocol and gRPC responses
  • βœ… No Weaviate instance required
  • βœ… Fast execution (~0.2 seconds)
  • βœ… 2248+ unit tests
  • βœ… Perfect for TDD and CI/CD

Integration Mode - Real Weaviate testing:

  • βœ… Tests against live Weaviate instance
  • βœ… Validates actual API behavior
  • βœ… Requires Weaviate running locally
  • βœ… Run with --include integration flag
  • βœ… 10 integration test suites (collections, objects, batch, query, health, search, filter, aggregate, auth/RBAC, backup)

Running Tests

# Run all unit tests with mocks (default - no Weaviate needed)
mix test

# EASIEST: Run integration tests with automatic Weaviate management
mix weaviate.test              # Starts Weaviate, runs tests, stops Weaviate
mix weaviate.test --keep       # Keep Weaviate running after tests
mix weaviate.test -v 1.30.5    # Test against specific Weaviate version

# MANUAL: Run integration tests with separate Weaviate management
mix weaviate.start             # Start Weaviate containers
mix test --include integration # Run integration tests
mix weaviate.stop              # Stop Weaviate containers

# Or use environment variable
WEAVIATE_INTEGRATION=true mix test --include integration

# Run specific test file
mix test test/weaviate_ex/api/collections_test.exs

# Run specific test by line number
mix test test/weaviate_ex/objects_test.exs:95

# Run with coverage report (basic)
mix test --cover

# Run with coverage report (detailed HTML via excoveralls)
mix coveralls.html
open cover/excoveralls.html

# Run only integration tests
mix test --only integration

# Run specific integration test suites
mix test --only integration test/integration/search_integration_test.exs
mix test --only rbac    # RBAC tests (requires port 8092)
mix test --only backup  # Backup tests (requires port 8093)

Test Structure

test/
β”œβ”€β”€ test_helper.exs           # Test setup, Mox configuration
β”œβ”€β”€ support/
β”‚   β”œβ”€β”€ factory.ex            # Test data factories
β”‚   β”œβ”€β”€ mocks.ex              # Mox mock definitions
β”‚   └── integration_case.ex   # Shared integration test module
β”œβ”€β”€ weaviate_ex_test.exs      # Top-level API tests
β”œβ”€β”€ weaviate_ex/
β”‚   β”œβ”€β”€ api/                  # API module tests (mocked)
β”‚   β”‚   β”œβ”€β”€ collections_test.exs
β”‚   β”‚   β”œβ”€β”€ data_test.exs
β”‚   β”‚   β”œβ”€β”€ aggregate_test.exs
β”‚   β”‚   β”œβ”€β”€ tenants_test.exs
β”‚   β”‚   └── ...
β”‚   β”œβ”€β”€ filter_test.exs       # Filter system tests
β”‚   β”œβ”€β”€ objects_test.exs      # Objects API tests
β”‚   β”œβ”€β”€ batch_test.exs        # Batch operations tests
β”‚   └── query_test.exs        # Query builder tests
β”œβ”€β”€ integration/              # Integration tests (live Weaviate)
β”‚   β”œβ”€β”€ collections_integration_test.exs  # Collection CRUD
β”‚   β”œβ”€β”€ objects_integration_test.exs      # Object CRUD
β”‚   β”œβ”€β”€ batch_integration_test.exs        # Batch operations
β”‚   β”œβ”€β”€ query_integration_test.exs        # Query execution
β”‚   β”œβ”€β”€ health_integration_test.exs       # Health checks
β”‚   β”œβ”€β”€ search_integration_test.exs       # BM25, near_vector, pagination
β”‚   β”œβ”€β”€ filter_integration_test.exs       # Filter operators, AND/OR
β”‚   β”œβ”€β”€ aggregate_integration_test.exs    # Aggregations, group by
β”‚   β”œβ”€β”€ auth_integration_test.exs         # RBAC, API key auth (port 8092)
β”‚   └── backup_integration_test.exs       # Backup/restore (port 8093)
└── journey/                  # Web framework journey tests
    β”œβ”€β”€ scenarios.ex          # Shared journey test scenarios
    β”œβ”€β”€ scenarios_test.exs    # Direct scenario tests
    β”œβ”€β”€ phoenix_test.exs      # Phoenix endpoint integration
    └── plug_test.exs         # Plug router integration

Integration Test Helper

Use WeaviateEx.IntegrationCase for consistent test setup:

defmodule MyIntegrationTest do
  use WeaviateEx.IntegrationCase  # Auto-configures HTTP client, cleanup

  test "my integration test" do
    # Unique collection names with automatic cleanup
    {name, {:ok, _}} = create_test_collection("MyTest", properties: [...])

    # Or use scoped collections
    with_collection([prefix: "Scoped"], fn name ->
      # Collection exists only within this block
    end)
  end
end

Journey Tests

Journey tests validate WeaviateEx integration with Phoenix and Plug web frameworks. These tests ensure the SDK works correctly when:

  • Initialized at application startup and closed at shutdown
  • Used from both synchronous and asynchronous contexts (different processes)
  • Handling concurrent requests from multiple web requests
  • Managing connection lifecycle within web framework patterns
# Start Weaviate
mix weaviate.start

# Run journey tests
WEAVIATE_INTEGRATION=true mix test --include journey

# Or run all integration tests including journey
WEAVIATE_INTEGRATION=true mix test --include integration --include journey

# Stop Weaviate
mix weaviate.stop

See test/journey/ for Phoenix and Plug integration examples:

  • test/journey/scenarios.ex - Shared journey test scenarios
  • test/journey/scenarios_test.exs - Direct scenario tests
  • test/journey/phoenix_test.exs - Phoenix endpoint integration
  • test/journey/plug_test.exs - Plug router integration

Test Coverage

Current test coverage by module:

  • βœ… Collections API: 17 tests - Create, list, get, exists, delete, add property
  • βœ… Filter System: 80+ tests - All operators, combinators, RefPath, MultiTargetRef, property length
  • βœ… Data Operations: 17 tests - Insert, get, patch, exists, delete with vectors
  • βœ… Objects API: 15+ tests - Full CRUD with pagination
  • βœ… Batch Operations: 35+ tests - Bulk create, delete, error tracking, retry logic
  • βœ… Query System: 60+ tests - GraphQL, near_text, hybrid, BM25, move, rerank, groupBy
  • βœ… Aggregations: 15+ tests - Count, statistics, group by
  • βœ… Tenants: 20+ tests - Multi-tenancy with freeze/offload states
  • βœ… References: 30+ tests - Cross-reference CRUD, multi-target references, QueryReference metadata
  • βœ… Generative AI: 62 tests - All providers, typed configs, result parsing
  • βœ… Vector Config: 15+ tests - HNSW, PQ, flat index, multi-vector
  • βœ… Multi-Vector: 10+ tests - ColBERT, Muvera encoding, Jina vectorizers
  • βœ… gRPC Services: 50+ tests - Channel management, search, batch, aggregate, tenants, health
  • βœ… gRPC Error Handling: 30+ tests - Status code mapping, retryable errors
  • βœ… Generative Search: 25+ tests - Query.Generate, all search types, GraphQL generation
  • βœ… Nested Properties: 25+ tests - Property.Nested struct, serialization, validation
  • βœ… Concurrent Batch: 20+ tests - Parallel insertion, result aggregation
  • βœ… Batch Queue: 25+ tests - Queue operations, failure tracking, re-queue
  • βœ… Rate Limit Detection: 20+ tests - Provider patterns, backoff calculation
  • βœ… Custom Providers: 20+ tests - Custom generative configs, reranker configurations

Total: 2362 tests passing

Mix Tasks

WeaviateEx provides Mix tasks for managing local Weaviate Docker containers:

TaskDescription
mix weaviate.startStart Weaviate Docker containers
mix weaviate.stopStop Weaviate Docker containers
mix weaviate.statusShow container status and health check
mix weaviate.testStart Weaviate, run integration tests, stop Weaviate
mix weaviate.logsShow Docker container logs
# Start Weaviate containers (default version: 1.28.14)
mix weaviate.start
mix weaviate.start --version 1.30.5    # Specific version
mix weaviate.start -v latest           # Latest version

# Check container status and health
mix weaviate.status

# Stop all Weaviate containers
mix weaviate.stop
mix weaviate.stop --keep-data          # Preserve data directory

# Run integration tests (full lifecycle management)
mix weaviate.test                      # Start, test, stop
mix weaviate.test --keep               # Keep Weaviate running after tests
mix weaviate.test -v 1.30.5            # Test against specific version

# View container logs
mix weaviate.logs                      # Show last 100 lines
mix weaviate.logs --tail 50            # Show last 50 lines
mix weaviate.logs --file docker-compose-backup.yml  # Specific compose file
mix weaviate.logs -f --file docker-compose.yml      # Follow logs

The tasks shell out to scripts in ci/ which manage multiple Docker Compose profiles (single node, RBAC, backup, cluster, async, etc.).

Development Tools

Benchmarks

Run performance benchmarks with Benchee:

# Start Weaviate first
mix weaviate.start

# Run all benchmarks
mix weaviate.bench

# Run specific benchmark
mix weaviate.bench batch    # Batch insert performance
mix weaviate.bench query    # Query performance (near_vector, BM25, hybrid)

Results are saved to bench/output/ as HTML files with detailed statistics and charts.

Pre-commit Hooks

Install pre-commit hooks for automatic code quality checks:

# Install pre-commit (Python package)
pip install pre-commit

# Or with Homebrew
brew install pre-commit

# Install hooks
pre-commit install

# Run on all files
pre-commit run --all-files

Hooks automatically run mix format, mix compile --warnings-as-errors, and mix credo --strict before each commit.

Profiling

See guides/profiling.md for profiling techniques using Elixir's built-in tools (fprof, eprof, cprof).

Docker Management

Using the bundled scripts

All Compose profiles live under ci/ (ported from the Python client). The shell scripts manage multiple configurations:

# Start all profiles (single node, modules, RBAC, cluster, async, proxy, backup)
./ci/start_weaviate.sh 1.28.14

# Async-only sandbox for journey tests
./ci/start_weaviate_jt.sh 1.28.14

# Stop all containers
./ci/stop_weaviate.sh

Edit ci/compose.sh to add/remove compose files from the managed set.

Available Docker Compose Profiles

FilePort(s)Description
docker-compose.yml8080, 50051Primary single-node instance
docker-compose-rbac.yml8092RBAC-enabled instance
docker-compose-backup.yml8093Backup-enabled instance
docker-compose-cluster.yml8087-80893-node cluster
docker-compose-async.yml8090Async/journey test instance
docker-compose-modules.yml8091Module-enabled instance
docker-compose-proxy.yml8094Proxy configuration

Direct Docker Compose commands

# Spawn just the baseline stack
docker compose -f ci/docker-compose.yml up -d

# Inspect the cluster nodes
docker compose -f ci/docker-compose-cluster.yml ps

# Tail logs for the RBAC profile
docker compose -f ci/docker-compose-rbac.yml logs -f

# Remove everything (data included)
docker compose -f ci/docker-compose.yml down -v

Troubleshooting tips

# Confirm Docker is running
docker info

# See which services are up for a given profile
docker compose -f ci/docker-compose-backup.yml ps -a

# Check the ready endpoint of the primary instance
curl http://localhost:8080/v1/.well-known/ready

# Query metadata
curl http://localhost:8080/v1/meta

Authentication

For production or cloud Weaviate instances with authentication:

# Add to .env file (NOT committed to git)
WEAVIATE_URL=https://your-cluster.weaviate.network
WEAVIATE_API_KEY=your-secret-api-key-here

# Or add to ~/.bash_secrets (sourced by ~/.bashrc)
export WEAVIATE_URL=https://your-cluster.weaviate.network
export WEAVIATE_API_KEY=your-secret-api-key-here

Runtime Configuration (Production)

# config/runtime.exs
config :weaviate_ex,
  url: System.fetch_env!("WEAVIATE_URL"),
  api_key: System.fetch_env!("WEAVIATE_API_KEY"),
  strict: true  # Fail fast if unreachable

Development Configuration

# config/dev.exs (NEVER commit production keys!)
config :weaviate_ex,
  url: "http://localhost:8080",
  api_key: nil  # No auth for local development

Client Auth Helpers (API Key / OIDC)

Configure auth directly in the client for per-connection credentials and automatic OIDC refresh:

alias WeaviateEx.Auth

# API key
{:ok, client} =
  WeaviateEx.Client.connect(
    base_url: "https://your-cluster.weaviate.network",
    auth: Auth.api_key("your-secret-api-key")
  )

# OIDC client credentials (auto-refresh)
auth = Auth.client_credentials("client-id", "client-secret", scopes: ["openid", "profile"])

{:ok, client} =
  WeaviateEx.Client.connect(
    base_url: "https://your-cluster.weaviate.network",
    auth: auth
  )

# Skip init checks if needed
{:ok, client} =
  WeaviateEx.Client.connect(
    base_url: "https://your-cluster.weaviate.network",
    auth: auth,
    skip_init_checks: true
  )

OIDC access tokens are refreshed automatically and applied to HTTP headers and gRPC metadata.

Security Best Practices:

  • βœ… Never commit API keys to version control
  • βœ… Use environment variables for production
  • βœ… Add .env to .gitignore (already done)
  • βœ… Use System.fetch_env!/1 to fail fast on missing keys
  • βœ… Store production secrets in secure vaults (e.g., AWS Secrets Manager)
  • βœ… Use different keys for dev/staging/production

Connection Management

Connecting to Weaviate Cloud (v0.7.4+)

WeaviateEx provides full support for Weaviate Cloud Service (WCS) with automatic configuration:

alias WeaviateEx.Connect

# Connect to Weaviate Cloud with API key
config = Connect.to_weaviate_cloud(
  cluster_url: "my-cluster.weaviate.network",
  api_key: "your-wcs-api-key"
)

{:ok, client} = WeaviateEx.Client.connect(
  base_url: config.base_url,
  grpc_host: config.grpc_host,
  grpc_port: config.grpc_port,
  api_key: config.api_key,
  additional_headers: Map.new(config.headers)
)

Automatic WCS Features:

  • gRPC Host Detection: .weaviate.network clusters use {ident}.grpc.{domain} pattern
  • X-Weaviate-Cluster-URL Header: Automatically added for embedding service integration
  • TLS/Port 443: HTTPS and gRPC-TLS enforced for cloud clusters
# Different WCS domains are handled correctly:
Connect.to_weaviate_cloud(cluster_url: "my-cluster.weaviate.network")
# gRPC host: my-cluster.grpc.weaviate.network

Connect.to_weaviate_cloud(cluster_url: "my-cluster.aws.weaviate.cloud")
# gRPC host: grpc-my-cluster.aws.weaviate.cloud

Server Version Requirements

WeaviateEx requires Weaviate server version 1.27.0 or higher. The client validates the server version on connection.

# Version check happens automatically during connect
{:ok, client} = WeaviateEx.Client.connect(
  base_url: "http://localhost:8080"
)

# To bypass version checks (not recommended)
{:ok, client} = WeaviateEx.Client.connect(
  base_url: "http://localhost:8080",
  skip_init_checks: true
)

When connecting to an unsupported version, you'll receive a clear error:

Weaviate server version 1.20.0 is below minimum required 1.27.0

Connection Pool Configuration (v0.6.0+)

Configure HTTP and gRPC connection pools for optimal performance:

alias WeaviateEx.Client.Pool

# Create custom pool configuration
http_pool = Pool.new(
  size: 20,              # Number of connections in pool
  overflow: 10,          # Maximum overflow connections
  strategy: :lifo,       # Connection selection (:fifo or :lifo)
  timeout: 5000,         # Checkout timeout in ms
  idle_timeout: 60_000,  # Idle connection timeout in ms
  max_age: nil           # Max connection age (nil = no limit)
)

# Use preset configurations
http_pool = Pool.default_http()   # Optimized for HTTP/Finch
grpc_pool = Pool.default_grpc()   # Optimized for gRPC (fewer connections)

# Convert to client options
finch_opts = Pool.to_finch_opts(http_pool)
grpc_opts = Pool.to_grpc_opts(grpc_pool)

Simplified Connection Config (v0.7.0+)

For high-load scenarios, use the new Connection config:

alias WeaviateEx.Config.Connection

# Create connection config with custom settings
config = Connection.new(
  pool_size: 20,           # Connections per pool
  max_connections: 200,    # Maximum total connections
  pool_timeout: 10_000,    # Pool checkout timeout (ms)
  max_idle_time: 60_000    # Max idle time before close (ms)
)

# Use in client creation
{:ok, client} = WeaviateEx.Client.connect(
  base_url: "http://localhost:8080",
  connection: config
)

# Or pass options directly
{:ok, client} = WeaviateEx.Client.connect(
  base_url: "http://localhost:8080",
  connection: [pool_size: 20, max_connections: 200]
)

Proxy Configuration (v0.7.3+)

Use proxy settings for HTTP, HTTPS, and gRPC connections:

alias WeaviateEx.Config.Proxy

{:ok, client} =
  WeaviateEx.Client.connect(
    base_url: "https://your-cluster.weaviate.network",
    proxy: Proxy.new(
      http: "http://proxy.example.com:8080",
      https: "https://proxy.example.com:8443",
      grpc: "http://grpc-proxy.example.com:8080"
    )
  )

# Or read from HTTP_PROXY / HTTPS_PROXY / GRPC_PROXY
{:ok, client} =
  WeaviateEx.Client.connect(
    base_url: "https://your-cluster.weaviate.network",
    proxy: :env
  )

HTTP Retry Configuration (v0.7.4+)

WeaviateEx automatically retries failed HTTP requests with exponential backoff and jitter. Retries are triggered for both transport errors (network issues) and transient HTTP status codes.

Retryable errors:

  • Transport: connection refused, reset, timeout, closed, DNS failure
  • HTTP status codes: 408, 429, 500, 502, 503, 504
# Configure retry options when creating a client
{:ok, client} = WeaviateEx.Client.connect(
  base_url: "http://localhost:8080",
  retry: [
    max_retries: 3,        # Maximum retry attempts (default: 3)
    base_delay_ms: 100,    # Base delay for exponential backoff (default: 100)
    max_delay_ms: 5000     # Maximum delay cap (default: 5000)
  ]
)

# Or override per-request
{:ok, data} = WeaviateEx.API.Data.get(client, "Article", uuid,
  max_retries: 5,
  base_delay_ms: 200,
  max_delay_ms: 10000
)

Backoff strategy:

  • Uses exponential backoff: delay = base_delay_ms Γ— 2^attempt
  • Adds Β±10% random jitter to prevent thundering herd
  • Capped at max_delay_ms

Example delays with defaults (base=100ms, max=5000ms):

  • Attempt 0: ~100ms
  • Attempt 1: ~200ms
  • Attempt 2: ~400ms
  • Attempt 3: ~800ms

Per-Operation Timeouts (v0.7.4+)

WeaviateEx uses different timeouts based on operation type:

Operation TypeDefault TimeoutDescription
Query/GET30 secondsSearch, read operations
Insert/POST90 secondsWrite, update operations
Batch900 secondsBatch operations (insert Γ— 10)
Init2 secondsConnection initialization
# Configure timeouts in client
{:ok, client} = WeaviateEx.Client.connect(
  base_url: "http://localhost:8080",
  timeout_config: WeaviateEx.Config.Timeout.new(
    query: 60_000,    # 60 seconds for queries
    insert: 180_000,  # 180 seconds for inserts
    init: 5_000       # 5 seconds for init
  )
)

# Override per-request
{:ok, data} = WeaviateEx.API.Data.get(client, "Article", uuid,
  timeout: 60_000  # Explicit timeout override
)

# Specify operation type for automatic timeout selection
{:ok, result} = WeaviateEx.API.Batch.create_objects(client, objects,
  operation: :batch  # Uses extended batch timeout
)

Client Lifecycle Management (v0.6.0+)

Manage client connections with explicit lifecycle control:

alias WeaviateEx.Client

# Create and use a client
{:ok, client} = Client.new(base_url: "http://localhost:8080")

# Check client status
Client.status(client)   # => :connected, :initializing, :disconnected, :closed

# Check if client is closed
Client.closed?(client)  # => false

# Get client statistics
stats = Client.stats(client)
IO.puts("Requests: #{stats.request_count}")
IO.puts("Errors: #{stats.error_count}")
IO.puts("Created: #{stats.created_at}")

# Close the client when done
:ok = Client.close(client)
Client.closed?(client)  # => true

Resource Management with with_client/2

Automatic client lifecycle management with guaranteed cleanup:

alias WeaviateEx.Client

# with_client ensures cleanup even on errors
result = Client.with_client([base_url: "http://localhost:8080"], fn client ->
  # Use client for operations
  {:ok, meta} = WeaviateEx.health_check(client)
  {:ok, collections} = WeaviateEx.Collections.list(client)

  # Return your result
  {:ok, %{version: meta["version"], collections: length(collections)}}
end)

# Client is automatically closed after the function returns
case result do
  {:ok, data} -> IO.puts("Version: #{data.version}")
  {:error, reason} -> IO.puts("Error: #{inspect(reason)}")
end

# Even if the function raises, client is closed
try do
  Client.with_client([base_url: url], fn client ->
    raise "Something went wrong"
  end)
rescue
  e -> IO.puts("Caught: #{e.message}")
  # Client was still properly closed
end

Debug & Troubleshooting

Debug Module (v0.6.0+)

Compare REST and gRPC protocol responses for debugging:

alias WeaviateEx.Debug

# Get an object via REST (HTTP)
{:ok, rest_obj} = Debug.get_object_rest(client, "Article", uuid)
{:ok, rest_obj} =
  Debug.get_object_rest(client, "Article", uuid,
    node_name: "node-1",
    consistency_level: "ALL"
  )

# Get the same object via gRPC
{:ok, grpc_obj} = Debug.get_object_grpc(client, "Article", uuid)

# Compare both protocols and get a detailed diff
{:ok, comparison} = Debug.compare_protocols(client, "Article", uuid)

# Check comparison results
comparison.match?           # => true or false
comparison.rest_object      # => %{...}
comparison.grpc_object      # => %{...}
comparison.differences      # => [] or list of differences

# Get connection diagnostics
{:ok, info} = Debug.connection_info(client)
IO.puts("HTTP Base URL: #{info.http_base_url}")
IO.puts("gRPC Connected: #{info.grpc_connected}")
IO.puts("gRPC Host: #{info.grpc_host}:#{info.grpc_port}")

Object Comparison

Deep comparison of objects from different sources:

alias WeaviateEx.Debug.ObjectCompare

# Compare two objects
result = ObjectCompare.compare(rest_object, grpc_object)

result.match?        # => true if objects are equivalent
result.differences   # => list of differences found

# Get a formatted diff report
diff_list = ObjectCompare.diff(rest_object, grpc_object)
report = ObjectCompare.format_diff(diff_list)
IO.puts(report)
# Output:
# - properties.title: "REST Title" vs "gRPC Title"
# - _additional.vector: [0.1, 0.2, ...] vs [0.1, 0.2, ...]

Request Logging

Log and analyze HTTP/gRPC requests for debugging:

alias WeaviateEx.Debug.RequestLogger

# Start the request logger
{:ok, logger} = RequestLogger.start_link(name: :my_logger)

# Enable logging
RequestLogger.enable(logger)

# Log requests manually or via middleware
RequestLogger.log_request(logger, %{
  method: :get,
  path: "/v1/schema",
  protocol: :http,
  duration_ms: 45,
  status: 200
})

# Get recent logs
logs = RequestLogger.get_logs(logger)
for log <- logs do
  IO.puts("#{log.protocol} #{log.method} #{log.path} - #{log.status} (#{log.duration_ms}ms)")
end

# Filter logs
http_logs = RequestLogger.get_logs(logger, protocol: :http)
slow_logs = RequestLogger.get_logs(logger, min_duration_ms: 100)

# Export logs for analysis
RequestLogger.export_logs(logger, "/tmp/weaviate_requests.json", :json)
RequestLogger.export_logs(logger, "/tmp/weaviate_requests.txt", :text)

# Clear logs
RequestLogger.clear_logs(logger)

# Disable when done
RequestLogger.disable(logger)

Main Module Debug Helpers

Quick access to debug functions from the main module:

# Get object via REST
{:ok, obj} = WeaviateEx.debug_get_rest(client, "Article", uuid)

# Compare protocols
{:ok, comparison} = WeaviateEx.debug_compare(client, "Article", uuid)

Documentation

Building Documentation Locally

# Generate docs
mix docs

# Open in browser (macOS)
open doc/index.html

# Open in browser (Linux)
xdg-open doc/index.html

Development

# Clone the repository
git clone https://github.com/yourusername/weaviate_ex.git
cd weaviate_ex

# Install dependencies
mix deps.get

# Compile
mix compile

# Run unit tests (mocked - fast)
mix test

# Run integration tests (requires live Weaviate)
mix weaviate.start
mix test --include integration

# Generate documentation
mix docs

# Run code analysis
mix credo

# Run type checking (if dialyzer is set up)
mix dialyzer

# Format code
mix format

Project Structure

weaviate_ex/
β”œβ”€β”€ ci/
β”‚   └── weaviate/                   # Docker assets mirrored from Python client
β”‚       β”œβ”€β”€ compose.sh
β”‚       β”œβ”€β”€ start_weaviate.sh
β”‚       β”œβ”€β”€ docker-compose.yml
β”‚       └── docker-compose-*.yml
β”œβ”€β”€ priv/
β”‚   └── protos/v1/                  # Weaviate gRPC proto definitions
β”‚       β”œβ”€β”€ weaviate.proto
β”‚       β”œβ”€β”€ batch.proto
β”‚       β”œβ”€β”€ search_get.proto
β”‚       └── ...
β”œβ”€β”€ lib/
β”‚   β”œβ”€β”€ weaviate_ex.ex              # Top-level API
β”‚   β”œβ”€β”€ weaviate_ex/
β”‚   β”‚   β”œβ”€β”€ embedded.ex             # Embedded binary lifecycle manager
β”‚   β”‚   β”œβ”€β”€ dev_support/            # Internal tooling (compose helper)
β”‚   β”‚   β”œβ”€β”€ application.ex          # OTP application
β”‚   β”‚   β”œβ”€β”€ client.ex               # Client struct & config
β”‚   β”‚   β”œβ”€β”€ config.ex               # Configuration management
β”‚   β”‚   β”œβ”€β”€ error.ex                # Error types (HTTP + gRPC)
β”‚   β”‚   β”œβ”€β”€ filter.ex               # Filter DSL
β”‚   β”‚   β”œβ”€β”€ api/                    # API modules
β”‚   β”‚   β”‚   β”œβ”€β”€ collections.ex
β”‚   β”‚   β”‚   β”œβ”€β”€ data.ex
β”‚   β”‚   β”‚   β”œβ”€β”€ aggregate.ex
β”‚   β”‚   β”‚   β”œβ”€β”€ tenants.ex
β”‚   β”‚   β”‚   └── vector_config.ex
β”‚   β”‚   β”œβ”€β”€ grpc/                   # gRPC infrastructure
β”‚   β”‚   β”‚   β”œβ”€β”€ channel.ex          # Channel management
β”‚   β”‚   β”‚   β”œβ”€β”€ services/           # gRPC service clients
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ search.ex
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ batch.ex
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ aggregate.ex
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ tenants.ex
β”‚   β”‚   β”‚   β”‚   └── health.ex
β”‚   β”‚   β”‚   └── generated/v1/       # Proto-generated modules
β”‚   β”‚   └── ...
β”‚   └── mix/
β”‚       └── tasks/
β”‚           β”œβ”€β”€ weaviate.start.ex
β”‚           β”œβ”€β”€ weaviate.stop.ex
β”‚           β”œβ”€β”€ weaviate.status.ex
β”‚           └── weaviate.logs.ex
β”œβ”€β”€ test/                           # Test suite
β”œβ”€β”€ examples/                       # Runnable examples (in source repo)
β”œβ”€β”€ install.sh                      # Legacy single-profile bootstrap
└── mix.exs                         # Project configuration

Contributing

Contributions are welcome! Here's how you can help:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Write tests: All new features should include tests
  4. Run tests: mix test (should pass)
  5. Run integration tests: mix weaviate.test (optional but recommended)
  6. Run Credo: mix credo (should pass)
  7. Commit changes: git commit -m 'Add amazing feature'
  8. Push to branch: git push origin feature/amazing-feature
  9. Open a Pull Request

CI/CD Pipeline

Pull requests automatically run the following GitHub Actions jobs:

JobDescription
format-and-lintCode formatting and Credo linting
unit-tests2300+ unit tests with Mox mocking + Dialyzer
integration-testsIntegration tests against Weaviate 1.28.14
integration-matrixTests against Weaviate 1.27, 1.28, 1.29, 1.30 (master/tags only)

Development Guidelines

  • Write tests first (TDD approach)
  • Maintain test coverage above 90%
  • Follow Elixir style guide
  • Add typespecs for public functions
  • Update documentation for API changes
  • Add examples for new features
  • For API changes, add integration tests in test/integration/

License

MIT License. See LICENSE for details.

Acknowledgments

  • Built for Weaviate vector database
  • Inspired by official Python and TypeScript clients
  • Uses grpc-elixir for high-performance gRPC operations
  • Uses Finch for HTTP/2 connection pooling (schema operations)
  • Powered by Elixir and the BEAM VM

Questions or Issues? Open an issue on GitHub