README
View Source
WeaviateEx
A modern, idiomatic Elixir client for Weaviate vector database (v1.28+) with full Python client feature parity.
Features
Core Capabilities
- Complete API Coverage - Collections, objects, batch operations, queries, aggregations, cross-references, tenants
- RBAC & User Management - Full role-based access control, user lifecycle management, OIDC groups
- Hybrid Protocol Architecture - gRPC for high-performance data operations, HTTP for schema management
- Type-Safe - Protocol-based architecture with comprehensive typespecs
- Test-First Design - 2600+ tests with Mox-based mocking for fast, isolated testing
- Production-Ready - gRPC persistent channels, Finch HTTP pooling, proper error handling, health checks
- Easy Setup - First-class Mix tasks for managing local Weaviate stacks
Generative AI (RAG) - 20+ Providers
- OpenAI (GPT-4, GPT-3.5, O1/O3 reasoning models)
- Anthropic (Claude 3.5 Sonnet, Claude 3 Opus/Haiku)
- Cohere, Google Vertex/Gemini, AWS Bedrock/SageMaker
- Mistral, Ollama, XAI (Grok), ContextualAI
- NEW in v0.3: NVIDIA NIM, Databricks, FriendliAI
- Typed provider configurations with full parameter support
- Multimodal generation with image support
Vector Search
- Semantic Search - near_text, near_vector, near_object
- Multimodal Search - near_image (images), near_media (audio, video, thermal, depth, IMU)
- Hybrid Search - Combined keyword + vector with configurable alpha
- BM25 Keyword Search - Full-text search with AND/OR operators
- Reranking - gRPC-based result reranking with Cohere, Transformers, VoyageAI, and more
- Multi-Vector Support - ColBERT-style embeddings with Muvera encoding
- Named Vectors - Multiple vectors per object with targeting strategies
Advanced Features
- Cross-References - Full CRUD for object relationships
- Multi-Tenancy - HOT, COLD, FROZEN, OFFLOADED states
- Batch Operations - Error tracking, retry logic, rate limit handling
- Embedded Mode - Run Weaviate without Docker
- 20+ Vectorizers - OpenAI, Cohere, VoyageAI, Jina, Transformers, Ollama, and more
- gRPC Batch Streaming - High-performance bidirectional streaming (Weaviate 1.34+)
Table of Contents
- Quick Start
- Installation
- Configuration
- Usage
- Embedded Mode
- Health Checks
- Server Version Detection
- Collections (Schema Management)
- Data Operations (CRUD)
- Objects API
- Batch Operations
- Queries & Vector Search
- Multimodal Search
- Aggregations
- Advanced Filtering
- Vector Configuration
- Backup & Restore
- Multi-Tenancy
- RBAC (Role-Based Access Control)
- User Management
- Group Management
- Examples
- Testing
- Mix Tasks
- Docker Management
- Authentication
- Connection Management
- Debug & Troubleshooting
- Documentation
- Contributing
- License
Quick Start
1. Start Weaviate locally
π§° Prerequisite: Docker Desktop (macOS/Windows) or Docker Engine (Linux)
We ship Docker Compose profiles from the Python client under ci/. Use our Mix tasks to bring everything up:
# Start Weaviate containers (default version: 1.35.0)
mix weaviate.start
# Or specify a version
mix weaviate.start --version 1.35.0
# Inspect running services and health status
mix weaviate.status
The first run downloads the Weaviate Docker image and waits for the /v1/.well-known/ready endpoint to return 200.
When you're done:
mix weaviate.stop
Prefer direct scripts? Use
./ci/start_weaviate.sh 1.35.0and./ci/stop_weaviate.sh.
2. Add to Your Project
Add weaviate_ex to your mix.exs dependencies:
def deps do
[
{:weaviate_ex, "~> 0.7.4"}
]
endThen fetch dependencies:
mix deps.get
3. Configure
The library automatically reads from environment variables (loaded from .env):
# .env file (created by install.sh)
WEAVIATE_URL=http://localhost:8080
WEAVIATE_API_KEY= # Optional, for authenticated instances
Or configure in your Elixir config files:
# config/config.exs
config :weaviate_ex,
url: "http://localhost:8080",
api_key: nil, # Optional
strict: true # Default: true - fails fast if Weaviate is unreachableStrict Mode: By default, WeaviateEx validates connectivity on startup. If Weaviate is unreachable, your application won't start. Set strict: false to allow startup anyway (useful for development when Weaviate might not always be running).
4. Verify Connection
The library automatically performs a health check on startup:
[WeaviateEx] Successfully connected to Weaviate
URL: http://localhost:8080
Version: 1.34.0-rc.0
You can also run mix weaviate.status to see every profile thatβs currently online and the ports they expose.
If configuration is missing, you'll get helpful error messages:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β WeaviateEx Configuration Error β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ£
β Missing required configuration: WEAVIATE_URL β
β β
β Please set the Weaviate URL using one of these methods: β
β 1. Environment variable: export WEAVIATE_URL=http://localhost:8080 β
β 2. Application configuration (config/config.exs) β
β 3. Runtime configuration (config/runtime.exs) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ5. Shape a Tenant-Aware Collection and Load Data
alias WeaviateEx.{Collections, Objects, Batch}
# Define the collection and toggle multi-tenancy when ready
{:ok, _collection} =
Collections.create("Article", %{
description: "Articles by tenant",
properties: [
%{name: "title", dataType: ["text"]},
%{name: "content", dataType: ["text"]}
]
})
{:ok, %{"enabled" => true}} = Collections.set_multi_tenancy("Article", true)
{:ok, true} = Collections.exists?("Article")
# Create & read tenant-scoped objects with _additional metadata
{:ok, created} =
Objects.create("Article", %{properties: %{title: "Tenant scoped", content: "Hello!"}},
tenant: "tenant-a"
)
{:ok, fetched} =
Objects.get("Article", created["id"],
tenant: "tenant-a",
include: ["_additional", "vector"]
)
# Batch ingest with a summary that separates successes from errors
objects =
Enum.map(1..3, fn idx ->
%{class: "Article", properties: %{title: "Story #{idx}"}, tenant: "tenant-a"}
end)
{:ok, summary} = Batch.create_objects(objects, return_summary: true, tenant: "tenant-a")
summary.statistics
#=> %{processed: 3, successful: 3, failed: 0}Installation
See INSTALL.md for detailed installation instructions covering:
- Docker installation on various platforms
- Manual Weaviate setup
- Configuration options
- Troubleshooting
Configuration
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
WEAVIATE_URL | Yes | - | Full URL to Weaviate (e.g., http://localhost:8080) |
WEAVIATE_API_KEY | No | - | API key for authentication (for cloud/production) |
Application Configuration
# config/config.exs
config :weaviate_ex,
url: System.get_env("WEAVIATE_URL", "http://localhost:8080"),
api_key: System.get_env("WEAVIATE_API_KEY"),
strict: true, # Fail on startup if unreachable
timeout: 30_000 # Request timeout in millisecondsgRPC Configuration
WeaviateEx v0.4.0+ uses a hybrid protocol architecture: gRPC for data operations (queries, batch, aggregations) and HTTP for schema management. gRPC provides significantly better performance for high-throughput operations.
# config/config.exs
config :weaviate_ex,
url: "http://localhost:8080", # HTTP endpoint for schema operations
grpc_host: "localhost", # gRPC host (default: derived from url)
grpc_port: 50051, # gRPC port (default: 50051)
grpc_max_message_size: 104_857_600, # Max message size in bytes (default: 100MB)
api_key: nil # Used for both HTTP and gRPC auth| Variable | Required | Default | Description |
|---|---|---|---|
grpc_host | No | Derived from url | gRPC endpoint hostname |
grpc_port | No | 50051 | gRPC port |
grpc_max_message_size | No | 104857600 | Maximum gRPC message size (100MB) |
The gRPC connection is automatically established when you create a client:
# Connect with gRPC (automatic)
{:ok, client} = WeaviateEx.Client.connect(
url: "http://localhost:8080",
grpc_port: 50051
)
# Client now has both HTTP and gRPC channels
client.grpc_channel # => gRPC channel for data operations
client.config # => Configuration for HTTP operationsCustom Headers (v0.7.1+)
Add custom headers to all HTTP and gRPC requests for authentication, tracing, or other purposes:
# Configure additional headers in client config
{:ok, client} = WeaviateEx.Client.connect(
url: "http://localhost:8080",
additional_headers: %{
"X-Custom-Header" => "custom-value",
"X-Request-ID" => "trace-123",
"Authorization" => "Bearer custom-token"
}
)
# Headers are automatically included in:
# - All HTTP requests (schema operations, health checks)
# - All gRPC requests as metadata (lowercased keys)Headers are validated on client creation - nil values will raise an ArgumentError.
gRPC Retry with Exponential Backoff (v0.7.1+)
All gRPC operations automatically retry on transient errors with exponential backoff:
# Retryable gRPC status codes:
# - UNAVAILABLE (14) - Service temporarily unavailable
# - RESOURCE_EXHAUSTED (8) - Rate limiting
# - ABORTED (10) - Transaction aborted
# - DEADLINE_EXCEEDED (4) - Timeout
# Default: 4 retries with exponential backoff
# Attempt 0: 1 second delay
# Attempt 1: 2 seconds
# Attempt 2: 4 seconds
# Attempt 3: 8 seconds
# Maximum delay capped at 32 seconds
# Configure retry behavior (optional)
alias WeaviateEx.GRPC.Retry
# Custom retry with options
result = Retry.with_retry(
fn -> some_grpc_operation() end,
max_retries: 3,
base_delay_ms: 500
)
# Check if error is retryable
Retry.retryable?(%GRPC.RPCError{status: 14}) # => true (UNAVAILABLE)
Retry.retryable?(%GRPC.RPCError{status: 3}) # => false (INVALID_ARGUMENT)
# Calculate backoff delay
Retry.calculate_backoff(0) # => 1000ms
Retry.calculate_backoff(2) # => 4000ms
Retry.calculate_backoff(5) # => 32000ms (capped)All gRPC services (Search, Batch, Aggregate, Tenants, Health) automatically use retry logic.
Proxy Configuration (v0.5.0+)
WeaviateEx supports HTTP, HTTPS, and gRPC proxy configuration:
alias WeaviateEx.Config.Proxy
# Read from environment variables (HTTP_PROXY, HTTPS_PROXY, GRPC_PROXY)
proxy = Proxy.from_env()
# Or configure explicitly
proxy = Proxy.new(
http: "http://proxy.example.com:8080",
https: "https://proxy.example.com:8443",
grpc: "http://grpc-proxy.example.com:8080"
)
# Check if proxy is configured
Proxy.configured?(proxy) # => true
# Get Finch HTTP client options
Proxy.to_finch_opts(proxy) # => [proxy: {:https, "proxy.example.com", 8443, []}]
# Get gRPC channel options
Proxy.to_grpc_opts(proxy) # => [http_proxy: "http://grpc-proxy.example.com:8080"]Environment variables are read case-insensitively (uppercase takes precedence):
HTTP_PROXY/http_proxy- HTTP proxy URLHTTPS_PROXY/https_proxy- HTTPS proxy URLGRPC_PROXY/grpc_proxy- gRPC proxy URL
Runtime Configuration (Recommended for Production)
# config/runtime.exs
config :weaviate_ex,
url: System.fetch_env!("WEAVIATE_URL"),
api_key: System.get_env("WEAVIATE_API_KEY")Usage
Embedded Mode
Need an ephemeral instance without Docker? WeaviateEx can download and manage the official embedded binary:
# Downloads (once) into ~/.cache/weaviate-embedded and starts the process
{:ok, embedded} =
WeaviateEx.start_embedded(
version: "1.34.0",
port: 8099,
grpc_port: 50155,
persistence_data_path: Path.expand("tmp/weaviate-data"),
environment_variables: %{"DISABLE_TELEMETRY" => "true"}
)
# Talk to it just like any other instance
System.put_env("WEAVIATE_URL", "http://localhost:8099")
{:ok, meta} = WeaviateEx.health_check()
# Always stop the handle when finished
:ok = WeaviateEx.stop_embedded(embedded)Passing version: "latest" fetches the most recent GitHub release. Binaries are cached, so subsequent calls reuse the download. You can override binary_path/persistence_data_path to control where the executable and data live.
Health Checks
Check if Weaviate is accessible and get version information:
# Get metadata (version, modules)
{:ok, meta} = WeaviateEx.health_check()
# => %{"version" => "1.34.0-rc.0", "modules" => %{}}
# Check readiness (can handle requests) - K8s readiness probe
{:ok, true} = WeaviateEx.ready?()
# Check liveness (service is up) - K8s liveness probe
{:ok, true} = WeaviateEx.alive?()
# With explicit client
{:ok, client} = WeaviateEx.Client.connect(base_url: "http://localhost:8080")
{:ok, true} = WeaviateEx.Health.alive?(client)
{:ok, true} = WeaviateEx.Health.ready?(client)
# Wait for Weaviate to become ready (useful for startup scripts)
:ok = WeaviateEx.Health.wait_until_ready(timeout: 30_000, check_interval: 1000)
# gRPC health ping (v0.7.0+)
alias WeaviateEx.GRPC.Services.Health, as: GRPCHealth
:ok = GRPCHealth.ping(client.grpc_channel)Kubernetes Integration
The alive? and ready? functions use the standard Kubernetes probe endpoints:
- Liveness:
/.well-known/live- Is the process running? - Readiness:
/.well-known/ready- Can the service handle traffic?
# Example K8s deployment liveness/readiness probes
livenessProbe:
httpGet:
path: /.well-known/live
port: 8080
readinessProbe:
httpGet:
path: /.well-known/ready
port: 8080Server Version Detection
Parse and validate Weaviate server versions (v0.7.0+):
alias WeaviateEx.Version
# Parse version strings
{:ok, {1, 28, 0}} = Version.parse("1.28.0")
{:ok, {1, 28, 0}} = Version.parse("v1.28.0-rc1") # Handles v prefix and prerelease
# Check if version meets minimum requirement
true = Version.meets_minimum?({1, 28, 0}, {1, 27, 0})
false = Version.meets_minimum?({1, 26, 0}, {1, 27, 0})
# Validate server version (minimum: 1.27.0)
:ok = Version.validate_server({1, 28, 0})
{:error, {:unsupported_version, {1, 20, 0}, {1, 27, 0}}} = Version.validate_server({1, 20, 0})
# Extract version from meta endpoint response
{:ok, meta} = WeaviateEx.health_check()
{:ok, {1, 28, 0}} = Version.get_server_version(meta)
# Get minimum supported version
Version.minimum_version() # => {1, 27, 0}
Version.minimum_version_string() # => "1.27.0"
# Format version tuple to string
"1.28.0" = Version.format_version({1, 28, 0})Collections (Schema Management)
Collections define the structure of your data:
# Create a collection with properties
{:ok, collection} = WeaviateEx.Collections.create("Article", %{
description: "News articles",
properties: [
%{name: "title", dataType: ["text"]},
%{name: "content", dataType: ["text"]},
%{name: "publishedAt", dataType: ["date"]},
%{name: "views", dataType: ["int"]}
],
vectorizer: "none" # Use "text2vec-openai" for auto-vectorization
})
# List all collections
{:ok, schema} = WeaviateEx.Collections.list()
# Get a specific collection
{:ok, collection} = WeaviateEx.Collections.get("Article")
# Add a property to existing collection
{:ok, property} = WeaviateEx.Collections.add_property("Article", %{
name: "author",
dataType: ["text"]
})
# Check if collection exists
{:ok, true} = WeaviateEx.Collections.exists?("Article")
# Delete a collection
{:ok, _} = WeaviateEx.Collections.delete("Article")Object TTL (Time-To-Live)
Automatically expire and delete objects after a specified duration:
alias WeaviateEx.Config.ObjectTTL
# Create collection with 24-hour TTL using human-readable duration
{:ok, _} = WeaviateEx.Collections.create("Events", %{
properties: [%{name: "title", dataType: ["text"]}],
object_ttl: ObjectTTL.from_duration(hours: 24)
})
# Or specify exact seconds with creation time deletion
{:ok, _} = WeaviateEx.Collections.create("Sessions", %{
properties: [%{name: "user_id", dataType: ["text"]}],
object_ttl: ObjectTTL.delete_by_creation_time(3600) # 1 hour
})
# Delete objects based on last update time
{:ok, _} = WeaviateEx.Collections.create("Cache", %{
properties: [%{name: "data", dataType: ["text"]}],
object_ttl: ObjectTTL.delete_by_update_time(86_400, true) # 24h, filter expired
})
# Delete objects based on a custom date property
{:ok, _} = WeaviateEx.Collections.create("Subscriptions", %{
properties: [
%{name: "plan", dataType: ["text"]},
%{name: "expires_at", dataType: ["date"]}
],
object_ttl: ObjectTTL.delete_by_date_property("expires_at")
})
# Update TTL on existing collection
{:ok, _} = WeaviateEx.Collections.update_ttl("Events",
ObjectTTL.from_duration(days: 7)
)
# Disable TTL
{:ok, _} = WeaviateEx.Collections.update_ttl("Events",
ObjectTTL.disable()
)Note: Objects are deleted asynchronously in the background. The filter_expired_objects
option (second parameter in delete_by_* functions) controls whether expired but not yet
deleted objects are excluded from search results.
Schema helpers for range filters and auto-tenant configuration:
alias WeaviateEx.Config.{AutoTenant, ObjectTTL}
alias WeaviateEx.Schema.MultiTenancyConfig
alias WeaviateEx.Property
ttl = ObjectTTL.delete_by_update_time(86_400, true)
{:ok, _} = WeaviateEx.Collections.create("Session", %{
properties: [
Property.number("expires_in", index_range_filters: true)
],
object_ttl: ttl,
multi_tenancy_config: MultiTenancyConfig.new(enabled: true, auto_tenant_creation: true),
auto_tenant: AutoTenant.enable(auto_delete_timeout: 3_600)
})Nested Properties
Define complex object structures with nested properties:
alias WeaviateEx.Property
alias WeaviateEx.Property.Nested
# Create a collection with nested object properties
{:ok, _} = WeaviateEx.Collections.create("Product", %{
description: "Products with specifications",
properties: [
%{name: "name", dataType: ["text"]},
%{name: "price", dataType: ["number"]},
# Nested object property
Property.object("specs", [
Nested.new(name: "weight", data_type: :number),
Nested.new(name: "dimensions", data_type: :text),
Nested.new(name: "material", data_type: :text)
]),
# Array of nested objects
Property.object_array("variants", [
Nested.new(name: "color", data_type: :text),
Nested.new(name: "size", data_type: :text),
Nested.new(name: "sku", data_type: :text),
Nested.new(name: "stock", data_type: :int)
])
]
})
# Insert object with nested data
{:ok, product} = WeaviateEx.Objects.create("Product", %{
properties: %{
name: "Laptop Stand",
price: 79.99,
specs: %{
weight: 2.5,
dimensions: "30x25x15cm",
material: "aluminum"
},
variants: [
%{color: "silver", size: "standard", sku: "LS-001", stock: 50},
%{color: "black", size: "large", sku: "LS-002", stock: 30}
]
}
})
# Deeply nested properties (object within object)
{:ok, _} = WeaviateEx.Collections.create("Company", %{
properties: [
%{name: "name", dataType: ["text"]},
Property.object("headquarters", [
Nested.new(name: "city", data_type: :text),
Nested.new(name: "country", data_type: :text),
Nested.new(
name: "address",
data_type: :object,
nested_properties: [
Nested.new(name: "street", data_type: :text),
Nested.new(name: "zip", data_type: :text)
]
)
])
]
})
# Parse nested properties from API response
api_data = %{
"name" => "specs",
"dataType" => ["object"],
"nestedProperties" => [
%{"name" => "weight", "dataType" => ["number"]}
]
}
nested = Nested.from_api(api_data)Data Operations (CRUD)
Simple CRUD operations with automatic UUID generation:
alias WeaviateEx.API.Data
# Create (insert) a new object
data = %{
properties: %{
"title" => "Hello Weaviate",
"content" => "This is a test article",
"views" => 0
},
vector: [0.1, 0.2, 0.3, 0.4, 0.5] # Optional if using auto-vectorization
}
{:ok, object} = Data.insert(client, "Article", data)
# Named vectors (v0.7.1+) - for collections with multiple vector spaces
data_with_named_vectors = %{
properties: %{"title" => "Multi-vector article"},
vectors: %{
"title_vector" => [0.1, 0.2, 0.3],
"content_vector" => [0.4, 0.5, 0.6, 0.7]
}
}
{:ok, object} = Data.insert(client, "MultiVectorCollection", data_with_named_vectors)
uuid = object["id"]
# Read - get object by ID
{:ok, retrieved} = Data.get_by_id(client, "Article", uuid)
# Update - partial update (PATCH)
{:ok, updated} = Data.patch(client, "Article", uuid, %{
properties: %{"views" => 42},
vector: [0.1, 0.2, 0.3, 0.4, 0.5]
})
# Check if object exists
{:ok, true} = Data.exists?(client, "Article", uuid)
# Delete
{:ok, _} = Data.delete_by_id(client, "Article", uuid)Collection handles with default tenant/consistency:
collection =
WeaviateEx.Collection.new(client, "Article",
tenant: "tenant-a",
consistency_level: "QUORUM"
)
{:ok, _} = WeaviateEx.Collection.insert(collection, %{properties: %{title: "Tenant scoped"}})Inline References During Insert (v0.7.1+)
Create objects with references in a single operation:
# Insert object with inline references
{:ok, article} = WeaviateEx.Objects.create("Article", %{
properties: %{
title: "My Article",
content: "Article content..."
},
# Single reference
references: %{
"hasAuthor" => "author-uuid-here"
}
})
# Multiple references to same property
{:ok, article} = WeaviateEx.Objects.create("Article", %{
properties: %{title: "Collaborative Article"},
references: %{
"hasAuthors" => ["author-uuid-1", "author-uuid-2", "author-uuid-3"]
}
})
# Multi-target references (pointing to specific collection)
{:ok, article} = WeaviateEx.Objects.create("Article", %{
properties: %{title: "Related Content"},
references: %{
"relatedTo" => %{
target_collection: "Category",
uuids: "category-uuid"
}
}
})
# Multiple multi-target references
{:ok, article} = WeaviateEx.Objects.create("Article", %{
properties: %{title: "Multi-related"},
references: %{
"mentions" => %{
target_collection: "Person",
uuids: ["person-1", "person-2"]
}
}
})References are automatically converted to Weaviate beacon format.
Reference Operations API (v0.7.3+)
For managing references after object creation, use the References API with full multi-target support:
alias WeaviateEx.API.References
alias WeaviateEx.Data.ReferenceToMulti
alias WeaviateEx.Types.Beacon
# Add a single reference
{:ok, _} = References.add(client, "Article", article_uuid, "hasAuthor", author_uuid)
# Add a multi-target reference using ReferenceToMulti
ref = ReferenceToMulti.new("Person", person_uuid)
{:ok, _} = References.add(client, "Article", article_uuid, "hasAuthor", ref)
# Add multiple references at once
ref = ReferenceToMulti.new("Person", [person1_uuid, person2_uuid])
{:ok, _} = References.add(client, "Article", article_uuid, "hasAuthors", ref)
# Replace all references on a property
{:ok, _} = References.replace(client, "Article", article_uuid, "hasAuthors",
[author1_uuid, author2_uuid, author3_uuid]
)
# Replace with multi-target references pointing to different collections
{:ok, _} = References.replace(client, "Article", article_uuid, "relatedTo", [
ReferenceToMulti.new("Person", person_uuid),
ReferenceToMulti.new("Organization", org_uuid)
])
# Delete a reference
{:ok, _} = References.delete(client, "Article", article_uuid, "hasAuthor", author_uuid)
# Batch add references
refs = [
%{from_uuid: "article-1", from_property: "hasAuthor", to_uuid: "author-1"},
%{from_uuid: "article-2", from_property: "hasAuthor", to_uuid: "author-2",
target_collection: "Person"} # For multi-target properties
]
{:ok, _} = References.add_many(client, "Article", refs)
# Parse beacon URLs
parsed = Beacon.parse("weaviate://localhost/Person/uuid-123")
# => %{collection: "Person", uuid: "uuid-123"}
# Build beacon URLs
beacon = Beacon.build("uuid-123", "Person")
# => "weaviate://localhost/Person/uuid-123"Objects API
Full CRUD operations with explicit UUID control:
# Create with custom UUID
{:ok, object} = WeaviateEx.Objects.create("Article", %{
id: "custom-uuid-here", # Optional
properties: %{
title: "Hello Weaviate",
content: "This is a test article",
publishedAt: "2025-01-15T10:00:00Z"
},
vector: [0.1, 0.2, 0.3] # Optional
})
# Get an object with additional fields
{:ok, object} = WeaviateEx.Objects.get("Article", uuid,
include: "vector,classification"
)
# List objects with pagination
{:ok, result} = WeaviateEx.Objects.list("Article",
limit: 10,
offset: 0,
include: "vector"
)
# Update (full replacement)
{:ok, updated} = WeaviateEx.Objects.update("Article", uuid, %{
properties: %{
title: "Updated Title",
content: "Updated content"
}
})
# Patch (partial update)
{:ok, patched} = WeaviateEx.Objects.patch("Article", uuid, %{
properties: %{title: "New Title"}
})
# Delete
{:ok, _} = WeaviateEx.Objects.delete("Article", uuid)
# Check existence
{:ok, true} = WeaviateEx.Objects.exists?("Article", uuid)Payload validation happens client-side: properties is required for inserts/updates, and
property names id and vector are reserved (raises ArgumentError).
Complex Data Types
WeaviateEx automatically serializes complex Elixir types when creating or updating objects:
alias WeaviateEx.Types.{GeoCoordinate, PhoneNumber, Blob}
# DateTime - serialized to RFC3339/ISO8601
%{created_at: ~U[2024-01-01 00:00:00Z]}
# -> {"created_at": "2024-01-01T00:00:00Z"}
# Date - serialized as midnight UTC
%{published_date: ~D[2024-06-15]}
# -> {"published_date": "2024-06-15T00:00:00Z"}
# GeoCoordinate - serialized to lat/lon map
{:ok, geo} = GeoCoordinate.new(40.71, -74.00)
%{location: geo}
# -> {"location": {"latitude": 40.71, "longitude": -74.00}}
# PhoneNumber - serialized with input and country
phone = PhoneNumber.new("555-1234", default_country: "US")
%{contact: phone}
# -> {"contact": {"input": "555-1234", "defaultCountry": "US"}}
# Blob (binary data) - base64 encoded
blob = Blob.new(<<binary_image_data>>)
%{image: blob}
# -> {"image": "<base64 encoded string>"}
# Nested objects with complex types
{:ok, geo} = GeoCoordinate.new(40.7128, -74.0060)
{:ok, article} = WeaviateEx.Objects.create("Place", %{
properties: %{
name: "Central Park",
location: geo,
created_at: ~U[2024-01-01 00:00:00Z],
metadata: %{
last_visited: ~D[2024-12-25]
}
}
})Deserializing Responses
Convert Weaviate response data back to rich Elixir types:
alias WeaviateEx.Types.Deserialize
# Parse individual values
{:ok, dt} = Deserialize.deserialize("2024-01-01T00:00:00Z", :date)
# => {:ok, ~U[2024-01-01 00:00:00Z]}
{:ok, geo} = Deserialize.deserialize(
%{"latitude" => 52.37, "longitude" => 4.90},
:geo_coordinates
)
# => {:ok, %GeoCoordinate{latitude: 52.37, longitude: 4.90}}
# Deserialize properties with schema hints
schema = %{"created_at" => :date, "location" => :geo_coordinates}
{:ok, props} = Deserialize.deserialize_properties(raw_props, schema)
# Auto-detect types based on value structure
{:ok, props} = Deserialize.auto_deserialize(response["properties"])Batch Operations
Efficient bulk operations for importing large datasets:
# Batch create multiple objects
objects = [
%{class: "Article", properties: %{title: "Article 1", content: "Content 1"}},
%{class: "Article", properties: %{title: "Article 2", content: "Content 2"}},
%{class: "Article", properties: %{title: "Article 3", content: "Content 3"}}
]
{:ok, summary} = WeaviateEx.Batch.create_objects(objects, return_summary: true)
# Check rolled-up stats and per-object errors
summary.statistics
#=> %{processed: 3, successful: 3, failed: 0}
Enum.each(summary.errors, fn error ->
Logger.warn("[Batch error] #{error.id} => #{Enum.join(error.messages, "; ")}")
end)
If every object in the batch fails, `Batch.create_objects/2` returns
`{:error, %WeaviateEx.Error{type: :batch_all_failed}}`.
# Batch delete with criteria (WHERE filter)
{:ok, result} = WeaviateEx.Batch.delete_objects(%{
class: "Article",
where: %{
path: ["status"],
operator: "Equal",
valueText: "draft"
}
})Concurrent Batch Operations
High-throughput parallel batch processing with failure tracking:
alias WeaviateEx.Batch.Concurrent
alias WeaviateEx.Batch.Queue
# Concurrent batch insertion with parallel processing
objects = Enum.map(1..10_000, fn i ->
%{class: "Article", properties: %{title: "Article #{i}", content: "Content #{i}"}}
end)
{:ok, result} = Concurrent.insert_many(client, "Article", objects,
max_concurrency: 8, # Parallel batch requests
batch_size: 200, # Objects per request
ordered: false, # Don't maintain order (faster)
timeout: 60_000 # Timeout per batch
)
# Check results
IO.puts(Concurrent.Result.summary(result))
# => "Inserted 10000/10000 objects in 50 batches (1234ms). Failures: 0, Batch errors: 0"
if Concurrent.Result.all_successful?(result) do
IO.puts("All objects inserted successfully!")
else
IO.puts("Some failures occurred")
for failed <- result.failed do
IO.puts("Failed: #{failed.id} - #{failed.error}")
end
end
# Batch Queue for failure tracking and re-queuing
queue = Queue.new()
# Add objects to queue
queue = Enum.reduce(objects, queue, fn obj, q ->
Queue.enqueue(q, obj)
end)
# Dequeue a batch for processing
{batch, queue} = Queue.dequeue_batch(queue, 100)
# Process batch and mark failures
queue = Enum.reduce(failed_objects, queue, fn {obj, reason}, q ->
Queue.mark_failed(q, obj, reason)
end)
# Re-queue failed objects for retry (with max retry limit)
queue = Queue.requeue_failed(queue, max_retries: 3)
# Get queue statistics
IO.puts("Pending: #{Queue.pending_count(queue)}")
IO.puts("Failed: #{Queue.failed_count(queue)}")
IO.puts("Empty: #{Queue.empty?(queue)}")
# Rate limit detection
alias WeaviateEx.Batch.RateLimit
response = %{status: 429, headers: [{"retry-after", "5"}]}
case RateLimit.detect(response) do
:ok -> IO.puts("No rate limit")
{:rate_limited, wait_ms} ->
IO.puts("Rate limited, wait #{wait_ms}ms")
Process.sleep(wait_ms)
end
# Server queue monitoring for dynamic batch sizing
alias WeaviateEx.API.Cluster
{:ok, stats} = Cluster.batch_stats(client)
IO.puts("Queue length: #{stats.queue_length}")
IO.puts("Rate: #{stats.rate_per_second}/s")
IO.puts("Failed: #{stats.failed_count}")gRPC Batch Streaming (v0.6.0+)
Bidirectional gRPC streaming for high-throughput batch operations (requires Weaviate 1.34+):
alias WeaviateEx.Batch.Stream
# Create a streaming batch session
{:ok, stream} = Stream.new(client, "Article",
buffer_size: 200, # Objects per batch
flush_interval_ms: 1000, # Auto-flush interval
auto_flush: true # Enable automatic flushing
)
# Add objects to the stream buffer
{:ok, stream} = Stream.add(stream, %{
properties: %{title: "Article 1", content: "Content 1"}
})
{:ok, stream} = Stream.add(stream, %{
properties: %{title: "Article 2", content: "Content 2"}
})
# Manually flush when buffer reaches threshold
{:ok, stream} = Stream.flush(stream)
# Add many objects efficiently
objects = Enum.map(1..1000, fn i ->
%{properties: %{title: "Article #{i}", content: "Content #{i}"}}
end)
{:ok, stream} = Enum.reduce(objects, {:ok, stream}, fn obj, {:ok, s} ->
Stream.add(s, obj)
end)
# Close stream and get final results
{:ok, results} = Stream.close(stream)
# Results include success/failure for each object
Enum.each(results, fn result ->
case result do
%{status: :success, uuid: uuid} ->
IO.puts("Created: #{uuid}")
%{status: :failed, error: error} ->
IO.puts("Failed: #{error}")
end
end)When the server sends backoff messages, the stream automatically updates its buffer size to the server-provided batch size for subsequent flushes.
Low-Level gRPC Streaming
For advanced use cases, access the underlying gRPC stream directly:
alias WeaviateEx.GRPC.Services.BatchStream
# Open a bidirectional stream
{:ok, stream_handle} = BatchStream.open(client.grpc_channel)
# Send objects
:ok = BatchStream.send_objects(stream_handle, [
%{collection: "Article", properties: %{title: "Test"}, uuid: nil, vector: nil}
])
# Send cross-references
:ok = BatchStream.send_references(stream_handle, [
%{from_collection: "Article", from_uuid: "...", to_collection: "Author", to_uuid: "..."}
])
# Receive results
{:ok, results} = BatchStream.receive_results(stream_handle, timeout: 5000)
# Close the stream
:ok = BatchStream.close(stream_handle)Background Batch Processing (v0.7.0+)
For high-throughput scenarios, use the background batcher for continuous async processing:
alias WeaviateEx.Batch.Background
# Start a background batch processor
{:ok, batcher} = WeaviateEx.Batch.background(client, "Article",
batch_size: 100,
concurrent_requests: 2,
flush_interval: 1000
)
# Add objects asynchronously (non-blocking)
for article <- articles do
:ok = Background.add_object(batcher, %{
title: article.title,
content: article.content
})
end
# Add objects with explicit UUID and vector
:ok = Background.add_object(batcher, %{title: "Test"},
uuid: "550e8400-e29b-41d4-a716-446655440000",
vector: [0.1, 0.2, 0.3]
)
# Add references (automatically ordered after related objects)
:ok = Background.add_reference(batcher, article_uuid, "hasAuthor", author_uuid)
# Force immediate flush
:ok = Background.flush(batcher)
# Get current results
results = Background.get_results(batcher)
IO.puts("Imported #{map_size(results.successful_uuids)} objects")
# Stop and get final results (with flush)
results = Background.stop(batcher, flush: true)Batch Safety Features (v0.7.4+)
WeaviateEx implements production-grade batch safety for reliable large-scale operations:
Memory Management
# MAX_STORED_RESULTS limit (100,000) prevents memory exhaustion
# Automatic eviction of oldest entries when limit exceeded
alias WeaviateEx.Batch.ErrorTracking.Results
# Check the limit
Results.max_stored_results()
#=> 100_000
# Results automatically evict oldest entries when limit is exceeded
# This prevents unbounded memory growth during large batch operationsAuto-Retry for Failed Objects
alias WeaviateEx.Batch.Dynamic
# Dynamic batcher with auto-retry enabled (default)
{:ok, batcher} = Dynamic.start(
client: client,
auto_retry: true, # Enable automatic retry (default: true)
max_retries: 5, # Maximum retry attempts (default: 3)
retry_delay_ms: 2000, # Base delay for backoff (default: 1000ms)
on_permanent_failure: fn objects ->
Logger.error("Permanent failures: #{length(objects)}")
# Handle objects that exceeded max_retries
end
)
# Add objects - failed objects are automatically re-queued
Dynamic.add_object(batcher, "Article", %{title: "Test"})
# Retryable errors include:
# - Rate limit errors (429, "rate limit exceeded", etc.)
# - Transient gRPC errors (UNAVAILABLE, RESOURCE_EXHAUSTED, ABORTED, DEADLINE_EXCEEDED)RetryQueue for Manual Control
alias WeaviateEx.Batch.RetryQueue
# Start a retry queue for manual control
{:ok, retry_queue} = RetryQueue.start_link(
client: client,
max_retries: 3,
base_delay_ms: 1000,
on_permanent_failure: fn objects ->
Logger.error("Failed after max retries: #{length(objects)}")
end
)
# Enqueue failed objects for retry
:ok = RetryQueue.enqueue_failed(retry_queue, failed_objects)
# Check retry count for a specific object
count = RetryQueue.get_retry_count(retry_queue, "uuid-123")
# Drain all queued objects for manual processing
{:ok, objects} = RetryQueue.drain(retry_queue)
# Clear the queue
:ok = RetryQueue.clear(retry_queue)Configurable Batch Options
alias WeaviateEx.Batch.Config
# Create a batch configuration
config = Config.new(
max_stored_results: 50_000, # Custom limit
auto_retry: true,
max_retries: 5,
retry_delay_ms: 2000,
on_permanent_failure: fn objects ->
Logger.error("Failed: #{length(objects)}")
end
)
# Access configuration values
Config.auto_retry_enabled?(config) #=> true
Config.default_max_retries() #=> 3Queries & Vector Search
Powerful query capabilities with semantic search:
alias WeaviateEx.Query
# Simple query with field selection
query = Query.get("Article")
|> Query.fields(["title", "content", "publishedAt"])
|> Query.limit(10)
{:ok, results} = Query.execute(query)
# Semantic search with near_text (requires vectorizer)
query = Query.get("Article")
|> Query.near_text("artificial intelligence", certainty: 0.7)
|> Query.fields(["title", "content"])
|> Query.additional(["certainty", "distance"])
|> Query.limit(5)
{:ok, results} = Query.execute(query)
# Vector search with custom vectors
query = Query.get("Article")
|> Query.near_vector([0.1, 0.2, 0.3], certainty: 0.8)
|> Query.fields(["title"])
{:ok, results} = Query.execute(query)
# Hybrid search (combines keyword + vector)
query = Query.get("Article")
|> Query.hybrid("machine learning", alpha: 0.5) # alpha: 0=keyword, 1=vector
|> Query.fields(["title", "content"])
{:ok, results} = Query.execute(query)
# BM25 keyword search
query = Query.get("Article")
|> Query.bm25("elixir programming")
|> Query.fields(["title", "content"])
{:ok, results} = Query.execute(query)
# Semantic direction with Move (v0.5.0+)
query = Query.get("Article")
|> Query.near_text("technology",
move_to: [concepts: ["artificial intelligence", "machine learning"], force: 0.8],
move_away: [concepts: ["politics", "sports"], force: 0.5]
)
|> Query.fields(["title", "content"])
{:ok, results} = Query.execute(query)
# Queries with filters (WHERE clause)
query = Query.get("Article")
|> Query.where(%{
path: ["publishedAt"],
operator: "GreaterThan",
valueDate: "2025-01-01T00:00:00Z"
})
|> Query.fields(["title", "publishedAt"])
|> Query.sort([%{path: ["publishedAt"], order: "desc"}])
{:ok, results} = Query.execute(query)Fetch Objects by IDs
alias WeaviateEx.API.Data
ids = [
"550e8400-e29b-41d4-a716-446655440001",
"550e8400-e29b-41d4-a716-446655440002"
]
{:ok, objects} = Data.fetch_objects_by_ids(client, "Article", ids,
return_properties: ["title", "content"]
)
# Results preserve the input ID order.# Using the Objects module (no client needed)
{:ok, objects} = WeaviateEx.Objects.fetch_objects_by_ids("Article", ids,
return_properties: ["title", "content"]
)gRPC vs GraphQL
When you pass a WeaviateEx.Client, Query.execute/2 uses gRPC and now supports
filters, group_by, target vectors, near_image/near_media, references, vector metadata,
reranking, and generative search (RAG). If a query includes options not yet supported in gRPC
(for example sorting or cursor pagination), it automatically falls back to GraphQL.
Reranking
Improve search result relevance using reranker models:
alias WeaviateEx.Query
alias WeaviateEx.Query.Rerank
# Basic reranking - re-scores results using the "content" property
rerank = Rerank.new("content")
{:ok, results} = Query.get("Article")
|> Query.near_text("machine learning")
|> Query.fields(["title", "content"])
|> Query.limit(10)
|> Query.rerank(rerank)
|> Query.execute(client)
# With custom rerank query (different from search query)
rerank = Rerank.new("content", query: "latest AI applications in healthcare")
{:ok, results} = Query.get("Article")
|> Query.hybrid("AI trends", alpha: 0.5)
|> Query.fields(["title", "content"])
|> Query.rerank(rerank)
|> Query.execute(client)
# Access rerank scores in results
for result <- results do
score = result["_additional"]["rerankScore"]
IO.puts("Rerank score: #{score}")
endNote: Requires a reranker module configured on the collection. See
WeaviateEx.API.RerankerConfig for available rerankers: cohere, transformers,
voyageai, jinaai, nvidia, contextualai.
gRPC Generative Search (v0.7.4+)
Generative queries now use gRPC for improved performance (~2-3x lower latency):
alias WeaviateEx.GRPC.Services.Search
alias WeaviateEx.Query.GenerativeResult
# Build a search request with generative config
request = Search.build_near_text_request("Article", "machine learning",
limit: 5,
return_properties: ["title", "content"],
generative: %{
single_prompt: "Summarize this article: {content}",
provider: :openai,
model: "gpt-4",
temperature: 0.7
}
)
# Execute the search
{:ok, reply} = Search.execute(channel, request)
# Parse the generative results
result = GenerativeResult.from_grpc_response(reply)
# Access per-object generations
for gen <- result.generated_per_object do
IO.puts("Generated: #{gen}")
end
# Grouped generation
request = Search.build_near_text_request("Article", "AI trends",
generative: %{
grouped_task: "Synthesize the key themes from these articles",
grouped_properties: ["title", "content"],
provider: :anthropic,
model: "claude-3-5-sonnet-20241022"
}
)
{:ok, reply} = Search.execute(channel, request)
result = GenerativeResult.from_grpc_response(reply)
IO.puts("Grouped summary: #{result.generated}")Supported providers: :openai, :anthropic, :cohere, :mistral, :ollama,
:google, :aws, :databricks, :friendliai, :nvidia, :xai, :contextualai, :anyscale.
Multi-Vector Collections (v0.7.0+)
Query collections with multiple named vectors:
alias WeaviateEx.Query
alias WeaviateEx.Query.TargetVectors
# Single target vector
query = Query.get("MultiVectorCollection")
|> Query.near_text("search term", target_vectors: "content_vector")
|> Query.fields(["title", "content"])
{:ok, results} = Query.execute(query, client)
# Combined vectors with average method
target = TargetVectors.combine(["title_vector", "content_vector"], method: :average)
query = Query.get("MultiVectorCollection")
|> Query.near_vector(embedding, target_vectors: target)
|> Query.fields(["title"])
{:ok, results} = Query.execute(query, client)
# Weighted combination
target = TargetVectors.weighted(%{
"title_vector" => 0.7,
"content_vector" => 0.3
})
query = Query.get("MultiVectorCollection")
|> Query.near_text("search", target_vectors: target)
|> Query.fields(["title", "content"])
{:ok, results} = Query.execute(query, client)Updating Named Vector Configuration (v0.7.0+)
Update existing named vector index settings and quantization:
alias WeaviateEx.API.NamedVectors
# Update vector index parameters
update = NamedVectors.update_config("title_vector",
vector_index: [
ef: 200,
dynamic_ef_min: 100,
dynamic_ef_max: 500,
dynamic_ef_factor: 8,
flat_search_cutoff: 40000
]
)
# Update with quantization settings
update = NamedVectors.update_config("content_vector",
vector_index: [ef: 150],
quantizer: [
type: :pq,
segments: 128,
centroids: 256,
training_limit: 100000
]
)
# Build update config for multiple vectors at once
updates = NamedVectors.build_update_config([
{"title_vector", [vector_index: [ef: 200]]},
{"content_vector", [quantizer: [type: :sq, rescore_limit: 200]]}
])
# Convert to API format
api_config = NamedVectors.update_to_api(update)Advanced Hybrid Search (v0.7.0+)
Use HybridVector for sophisticated hybrid queries with Move operations:
alias WeaviateEx.Query
alias WeaviateEx.Query.{HybridVector, Move}
# Text sub-search with Move operations
hv = HybridVector.near_text("machine learning",
move_to: Move.to(0.5, concepts: ["AI", "neural networks"]),
move_away_from: Move.to(0.3, concepts: ["biology"])
)
query = Query.get("Article")
|> Query.hybrid("search term", vector: hv, alpha: 0.7)
|> Query.fields(["title", "content"])
{:ok, results} = Query.execute(query, client)
# Vector sub-search with target vectors
hv = HybridVector.near_vector(embedding, target_vectors: "content_vector")
query = Query.get("Article")
|> Query.hybrid("search", vector: hv, fusion_type: :relative_score)
|> Query.fields(["title"])
{:ok, results} = Query.execute(query, client)Multimodal Search
Search using images, audio, video, and other media types (v0.7.0+):
Image Search (near_image)
Search collections using image data with multi2vec-clip, multi2vec-bind, or other image vectorizers:
alias WeaviateEx.Query
alias WeaviateEx.Query.NearImage
# Search by base64 encoded image
query = Query.get("ImageCollection")
|> Query.near_image(image: base64_image_data, certainty: 0.8)
|> Query.fields(["name", "description"])
|> Query.limit(10)
{:ok, results} = Query.execute(query, client)
# Search by image file path
query = Query.get("ImageCollection")
|> Query.near_image(image_file: "/path/to/image.png", distance: 0.3)
|> Query.fields(["name"])
{:ok, results} = Query.execute(query, client)
# With named vectors (for collections with multiple vector spaces)
query = Query.get("MultiVectorCollection")
|> Query.near_image(
image: base64_data,
certainty: 0.7,
target_vectors: ["image_vector", "clip_vector"]
)
|> Query.fields(["title"])
{:ok, results} = Query.execute(query, client)
# Using NearImage directly
near_image = NearImage.new(image: base64_data, certainty: 0.8)
NearImage.to_graphql(near_image) # => %{"image" => "...", "certainty" => 0.8}
NearImage.to_grpc(near_image) # => %{image: "...", certainty: 0.8}
# Encode image file to base64
base64_data = NearImage.encode_image_file("/path/to/image.jpg")Media Search (near_media)
Search using audio, video, thermal, depth, or IMU data with multi2vec-bind:
alias WeaviateEx.Query
alias WeaviateEx.Query.NearMedia
# Search by audio
query = Query.get("MediaCollection")
|> Query.near_media(:audio, media: base64_audio, certainty: 0.7)
|> Query.fields(["name", "transcript"])
|> Query.limit(5)
{:ok, results} = Query.execute(query, client)
# Search by video file
query = Query.get("MediaCollection")
|> Query.near_media(:video, media_file: "/path/to/video.mp4", distance: 0.3)
|> Query.fields(["title", "duration"])
{:ok, results} = Query.execute(query, client)
# Search by thermal imaging data
query = Query.get("SensorData")
|> Query.near_media(:thermal, media: base64_thermal, certainty: 0.8)
|> Query.fields(["timestamp", "location"])
{:ok, results} = Query.execute(query, client)
# Supported media types
NearMedia.media_types() # => [:audio, :video, :thermal, :depth, :imu]
# Using NearMedia directly
near_media = NearMedia.new(:audio, media: base64_audio, certainty: 0.7)
NearMedia.to_graphql(near_media) # => %{"media" => "...", "type" => "audio", "certainty" => 0.7}
NearMedia.to_grpc(near_media) # => %{media: "...", type: :MEDIA_TYPE_AUDIO, certainty: 0.7}
# With target vectors for named vectors
near_media = NearMedia.new(:depth,
media: base64_depth_data,
target_vectors: ["depth_vector"]
)Convenience Methods (v0.8.0+)
For a simpler Python-like API, use the convenience methods that automatically handle file paths, base64 data, and raw binary input:
alias WeaviateEx.Query
# Search by image - accepts file path, base64, or binary
{:ok, results} = Query.get("Products")
|> Query.with_near_image("/path/to/image.jpg")
|> Query.limit(10)
|> Query.execute(client)
# Search by base64 image data
{:ok, results} = Query.get("Products")
|> Query.with_near_image(base64_image_data, certainty: 0.8)
|> Query.execute(client)
# Search by audio
{:ok, results} = Query.get("Podcasts")
|> Query.with_near_audio("/path/to/clip.mp3")
|> Query.execute(client)
# Search by video
{:ok, results} = Query.get("Videos")
|> Query.with_near_video("/path/to/clip.mp4")
|> Query.execute(client)
# Search by other media types
{:ok, results} = Query.get("SensorData")
|> Query.with_near_thermal(thermal_data)
|> Query.execute(client)
{:ok, results} = Query.get("DepthMaps")
|> Query.with_near_depth(depth_data, distance: 0.3)
|> Query.execute(client)
{:ok, results} = Query.get("MotionData")
|> Query.with_near_imu(imu_data)
|> Query.execute(client)
# Generic method for any media type
{:ok, results} = Query.get("Products")
|> Query.with_near_media(:image, "/path/to/image.jpg", certainty: 0.8)
|> Query.execute(client)Convenience method options:
:certainty- Minimum certainty threshold (0.0 to 1.0):distance- Maximum distance threshold:target_vectors- Target vectors for multi-vector collections
Supported modalities: image, audio, video, thermal, depth, imu
Note: Requires a multi-modal vectorizer (e.g., multi2vec-clip for images,
multi2vec-bind for audio/video).
Media Type Reference
| Type | Description | Use Case |
|---|---|---|
:audio | Audio files (wav, mp3, etc.) | Voice search, audio similarity |
:video | Video files (mp4, avi, etc.) | Video content matching |
:thermal | Thermal imaging data | Industrial inspection, security |
:depth | Depth sensor data | 3D object recognition |
:imu | Inertial measurement unit data | Motion/gesture recognition |
Generative Search (RAG)
Combine search with AI generation for retrieval-augmented generation:
alias WeaviateEx.Query.Generate
# Single-object generation - generate for each result
query = Generate.new("Article")
|> Generate.near_text("artificial intelligence")
|> Generate.single("Summarize this article in one sentence: {title}")
|> Generate.return_properties(["title", "content"])
|> Generate.limit(5)
{:ok, result} = Generate.execute(query, client)
# Access generated content per object
for obj <- result.objects do
IO.puts("Title: #{obj["title"]}")
IO.puts("Generated: #{obj["_additional"]["generate"]["singleResult"]}")
end
# Grouped generation - generate once for all results combined
query = Generate.new("Article")
|> Generate.bm25("machine learning")
|> Generate.grouped("Based on these articles, what are the main trends?",
properties: ["title", "content"])
|> Generate.return_properties(["title"])
|> Generate.limit(10)
{:ok, result} = Generate.execute(query, client)
IO.puts("Combined insight: #{result.generated}")
# Hybrid search with generation
query = Generate.new("Article")
|> Generate.hybrid("neural networks", alpha: 0.7)
|> Generate.single("Extract key points from: {content}")
|> Generate.return_properties(["title", "content"])
{:ok, result} = Generate.execute(query, client)
# Convert existing Query to generative query
query = Query.get("Article")
|> Query.near_text("climate change")
|> Query.fields(["title", "content"])
|> Query.limit(5)
gen_query = Query.generate(query, :single, "Summarize: {content}")
{:ok, result} = Generate.execute(gen_query, client)Query References (v0.7.0+)
Query cross-references with multi-target support and metadata:
alias WeaviateEx.Query.QueryReference
# Basic reference query
ref = QueryReference.new("hasAuthor", return_properties: ["name", "email"])
# Multi-target reference query (for references pointing to multiple collections)
ref = QueryReference.multi_target("relatedTo", "Article",
return_properties: ["title", "publishedAt"]
)
# Check if reference is multi-target
QueryReference.multi_target?(ref) # => true
# Request metadata in referenced objects
ref = QueryReference.new("hasAuthor",
return_properties: ["name"],
return_metadata: [:uuid, :distance, :certainty]
)
# Use metadata presets
ref = QueryReference.new("hasAuthor",
return_properties: ["name"],
return_metadata: :full # All available metadata
)
ref = QueryReference.new("hasAuthor",
return_properties: ["name"],
return_metadata: :common # uuid, distance, certainty, creation_time
)
# Use in queries
query = Query.get("Article")
|> Query.fields(["title", "content"])
|> Query.reference(ref)Aggregations
Statistical analysis over your data:
alias WeaviateEx.API.Aggregate
alias WeaviateEx.Aggregate.Metrics
# Count all objects
{:ok, result} = Aggregate.over_all(client, "Product", metrics: [:count])
# Numeric aggregations (mean, sum, min, max)
{:ok, stats} = Aggregate.over_all(client, "Product",
properties: [{:price, [:mean, :sum, :maximum, :minimum, :count]}]
)
# Top occurrences for text fields
{:ok, categories} = Aggregate.over_all(client, "Product",
properties: [{:category, [:topOccurrences], limit: 10}]
)
# Group by with aggregations
{:ok, grouped} = Aggregate.group_by(client, "Product", "category",
metrics: [:count],
properties: [{:price, [:mean, :maximum, :minimum]}]
)Near Object Aggregation
Aggregate objects similar to a reference object:
# Aggregate objects near a reference UUID
{:ok, result} = Aggregate.with_near_object(client, "Articles", reference_uuid,
distance: 0.5,
metrics: [:count],
properties: [
{:views, [:mean, :sum]},
{:category, [:topOccurrences], limit: 5}
]
)
IO.inspect(result) # %{"meta" => %{"count" => 42}, "views" => %{"mean" => 1250.5, "sum" => 52521}}Hybrid Aggregation
Aggregate with combined keyword and vector search:
# Hybrid search aggregation (balanced keyword + vector)
{:ok, result} = Aggregate.with_hybrid(client, "Products", "electronics",
alpha: 0.5, # 50% vector, 50% keyword (default)
metrics: [:count],
properties: [
{:price, [:sum, :mean, :minimum, :maximum]}
]
)
# Pure keyword search aggregation (alpha = 0)
{:ok, result} = Aggregate.with_hybrid(client, "Products", "laptop",
alpha: 0.0,
fusion_type: :ranked,
metrics: [:count]
)
# Vector-weighted search aggregation (alpha = 0.8)
{:ok, result} = Aggregate.with_hybrid(client, "Products", "portable computer",
alpha: 0.8,
fusion_type: :relative_score,
properties: [{:category, [:topOccurrences], limit: 3}]
)Using the Metrics Helper
Build metrics specifications with the helper module:
alias WeaviateEx.Aggregate.Metrics
# Number metrics with all options
{:ok, result} = Aggregate.over_all(client, "Products",
metrics: [Metrics.count()],
properties: [
Metrics.number("price", sum: true, mean: true, minimum: true, maximum: true),
Metrics.text("category", top_occurrences: 5),
Metrics.boolean("inStock")
]
)Advanced Filtering
Build complex filters with a type-safe DSL:
alias WeaviateEx.Filter
# Simple equality
filter = Filter.equal("status", "published")
# Numeric comparisons
filter = Filter.greater_than("views", 100)
filter = Filter.less_than_equal("price", 50.0)
# Text pattern matching
filter = Filter.like("title", "*AI*")
# Array operations
filter = Filter.contains_any("tags", ["elixir", "phoenix"])
filter = Filter.contains_all("tags", ["elixir", "tutorial"])
# Geospatial queries
filter = Filter.within_geo_range("location", {40.7128, -74.0060}, 5000.0)
# Date comparisons
filter = Filter.greater_than("publishedAt", "2025-01-01T00:00:00Z")
# Null checks
filter = Filter.is_null("deletedAt")
# Property length filtering (v0.7.0+)
filter = Filter.by_property_length("title", :greater_than, 10)
filter = Filter.by_property_length("tags", :greater_or_equal, 3)
# Combine filters with AND
combined = Filter.all_of([
Filter.equal("status", "published"),
Filter.greater_than("views", 100),
Filter.like("title", "*Elixir*")
])
# Combine filters with OR
or_filter = Filter.any_of([
Filter.equal("category", "technology"),
Filter.equal("category", "science")
])
# Negate filters
not_filter = Filter.none_of([
Filter.equal("status", "draft")
])
# Use in queries
query = Query.get("Article")
|> Query.where(Filter.to_graphql(combined))
|> Query.fields(["title", "views"])Deep Reference Filtering (v0.7.0+)
Filter through chains of references to reach nested properties:
alias WeaviateEx.Filter
alias WeaviateEx.Filter.RefPath
# Filter articles where the author's company is in technology
filter = RefPath.through("hasAuthor", "Author")
|> RefPath.through("worksAt", "Company")
|> RefPath.property("industry", :equal, "Technology")
# Filter by author name directly
filter = RefPath.through("hasAuthor", "Author")
|> RefPath.property("name", :like, "John*")
# Combine with other filters
combined = Filter.all_of([
RefPath.through("hasAuthor", "Author")
|> RefPath.property("verified", :equal, true),
Filter.equal("status", "published")
])
# Get path depth
path = RefPath.through("hasAuthor", "Author")
|> RefPath.through("worksAt", "Company")
RefPath.depth(path) # => 2
# Use convenience function
filter = Filter.by_ref_path(
RefPath.through("hasAuthor", "Author"),
"name",
:equal,
"Jane"
)Multi-Target Reference Filtering (v0.7.0+)
Filter on multi-target reference properties that can point to different collections:
alias WeaviateEx.Filter
alias WeaviateEx.Filter.{MultiTargetRef, RefPath}
# Filter where "relatedTo" points to an Article with specific title
filter = MultiTargetRef.new("relatedTo", "Article")
|> MultiTargetRef.where("title", :equal, "My Article")
# Filter where "mentions" points to a verified Person
filter = MultiTargetRef.new("mentions", "Person")
|> MultiTargetRef.where("verified", :equal, true)
# Deep path filtering through multi-target reference
filter = MultiTargetRef.new("mentions", "Person")
|> MultiTargetRef.deep_where(fn path ->
path
|> RefPath.through("worksAt", "Company")
|> RefPath.property("industry", :equal, "Tech")
end)
# Convert to RefPath for chaining
ref_path = MultiTargetRef.new("mentions", "Person")
|> MultiTargetRef.as_ref_path()
|> RefPath.through("worksAt", "Company")
|> RefPath.property("name", :equal, "Acme")
# Combine with other filters
combined = Filter.all_of([
MultiTargetRef.new("relatedTo", "Article")
|> MultiTargetRef.where("status", :equal, "published"),
Filter.equal("featured", true)
])
# Use convenience function
filter = Filter.by_ref_multi_target(
"relatedTo",
"Article",
"status",
:equal,
"published"
)Vector Configuration
Configure vectorizers and index types:
alias WeaviateEx.API.VectorConfig
# Custom vectors with HNSW index
config = VectorConfig.new("AIArticle")
|> VectorConfig.with_vectorizer(:none) # Bring your own vectors
|> VectorConfig.with_hnsw_index(
distance: :cosine,
ef: 100,
max_connections: 64
)
|> VectorConfig.with_properties([
%{"name" => "title", "dataType" => ["text"]},
%{"name" => "content", "dataType" => ["text"]}
])
{:ok, _} = Collections.create(client, config)
# HNSW with Product Quantization (compression)
config = VectorConfig.new("CompressedData")
|> VectorConfig.with_vectorizer(:none)
|> VectorConfig.with_hnsw_index(distance: :dot)
|> VectorConfig.with_product_quantization(
enabled: true,
segments: 96,
centroids: 256
)
# Flat index for exact search (no approximation)
config = VectorConfig.new("ExactSearch")
|> VectorConfig.with_vectorizer(:none)
|> VectorConfig.with_flat_index(distance: :dot)Inverted Index Configuration (v0.5.0+)
Configure BM25 and stopwords for full-text search:
alias WeaviateEx.API.InvertedIndexConfig
# Configure BM25 algorithm parameters
bm25_config = InvertedIndexConfig.bm25(b: 0.75, k1: 1.2)
# Configure stopwords with English preset and customizations
stopwords = InvertedIndexConfig.stopwords(
preset: :en,
additions: ["foo", "bar"],
removals: ["the"]
)
# Build complete inverted index configuration
config = InvertedIndexConfig.build(
bm25: [b: 0.8, k1: 1.5],
stopwords: [preset: :en],
index_timestamps: true,
index_property_length: true,
index_null_state: false,
cleanup_interval_seconds: 60
)
# Validate configuration
{:ok, validated} = InvertedIndexConfig.validate(config)
# Merge configurations
merged = InvertedIndexConfig.merge(base_config, override_config)Reranker Configuration (v0.7.0+)
Configure reranking models to improve search result relevance:
alias WeaviateEx.API.RerankerConfig
# Cohere reranker (default or specific model)
config = RerankerConfig.cohere()
config = RerankerConfig.cohere("rerank-english-v3.0")
config = RerankerConfig.cohere("rerank-multilingual-v3.0", base_url: "https://api.cohere.ai")
# Local transformers reranker
config = RerankerConfig.transformers()
config = RerankerConfig.transformers(inference_url: "http://localhost:8080")
# Voyage AI reranker
config = RerankerConfig.voyageai("rerank-1")
config = RerankerConfig.voyageai("rerank-lite-1", base_url: "https://api.voyageai.com")
# Jina AI reranker
config = RerankerConfig.jinaai("jina-reranker-v1-base-en")
config = RerankerConfig.jinaai("jina-reranker-v1-turbo-en")
# Custom/unlisted reranker provider
config = RerankerConfig.custom("my-reranker",
api_endpoint: "https://reranker.example.com",
model: "rerank-v1",
max_tokens: 512
)
# Disable reranking
config = RerankerConfig.none()
# Use in collection creation
{:ok, _} = Collections.create("Article", %{
properties: [...],
reranker_config: config
})Custom Generative Provider Configuration (v0.7.0+)
Configure unlisted generative AI providers with custom settings:
alias WeaviateEx.API.GenerativeConfig
# Custom generative provider for unlisted LLMs
config = GenerativeConfig.custom("my-llm",
api_endpoint: "https://llm.example.com",
model: "custom-gpt",
temperature: 0.7,
max_tokens: 2048
)
# Custom provider with authentication options
config = GenerativeConfig.custom("enterprise-llm",
api_endpoint: "https://llm.internal.corp",
model: "llm-v2",
api_key_header: "X-API-Key",
temperature: 0.5
)
# Use with collection
{:ok, _} = Collections.create("Article", %{
properties: [...],
generative_config: config
})Backup & Restore
Complete backup and restore operations with multiple storage backends:
alias WeaviateEx.Backup.{Config, Location}
# Create a backup to filesystem
{:ok, status} = WeaviateEx.create_backup(client, "daily-backup", :filesystem)
# Create backup to S3 with specific collections and wait for completion
{:ok, status} = WeaviateEx.create_backup(client, "daily-backup", :s3,
include_collections: ["Article", "Author"],
wait_for_completion: true,
config: Config.create(compression: :best_compression)
)
# Check backup status
{:ok, status} = WeaviateEx.get_backup_status(client, "daily-backup", :filesystem)
IO.puts("Status: #{status.status}") # :started, :transferring, :success, etc.
# List all backups
{:ok, backups} = WeaviateEx.list_backups(client, :filesystem)
# Restore a backup
{:ok, status} = WeaviateEx.restore_backup(client, "daily-backup", :filesystem,
wait_for_completion: true
)
# Restore specific collections only
{:ok, status} = WeaviateEx.restore_backup(client, "daily-backup", :s3,
include_collections: ["Article"]
)
# Cancel an in-progress backup
:ok = WeaviateEx.cancel_backup(client, "daily-backup", :filesystem)Storage Backends
| Backend | Description | Configuration |
|---|---|---|
:filesystem | Local filesystem | BACKUP_FILESYSTEM_PATH on server |
:s3 | Amazon S3 / S3-compatible | Bucket, region, credentials |
:gcs | Google Cloud Storage | Bucket, project ID, credentials |
:azure | Azure Blob Storage | Container, connection string |
Compression Options (v0.5.0+)
alias WeaviateEx.Backup.{Config, Compression}
# GZIP compression (default)
Config.create(compression: :default) # Balanced GZIP
Config.create(compression: :best_speed) # Fast GZIP
Config.create(compression: :best_compression) # Max GZIP
# ZSTD compression (faster, better ratios)
Config.create(compression: :zstd_default) # Balanced ZSTD
Config.create(compression: :zstd_best_speed) # Fast ZSTD
Config.create(compression: :zstd_best_compression) # Max ZSTD
# No compression
Config.create(compression: :no_compression)
# Check compression type
Compression.gzip?(:default) # => true
Compression.zstd?(:zstd_default) # => trueRBAC Restore Options (v0.6.0+)
Restore backups with fine-grained control over RBAC data:
alias WeaviateEx.Backup
# Restore with RBAC options
{:ok, status} = Backup.restore(client, "daily-backup", :s3,
roles_restore: true, # Restore role definitions
users_restore: true, # Restore user assignments
overwrite_alias: true, # Overwrite existing aliases
wait_for_completion: true
)
# Selective RBAC restore - roles only
{:ok, status} = Backup.restore(client, "daily-backup", :filesystem,
roles_restore: true,
users_restore: false
)Location Configuration (Advanced)
Use typed location structs for cloud backend configuration:
alias WeaviateEx.Backup.{Location, Config}
# Filesystem location
fs_loc = Location.filesystem("/var/backups/weaviate")
# S3 location with full configuration
s3_loc = Location.s3("my-bucket", "/backups",
endpoint: "s3.us-west-2.amazonaws.com",
region: "us-west-2",
access_key_id: "...",
secret_access_key: "...",
use_ssl: true
)
# GCS location
gcs_loc = Location.gcs("my-bucket", "/backups",
project_id: "my-project",
credentials: %{...}
)
# Azure location
azure_loc = Location.azure("my-container", "/backups",
connection_string: "..."
)
# Use location structs directly in backup operations
{:ok, status} = Backup.create(client, "backup-001", s3_loc,
include_collections: ["Article"],
config: Config.create(chunk_size: 128, compression: :zstd_default)
)
# Restore from location struct
{:ok, status} = Backup.restore(client, "backup-001", s3_loc,
roles_restore: true
)Collection Aliases (v0.5.0+)
Aliases allow zero-downtime collection updates by providing alternative names:
alias WeaviateEx.API.Aliases
# Create an alias (requires Weaviate v1.32.0+)
{:ok, _} = Aliases.create(client, "articles", "Article_v1")
# List all aliases
{:ok, aliases} = Aliases.list(client)
# => [%Alias{alias: "articles", collection: "Article_v1"}]
# Update alias to point to new collection (blue-green deployment)
{:ok, _} = Aliases.update(client, "articles", "Article_v2")
# Get alias details
{:ok, alias_info} = Aliases.get(client, "articles")
# => %Alias{alias: "articles", collection: "Article_v2"}
# Check if alias exists
{:ok, true} = Aliases.exists?(client, "articles")
# Delete alias (underlying collection remains)
{:ok, true} = Aliases.delete(client, "articles")Multi-Tenancy
Isolate data per tenant with automatic partitioning:
alias WeaviateEx.API.{VectorConfig, Tenants}
# Create multi-tenant collection
config = VectorConfig.new("TenantArticle")
|> VectorConfig.with_multi_tenancy(enabled: true)
|> VectorConfig.with_properties([
%{"name" => "title", "dataType" => ["text"]}
])
Collections.create(client, config)
# Create tenants
{:ok, created} = Tenants.create(client, "TenantArticle",
["CompanyA", "CompanyB", "CompanyC"]
)
# List all tenants
{:ok, tenants} = Tenants.list(client, "TenantArticle")
# Get specific tenant
{:ok, tenant} = Tenants.get(client, "TenantArticle", "CompanyA")
# Check existence
{:ok, true} = Tenants.exists?(client, "TenantArticle", "CompanyA")
# Deactivate tenant (set to COLD storage)
{:ok, _} = Tenants.deactivate(client, "TenantArticle", "CompanyB")
# List only active tenants
{:ok, active} = Tenants.list_active(client, "TenantArticle")
# Activate tenant (set to HOT)
{:ok, _} = Tenants.activate(client, "TenantArticle", "CompanyB")
# Count tenants
{:ok, count} = Tenants.count(client, "TenantArticle")
# Delete tenant
{:ok, _} = Tenants.delete(client, "TenantArticle", "CompanyC")
# Use tenant in queries (specify tenant parameter)
{:ok, objects} = Data.insert(client, "TenantArticle", data, tenant: "CompanyA")Fluent with_tenant API (v0.7.4+)
Get a tenant-scoped collection reference for cleaner multi-tenant code:
alias WeaviateEx.{Collections, TenantCollection, Query}
# Get tenant-scoped collection (matches Python client pattern)
tenant_col = Collections.with_tenant(client, "Articles", "tenant_A")
# All operations automatically scoped to tenant_A
{:ok, _} = TenantCollection.insert(tenant_col, %{
title: "My Article",
content: "Article content"
})
# Query within tenant
{:ok, results} = tenant_col
|> TenantCollection.query()
|> Query.bm25("search term")
|> Query.execute(client)
# Batch insert within tenant
{:ok, _} = TenantCollection.insert_many(tenant_col, [
%{title: "Article 1"},
%{title: "Article 2"}
])
# Get, update, delete operations
{:ok, obj} = TenantCollection.get(tenant_col, uuid)
{:ok, _} = TenantCollection.update(tenant_col, uuid, %{title: "Updated"})
{:ok, _} = TenantCollection.delete(tenant_col, uuid)Traditional API (still supported)
# Pass tenant as option to each operation
{:ok, _} = Objects.create("Articles", object, tenant: "tenant_A")
{:ok, _} = Query.get("Articles") |> Query.tenant("tenant_A") |> Query.execute(client)RBAC (Role-Based Access Control)
WeaviateEx provides full RBAC support for managing roles, permissions, users, and groups.
Creating Roles with Permissions
alias WeaviateEx.API.RBAC
alias WeaviateEx.RBAC.Permissions
# Define permissions using the builder API
permissions = [
Permissions.collections("Article", [:read, :create]),
Permissions.data("Article", [:read, :create, :update]),
Permissions.tenants("Article", [:read])
]
# Create a role
{:ok, role} = RBAC.create_role(client, "article-editor", permissions)
# List all roles
{:ok, roles} = RBAC.list_roles(client)
# Check if role has specific permissions
{:ok, true} = RBAC.has_permissions?(client, "article-editor",
[Permissions.data("Article", :read)]
)
# Add more permissions to a role
:ok = RBAC.add_permissions(client, "article-editor",
[Permissions.nodes(:verbose)]
)
# Delete a role
:ok = RBAC.delete_role(client, "article-editor")Role Scope Permissions (v0.6.0+)
Fine-grained permissions with collection/tenant/shard scopes:
alias WeaviateEx.API.RBAC.{Scope, Permission}
# Create scopes for fine-grained access
scope = Scope.collection("Article")
|> Scope.with_tenants(["tenant-a", "tenant-b"])
# Or use wildcard access
all_scope = Scope.all_collections()
# Build permissions with scopes
permissions = [
Permission.read_collection("Article"),
Permission.manage_data("Article"),
Permission.new(:data, :read, scope: Scope.collection("*")),
Permission.new(:tenants, :create, scope: scope)
]
# Convenience methods for common patterns
admin_permissions = Permission.admin() # Full access
viewer_permissions = Permission.viewer() # Read-only accessPermission Types
| Type | Actions | Description |
|---|---|---|
| collections | create, read, update, delete, manage | Collection schema operations |
| data | create, read, update, delete, manage | Object CRUD operations |
| tenants | create, read, update, delete | Multi-tenancy management |
| roles | create, read, update, delete | Role management |
| users | create, read, update, delete, assign_and_revoke | User management |
| groups | read, assign_and_revoke | OIDC group management |
| cluster | read | Cluster information |
| nodes | read (minimal/verbose) | Node information |
| backups | manage | Backup operations |
| replicate | create, read, update, delete | Replication management |
| alias | create, read, update, delete | Collection aliases |
User Management
alias WeaviateEx.API.Users
# Create a new DB user (returns API key)
{:ok, user} = Users.create(client, "john.doe")
IO.puts("API Key: #{user.api_key}")
# Get user info
{:ok, user} = Users.get(client, "john.doe")
# Get current authenticated user
{:ok, me} = Users.get_my_user(client)
# Assign roles to user
:ok = Users.assign_roles(client, "john.doe", ["article-editor", "viewer"])
# Revoke roles from user
:ok = Users.revoke_roles(client, "john.doe", ["viewer"])
# Get user's assigned roles
{:ok, roles} = Users.get_assigned_roles(client, "john.doe")
# Rotate API key
{:ok, new_key} = Users.rotate_key(client, "john.doe")
# Deactivate/activate user
:ok = Users.deactivate(client, "john.doe")
:ok = Users.activate(client, "john.doe")
# Delete user
:ok = Users.delete(client, "john.doe")Separate DB and OIDC User Management (v0.6.0+)
For fine-grained control, use the specialized modules:
alias WeaviateEx.API.Users.{DB, OIDC}
# Database-backed users (full lifecycle management)
{:ok, user} = DB.create(client, "db-user")
{:ok, new_key} = DB.rotate_api_key(client, "db-user")
{:ok, _} = DB.delete(client, "db-user")
# OIDC users (managed externally, role assignment only)
{:ok, users} = OIDC.list(client)
{:ok, user} = OIDC.get(client, "oidc-user@example.com")
:ok = OIDC.assign_roles(client, "oidc-user@example.com", ["viewer"])
:ok = OIDC.revoke_roles(client, "oidc-user@example.com", ["admin"])Group Management
OIDC group management for role assignments:
alias WeaviateEx.API.Groups
# List known OIDC groups
{:ok, groups} = Groups.list_known(client)
# Assign roles to a group
:ok = Groups.assign_roles(client, "engineering", ["developer", "viewer"])
# Get roles assigned to a group
{:ok, roles} = Groups.get_assigned_roles(client, "engineering")
# Revoke roles from a group
:ok = Groups.revoke_roles(client, "engineering", ["admin"])Examples
WeaviateEx includes 8 runnable examples that demonstrate all major features:
| Example | Description | What You'll Learn |
|---|---|---|
01_collections.exs | Collection management | Create, list, get, add properties, delete collections |
02_data.exs | CRUD operations | Insert, get, patch, check existence, delete objects |
03_filter.exs | Advanced filtering | Equality, comparison, pattern matching, geo, array filters |
04_aggregate.exs | Aggregations | Count, statistics, top occurrences, group by |
05_vector_config.exs | Vector configuration | HNSW, PQ compression, flat index, distance metrics |
06_tenants.exs | Multi-tenancy | Create tenants, activate/deactivate, list, delete |
07_batch.exs | Batch API | Bulk create/delete with summaries, query remaining data |
08_query.exs | Query builder | BM25 search, filters, near-vector similarity |
Prerequisites
Follow these steps once before running any example:
Start the local stack (full profile with all compose files):
# from the project root mix weaviate.start --version latest # or use the helper script ./scripts/weaviate-stack.sh start --version latestTo shut everything down afterwards use
mix weaviate.stop --version latest(or./scripts/weaviate-stack.sh stop).Confirm the services are healthy (optional but recommended):
mix weaviate.statusPoint the client at the running cluster (avoids repeated configuration warnings):
export WEAVIATE_URL=http://localhost:8080 # set WEAVIATE_API_KEY=... as well if your instance requires auth
Running Examples
All examples are self-contained and include clean visual output:
# With WEAVIATE_URL exported
# Run any example
mix run examples/01_collections.exs
mix run examples/02_data.exs
mix run examples/03_filter.exs
# ... etc
# Or run all examples
for example in examples/*.exs; do
echo "Running $example..."
mix run "$example"
done
Each example:
- β Checks Weaviate connectivity before running
- β Shows the code being executed
- β Displays formatted results
- β Cleans up after itself (deletes test data)
- β Provides clear success/error messages
Supported Weaviate Versions
| Weaviate Version | Status | Notes |
|---|---|---|
| 1.35.x | Fully Supported | Latest |
| 1.34.x | Fully Supported | gRPC streaming |
| 1.33.x | Fully Supported | |
| 1.32.x | Fully Supported | |
| 1.31.x | Fully Supported | |
| 1.30.x | Fully Supported | |
| 1.29.x | Fully Supported | |
| 1.28.x | Fully Supported | |
| 1.27.x | Fully Supported | Minimum |
| < 1.27 | Not Tested |
Testing is performed against all supported versions in CI.
Testing
WeaviateEx has comprehensive test coverage with two testing modes:
Test Modes
Mock Mode (Default) - Fast, isolated unit tests:
- β Uses Mox to mock HTTP/Protocol and gRPC responses
- β No Weaviate instance required
- β Fast execution (~0.2 seconds)
- β 2248+ unit tests
- β Perfect for TDD and CI/CD
Integration Mode - Real Weaviate testing:
- β Tests against live Weaviate instance
- β Validates actual API behavior
- β Requires Weaviate running locally
- β
Run with
--include integrationflag - β 10 integration test suites (collections, objects, batch, query, health, search, filter, aggregate, auth/RBAC, backup)
Running Tests
# Run all unit tests with mocks (default - no Weaviate needed)
mix test
# EASIEST: Run integration tests with automatic Weaviate management
mix weaviate.test # Starts Weaviate, runs tests, stops Weaviate
mix weaviate.test --keep # Keep Weaviate running after tests
mix weaviate.test -v 1.30.5 # Test against specific Weaviate version
# MANUAL: Run integration tests with separate Weaviate management
mix weaviate.start # Start Weaviate containers
mix test --include integration # Run integration tests
mix weaviate.stop # Stop Weaviate containers
# Or use environment variable
WEAVIATE_INTEGRATION=true mix test --include integration
# Run specific test file
mix test test/weaviate_ex/api/collections_test.exs
# Run specific test by line number
mix test test/weaviate_ex/objects_test.exs:95
# Run with coverage report (basic)
mix test --cover
# Run with coverage report (detailed HTML via excoveralls)
mix coveralls.html
open cover/excoveralls.html
# Run only integration tests
mix test --only integration
# Run specific integration test suites
mix test --only integration test/integration/search_integration_test.exs
mix test --only rbac # RBAC tests (requires port 8092)
mix test --only backup # Backup tests (requires port 8093)
Test Structure
test/
βββ test_helper.exs # Test setup, Mox configuration
βββ support/
β βββ factory.ex # Test data factories
β βββ mocks.ex # Mox mock definitions
β βββ integration_case.ex # Shared integration test module
βββ weaviate_ex_test.exs # Top-level API tests
βββ weaviate_ex/
β βββ api/ # API module tests (mocked)
β β βββ collections_test.exs
β β βββ data_test.exs
β β βββ aggregate_test.exs
β β βββ tenants_test.exs
β β βββ ...
β βββ filter_test.exs # Filter system tests
β βββ objects_test.exs # Objects API tests
β βββ batch_test.exs # Batch operations tests
β βββ query_test.exs # Query builder tests
βββ integration/ # Integration tests (live Weaviate)
β βββ collections_integration_test.exs # Collection CRUD
β βββ objects_integration_test.exs # Object CRUD
β βββ batch_integration_test.exs # Batch operations
β βββ query_integration_test.exs # Query execution
β βββ health_integration_test.exs # Health checks
β βββ search_integration_test.exs # BM25, near_vector, pagination
β βββ filter_integration_test.exs # Filter operators, AND/OR
β βββ aggregate_integration_test.exs # Aggregations, group by
β βββ auth_integration_test.exs # RBAC, API key auth (port 8092)
β βββ backup_integration_test.exs # Backup/restore (port 8093)
βββ journey/ # Web framework journey tests
βββ scenarios.ex # Shared journey test scenarios
βββ scenarios_test.exs # Direct scenario tests
βββ phoenix_test.exs # Phoenix endpoint integration
βββ plug_test.exs # Plug router integrationIntegration Test Helper
Use WeaviateEx.IntegrationCase for consistent test setup:
defmodule MyIntegrationTest do
use WeaviateEx.IntegrationCase # Auto-configures HTTP client, cleanup
test "my integration test" do
# Unique collection names with automatic cleanup
{name, {:ok, _}} = create_test_collection("MyTest", properties: [...])
# Or use scoped collections
with_collection([prefix: "Scoped"], fn name ->
# Collection exists only within this block
end)
end
endJourney Tests
Journey tests validate WeaviateEx integration with Phoenix and Plug web frameworks. These tests ensure the SDK works correctly when:
- Initialized at application startup and closed at shutdown
- Used from both synchronous and asynchronous contexts (different processes)
- Handling concurrent requests from multiple web requests
- Managing connection lifecycle within web framework patterns
# Start Weaviate
mix weaviate.start
# Run journey tests
WEAVIATE_INTEGRATION=true mix test --include journey
# Or run all integration tests including journey
WEAVIATE_INTEGRATION=true mix test --include integration --include journey
# Stop Weaviate
mix weaviate.stop
See test/journey/ for Phoenix and Plug integration examples:
test/journey/scenarios.ex- Shared journey test scenariostest/journey/scenarios_test.exs- Direct scenario teststest/journey/phoenix_test.exs- Phoenix endpoint integrationtest/journey/plug_test.exs- Plug router integration
Test Coverage
Current test coverage by module:
- β Collections API: 17 tests - Create, list, get, exists, delete, add property
- β Filter System: 80+ tests - All operators, combinators, RefPath, MultiTargetRef, property length
- β Data Operations: 17 tests - Insert, get, patch, exists, delete with vectors
- β Objects API: 15+ tests - Full CRUD with pagination
- β Batch Operations: 35+ tests - Bulk create, delete, error tracking, retry logic
- β Query System: 60+ tests - GraphQL, near_text, hybrid, BM25, move, rerank, groupBy
- β Aggregations: 15+ tests - Count, statistics, group by
- β Tenants: 20+ tests - Multi-tenancy with freeze/offload states
- β References: 30+ tests - Cross-reference CRUD, multi-target references, QueryReference metadata
- β Generative AI: 62 tests - All providers, typed configs, result parsing
- β Vector Config: 15+ tests - HNSW, PQ, flat index, multi-vector
- β Multi-Vector: 10+ tests - ColBERT, Muvera encoding, Jina vectorizers
- β gRPC Services: 50+ tests - Channel management, search, batch, aggregate, tenants, health
- β gRPC Error Handling: 30+ tests - Status code mapping, retryable errors
- β Generative Search: 25+ tests - Query.Generate, all search types, GraphQL generation
- β Nested Properties: 25+ tests - Property.Nested struct, serialization, validation
- β Concurrent Batch: 20+ tests - Parallel insertion, result aggregation
- β Batch Queue: 25+ tests - Queue operations, failure tracking, re-queue
- β Rate Limit Detection: 20+ tests - Provider patterns, backoff calculation
- β Custom Providers: 20+ tests - Custom generative configs, reranker configurations
Total: 2362 tests passing
Mix Tasks
WeaviateEx provides Mix tasks for managing local Weaviate Docker containers:
| Task | Description |
|---|---|
mix weaviate.start | Start Weaviate Docker containers |
mix weaviate.stop | Stop Weaviate Docker containers |
mix weaviate.status | Show container status and health check |
mix weaviate.test | Start Weaviate, run integration tests, stop Weaviate |
mix weaviate.logs | Show Docker container logs |
# Start Weaviate containers (default version: 1.28.14)
mix weaviate.start
mix weaviate.start --version 1.30.5 # Specific version
mix weaviate.start -v latest # Latest version
# Check container status and health
mix weaviate.status
# Stop all Weaviate containers
mix weaviate.stop
mix weaviate.stop --keep-data # Preserve data directory
# Run integration tests (full lifecycle management)
mix weaviate.test # Start, test, stop
mix weaviate.test --keep # Keep Weaviate running after tests
mix weaviate.test -v 1.30.5 # Test against specific version
# View container logs
mix weaviate.logs # Show last 100 lines
mix weaviate.logs --tail 50 # Show last 50 lines
mix weaviate.logs --file docker-compose-backup.yml # Specific compose file
mix weaviate.logs -f --file docker-compose.yml # Follow logs
The tasks shell out to scripts in ci/ which manage multiple Docker Compose profiles (single node, RBAC, backup, cluster, async, etc.).
Development Tools
Benchmarks
Run performance benchmarks with Benchee:
# Start Weaviate first
mix weaviate.start
# Run all benchmarks
mix weaviate.bench
# Run specific benchmark
mix weaviate.bench batch # Batch insert performance
mix weaviate.bench query # Query performance (near_vector, BM25, hybrid)
Results are saved to bench/output/ as HTML files with detailed statistics and charts.
Pre-commit Hooks
Install pre-commit hooks for automatic code quality checks:
# Install pre-commit (Python package)
pip install pre-commit
# Or with Homebrew
brew install pre-commit
# Install hooks
pre-commit install
# Run on all files
pre-commit run --all-files
Hooks automatically run mix format, mix compile --warnings-as-errors, and mix credo --strict before each commit.
Profiling
See guides/profiling.md for profiling techniques using Elixir's built-in tools (fprof, eprof, cprof).
Docker Management
Using the bundled scripts
All Compose profiles live under ci/ (ported from the Python client). The shell scripts manage multiple configurations:
# Start all profiles (single node, modules, RBAC, cluster, async, proxy, backup)
./ci/start_weaviate.sh 1.28.14
# Async-only sandbox for journey tests
./ci/start_weaviate_jt.sh 1.28.14
# Stop all containers
./ci/stop_weaviate.sh
Edit ci/compose.sh to add/remove compose files from the managed set.
Available Docker Compose Profiles
| File | Port(s) | Description |
|---|---|---|
docker-compose.yml | 8080, 50051 | Primary single-node instance |
docker-compose-rbac.yml | 8092 | RBAC-enabled instance |
docker-compose-backup.yml | 8093 | Backup-enabled instance |
docker-compose-cluster.yml | 8087-8089 | 3-node cluster |
docker-compose-async.yml | 8090 | Async/journey test instance |
docker-compose-modules.yml | 8091 | Module-enabled instance |
docker-compose-proxy.yml | 8094 | Proxy configuration |
Direct Docker Compose commands
# Spawn just the baseline stack
docker compose -f ci/docker-compose.yml up -d
# Inspect the cluster nodes
docker compose -f ci/docker-compose-cluster.yml ps
# Tail logs for the RBAC profile
docker compose -f ci/docker-compose-rbac.yml logs -f
# Remove everything (data included)
docker compose -f ci/docker-compose.yml down -v
Troubleshooting tips
# Confirm Docker is running
docker info
# See which services are up for a given profile
docker compose -f ci/docker-compose-backup.yml ps -a
# Check the ready endpoint of the primary instance
curl http://localhost:8080/v1/.well-known/ready
# Query metadata
curl http://localhost:8080/v1/meta
Authentication
For production or cloud Weaviate instances with authentication:
Environment Variables (Recommended)
# Add to .env file (NOT committed to git)
WEAVIATE_URL=https://your-cluster.weaviate.network
WEAVIATE_API_KEY=your-secret-api-key-here
# Or add to ~/.bash_secrets (sourced by ~/.bashrc)
export WEAVIATE_URL=https://your-cluster.weaviate.network
export WEAVIATE_API_KEY=your-secret-api-key-here
Runtime Configuration (Production)
# config/runtime.exs
config :weaviate_ex,
url: System.fetch_env!("WEAVIATE_URL"),
api_key: System.fetch_env!("WEAVIATE_API_KEY"),
strict: true # Fail fast if unreachableDevelopment Configuration
# config/dev.exs (NEVER commit production keys!)
config :weaviate_ex,
url: "http://localhost:8080",
api_key: nil # No auth for local developmentClient Auth Helpers (API Key / OIDC)
Configure auth directly in the client for per-connection credentials and automatic OIDC refresh:
alias WeaviateEx.Auth
# API key
{:ok, client} =
WeaviateEx.Client.connect(
base_url: "https://your-cluster.weaviate.network",
auth: Auth.api_key("your-secret-api-key")
)
# OIDC client credentials (auto-refresh)
auth = Auth.client_credentials("client-id", "client-secret", scopes: ["openid", "profile"])
{:ok, client} =
WeaviateEx.Client.connect(
base_url: "https://your-cluster.weaviate.network",
auth: auth
)
# Skip init checks if needed
{:ok, client} =
WeaviateEx.Client.connect(
base_url: "https://your-cluster.weaviate.network",
auth: auth,
skip_init_checks: true
)OIDC access tokens are refreshed automatically and applied to HTTP headers and gRPC metadata.
Security Best Practices:
- β Never commit API keys to version control
- β Use environment variables for production
- β
Add
.envto.gitignore(already done) - β
Use
System.fetch_env!/1to fail fast on missing keys - β Store production secrets in secure vaults (e.g., AWS Secrets Manager)
- β Use different keys for dev/staging/production
Connection Management
Connecting to Weaviate Cloud (v0.7.4+)
WeaviateEx provides full support for Weaviate Cloud Service (WCS) with automatic configuration:
alias WeaviateEx.Connect
# Connect to Weaviate Cloud with API key
config = Connect.to_weaviate_cloud(
cluster_url: "my-cluster.weaviate.network",
api_key: "your-wcs-api-key"
)
{:ok, client} = WeaviateEx.Client.connect(
base_url: config.base_url,
grpc_host: config.grpc_host,
grpc_port: config.grpc_port,
api_key: config.api_key,
additional_headers: Map.new(config.headers)
)Automatic WCS Features:
- gRPC Host Detection:
.weaviate.networkclusters use{ident}.grpc.{domain}pattern - X-Weaviate-Cluster-URL Header: Automatically added for embedding service integration
- TLS/Port 443: HTTPS and gRPC-TLS enforced for cloud clusters
# Different WCS domains are handled correctly:
Connect.to_weaviate_cloud(cluster_url: "my-cluster.weaviate.network")
# gRPC host: my-cluster.grpc.weaviate.network
Connect.to_weaviate_cloud(cluster_url: "my-cluster.aws.weaviate.cloud")
# gRPC host: grpc-my-cluster.aws.weaviate.cloudServer Version Requirements
WeaviateEx requires Weaviate server version 1.27.0 or higher. The client validates the server version on connection.
# Version check happens automatically during connect
{:ok, client} = WeaviateEx.Client.connect(
base_url: "http://localhost:8080"
)
# To bypass version checks (not recommended)
{:ok, client} = WeaviateEx.Client.connect(
base_url: "http://localhost:8080",
skip_init_checks: true
)When connecting to an unsupported version, you'll receive a clear error:
Weaviate server version 1.20.0 is below minimum required 1.27.0Connection Pool Configuration (v0.6.0+)
Configure HTTP and gRPC connection pools for optimal performance:
alias WeaviateEx.Client.Pool
# Create custom pool configuration
http_pool = Pool.new(
size: 20, # Number of connections in pool
overflow: 10, # Maximum overflow connections
strategy: :lifo, # Connection selection (:fifo or :lifo)
timeout: 5000, # Checkout timeout in ms
idle_timeout: 60_000, # Idle connection timeout in ms
max_age: nil # Max connection age (nil = no limit)
)
# Use preset configurations
http_pool = Pool.default_http() # Optimized for HTTP/Finch
grpc_pool = Pool.default_grpc() # Optimized for gRPC (fewer connections)
# Convert to client options
finch_opts = Pool.to_finch_opts(http_pool)
grpc_opts = Pool.to_grpc_opts(grpc_pool)Simplified Connection Config (v0.7.0+)
For high-load scenarios, use the new Connection config:
alias WeaviateEx.Config.Connection
# Create connection config with custom settings
config = Connection.new(
pool_size: 20, # Connections per pool
max_connections: 200, # Maximum total connections
pool_timeout: 10_000, # Pool checkout timeout (ms)
max_idle_time: 60_000 # Max idle time before close (ms)
)
# Use in client creation
{:ok, client} = WeaviateEx.Client.connect(
base_url: "http://localhost:8080",
connection: config
)
# Or pass options directly
{:ok, client} = WeaviateEx.Client.connect(
base_url: "http://localhost:8080",
connection: [pool_size: 20, max_connections: 200]
)Proxy Configuration (v0.7.3+)
Use proxy settings for HTTP, HTTPS, and gRPC connections:
alias WeaviateEx.Config.Proxy
{:ok, client} =
WeaviateEx.Client.connect(
base_url: "https://your-cluster.weaviate.network",
proxy: Proxy.new(
http: "http://proxy.example.com:8080",
https: "https://proxy.example.com:8443",
grpc: "http://grpc-proxy.example.com:8080"
)
)
# Or read from HTTP_PROXY / HTTPS_PROXY / GRPC_PROXY
{:ok, client} =
WeaviateEx.Client.connect(
base_url: "https://your-cluster.weaviate.network",
proxy: :env
)HTTP Retry Configuration (v0.7.4+)
WeaviateEx automatically retries failed HTTP requests with exponential backoff and jitter. Retries are triggered for both transport errors (network issues) and transient HTTP status codes.
Retryable errors:
- Transport: connection refused, reset, timeout, closed, DNS failure
- HTTP status codes: 408, 429, 500, 502, 503, 504
# Configure retry options when creating a client
{:ok, client} = WeaviateEx.Client.connect(
base_url: "http://localhost:8080",
retry: [
max_retries: 3, # Maximum retry attempts (default: 3)
base_delay_ms: 100, # Base delay for exponential backoff (default: 100)
max_delay_ms: 5000 # Maximum delay cap (default: 5000)
]
)
# Or override per-request
{:ok, data} = WeaviateEx.API.Data.get(client, "Article", uuid,
max_retries: 5,
base_delay_ms: 200,
max_delay_ms: 10000
)Backoff strategy:
- Uses exponential backoff:
delay = base_delay_ms Γ 2^attempt - Adds Β±10% random jitter to prevent thundering herd
- Capped at
max_delay_ms
Example delays with defaults (base=100ms, max=5000ms):
- Attempt 0: ~100ms
- Attempt 1: ~200ms
- Attempt 2: ~400ms
- Attempt 3: ~800ms
Per-Operation Timeouts (v0.7.4+)
WeaviateEx uses different timeouts based on operation type:
| Operation Type | Default Timeout | Description |
|---|---|---|
| Query/GET | 30 seconds | Search, read operations |
| Insert/POST | 90 seconds | Write, update operations |
| Batch | 900 seconds | Batch operations (insert Γ 10) |
| Init | 2 seconds | Connection initialization |
# Configure timeouts in client
{:ok, client} = WeaviateEx.Client.connect(
base_url: "http://localhost:8080",
timeout_config: WeaviateEx.Config.Timeout.new(
query: 60_000, # 60 seconds for queries
insert: 180_000, # 180 seconds for inserts
init: 5_000 # 5 seconds for init
)
)
# Override per-request
{:ok, data} = WeaviateEx.API.Data.get(client, "Article", uuid,
timeout: 60_000 # Explicit timeout override
)
# Specify operation type for automatic timeout selection
{:ok, result} = WeaviateEx.API.Batch.create_objects(client, objects,
operation: :batch # Uses extended batch timeout
)Client Lifecycle Management (v0.6.0+)
Manage client connections with explicit lifecycle control:
alias WeaviateEx.Client
# Create and use a client
{:ok, client} = Client.new(base_url: "http://localhost:8080")
# Check client status
Client.status(client) # => :connected, :initializing, :disconnected, :closed
# Check if client is closed
Client.closed?(client) # => false
# Get client statistics
stats = Client.stats(client)
IO.puts("Requests: #{stats.request_count}")
IO.puts("Errors: #{stats.error_count}")
IO.puts("Created: #{stats.created_at}")
# Close the client when done
:ok = Client.close(client)
Client.closed?(client) # => trueResource Management with with_client/2
Automatic client lifecycle management with guaranteed cleanup:
alias WeaviateEx.Client
# with_client ensures cleanup even on errors
result = Client.with_client([base_url: "http://localhost:8080"], fn client ->
# Use client for operations
{:ok, meta} = WeaviateEx.health_check(client)
{:ok, collections} = WeaviateEx.Collections.list(client)
# Return your result
{:ok, %{version: meta["version"], collections: length(collections)}}
end)
# Client is automatically closed after the function returns
case result do
{:ok, data} -> IO.puts("Version: #{data.version}")
{:error, reason} -> IO.puts("Error: #{inspect(reason)}")
end
# Even if the function raises, client is closed
try do
Client.with_client([base_url: url], fn client ->
raise "Something went wrong"
end)
rescue
e -> IO.puts("Caught: #{e.message}")
# Client was still properly closed
endDebug & Troubleshooting
Debug Module (v0.6.0+)
Compare REST and gRPC protocol responses for debugging:
alias WeaviateEx.Debug
# Get an object via REST (HTTP)
{:ok, rest_obj} = Debug.get_object_rest(client, "Article", uuid)
{:ok, rest_obj} =
Debug.get_object_rest(client, "Article", uuid,
node_name: "node-1",
consistency_level: "ALL"
)
# Get the same object via gRPC
{:ok, grpc_obj} = Debug.get_object_grpc(client, "Article", uuid)
# Compare both protocols and get a detailed diff
{:ok, comparison} = Debug.compare_protocols(client, "Article", uuid)
# Check comparison results
comparison.match? # => true or false
comparison.rest_object # => %{...}
comparison.grpc_object # => %{...}
comparison.differences # => [] or list of differences
# Get connection diagnostics
{:ok, info} = Debug.connection_info(client)
IO.puts("HTTP Base URL: #{info.http_base_url}")
IO.puts("gRPC Connected: #{info.grpc_connected}")
IO.puts("gRPC Host: #{info.grpc_host}:#{info.grpc_port}")Object Comparison
Deep comparison of objects from different sources:
alias WeaviateEx.Debug.ObjectCompare
# Compare two objects
result = ObjectCompare.compare(rest_object, grpc_object)
result.match? # => true if objects are equivalent
result.differences # => list of differences found
# Get a formatted diff report
diff_list = ObjectCompare.diff(rest_object, grpc_object)
report = ObjectCompare.format_diff(diff_list)
IO.puts(report)
# Output:
# - properties.title: "REST Title" vs "gRPC Title"
# - _additional.vector: [0.1, 0.2, ...] vs [0.1, 0.2, ...]Request Logging
Log and analyze HTTP/gRPC requests for debugging:
alias WeaviateEx.Debug.RequestLogger
# Start the request logger
{:ok, logger} = RequestLogger.start_link(name: :my_logger)
# Enable logging
RequestLogger.enable(logger)
# Log requests manually or via middleware
RequestLogger.log_request(logger, %{
method: :get,
path: "/v1/schema",
protocol: :http,
duration_ms: 45,
status: 200
})
# Get recent logs
logs = RequestLogger.get_logs(logger)
for log <- logs do
IO.puts("#{log.protocol} #{log.method} #{log.path} - #{log.status} (#{log.duration_ms}ms)")
end
# Filter logs
http_logs = RequestLogger.get_logs(logger, protocol: :http)
slow_logs = RequestLogger.get_logs(logger, min_duration_ms: 100)
# Export logs for analysis
RequestLogger.export_logs(logger, "/tmp/weaviate_requests.json", :json)
RequestLogger.export_logs(logger, "/tmp/weaviate_requests.txt", :text)
# Clear logs
RequestLogger.clear_logs(logger)
# Disable when done
RequestLogger.disable(logger)Main Module Debug Helpers
Quick access to debug functions from the main module:
# Get object via REST
{:ok, obj} = WeaviateEx.debug_get_rest(client, "Article", uuid)
# Compare protocols
{:ok, comparison} = WeaviateEx.debug_compare(client, "Article", uuid)Documentation
- INSTALL.md - Detailed installation guide for all platforms
- CHANGELOG.md - Version history and release notes
- API Documentation - Full API reference on HexDocs
- Weaviate Docs - Official Weaviate documentation
- Examples - 8 runnable examples in the GitHub repository (see Examples section)
Building Documentation Locally
# Generate docs
mix docs
# Open in browser (macOS)
open doc/index.html
# Open in browser (Linux)
xdg-open doc/index.html
Development
# Clone the repository
git clone https://github.com/yourusername/weaviate_ex.git
cd weaviate_ex
# Install dependencies
mix deps.get
# Compile
mix compile
# Run unit tests (mocked - fast)
mix test
# Run integration tests (requires live Weaviate)
mix weaviate.start
mix test --include integration
# Generate documentation
mix docs
# Run code analysis
mix credo
# Run type checking (if dialyzer is set up)
mix dialyzer
# Format code
mix format
Project Structure
weaviate_ex/
βββ ci/
β βββ weaviate/ # Docker assets mirrored from Python client
β βββ compose.sh
β βββ start_weaviate.sh
β βββ docker-compose.yml
β βββ docker-compose-*.yml
βββ priv/
β βββ protos/v1/ # Weaviate gRPC proto definitions
β βββ weaviate.proto
β βββ batch.proto
β βββ search_get.proto
β βββ ...
βββ lib/
β βββ weaviate_ex.ex # Top-level API
β βββ weaviate_ex/
β β βββ embedded.ex # Embedded binary lifecycle manager
β β βββ dev_support/ # Internal tooling (compose helper)
β β βββ application.ex # OTP application
β β βββ client.ex # Client struct & config
β β βββ config.ex # Configuration management
β β βββ error.ex # Error types (HTTP + gRPC)
β β βββ filter.ex # Filter DSL
β β βββ api/ # API modules
β β β βββ collections.ex
β β β βββ data.ex
β β β βββ aggregate.ex
β β β βββ tenants.ex
β β β βββ vector_config.ex
β β βββ grpc/ # gRPC infrastructure
β β β βββ channel.ex # Channel management
β β β βββ services/ # gRPC service clients
β β β β βββ search.ex
β β β β βββ batch.ex
β β β β βββ aggregate.ex
β β β β βββ tenants.ex
β β β β βββ health.ex
β β β βββ generated/v1/ # Proto-generated modules
β β βββ ...
β βββ mix/
β βββ tasks/
β βββ weaviate.start.ex
β βββ weaviate.stop.ex
β βββ weaviate.status.ex
β βββ weaviate.logs.ex
βββ test/ # Test suite
βββ examples/ # Runnable examples (in source repo)
βββ install.sh # Legacy single-profile bootstrap
βββ mix.exs # Project configurationContributing
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Write tests: All new features should include tests
- Run tests:
mix test(should pass) - Run integration tests:
mix weaviate.test(optional but recommended) - Run Credo:
mix credo(should pass) - Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open a Pull Request
CI/CD Pipeline
Pull requests automatically run the following GitHub Actions jobs:
| Job | Description |
|---|---|
format-and-lint | Code formatting and Credo linting |
unit-tests | 2300+ unit tests with Mox mocking + Dialyzer |
integration-tests | Integration tests against Weaviate 1.28.14 |
integration-matrix | Tests against Weaviate 1.27, 1.28, 1.29, 1.30 (master/tags only) |
Development Guidelines
- Write tests first (TDD approach)
- Maintain test coverage above 90%
- Follow Elixir style guide
- Add typespecs for public functions
- Update documentation for API changes
- Add examples for new features
- For API changes, add integration tests in
test/integration/
License
MIT License. See LICENSE for details.
Acknowledgments
- Built for Weaviate vector database
- Inspired by official Python and TypeScript clients
- Uses grpc-elixir for high-performance gRPC operations
- Uses Finch for HTTP/2 connection pooling (schema operations)
- Powered by Elixir and the BEAM VM
Questions or Issues? Open an issue on GitHub