WeaviateEx.API.VectorConfig (WeaviateEx v0.7.4)

View Source

Vector configuration builders for Phase 2.4.

Provides builder functions for:

  • 25+ vectorizer configurations
  • 3 index types (HNSW, FLAT, DYNAMIC)
  • 3 quantization methods (PQ, BQ, SQ)
  • Complete collection configurations

Usage

config = VectorConfig.new("Article")
|> VectorConfig.with_vectorizer(:text2vec_openai, model: "text-embedding-ada-002")
|> VectorConfig.with_hnsw_index(ef: 100, max_connections: 64)
|> VectorConfig.with_product_quantization(enabled: true)
|> VectorConfig.with_properties([
  %{name: "title", dataType: ["text"]}
])

Summary

Functions

Configure Binary Quantization (BQ)

Configure a custom/unlisted vectorizer module.

List all distance metrics

Configure DYNAMIC index

Configure FLAT index

Configure HNSW index.

Configure img2vec-neural vectorizer for images.

Configure multi2multivec-jinaai vectorizer.

Configure multi2vec-aws for multimodal embeddings.

Configure multi2vec-bind vectorizer

Configure multi2vec-clip vectorizer

Configure multi2vec-cohere for multimodal embeddings.

Configure multi2vec-google (Palm) for multimodal embeddings.

Configure multi2vec-jinaai for multimodal embeddings.

Configure multi2vec-nvidia for multimodal embeddings.

Configure multi2vec-voyageai for multimodal embeddings.

Create new collection configuration

Configure no vectorizer (custom vectors)

Configure Product Quantization (PQ)

Configure ref2vec-centroid vectorizer.

Configure reranker-cohere module.

Configure reranker-contextualai module.

Configure reranker-jinaai module.

Configure reranker-nvidia module.

Configure reranker-transformers module (local model).

Configure reranker-voyageai module.

Configure Rotational Quantization (RQ).

Configure Scalar Quantization (SQ).

Alias for scalar_quantization/1 with enabled defaulting to true.

List all supported vectorizers

Configure text2colbert-jinaai vectorizer (multi-vector).

Configure text2vec-aws vectorizer (deprecated, use text2vec_aws_bedrock or text2vec_aws_sagemaker).

Configure text2vec-aws with AWS Bedrock service.

Configure text2vec-aws with AWS SageMaker service.

Configure text2vec-azure-openai vectorizer.

Configure text2vec-cohere vectorizer.

Configure text2vec-contextionary vectorizer

Configure text2vec-databricks vectorizer.

Configure text2vec-google with Google AI Studio (Gemini).

Configure text2vec-google with Google Vertex AI.

Configure text2vec-gpt4all vectorizer

Configure text2vec-huggingface vectorizer

Configure text2vec-jinaai vectorizer.

Configure text2vec-mistral vectorizer.

Configure text2vec-model2vec vectorizer.

Configure text2vec-morph vectorizer.

Configure text2vec-nvidia vectorizer.

Configure text2vec-ollama vectorizer.

Configure text2vec-openai vectorizer

Configure text2vec-palm vectorizer (deprecated, use text2vec_google_vertex or text2vec_google_gemini).

Configure text2vec-transformers vectorizer

Configure text2vec-voyageai vectorizer.

Configure text2vec-weaviate vectorizer (Weaviate-hosted embeddings).

Check if distance metric is valid

Check if vectorizer is valid

Add Binary Quantization to configuration

Add a custom vectorizer to the collection configuration.

Add DYNAMIC index to configuration

Add FLAT index to configuration

Add HNSW index to configuration

Add multi-tenancy configuration

Add named vectors configuration

Add Product Quantization to configuration

Add properties to configuration

Add replication configuration.

Add reranker configuration to a collection.

Add Rotational Quantization to configuration

Add Scalar Quantization to configuration

Add sharding configuration

Add vectorizer to configuration

Types

config()

@type config() :: map()

opts()

@type opts() :: keyword()

vectorizer()

@type vectorizer() ::
  :text2vec_openai
  | :text2vec_cohere
  | :text2vec_huggingface
  | :text2vec_transformers
  | :text2vec_contextionary
  | :text2vec_gpt4all
  | :text2vec_palm
  | :text2vec_aws
  | :multi2vec_clip
  | :multi2vec_bind
  | :none

Functions

binary_quantization(opts \\ [])

Configure Binary Quantization (BQ)

custom(module_name, opts \\ [])

@spec custom(
  String.t(),
  keyword()
) :: map()

Configure a custom/unlisted vectorizer module.

Use this for vectorizers not explicitly supported or for user-provided modules.

Options

All options are passed directly to the module configuration with automatic snake_case to camelCase conversion.

Examples

# Custom vectorizer with arbitrary configuration
VectorConfig.custom("my-custom-vectorizer",
  model: "custom-model",
  api_endpoint: "https://custom-api.example.com",
  dimensions: 768
)

# Use in collection creation
VectorConfig.new("Article")
|> VectorConfig.with_custom_vectorizer("text2vec-custom", model: "my-model")

distance_metrics()

List all distance metrics

dynamic_index(opts \\ [])

Configure DYNAMIC index

flat_index(opts \\ [])

Configure FLAT index

hnsw_index(opts \\ [])

@spec hnsw_index(keyword()) :: map()

Configure HNSW index.

Options

  • :distance - Distance metric (:cosine, :dot, :l2_squared, :hamming, :manhattan)
  • :ef - Query time ef parameter (default: -1)
  • :ef_construction - Index build ef parameter (default: 128)
  • :max_connections - Maximum connections per node (default: 32)
  • :filter_strategy - Filter strategy (:sweeping or :acorn)
  • :quantizer - Quantization config (PQ, BQ, SQ, or RQ)

img2vec_neural(opts \\ [])

Configure img2vec-neural vectorizer for images.

Options

  • :image_fields - Image property names

multi2multivec_jinaai(opts \\ [])

Configure multi2multivec-jinaai vectorizer.

Options

  • :model - Model to use
  • :base_url - Base URL for API (optional)
  • :image_fields - List of image property names (optional)
  • :text_fields - List of text property names (optional)

multi2vec_aws(opts)

Configure multi2vec-aws for multimodal embeddings.

Options

  • :model - AWS model (required)
  • :region - AWS region (required)
  • :service - AWS service (bedrock or sagemaker)
  • :image_fields - Image property fields (optional)
  • :text_fields - Text property fields (optional)

multi2vec_bind(opts \\ [])

Configure multi2vec-bind vectorizer

multi2vec_clip(opts \\ [])

Configure multi2vec-clip vectorizer

multi2vec_cohere(opts \\ [])

Configure multi2vec-cohere for multimodal embeddings.

Options

  • :model - Cohere model (optional)
  • :image_fields - Image property fields (optional)
  • :text_fields - Text property fields (optional)
  • :truncate - Truncation mode (optional)

multi2vec_google(opts)

Configure multi2vec-google (Palm) for multimodal embeddings.

Options

  • :project_id - Google Cloud project ID (required)
  • :location - Model location (required)
  • :model - Model ID (optional)
  • :dimensions - Output dimensions (optional)
  • :image_fields - Image property fields (optional)
  • :text_fields - Text property fields (optional)
  • :video_fields - Video property fields (optional)

multi2vec_jinaai(opts \\ [])

Configure multi2vec-jinaai for multimodal embeddings.

Options

  • :model - Jina model (optional)
  • :image_fields - Image property fields (optional)
  • :text_fields - Text property fields (optional)
  • :base_url - API base URL (optional)

multi2vec_nvidia(opts \\ [])

Configure multi2vec-nvidia for multimodal embeddings.

Options

  • :model - NVIDIA model (optional)
  • :image_fields - Image property fields (optional)
  • :text_fields - Text property fields (optional)
  • :base_url - API base URL (optional)

multi2vec_voyageai(opts \\ [])

Configure multi2vec-voyageai for multimodal embeddings.

Options

  • :model - VoyageAI model (optional)
  • :image_fields - Image property fields (optional)
  • :text_fields - Text property fields (optional)
  • :base_url - API base URL (optional)

new(class_name)

Create new collection configuration

none()

Configure no vectorizer (custom vectors)

product_quantization(opts \\ [])

Configure Product Quantization (PQ)

ref2vec_centroid(opts \\ [])

Configure ref2vec-centroid vectorizer.

Creates vectors from referenced objects using centroid calculation.

Options

  • :reference_properties - List of reference property names

reranker_cohere(opts \\ [])

Configure reranker-cohere module.

Options

  • :model - Model to use (optional)
  • :base_url - Base URL for API (optional)

Example

VectorConfig.new("Article")
|> VectorConfig.with_reranker(:cohere, model: "rerank-multilingual-v3.0")

reranker_contextualai(opts \\ [])

Configure reranker-contextualai module.

Contextual AI reranker with retrieval augmentation capabilities.

Options

  • :model - Model to use
  • :base_url - Base URL for API (optional)
  • :context_source - Context source (optional)

Example

VectorConfig.new("Article")
|> VectorConfig.with_reranker(:contextualai, model: "contextual-rerank-v1")

reranker_jinaai(opts \\ [])

Configure reranker-jinaai module.

Options

  • :model - Model to use (e.g., "jina-reranker-v2-base-multilingual")
  • :base_url - Base URL for API (optional)

Example

VectorConfig.new("Article")
|> VectorConfig.with_reranker(:jinaai, model: "jina-reranker-v2-base-multilingual")

reranker_nvidia(opts \\ [])

Configure reranker-nvidia module.

Options

  • :model - Model to use (e.g., "nvidia/nv-rerankqa-mistral-4b-v3")
  • :base_url - Base URL for API (optional)

Example

VectorConfig.new("Article")
|> VectorConfig.with_reranker(:nvidia, model: "nvidia/nv-rerankqa-mistral-4b-v3")

reranker_transformers(opts \\ [])

Configure reranker-transformers module (local model).

Runs locally using a transformer model.

Options

  • :model - Model to use (optional)
  • :inference_url - Inference URL (optional)

Example

VectorConfig.new("Article")
|> VectorConfig.with_reranker(:transformers)

reranker_voyageai(opts \\ [])

Configure reranker-voyageai module.

Options

  • :model - Model to use (e.g., "rerank-2", "rerank-lite-1")
  • :base_url - Base URL for API (optional)
  • :truncate - Truncation mode (optional)

Example

VectorConfig.new("Article")
|> VectorConfig.with_reranker(:voyageai, model: "rerank-2")

rotational_quantization(opts \\ [])

@spec rotational_quantization(keyword()) :: map()

Configure Rotational Quantization (RQ).

RQ is an advanced quantization method that uses rotational transformations.

Options

  • :enabled - Enable RQ (default: true)
  • :cache - Enable cache (optional)
  • :bits - Number of bits for quantization (default: 8)
  • :rescore_limit - Number of candidates to rescore (optional)
  • :training_limit - Number of vectors to train on (optional)

Example

VectorConfig.hnsw_index(
  quantizer: VectorConfig.rotational_quantization(bits: 8, cache: true)
)

rq(opts \\ [])

@spec rq(keyword()) :: map()

Alias for rotational_quantization/1.

Example

VectorConfig.hnsw_index(
  quantizer: VectorConfig.rq(bits: 8, cache: true)
)

scalar_quantization(opts \\ [])

@spec scalar_quantization(keyword()) :: map()

Configure Scalar Quantization (SQ).

SQ provides memory reduction through scalar quantization of vectors.

Options

  • :enabled - Enable SQ (default: false)
  • :cache - Enable cache (optional)
  • :rescore_limit - Number of candidates to rescore (optional)
  • :training_limit - Number of vectors to train on (optional)

Example

VectorConfig.hnsw_index(
  quantizer: VectorConfig.scalar_quantization(enabled: true, cache: true)
)

sq(opts \\ [])

@spec sq(keyword()) :: map()

Alias for scalar_quantization/1 with enabled defaulting to true.

Example

VectorConfig.hnsw_index(
  quantizer: VectorConfig.sq(training_limit: 50_000)
)

supported_vectorizers()

List all supported vectorizers

text2colbert_jinaai(opts \\ [])

Configure text2colbert-jinaai vectorizer (multi-vector).

Options

  • :model - Model to use
  • :dimensions - Output dimensions (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

text2vec_aws(opts \\ [])

Configure text2vec-aws vectorizer (deprecated, use text2vec_aws_bedrock or text2vec_aws_sagemaker).

text2vec_aws_bedrock(opts)

Configure text2vec-aws with AWS Bedrock service.

Options

  • :model - The model to use (required, e.g., "amazon.titan-embed-text-v1")
  • :region - AWS region (required)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

Example

VectorConfig.text2vec_aws_bedrock(
  model: "amazon.titan-embed-text-v1",
  region: "us-east-1"
)

text2vec_aws_sagemaker(opts)

Configure text2vec-aws with AWS SageMaker service.

Options

  • :endpoint - The SageMaker endpoint (required)
  • :region - AWS region (required)
  • :target_model - Target model (optional)
  • :target_variant - Target variant (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

Example

VectorConfig.text2vec_aws_sagemaker(
  endpoint: "my-endpoint",
  region: "us-east-1"
)

text2vec_azure_openai(opts)

Configure text2vec-azure-openai vectorizer.

Options

  • :resource_name - Azure resource name (required)
  • :deployment_id - Azure deployment ID (required)
  • :base_url - Custom base URL (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

text2vec_cohere(opts \\ [])

Configure text2vec-cohere vectorizer.

Options

  • :model - Model to use (optional)
  • :dimensions - Output dimensions (optional, new in Python client)
  • :truncate - Truncation mode (optional)
  • :base_url - Base URL for API (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

text2vec_contextionary(opts \\ [])

Configure text2vec-contextionary vectorizer

text2vec_databricks(opts)

Configure text2vec-databricks vectorizer.

Options

  • :endpoint - Databricks serving endpoint (required)
  • :instruction - Instruction prefix (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

text2vec_google_gemini(opts \\ [])

Configure text2vec-google with Google AI Studio (Gemini).

Options

  • :model - Model to use (optional)
  • :dimensions - Output dimensions (optional)
  • :title_property - Property to use as title (optional)
  • :task_type - Task type for embeddings (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

Example

VectorConfig.text2vec_google_gemini(model: "text-embedding-004")

text2vec_google_vertex(opts)

Configure text2vec-google with Google Vertex AI.

Options

  • :project_id - Google Cloud project ID (required)
  • :api_endpoint - API endpoint (optional)
  • :model - Model to use (optional)
  • :dimensions - Output dimensions (optional)
  • :title_property - Property to use as title (optional)
  • :task_type - Task type for embeddings (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

Example

VectorConfig.text2vec_google_vertex(
  project_id: "my-project",
  model: "textembedding-gecko@001"
)

text2vec_gpt4all(opts \\ [])

Configure text2vec-gpt4all vectorizer

text2vec_huggingface(opts \\ [])

Configure text2vec-huggingface vectorizer

text2vec_jinaai(opts \\ [])

Configure text2vec-jinaai vectorizer.

Options

  • :model - Jina model (e.g., "jina-embeddings-v3", "jina-embeddings-v4")
  • :base_url - API base URL (optional)
  • :dimensions - Output dimensions (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

text2vec_mistral(opts \\ [])

Configure text2vec-mistral vectorizer.

Options

  • :model - Mistral model name
  • :base_url - Base URL for API (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

text2vec_model2vec(opts \\ [])

Configure text2vec-model2vec vectorizer.

Options

  • :inference_url - URL for inference service (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

text2vec_morph(opts \\ [])

Configure text2vec-morph vectorizer.

Options

  • :model - Model to use
  • :base_url - Base URL for API (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

text2vec_nvidia(opts \\ [])

Configure text2vec-nvidia vectorizer.

Options

  • :model - NVIDIA model name
  • :base_url - Base URL for API (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

text2vec_ollama(opts \\ [])

Configure text2vec-ollama vectorizer.

Options

  • :model - Ollama model name
  • :api_endpoint - Ollama API endpoint (default: http://localhost:11434)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

text2vec_openai(opts \\ [])

Configure text2vec-openai vectorizer

text2vec_palm(opts \\ [])

Configure text2vec-palm vectorizer (deprecated, use text2vec_google_vertex or text2vec_google_gemini).

text2vec_transformers(opts \\ [])

Configure text2vec-transformers vectorizer

text2vec_voyageai(opts \\ [])

Configure text2vec-voyageai vectorizer.

Options

  • :model - Model to use (e.g., "voyage-3.5", "voyage-3-large", "voyage-context-3")
  • :base_url - Base URL for API (optional)
  • :truncation - Whether to truncate (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

text2vec_weaviate(opts \\ [])

Configure text2vec-weaviate vectorizer (Weaviate-hosted embeddings).

Options

  • :model - Model name
  • :base_url - Base URL for API (optional)
  • :vectorize_collection_name - Whether to vectorize the collection name (default: true)

valid_distance?(metric)

Check if distance metric is valid

valid_vectorizer?(vectorizer)

Check if vectorizer is valid

with_binary_quantization(config, opts \\ [])

Add Binary Quantization to configuration

with_custom_vectorizer(config, module_name, opts \\ [])

@spec with_custom_vectorizer(map(), String.t(), keyword()) :: map()

Add a custom vectorizer to the collection configuration.

Examples

VectorConfig.new("Article")
|> VectorConfig.with_custom_vectorizer("text2vec-custom", model: "my-model")

with_dynamic_index(config, opts \\ [])

Add DYNAMIC index to configuration

with_flat_index(config, opts \\ [])

Add FLAT index to configuration

with_hnsw_index(config, opts \\ [])

Add HNSW index to configuration

with_multi_tenancy(config, opts \\ [])

Add multi-tenancy configuration

with_named_vectors(config, vectors)

Add named vectors configuration

with_product_quantization(config, opts \\ [])

Add Product Quantization to configuration

with_properties(config, properties)

Add properties to configuration

with_replication_config(config, opts \\ [])

@spec with_replication_config(
  map(),
  keyword()
) :: map()

Add replication configuration.

Options

  • :factor - Number of replicas (default: 1)
  • :async_enabled - Enable async replication (v1.26.0+, optional)
  • :deletion_strategy - Conflict resolution strategy (optional)
    • :delete_on_conflict - Delete object on conflict
    • :no_automated_resolution - No automated conflict resolution
    • :time_based_resolution - Use timestamp for resolution

with_reranker(config, reranker, opts \\ [])

@spec with_reranker(map(), atom(), keyword()) :: map()

Add reranker configuration to a collection.

Rerankers

  • :cohere - Cohere reranker
  • :transformers - Local transformers reranker
  • :voyageai - VoyageAI reranker
  • :jinaai - Jina AI reranker
  • :nvidia - NVIDIA reranker
  • :contextualai - Contextual AI reranker

Example

VectorConfig.new("Article")
|> VectorConfig.with_vectorizer(:text2vec_openai)
|> VectorConfig.with_reranker(:cohere, model: "rerank-multilingual-v3.0")

with_rotational_quantization(config, opts \\ [])

@spec with_rotational_quantization(
  map(),
  keyword()
) :: map()

Add Rotational Quantization to configuration

with_scalar_quantization(config, opts \\ [])

Add Scalar Quantization to configuration

with_sharding_config(config, opts \\ [])

Add sharding configuration

with_vectorizer(config, vectorizer, opts \\ [])

Add vectorizer to configuration