WeaviateEx.API.MultiVector (WeaviateEx v0.7.4)

View Source

Multi-vector configuration for ColBERT-style embeddings.

Multi-vectors allow storing multiple vectors per document, enabling late interaction retrieval methods like MaxSim.

Examples

# Self-provided multi-vectors
MultiVector.self_provided(
  name: "custom_multivec",
  encoding: MultiVector.muvera_encoding(ksim: 64)
)

# text2colbert-jinaai for ColBERT embeddings
MultiVector.text2colbert_jinaai(
  name: "colbert_vector",
  model: "jina-colbert-v2",
  source_properties: ["title", "content"],
  encoding: MultiVector.muvera_encoding(ksim: 64, dprojections: 128),
  multi_vector_config: MultiVector.multi_vector_config(aggregation: :max_sim)
)

Summary

Functions

Create a multi2multivec-jinaai configuration for multimodal multi-vectors.

Configure multi-vector aggregation.

Configure Muvera encoding for multi-vectors.

Create a self-provided multi-vector configuration.

Create a text2colbert-jinaai multi-vector configuration.

Types

aggregation()

@type aggregation() :: :max_sim

Functions

multi2multivec_jinaai(opts)

@spec multi2multivec_jinaai(keyword()) :: map()

Create a multi2multivec-jinaai configuration for multimodal multi-vectors.

Supports image and text fields for multimodal embeddings.

Options

  • :name - Vector name (required)
  • :model - Jina model (e.g., "jina-clip-v2")
  • :dimensions - Output dimensions
  • :image_fields - Image property fields
  • :text_fields - Text property fields
  • :vectorize_collection_name - Whether to vectorize collection name
  • :encoding - Encoding configuration
  • :multi_vector_config - Multi-vector config

Examples

MultiVector.multi2multivec_jinaai(
  name: "multivec",
  model: "jina-clip-v2",
  image_fields: ["image"],
  text_fields: ["caption"]
)

multi_vector_config(opts \\ [])

@spec multi_vector_config(keyword()) :: map()

Configure multi-vector aggregation.

Options

  • :aggregation - Aggregation method (:max_sim for MaxSim)

Examples

MultiVector.multi_vector_config(aggregation: :max_sim)

muvera_encoding(opts \\ [])

@spec muvera_encoding(keyword()) :: map()

Configure Muvera encoding for multi-vectors.

Muvera is an encoding scheme optimized for multi-vector representations.

Options

  • :ksim - Number of similar vectors to consider
  • :dprojections - Dimension of projections
  • :repetitions - Number of repetitions

Examples

MultiVector.muvera_encoding()
MultiVector.muvera_encoding(ksim: 64, dprojections: 128)

self_provided(opts)

@spec self_provided(keyword()) :: map()

Create a self-provided multi-vector configuration.

Use this when you want to provide your own multi-vector embeddings rather than using automatic vectorization.

Options

  • :name - Vector name (required)
  • :encoding - Encoding configuration (e.g., from muvera_encoding/1)
  • :multi_vector_config - Multi-vector config (e.g., from multi_vector_config/1)
  • :hnsw_opts - HNSW index options

Examples

MultiVector.self_provided(
  name: "custom_multivec",
  encoding: MultiVector.muvera_encoding(ksim: 64)
)

text2colbert_jinaai(opts)

@spec text2colbert_jinaai(keyword()) :: map()

Create a text2colbert-jinaai multi-vector configuration.

ColBERT (Contextualized Late Interaction over BERT) produces multiple token-level embeddings for each document.

Options

  • :name - Vector name (required)
  • :model - Jina ColBERT model (e.g., "jina-colbert-v2")
  • :dimensions - Output dimensions
  • :source_properties - Properties to vectorize
  • :vectorize_collection_name - Whether to vectorize collection name
  • :encoding - Encoding configuration (e.g., from muvera_encoding/1)
  • :multi_vector_config - Multi-vector config (e.g., from multi_vector_config/1)
  • :hnsw_opts - HNSW index options

Examples

MultiVector.text2colbert_jinaai(
  name: "colbert_vector",
  model: "jina-colbert-v2",
  source_properties: ["title", "content"],
  encoding: MultiVector.muvera_encoding(ksim: 64, dprojections: 128)
)