WeaviateEx.API.MultiVector (WeaviateEx v0.7.4)
View SourceMulti-vector configuration for ColBERT-style embeddings.
Multi-vectors allow storing multiple vectors per document, enabling late interaction retrieval methods like MaxSim.
Examples
# Self-provided multi-vectors
MultiVector.self_provided(
name: "custom_multivec",
encoding: MultiVector.muvera_encoding(ksim: 64)
)
# text2colbert-jinaai for ColBERT embeddings
MultiVector.text2colbert_jinaai(
name: "colbert_vector",
model: "jina-colbert-v2",
source_properties: ["title", "content"],
encoding: MultiVector.muvera_encoding(ksim: 64, dprojections: 128),
multi_vector_config: MultiVector.multi_vector_config(aggregation: :max_sim)
)
Summary
Functions
Create a multi2multivec-jinaai configuration for multimodal multi-vectors.
Configure multi-vector aggregation.
Configure Muvera encoding for multi-vectors.
Create a self-provided multi-vector configuration.
Create a text2colbert-jinaai multi-vector configuration.
Types
Functions
Create a multi2multivec-jinaai configuration for multimodal multi-vectors.
Supports image and text fields for multimodal embeddings.
Options
:name- Vector name (required):model- Jina model (e.g., "jina-clip-v2"):dimensions- Output dimensions:image_fields- Image property fields:text_fields- Text property fields:vectorize_collection_name- Whether to vectorize collection name:encoding- Encoding configuration:multi_vector_config- Multi-vector config
Examples
MultiVector.multi2multivec_jinaai(
name: "multivec",
model: "jina-clip-v2",
image_fields: ["image"],
text_fields: ["caption"]
)
Configure multi-vector aggregation.
Options
:aggregation- Aggregation method (:max_sim for MaxSim)
Examples
MultiVector.multi_vector_config(aggregation: :max_sim)
Configure Muvera encoding for multi-vectors.
Muvera is an encoding scheme optimized for multi-vector representations.
Options
:ksim- Number of similar vectors to consider:dprojections- Dimension of projections:repetitions- Number of repetitions
Examples
MultiVector.muvera_encoding()
MultiVector.muvera_encoding(ksim: 64, dprojections: 128)
Create a self-provided multi-vector configuration.
Use this when you want to provide your own multi-vector embeddings rather than using automatic vectorization.
Options
:name- Vector name (required):encoding- Encoding configuration (e.g., frommuvera_encoding/1):multi_vector_config- Multi-vector config (e.g., frommulti_vector_config/1):hnsw_opts- HNSW index options
Examples
MultiVector.self_provided(
name: "custom_multivec",
encoding: MultiVector.muvera_encoding(ksim: 64)
)
Create a text2colbert-jinaai multi-vector configuration.
ColBERT (Contextualized Late Interaction over BERT) produces multiple token-level embeddings for each document.
Options
:name- Vector name (required):model- Jina ColBERT model (e.g., "jina-colbert-v2"):dimensions- Output dimensions:source_properties- Properties to vectorize:vectorize_collection_name- Whether to vectorize collection name:encoding- Encoding configuration (e.g., frommuvera_encoding/1):multi_vector_config- Multi-vector config (e.g., frommulti_vector_config/1):hnsw_opts- HNSW index options
Examples
MultiVector.text2colbert_jinaai(
name: "colbert_vector",
model: "jina-colbert-v2",
source_properties: ["title", "content"],
encoding: MultiVector.muvera_encoding(ksim: 64, dprojections: 128)
)