WeaviateEx.API.InvertedIndexConfig (WeaviateEx v0.7.4)
View SourceConfiguration builders for Weaviate inverted index settings.
This module provides helpers for configuring BM25 parameters, stopwords, and other inverted index settings for collection schema definitions.
BM25 Parameters
BM25 (Best Match 25) is the ranking algorithm used for keyword search:
b- Length normalization (0.0-1.0), default 0.75k1- Term frequency saturation, default 1.2
Stopwords
Stopwords are common words filtered from indexing:
:enpreset includes English stopwords:nonedisables stopword filtering- Custom additions and removals can be specified
Examples
# Basic BM25 configuration
bm25 = InvertedIndexConfig.bm25(b: 0.8, k1: 1.5)
# Stopwords with additions
stopwords = InvertedIndexConfig.stopwords(
preset: :en,
additions: ["foo", "bar"]
)
# Full configuration for collection creation
config = InvertedIndexConfig.build(
bm25: [b: 0.8, k1: 1.5],
stopwords: [preset: :en, additions: ["foo"]],
cleanup_interval_seconds: 60,
index_timestamps: true,
index_property_length: true,
index_null_state: false
)
Collections.create("Article", %{
properties: [...],
invertedIndexConfig: config
})
Summary
Functions
Create BM25 ranking configuration.
Build a complete inverted index configuration from options.
Create configuration for cleanup interval.
Create configuration to enable/disable null state indexing.
Create configuration to enable/disable property length indexing.
Create configuration to enable/disable timestamp indexing.
Merge two inverted index configurations.
Create stopwords configuration.
Validate an inverted index configuration.
Types
@type inverted_index_config() :: %{ optional(:bm25) => bm25_config(), optional(:stopwords) => stopwords_config(), optional(:cleanupIntervalSeconds) => non_neg_integer(), optional(:indexTimestamps) => boolean(), optional(:indexPropertyLength) => boolean(), optional(:indexNullState) => boolean() }
@type stopwords_preset() :: :en | :none
Functions
@spec bm25(keyword()) :: bm25_config()
Create BM25 ranking configuration.
Parameters
:b- Length normalization parameter (0.0-1.0), default: 0.75- 0.0 = no length normalization
- 1.0 = full length normalization
:k1- Term frequency saturation parameter, default: 1.2- Higher values give more weight to term frequency
Examples
# Default configuration
InvertedIndexConfig.bm25()
# => %{b: 0.75, k1: 1.2}
# Custom configuration
InvertedIndexConfig.bm25(b: 0.5, k1: 2.0)
# => %{b: 0.5, k1: 2.0}
@spec build(keyword()) :: inverted_index_config()
Build a complete inverted index configuration from options.
Options
:bm25- BM25 configuration options (seebm25/1):stopwords- Stopwords configuration options (seestopwords/1):cleanup_interval_seconds- Cleanup interval in seconds:index_timestamps- Enable timestamp indexing:index_property_length- Enable property length indexing:index_null_state- Enable null state indexing
Examples
InvertedIndexConfig.build(
bm25: [b: 0.8, k1: 1.5],
stopwords: [preset: :en],
cleanup_interval_seconds: 60,
index_timestamps: true
)
@spec cleanup_interval_seconds(non_neg_integer()) :: map()
Create configuration for cleanup interval.
Sets the interval in seconds for cleaning up deleted entries from the inverted index.
Examples
# Cleanup every 5 minutes
InvertedIndexConfig.cleanup_interval_seconds(300)
# Immediate cleanup (not recommended for production)
InvertedIndexConfig.cleanup_interval_seconds(0)
Create configuration to enable/disable null state indexing.
When enabled, Weaviate indexes whether properties are null, allowing efficient filtering for null/non-null values.
Examples
InvertedIndexConfig.index_null_state(true)
# => %{indexNullState: true}
Create configuration to enable/disable property length indexing.
When enabled, Weaviate indexes the length of text properties, allowing efficient filtering by property length.
Examples
InvertedIndexConfig.index_property_length(true)
# => %{indexPropertyLength: true}
Create configuration to enable/disable timestamp indexing.
When enabled, Weaviate indexes creation and update timestamps,
allowing filtering by creationTimeUnix and lastUpdateTimeUnix.
Examples
InvertedIndexConfig.index_timestamps(true)
# => %{indexTimestamps: true}
@spec merge(inverted_index_config(), inverted_index_config()) :: inverted_index_config()
Merge two inverted index configurations.
The second configuration takes precedence for conflicting keys.
Examples
base = %{bm25: %{b: 0.75, k1: 1.2}}
override = %{indexTimestamps: true}
InvertedIndexConfig.merge(base, override)
# => %{bm25: %{b: 0.75, k1: 1.2}, indexTimestamps: true}
@spec stopwords(keyword()) :: stopwords_config()
Create stopwords configuration.
Parameters
:preset- Stopwords preset (:enor:none):additions- List of words to add to stopwords:removals- List of words to remove from stopwords
Examples
# Use English stopwords
InvertedIndexConfig.stopwords(preset: :en)
# English stopwords with custom additions
InvertedIndexConfig.stopwords(
preset: :en,
additions: ["foo", "bar"]
)
# Remove specific words from English stopwords
InvertedIndexConfig.stopwords(
preset: :en,
removals: ["the", "a"]
)
@spec validate(inverted_index_config()) :: {:ok, inverted_index_config()} | {:error, String.t()}
Validate an inverted index configuration.
Returns
{:ok, config}if valid{:error, message}if invalid
Examples
InvertedIndexConfig.validate(%{bm25: %{b: 0.5, k1: 1.2}})
# => {:ok, %{bm25: %{b: 0.5, k1: 1.2}}}
InvertedIndexConfig.validate(%{bm25: %{b: 1.5, k1: 1.2}})
# => {:error, "b must be between 0 and 1"}