WeaviateEx.API.InvertedIndexConfig (WeaviateEx v0.7.4)

View Source

Configuration builders for Weaviate inverted index settings.

This module provides helpers for configuring BM25 parameters, stopwords, and other inverted index settings for collection schema definitions.

BM25 Parameters

BM25 (Best Match 25) is the ranking algorithm used for keyword search:

  • b - Length normalization (0.0-1.0), default 0.75
  • k1 - Term frequency saturation, default 1.2

Stopwords

Stopwords are common words filtered from indexing:

  • :en preset includes English stopwords
  • :none disables stopword filtering
  • Custom additions and removals can be specified

Examples

# Basic BM25 configuration
bm25 = InvertedIndexConfig.bm25(b: 0.8, k1: 1.5)

# Stopwords with additions
stopwords = InvertedIndexConfig.stopwords(
  preset: :en,
  additions: ["foo", "bar"]
)

# Full configuration for collection creation
config = InvertedIndexConfig.build(
  bm25: [b: 0.8, k1: 1.5],
  stopwords: [preset: :en, additions: ["foo"]],
  cleanup_interval_seconds: 60,
  index_timestamps: true,
  index_property_length: true,
  index_null_state: false
)

Collections.create("Article", %{
  properties: [...],
  invertedIndexConfig: config
})

Summary

Functions

Create BM25 ranking configuration.

Build a complete inverted index configuration from options.

Create configuration for cleanup interval.

Create configuration to enable/disable null state indexing.

Create configuration to enable/disable property length indexing.

Create configuration to enable/disable timestamp indexing.

Merge two inverted index configurations.

Create stopwords configuration.

Validate an inverted index configuration.

Types

bm25_config()

@type bm25_config() :: %{b: float(), k1: float()}

inverted_index_config()

@type inverted_index_config() :: %{
  optional(:bm25) => bm25_config(),
  optional(:stopwords) => stopwords_config(),
  optional(:cleanupIntervalSeconds) => non_neg_integer(),
  optional(:indexTimestamps) => boolean(),
  optional(:indexPropertyLength) => boolean(),
  optional(:indexNullState) => boolean()
}

stopwords_config()

@type stopwords_config() :: %{
  optional(:preset) => String.t(),
  optional(:additions) => [String.t()],
  optional(:removals) => [String.t()]
}

stopwords_preset()

@type stopwords_preset() :: :en | :none

Functions

bm25(opts \\ [])

@spec bm25(keyword()) :: bm25_config()

Create BM25 ranking configuration.

Parameters

  • :b - Length normalization parameter (0.0-1.0), default: 0.75
    • 0.0 = no length normalization
    • 1.0 = full length normalization
  • :k1 - Term frequency saturation parameter, default: 1.2
    • Higher values give more weight to term frequency

Examples

# Default configuration
InvertedIndexConfig.bm25()
# => %{b: 0.75, k1: 1.2}

# Custom configuration
InvertedIndexConfig.bm25(b: 0.5, k1: 2.0)
# => %{b: 0.5, k1: 2.0}

build(opts \\ [])

@spec build(keyword()) :: inverted_index_config()

Build a complete inverted index configuration from options.

Options

  • :bm25 - BM25 configuration options (see bm25/1)
  • :stopwords - Stopwords configuration options (see stopwords/1)
  • :cleanup_interval_seconds - Cleanup interval in seconds
  • :index_timestamps - Enable timestamp indexing
  • :index_property_length - Enable property length indexing
  • :index_null_state - Enable null state indexing

Examples

InvertedIndexConfig.build(
  bm25: [b: 0.8, k1: 1.5],
  stopwords: [preset: :en],
  cleanup_interval_seconds: 60,
  index_timestamps: true
)

cleanup_interval_seconds(seconds)

@spec cleanup_interval_seconds(non_neg_integer()) :: map()

Create configuration for cleanup interval.

Sets the interval in seconds for cleaning up deleted entries from the inverted index.

Examples

# Cleanup every 5 minutes
InvertedIndexConfig.cleanup_interval_seconds(300)

# Immediate cleanup (not recommended for production)
InvertedIndexConfig.cleanup_interval_seconds(0)

index_null_state(enabled)

@spec index_null_state(boolean()) :: map()

Create configuration to enable/disable null state indexing.

When enabled, Weaviate indexes whether properties are null, allowing efficient filtering for null/non-null values.

Examples

InvertedIndexConfig.index_null_state(true)
# => %{indexNullState: true}

index_property_length(enabled)

@spec index_property_length(boolean()) :: map()

Create configuration to enable/disable property length indexing.

When enabled, Weaviate indexes the length of text properties, allowing efficient filtering by property length.

Examples

InvertedIndexConfig.index_property_length(true)
# => %{indexPropertyLength: true}

index_timestamps(enabled)

@spec index_timestamps(boolean()) :: map()

Create configuration to enable/disable timestamp indexing.

When enabled, Weaviate indexes creation and update timestamps, allowing filtering by creationTimeUnix and lastUpdateTimeUnix.

Examples

InvertedIndexConfig.index_timestamps(true)
# => %{indexTimestamps: true}

merge(base, override)

Merge two inverted index configurations.

The second configuration takes precedence for conflicting keys.

Examples

base = %{bm25: %{b: 0.75, k1: 1.2}}
override = %{indexTimestamps: true}
InvertedIndexConfig.merge(base, override)
# => %{bm25: %{b: 0.75, k1: 1.2}, indexTimestamps: true}

stopwords(opts \\ [])

@spec stopwords(keyword()) :: stopwords_config()

Create stopwords configuration.

Parameters

  • :preset - Stopwords preset (:en or :none)
  • :additions - List of words to add to stopwords
  • :removals - List of words to remove from stopwords

Examples

# Use English stopwords
InvertedIndexConfig.stopwords(preset: :en)

# English stopwords with custom additions
InvertedIndexConfig.stopwords(
  preset: :en,
  additions: ["foo", "bar"]
)

# Remove specific words from English stopwords
InvertedIndexConfig.stopwords(
  preset: :en,
  removals: ["the", "a"]
)

validate(config)

@spec validate(inverted_index_config()) ::
  {:ok, inverted_index_config()} | {:error, String.t()}

Validate an inverted index configuration.

Returns

  • {:ok, config} if valid
  • {:error, message} if invalid

Examples

InvertedIndexConfig.validate(%{bm25: %{b: 0.5, k1: 1.2}})
# => {:ok, %{bm25: %{b: 0.5, k1: 1.2}}}

InvertedIndexConfig.validate(%{bm25: %{b: 1.5, k1: 1.2}})
# => {:error, "b must be between 0 and 1"}