Stephen.Index.Compressed (Stephen v1.0.0)

Compressed document index using residual compression.

Combines PLAID-style centroid indexing with ColBERTv2 residual compression for memory-efficient storage while maintaining retrieval quality.

Instead of storing full float32 embeddings (~512 bytes per token), stores:

Centroid ID (2 bytes)
Quantized residual (dim bytes at 8 bits)

This achieves ~4-6x compression ratio.

Usage

# Train compression on document embeddings
index = Stephen.Index.Compressed.new(embedding_dim: 128)
index = Stephen.Index.Compressed.train(index, all_doc_embeddings)

# Add documents (stores compressed)
index = Stephen.Index.Compressed.add(index, "doc1", embeddings1)
index = Stephen.Index.Compressed.add(index, "doc2", embeddings2)

# Search (decompresses on-the-fly)
results = Stephen.Index.Compressed.search(index, query_embeddings, top_k: 10)

Summary

Types

doc_id()

t()

Functions

add(index, doc_id, embeddings)

Adds a document's embeddings to the index (stores compressed).

add_all(index, documents)

Adds multiple documents to the index.

delete(index, doc_id)

Removes a document from the index.

delete_all(index, doc_ids)

Removes multiple documents from the index.

doc_ids(index)

Returns all document IDs in the index.

get_compressed(index, doc_id)

Gets the compressed representation for a document.

get_embeddings(index, doc_id)

Gets the decompressed embeddings for a document.

has_doc?(index, doc_id)

Checks if a document exists in the index.

index_documents(index, documents)

Indexes documents: trains on all embeddings, then adds each document.

load(path)

Loads a compressed index from disk.

new(opts \\ [])

Creates a new empty compressed index.

save(index, path)

Saves the compressed index to disk.

search(index, query_embeddings, opts \\ [])

Searches the compressed index for documents matching a query.

size(index)

Returns the number of documents in the index.

stats(index)

Returns compression statistics for the index.

train(index, embeddings, opts \\ [])

Trains the compression codebook and PLAID centroids on document embeddings.

update(index, doc_id, embeddings)

Updates a document in the index by replacing its embeddings.

Types

doc_id()

@type doc_id() :: term()

t()

@type t() :: %Stephen.Index.Compressed{
  centroids: Nx.Tensor.t() | nil,
  compressed_embeddings: %{
    required(term()) => Stephen.Compression.compressed_embedding()
  },
  compression: Stephen.Compression.t() | nil,
  doc_count: non_neg_integer(),
  embedding_dim: pos_integer(),
  inverted_index: %{required(non_neg_integer()) => MapSet.t()},
  num_centroids: pos_integer(),
  trained?: boolean()
}

Functions

add(index, doc_id, embeddings)

@spec add(t(), doc_id(), Nx.Tensor.t()) :: t()

Adds a document's embeddings to the index (stores compressed).

The index must be trained before adding documents.

Arguments

index - The compressed index struct
doc_id - Unique identifier for the document
embeddings - Tensor of shape {num_tokens, embedding_dim}

add_all(index, documents)

@spec add_all(t(), [{doc_id(), Nx.Tensor.t()}]) :: t()

Adds multiple documents to the index.

Arguments

index - The compressed index struct
documents - List of {doc_id, embeddings} tuples

delete(index, doc_id)

@spec delete(t(), doc_id()) :: t()

Removes a document from the index.

Arguments

index - The compressed index struct
doc_id - The document ID to remove

Returns

Updated compressed index, or the original index if doc_id not found.

delete_all(index, doc_ids)

@spec delete_all(t(), [doc_id()]) :: t()

Removes multiple documents from the index.

Arguments

index - The compressed index struct
doc_ids - List of document IDs to remove

Returns

Updated compressed index.

doc_ids(index)

@spec doc_ids(t()) :: [doc_id()]

Returns all document IDs in the index.

get_compressed(index, doc_id)

@spec get_compressed(t(), doc_id()) ::
  Stephen.Compression.compressed_embedding() | nil

Gets the compressed representation for a document.

get_embeddings(index, doc_id)

@spec get_embeddings(t(), doc_id()) :: Nx.Tensor.t() | nil

Gets the decompressed embeddings for a document.

has_doc?(index, doc_id)

@spec has_doc?(t(), doc_id()) :: boolean()

Checks if a document exists in the index.

index_documents(index, documents)

@spec index_documents(t(), [{doc_id(), Nx.Tensor.t()}]) :: t()

Indexes documents: trains on all embeddings, then adds each document.

Convenience function that combines train/3 and add/3.

Arguments

index - The compressed index struct
documents - List of {doc_id, embeddings} tuples

load(path)

@spec load(Path.t()) :: {:ok, t()} | {:error, term()}

Loads a compressed index from disk.

Arguments

path - File path to load from

Returns

{:ok, index} or {:error, reason}

new(opts \\ [])

@spec new(keyword()) :: t()

Creates a new empty compressed index.

Options

:embedding_dim - Dimension of embeddings (required)
:num_centroids - Number of PLAID centroids for candidate generation (default: 1024)
:compression_centroids - Number of compression centroids (default: 2048)

save(index, path)

@spec save(t(), Path.t()) :: :ok | {:error, term()}

Saves the compressed index to disk.

Arguments

index - The compressed index struct
path - File path to save to

search(index, query_embeddings, opts \\ [])

@spec search(t(), Nx.Tensor.t(), keyword()) :: [%{doc_id: doc_id(), score: float()}]

Searches the compressed index for documents matching a query.

Uses PLAID-style candidate generation followed by reranking with decompressed embeddings.

Arguments

index - The compressed index struct
query_embeddings - Query token embeddings
opts - Search options

Options

:top_k - Number of results to return (default: 10)
:nprobe - Number of centroids to probe (default: 32)

Returns

List of %{doc_id: term(), score: float()} sorted by score descending.

size(index)

@spec size(t()) :: non_neg_integer()

Returns the number of documents in the index.

stats(index)

@spec stats(t()) :: map()

Returns compression statistics for the index.

train(index, embeddings, opts \\ [])

@spec train(t(), [Nx.Tensor.t()] | Nx.Tensor.t(), keyword()) :: t()

Trains the compression codebook and PLAID centroids on document embeddings.

Must be called before adding documents to the index.

Arguments

index - The compressed index struct
embeddings - List of embedding tensors or single concatenated tensor

Options

:compression_centroids - Number of compression centroids (default: 2048)
:residual_bits - Bits for residual quantization (default: 8)

update(index, doc_id, embeddings)

@spec update(t(), doc_id(), Nx.Tensor.t()) :: t()

Updates a document in the index by replacing its embeddings.

This is equivalent to deleting and re-adding the document.

Arguments

index - The compressed index struct
doc_id - The document ID to update
embeddings - New embeddings tensor

Returns

Updated compressed index.