Stephen.Index.Compressed (Stephen v1.0.0)
View SourceCompressed document index using residual compression.
Combines PLAID-style centroid indexing with ColBERTv2 residual compression for memory-efficient storage while maintaining retrieval quality.
Instead of storing full float32 embeddings (~512 bytes per token), stores:
- Centroid ID (2 bytes)
- Quantized residual (dim bytes at 8 bits)
This achieves ~4-6x compression ratio.
Usage
# Train compression on document embeddings
index = Stephen.Index.Compressed.new(embedding_dim: 128)
index = Stephen.Index.Compressed.train(index, all_doc_embeddings)
# Add documents (stores compressed)
index = Stephen.Index.Compressed.add(index, "doc1", embeddings1)
index = Stephen.Index.Compressed.add(index, "doc2", embeddings2)
# Search (decompresses on-the-fly)
results = Stephen.Index.Compressed.search(index, query_embeddings, top_k: 10)
Summary
Functions
Adds a document's embeddings to the index (stores compressed).
Adds multiple documents to the index.
Removes a document from the index.
Removes multiple documents from the index.
Returns all document IDs in the index.
Gets the compressed representation for a document.
Gets the decompressed embeddings for a document.
Checks if a document exists in the index.
Indexes documents: trains on all embeddings, then adds each document.
Loads a compressed index from disk.
Creates a new empty compressed index.
Saves the compressed index to disk.
Searches the compressed index for documents matching a query.
Returns the number of documents in the index.
Returns compression statistics for the index.
Trains the compression codebook and PLAID centroids on document embeddings.
Updates a document in the index by replacing its embeddings.
Types
@type doc_id() :: term()
@type t() :: %Stephen.Index.Compressed{ centroids: Nx.Tensor.t() | nil, compressed_embeddings: %{ required(term()) => Stephen.Compression.compressed_embedding() }, compression: Stephen.Compression.t() | nil, doc_count: non_neg_integer(), embedding_dim: pos_integer(), inverted_index: %{required(non_neg_integer()) => MapSet.t()}, num_centroids: pos_integer(), trained?: boolean() }
Functions
@spec add(t(), doc_id(), Nx.Tensor.t()) :: t()
Adds a document's embeddings to the index (stores compressed).
The index must be trained before adding documents.
Arguments
index- The compressed index structdoc_id- Unique identifier for the documentembeddings- Tensor of shape {num_tokens, embedding_dim}
@spec add_all(t(), [{doc_id(), Nx.Tensor.t()}]) :: t()
Adds multiple documents to the index.
Arguments
index- The compressed index structdocuments- List of {doc_id, embeddings} tuples
Removes a document from the index.
Arguments
index- The compressed index structdoc_id- The document ID to remove
Returns
Updated compressed index, or the original index if doc_id not found.
Removes multiple documents from the index.
Arguments
index- The compressed index structdoc_ids- List of document IDs to remove
Returns
Updated compressed index.
Returns all document IDs in the index.
@spec get_compressed(t(), doc_id()) :: Stephen.Compression.compressed_embedding() | nil
Gets the compressed representation for a document.
@spec get_embeddings(t(), doc_id()) :: Nx.Tensor.t() | nil
Gets the decompressed embeddings for a document.
Checks if a document exists in the index.
@spec index_documents(t(), [{doc_id(), Nx.Tensor.t()}]) :: t()
Indexes documents: trains on all embeddings, then adds each document.
Convenience function that combines train/3 and add/3.
Arguments
index- The compressed index structdocuments- List of {doc_id, embeddings} tuples
Loads a compressed index from disk.
Arguments
path- File path to load from
Returns
{:ok, index} or {:error, reason}
Creates a new empty compressed index.
Options
:embedding_dim- Dimension of embeddings (required):num_centroids- Number of PLAID centroids for candidate generation (default: 1024):compression_centroids- Number of compression centroids (default: 2048)
Saves the compressed index to disk.
Arguments
index- The compressed index structpath- File path to save to
@spec search(t(), Nx.Tensor.t(), keyword()) :: [%{doc_id: doc_id(), score: float()}]
Searches the compressed index for documents matching a query.
Uses PLAID-style candidate generation followed by reranking with decompressed embeddings.
Arguments
index- The compressed index structquery_embeddings- Query token embeddingsopts- Search options
Options
:top_k- Number of results to return (default: 10):nprobe- Number of centroids to probe (default: 32)
Returns
List of %{doc_id: term(), score: float()} sorted by score descending.
@spec size(t()) :: non_neg_integer()
Returns the number of documents in the index.
Returns compression statistics for the index.
@spec train(t(), [Nx.Tensor.t()] | Nx.Tensor.t(), keyword()) :: t()
Trains the compression codebook and PLAID centroids on document embeddings.
Must be called before adding documents to the index.
Arguments
index- The compressed index structembeddings- List of embedding tensors or single concatenated tensor
Options
:compression_centroids- Number of compression centroids (default: 2048):residual_bits- Bits for residual quantization (default: 8)
@spec update(t(), doc_id(), Nx.Tensor.t()) :: t()
Updates a document in the index by replacing its embeddings.
This is equivalent to deleting and re-adding the document.
Arguments
index- The compressed index structdoc_id- The document ID to updateembeddings- New embeddings tensor
Returns
Updated compressed index.