Stephen.Retriever (Stephen v1.0.0)
View SourceHigh-level retrieval operations combining encoding, indexing, and scoring.
Implements the full ColBERT retrieval pipeline:
- Encode query to per-token embeddings
- Search ANN index for candidate documents
- Rerank candidates using full MaxSim scoring
Two-Stage Retrieval
For large collections, use a two-stage retrieval approach:
First stage: Fast candidate retrieval (BM25, dense retriever, etc.)
Second stage: Rerank candidates with ColBERT MaxSim
# Get candidates from first stage (e.g., BM25) candidates = MySearch.bm25_search(query, top_k: 100)
# Rerank with ColBERT results = Stephen.Retriever.rerank(encoder, index, query, candidates)
Summary
Functions
Batch reranks multiple queries against their candidate documents.
Searches for documents matching multiple queries.
Extracts expansion embeddings from feedback documents.
Indexes a list of documents.
Reranks a list of documents against a query using full MaxSim scoring.
Reranks raw text documents against a query without requiring an index.
Reranks raw text documents with pre-computed query embeddings.
Reranks documents with pre-computed query embeddings.
Searches the index for documents matching the query.
Searches for documents matching a query using pre-computed embeddings.
Searches with pseudo-relevance feedback (PRF) for query expansion.
Types
Functions
@spec batch_rerank( Stephen.Encoder.encoder(), Stephen.Index.t() | Stephen.Plaid.t() | Stephen.Index.Compressed.t(), [{String.t(), [term()]}], keyword() ) :: [[search_result()]]
Batch reranks multiple queries against their candidate documents.
Arguments
encoder- Loaded encoderindex- Document indexqueries_and_candidates- List of {query, doc_ids} tuples
Options
:top_k- Number of results per query (default: 10)
Returns
List of result lists, one per query.
@spec batch_search( Stephen.Encoder.encoder(), Stephen.Index.t() | Stephen.Plaid.t() | Stephen.Index.Compressed.t(), [String.t()], keyword() ) :: [[search_result()]]
Searches for documents matching multiple queries.
Efficiently encodes all queries together, then searches the index for each query independently.
Arguments
encoder- Loaded encoderindex- Document index (Index, Plaid, or Index.Compressed)queries- List of query strings
Options
:top_k- Number of results per query (default: 10):candidates_per_token- ANN candidates per query token (default: 50):rerank?- Whether to rerank with full MaxSim (default: true)
Returns
List of result lists, one per query.
Examples
results = Stephen.Retriever.batch_search(encoder, index, ["query 1", "query 2"])
# results[0] contains top_k results for "query 1"
# results[1] contains top_k results for "query 2"
@spec extract_expansion_embeddings( Stephen.Index.t() | Stephen.Plaid.t() | Stephen.Index.Compressed.t(), [search_result()], Nx.Tensor.t(), pos_integer() ) :: Nx.Tensor.t() | nil
Extracts expansion embeddings from feedback documents.
Selects the most relevant token embeddings from feedback documents that aren't already well-represented in the query.
Arguments
index- Document indexfeedback_results- Search results to use for feedbackquery_embeddings- Original query embeddingsnum_tokens- Number of expansion tokens to extract
Returns
Tensor of expansion embeddings with shape {num_tokens, dim}, or nil if no feedback documents are available.
@spec index_documents(Stephen.Encoder.encoder(), Stephen.Index.t(), [ {term(), String.t()} ]) :: Stephen.Index.t()
Indexes a list of documents.
Arguments
encoder- Loaded encoderindex- Document indexdocuments- List of{doc_id, text}tuples
Returns
Updated index with all documents added.
@spec rerank( Stephen.Encoder.encoder(), Stephen.Index.t() | Stephen.Plaid.t() | Stephen.Index.Compressed.t(), String.t(), [term()], keyword() ) :: [search_result()]
Reranks a list of documents against a query using full MaxSim scoring.
Supports multiple index types: Index, Plaid, and Index.Compressed.
Arguments
encoder- Loaded encoderindex- Document index (Index, Plaid, or Index.Compressed)query- Query stringdoc_ids- List of document IDs to rerank
Options
:top_k- Number of results to return (default: all)
Returns
List of %{doc_id: term(), score: float()} sorted by score descending.
Examples
# Rerank BM25 candidates
candidates = [:doc1, :doc2, :doc3]
results = Stephen.Retriever.rerank(encoder, index, "my query", candidates)
# Return only top 5
results = Stephen.Retriever.rerank(encoder, index, query, candidates, top_k: 5)
@spec rerank_texts( Stephen.Encoder.encoder(), String.t(), [{term(), String.t()}], keyword() ) :: [ search_result() ]
Reranks raw text documents against a query without requiring an index.
Documents are encoded on-the-fly and scored using ColBERT's MaxSim. Useful for reranking results from external sources like BM25 or Elasticsearch.
Arguments
encoder- Loaded encoderquery- Query stringdocuments- List of{id, text}tuples to rerank
Options
:top_k- Number of results to return (default: all)
Returns
List of %{doc_id: term(), score: float()} sorted by score descending.
Examples
candidates = [
{"colbert", "Stephen Colbert hosts The Late Show with satirical comedy"},
{"conan", "Conan O'Brien is known for absurdist humor and remotes"}
]
results = rerank_texts(encoder, "political satire", candidates)
@spec rerank_texts_with_embeddings( Nx.Tensor.t(), Stephen.Encoder.encoder(), [{term(), String.t()}], keyword() ) :: [search_result()]
Reranks raw text documents with pre-computed query embeddings.
Useful when reranking multiple candidate sets with the same query.
Arguments
query_embeddings- Pre-computed query embeddings tensorencoder- Loaded encoder (for encoding documents)documents- List of{id, text}tuples to rerank
Options
:top_k- Number of results to return (default: all)
Returns
List of %{doc_id: term(), score: float()} sorted by score descending.
@spec rerank_with_embeddings( Nx.Tensor.t(), Stephen.Index.t() | Stephen.Plaid.t() | Stephen.Index.Compressed.t(), [term()], keyword() ) :: [search_result()]
Reranks documents with pre-computed query embeddings.
Useful when reranking multiple candidate sets with the same query, or when query embeddings are already available.
Arguments
query_embeddings- Pre-computed query embeddings tensorindex- Document indexdoc_ids- List of document IDs to rerank
Options
:top_k- Number of results to return (default: all)
Returns
List of %{doc_id: term(), score: float()} sorted by score descending.
@spec search( Stephen.Encoder.encoder(), Stephen.Index.t() | Stephen.Plaid.t() | Stephen.Index.Compressed.t(), String.t(), keyword() ) :: [search_result()]
Searches the index for documents matching the query.
Arguments
encoder- Loaded encoder fromEncoder.load/1index- Document index fromIndex.new/1query- Query string
Options
:top_k- Number of results to return (default: 10):candidates_per_token- ANN candidates per query token (default: 50):rerank?- Whether to rerank with full MaxSim (default: true)
Returns
List of %{doc_id: term(), score: float()} sorted by score descending.
@spec search_with_embeddings( Nx.Tensor.t(), Stephen.Index.t() | Stephen.Plaid.t() | Stephen.Index.Compressed.t(), keyword() ) :: [search_result()]
Searches for documents matching a query using pre-computed embeddings.
Arguments
query_embeddings- Pre-computed query embeddings tensorindex- Document index (Index, Plaid, or Index.Compressed)
Options
:top_k- Number of results to return (default: 10):candidates_per_token- ANN candidates per query token (default: 50):rerank?- Whether to rerank with full MaxSim (default: true):nprobe- Number of centroids to probe for Plaid/Compressed (default: 32)
Returns
List of %{doc_id: term(), score: float()} sorted by score descending.
@spec search_with_prf( Stephen.Encoder.encoder(), Stephen.Index.t() | Stephen.Plaid.t() | Stephen.Index.Compressed.t(), String.t(), keyword() ) :: [search_result()]
Searches with pseudo-relevance feedback (PRF) for query expansion.
PRF improves recall by expanding the query with information from top-ranked documents. The process:
- Run initial search with original query
- Extract representative embeddings from top-k feedback documents
- Combine original query with expansion embeddings
- Re-run search with expanded query
Arguments
encoder- Loaded encoderindex- Document indexquery- Query string
Options
:top_k- Final results to return (default: 10):feedback_docs- Number of docs for feedback (default: 3):expansion_tokens- Tokens to add from feedback (default: 10):expansion_weight- Weight for expansion vs original (default: 0.5)
Returns
List of %{doc_id: term(), score: float()} sorted by score descending.
Examples
# Basic PRF search
results = Retriever.search_with_prf(encoder, index, "late night comedy")
# Tune PRF parameters
results = Retriever.search_with_prf(encoder, index, query,
feedback_docs: 5,
expansion_tokens: 15,
expansion_weight: 0.3
)