Dsxir.Retrieval.InMemory (dsxir v0.1.0)

Copy Markdown

Cosine-similarity brute-force retriever backed by a struct value.

embedder = %Dsxir.Retrieval.Embedder{}
index    = %Dsxir.Retrieval.InMemory{embedder: embedder}

{:ok, index} = Dsxir.Retrieval.InMemory.add(index, ["alpha doc", "beta doc"])
{:ok, hits}  = Dsxir.Retrieval.InMemory.search(index, "alpha-ish query", k: 1)

No process, no ETS table. Mutation returns a new struct.

Persistence

:ok          = Dsxir.Retrieval.InMemory.save(index, "/tmp/idx.bin")
{:ok, index} = Dsxir.Retrieval.InMemory.load("/tmp/idx.bin")

Format is :erlang.term_to_binary/1; loads run with [:safe].

Summary

Functions

Embed and append new_docs to the index, returning the updated struct.

Load an index from path previously written by save/2.

Persist the index to path using :erlang.term_to_binary/1.

Return the top :k (default 3) documents ranked by cosine similarity against the embedded query.

Types

t()

@type t() :: %Dsxir.Retrieval.InMemory{
  docs: [String.t()],
  embedder: Dsxir.Retrieval.Embedder.t(),
  vectors: [[float()]]
}

Functions

add(idx, new_docs, opts \\ [])

@spec add(t(), [String.t()], keyword()) :: {:ok, t()} | {:error, term()}

Embed and append new_docs to the index, returning the updated struct.

load(path)

@spec load(Path.t()) :: {:ok, t()} | {:error, term()}

Load an index from path previously written by save/2.

save(idx, path)

@spec save(t(), Path.t()) :: :ok | {:error, File.posix()}

Persist the index to path using :erlang.term_to_binary/1.

search(in_memory, query, opts \\ [])

@spec search(t(), String.t(), keyword()) ::
  {:ok, [%{doc: String.t(), score: float()}]} | {:error, term()}

Return the top :k (default 3) documents ranked by cosine similarity against the embedded query.