Arcana.Embedder behaviour (Arcana v1.3.3)
View SourceBehaviour for embedding providers used by Arcana.
Arcana accepts any module that implements this behaviour. Built-in implementations are provided for:
Arcana.Embedder.Local- Local Bumblebee models (e.g.,bge-small-en-v1.5)Arcana.Embedder.OpenAI- OpenAI embeddings via Req.LLM
Configuration
Configure your embedding provider in config.exs:
# Default: Local Bumblebee with bge-small-en-v1.5 (384 dims)
config :arcana, embedder: :local
# Local with different HuggingFace model
config :arcana, embedder: {:local, model: "BAAI/bge-large-en-v1.5"}
# OpenAI via Req.LLM
config :arcana, embedder: :openai
config :arcana, embedder: {:openai, model: "text-embedding-3-large"}
# Custom function
config :arcana, embedder: fn text -> {:ok, embedding} end
# Custom module implementing this behaviour
config :arcana, embedder: MyApp.CohereEmbedder
config :arcana, embedder: {MyApp.CohereEmbedder, api_key: "..."}Implementing a Custom Embedder
Create a module that implements this behaviour:
defmodule MyApp.CohereEmbedder do
@behaviour Arcana.Embedder
@impl true
def embed(text, opts) do
api_key = opts[:api_key] || System.get_env("COHERE_API_KEY")
# Call Cohere API...
{:ok, embedding}
end
@impl true
def dimensions(_opts), do: 1024
endThen configure:
config :arcana, embedder: {MyApp.CohereEmbedder, api_key: "..."}
Summary
Callbacks
Returns the embedding dimensions.
Embed a single text string.
Embed multiple texts in batch.
Functions
Returns the embedding dimensions for the configured embedder.
Embeds text using the configured embedder.
Embeds multiple texts using the configured embedder.
Callbacks
@callback dimensions(opts :: keyword()) :: pos_integer()
Returns the embedding dimensions.
Embed a single text string.
Returns {:ok, embedding} where embedding is a list of floats,
or {:error, reason} on failure.
@callback embed_batch(texts :: [String.t()], opts :: keyword()) :: {:ok, [[float()]]} | {:error, term()}
Embed multiple texts in batch.
Default implementation calls embed/2 for each text sequentially.
Override for providers that support native batch embedding.
Functions
Returns the embedding dimensions for the configured embedder.
Embeds text using the configured embedder.
The embedder is a {module, opts} tuple where module implements
this behaviour.
Options
:intent- The embedding intent, either:queryor:document. Used by models like E5 that require different prefixes for search queries vs document content. Defaults to:document.
Examples
# Embed a search query (uses "query: " prefix for E5 models)
Embedder.embed(embedder, "what is machine learning?", intent: :query)
# Embed document content (uses "passage: " prefix for E5 models)
Embedder.embed(embedder, "Machine learning is...", intent: :document)
Embeds multiple texts using the configured embedder.
Falls back to sequential embedding if the module doesn't implement
embed_batch/2.