Embeddings

View Source

The Embedding Service provides managed embedding generation with batching and statistics tracking.

Overview

The Rag.Embedding.Service is a GenServer that handles:

  • Single and batch text embedding
  • Automatic batching for large requests
  • Chunk embedding with database preparation
  • Statistics tracking

Starting the Service

alias Rag.Embedding.Service

# Basic start
{:ok, pid} = Service.start_link([])

# With options
{:ok, pid} = Service.start_link(
  batch_size: 100,
  provider: :gemini,
  name: :embedding_service
)

Options

OptionDefaultDescription
:batch_size100Max texts per batch
:providerRag.Ai.GeminiEmbedding provider module
:namenoneProcess name for registration

Embedding Text

Single Text

{:ok, embedding} = Service.embed_text(pid, "Hello world")
# embedding: [0.1, 0.2, ...]  # Dimensions follow the configured Gemini embedding model

Multiple Texts

{:ok, embeddings} = Service.embed_texts(pid, ["Hello", "World", "Elixir"])
# embeddings: [[0.1, ...], [0.2, ...], [0.3, ...]]

Large requests are automatically batched according to batch_size.

Embedding Chunks

Embed and Return Chunks

alias Rag.VectorStore.Chunk

chunks = [
  %Chunk{content: "First document"},
  %Chunk{content: "Second document"}
]

{:ok, embedded_chunks} = Service.embed_chunks(pid, chunks)
# Each chunk now has its embedding field populated

Embed and Prepare for Insert

{:ok, insert_ready} = Service.embed_and_prepare(pid, chunks)
# Returns list of maps ready for Ecto insert_all

Repo.insert_all(Chunk, insert_ready)

This combines:

  1. embed_chunks/2 - Generate embeddings
  2. VectorStore.prepare_for_insert/1 - Add timestamps and format

Statistics

Track service usage:

stats = Service.get_stats(pid)
# %{
#   texts_embedded: 150,
#   batches_processed: 2,
#   errors: 0
# }

Internal State

%Service{
  provider: module(),           # AI provider module
  provider_instance: struct(),  # Provider instance
  batch_size: pos_integer(),    # Max texts per batch
  stats: %{
    texts_embedded: integer(),
    batches_processed: integer(),
    errors: integer()
  }
}

Using with Router

For simpler use cases, you can use the Router directly:

alias Rag.Router

{:ok, router} = Router.new(providers: [:gemini])

# Single embedding
{:ok, [embedding], router} = Router.execute(router, :embeddings, ["text"], [])

# Multiple embeddings
{:ok, embeddings, router} = Router.execute(router, :embeddings,
  ["text1", "text2", "text3"],
  []
)

The Embedding Service is useful when you need:

  • Long-running service with state
  • Automatic batching management
  • Statistics tracking
  • Named process access

Complete Workflow

alias Rag.Embedding.Service
alias Rag.VectorStore
alias Rag.VectorStore.Chunk

# 1. Start service
{:ok, pid} = Service.start_link(batch_size: 50, name: :embeddings)

# 2. Prepare documents
documents = [
  %{content: "Document 1", source: "doc1.md"},
  %{content: "Document 2", source: "doc2.md"}
]
chunks = VectorStore.build_chunks(documents)

# 3. Embed and prepare for insert
{:ok, insert_ready} = Service.embed_and_prepare(pid, chunks)

# 4. Insert into database
{count, _} = Repo.insert_all(Chunk, insert_ready)

# 5. Check stats
stats = Service.get_stats(pid)
IO.puts("Embedded #{stats.texts_embedded} texts in #{stats.batches_processed} batches")

Embedding Dimensions

ProviderModelDimensions
GeminiGemini.Config.default_embedding_model()Gemini.Config.default_embedding_dimensions(Gemini.Config.default_embedding_model())
OpenAItext-embedding-3-small1536
OpenAItext-embedding-3-large3072
Cohereembed-english-v3.01024

Ensure your database vector column matches the provider's embedding dimensions.

Best Practices

  1. Batch requests - Use embed_texts/2 for multiple texts
  2. Monitor statistics - Track embedded count and errors
  3. Use named processes - Easier access in OTP applications
  4. Configure batch size - Balance throughput vs. API limits
  5. Handle errors - Service returns {:error, reason} on failure

Next Steps