Arcana 🔮📚

Embeddable RAG library for Elixir/Phoenix. Add vector search, document retrieval, and AI-powered question answering to any Phoenix application. Supports both simple RAG and agentic RAG with query expansion, self-correction, and more.

[!TIP] See arcana-adept for a complete Phoenix app with a Doctor Who corpus ready to embed and query.

Features

Simple API - ingest/2, search/2, ask/2 for basic RAG
Agentic RAG - Pipeline with query expansion, decomposition, re-ranking, and self-correction
Pluggable components - Replace any pipeline step with custom implementations
Hybrid search - Vector, full-text, or combined with Reciprocal Rank Fusion
GraphRAG - Optional knowledge graph with entity extraction, community detection, and fusion search
Multiple backends - Swappable vector store (pgvector, in-memory HNSWLib) and graph store (Ecto, in-memory) backends
Configurable embeddings - Local Bumblebee, OpenAI, or custom providers
File ingestion - Text, Markdown, and PDF support
Evaluation - Measure retrieval quality with MRR, Recall, Precision metrics
Embeddable - Uses your existing Repo, no separate database
LiveView Dashboard - Optional web UI for managing documents and searching
Telemetry - Built-in observability for all operations

How it works

Ingest: Text is split into overlapping chunks (default 450 tokens, 50 overlap)
Embed: Each chunk is embedded using bge-small-en-v1.5 (384 dimensions)
Store: Chunks are stored in PostgreSQL with pgvector
Search: Query is embedded and compared using cosine similarity via HNSW index

Installation

With Igniter (recommended):

mix igniter.install arcana
mix ecto.migrate

This adds the dependency, creates migrations, configures your repo, and sets up the dashboard route.

Without Igniter:

Add arcana to your dependencies:

def deps do
  [
    {:arcana, "~> 0.1.0"}
  ]
end

Then run:

mix deps.get
mix arcana.install
mix ecto.migrate

And follow the manual steps printed by the installer:

Create the Postgrex types module:

# lib/my_app/postgrex_types.ex
Postgrex.Types.define(
  MyApp.PostgrexTypes,
  [Pgvector.Extensions.Vector] ++ Ecto.Adapters.Postgres.extensions(),
  []
)

Add to your repo config:

# config/config.exs
config :my_app, MyApp.Repo,
  types: MyApp.PostgrexTypes

(Optional) Mount the dashboard:

# lib/my_app_web/router.ex
scope "/arcana" do
  pipe_through [:browser]
  forward "/", ArcanaWeb.Router
end

Setup

Start PostgreSQL with pgvector

# docker-compose.yml
services:
  postgres:
    image: pgvector/pgvector:pg16
    ports:
      - "5432:5432"
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_DB: myapp_dev

Add to supervision tree (for local embeddings)

If using local Bumblebee embeddings (the default), add the serving to your supervision tree:

# lib/my_app/application.ex
def start(_type, _args) do
  children = [
    MyApp.Repo,
    Arcana.Embedder.Local  # Starts the local embedding model
  ]

  opts = [strategy: :one_for_one, name: MyApp.Supervisor]
  Supervisor.start_link(children, opts)
end

Configure Nx backend (required for local embeddings)

For local embeddings, you need an Nx backend. Choose one of the following:

# config/config.exs

# Option 1: EXLA - Google's XLA compiler (Linux/macOS/Windows)
config :nx,
  default_backend: EXLA.Backend,
  default_defn_options: [compiler: EXLA]

# Option 2: EMLX - Apple's MLX framework (macOS with Apple Silicon only)
config :nx,
  default_backend: EMLX.Backend,
  default_defn_options: [compiler: EMLX]

# Option 3: Torchx - PyTorch backend (no compiler, uses eager execution)
config :nx,
  default_backend: {Torchx.Backend, device: :cpu}  # or :mps for Apple Silicon

Add the corresponding dependency to your mix.exs:

{:exla, "~> 0.9"}    # or
{:emlx, "~> 0.1"}    # or
{:torchx, "~> 0.9"}

Embedding providers

Arcana supports multiple embedding providers:

# config/config.exs

# Local Bumblebee (default) - no API keys needed
config :arcana, embedder: :local
config :arcana, embedder: {:local, model: "BAAI/bge-large-en-v1.5"}

# E5 models (automatically adds query:/passage: prefixes)
config :arcana, embedder: {:local, model: "intfloat/e5-small-v2"}

# OpenAI (requires OPENAI_API_KEY)
config :arcana, embedder: :openai
config :arcana, embedder: {:openai, model: "text-embedding-3-large"}

# Custom module implementing Arcana.Embedder behaviour
config :arcana, embedder: MyApp.CohereEmbedder

Implement custom embedders with the Arcana.Embedder behaviour:

defmodule MyApp.CohereEmbedder do
  @behaviour Arcana.Embedder

  @impl true
  def embed(text, opts) do
    # Call your embedding API
    {:ok, embedding_vector}
  end

  @impl true
  def dimensions(_opts), do: 1024
end

See the Getting Started Guide for all embedding model options.

Chunking providers

Arcana supports pluggable chunking strategies:

# config/config.exs

# Default text chunker (uses text_chunker library)
config :arcana, chunker: :default
config :arcana, chunker: {:default, chunk_size: 512, chunk_overlap: 100}

# Custom module implementing Arcana.Chunker behaviour
config :arcana, chunker: MyApp.SemanticChunker

Implement custom chunkers with the Arcana.Chunker behaviour:

defmodule MyApp.SemanticChunker do
  @behaviour Arcana.Chunker

  @impl true
  def chunk(text, opts) do
    # Custom chunking logic (e.g., semantic boundaries)
    [
      %{text: "chunk 1", chunk_index: 0, token_count: 50},
      %{text: "chunk 2", chunk_index: 1, token_count: 45}
    ]
  end
end

You can also pass :chunker directly to ingest/2:

Arcana.ingest(text, repo: MyApp.Repo, chunker: MyApp.SemanticChunker)

LLM configuration

Configure the LLM for ask/2 and the Agent pipeline:

# config/config.exs

# Model string (requires req_llm dependency)
config :arcana, llm: "openai:gpt-4o-mini"
config :arcana, llm: "anthropic:claude-sonnet-4-20250514"

# Function that takes a prompt and returns {:ok, response}
config :arcana, llm: fn prompt ->
  {:ok, MyApp.LLM.complete(prompt)}
end

# Custom module implementing Arcana.LLM behaviour
config :arcana, llm: MyApp.CustomLLM

You can also pass :llm directly to functions:

Arcana.ask("What is Elixir?", repo: MyApp.Repo, llm: "openai:gpt-4o")

Agent.new(question, repo: MyApp.Repo, llm: fn prompt -> ... end)

See the LLM Integration Guide for detailed examples.

Usage

Ingest documents

# Basic ingestion
{:ok, document} = Arcana.ingest("Your document content here", repo: MyApp.Repo)

# With metadata and collection
{:ok, document} = Arcana.ingest(content,
  repo: MyApp.Repo,
  metadata: %{"title" => "My Doc", "author" => "Jane"},
  collection: "products"
)

# Ingest from file (supports .txt, .md, .pdf)
{:ok, document} = Arcana.ingest_file("path/to/document.pdf", repo: MyApp.Repo)

# With GraphRAG (extracts entities and relationships)
{:ok, document} = Arcana.ingest(content, repo: MyApp.Repo, graph: true)

Search

# Semantic search (default)
{:ok, results} = Arcana.search("your query", repo: MyApp.Repo)

# Hybrid search (combines semantic + fulltext)
{:ok, results} = Arcana.search("query", repo: MyApp.Repo, mode: :hybrid)

# Hybrid with custom weights (pgvector only)
{:ok, results} = Arcana.search("query",
  repo: MyApp.Repo,
  mode: :hybrid,
  semantic_weight: 0.7,
  fulltext_weight: 0.3
)

# With filters
{:ok, results} = Arcana.search("query",
  repo: MyApp.Repo,
  limit: 5,
  collection: "products"
)

# With GraphRAG (combines vector + graph search with RRF)
{:ok, results} = Arcana.search("query", repo: MyApp.Repo, graph: true)

See the Search Algorithms Guide for details on search modes.

GraphRAG

GraphRAG enhances retrieval by building a knowledge graph from your documents. Entities (people, organizations, technologies) and their relationships are extracted during ingestion, then used alongside vector search for more contextual results.

# Install GraphRAG tables
mix arcana.graph.install
mix ecto.migrate

# Ingest with graph building
{:ok, document} = Arcana.ingest(content, repo: MyApp.Repo, graph: true)

# Search combines vector + graph traversal with Reciprocal Rank Fusion
{:ok, results} = Arcana.search("Who leads OpenAI?", repo: MyApp.Repo, graph: true)

Components are pluggable: swap entity extractors (NER, LLM), relationship extractors, community detectors (Leiden), and summarizers with your own implementations.

See the GraphRAG Guide for entity extraction, community detection, and fusion search.

Ask (Simple RAG)

{:ok, answer} = Arcana.ask("What is Elixir?",
  repo: MyApp.Repo,
  llm: "openai:gpt-4o-mini"
)

Agentic RAG

For complex questions, use the Agent pipeline with query expansion, re-ranking, and self-correcting answers:

alias Arcana.Agent

llm = fn prompt -> {:ok, "LLM response"} end

ctx =
  Agent.new("Compare Elixir and Erlang features", repo: MyApp.Repo, llm: llm)
  |> Agent.select(collections: ["elixir-docs", "erlang-docs"])
  |> Agent.expand()
  |> Agent.search()
  |> Agent.rerank()
  |> Agent.answer(self_correct: true)

ctx.answer
# => "Generated answer based on retrieved context..."

Pipeline Steps

Step	What it does
`new/2`	Initialize context with question, repo, and LLM function
`rewrite/2`	Clean up conversational input ("Hey, can you tell me about X?" → "about X")
`select/2`	Choose which collections to search (LLM picks based on collection descriptions)
`expand/2`	Add synonyms and related terms ("ML models" → "ML machine learning models algorithms")
`decompose/2`	Split complex questions ("What is X and how does Y work?" → ["What is X?", "How does Y work?"])
`search/2`	Execute vector search across selected collections
`rerank/2`	Score each chunk's relevance (0-10) and filter below threshold
`answer/2`	Generate final answer; with `self_correct: true`, evaluates and refines if not grounded

Example: Building a Pipeline

# Simple pipeline - just search and answer
ctx =
  Agent.new(question, repo: MyApp.Repo, llm: llm)
  |> Agent.search(collection: "docs")
  |> Agent.answer()

# Full pipeline with all steps
ctx =
  Agent.new(question, repo: MyApp.Repo, llm: llm)
  |> Agent.rewrite()                              # Clean up conversational input
  |> Agent.select(collections: available_collections)  # Pick relevant collections
  |> Agent.expand()                               # Add synonyms
  |> Agent.decompose()                            # Split multi-part questions
  |> Agent.search()                               # Search each sub-question
  |> Agent.rerank(threshold: 7)                   # Keep chunks scoring 7+/10
  |> Agent.answer(self_correct: true)             # Generate and verify answer

# Access results
ctx.answer           # Final answer
ctx.chunks           # Retrieved chunks after reranking
ctx.sub_questions    # Sub-questions from decomposition
ctx.correction_count # Number of self-correction iterations

Custom Components

Every pipeline step can be replaced with a custom module or function:

# Custom reranker using a cross-encoder model
defmodule MyApp.CrossEncoderReranker do
  @behaviour Arcana.Agent.Reranker

  @impl true
  def rerank(question, chunks, _opts) do
    scored = Enum.map(chunks, fn chunk ->
      score = MyApp.CrossEncoder.score(question, chunk.text)
      {chunk, score}
    end)
    |> Enum.filter(fn {_, score} -> score > 0.5 end)
    |> Enum.sort_by(fn {_, score} -> score end, :desc)
    |> Enum.map(fn {chunk, _} -> chunk end)

    {:ok, scored}
  end
end

ctx |> Agent.rerank(reranker: MyApp.CrossEncoderReranker)

# Or use an inline function
ctx |> Agent.rerank(reranker: fn question, chunks, _opts ->
  {:ok, Enum.filter(chunks, &relevant?(&1, question))}
end)

All steps support custom implementations via behaviours:

Step	Behaviour	Option
`rewrite/2`	`Arcana.Agent.Rewriter`	`:rewriter`
`select/2`	`Arcana.Agent.Selector`	`:selector`
`expand/2`	`Arcana.Agent.Expander`	`:expander`
`decompose/2`	`Arcana.Agent.Decomposer`	`:decomposer`
`search/2`	`Arcana.Agent.Searcher`	`:searcher`
`rerank/2`	`Arcana.Agent.Reranker`	`:reranker`
`answer/2`	`Arcana.Agent.Answerer`	`:answerer`

See the Agentic RAG Guide for detailed examples.

Architecture

┌─────────────────────────────────────────────────────────┐
│                     Your Phoenix App                    │
├─────────────────────────────────────────────────────────┤
│                    Arcana.Agent                         │
│  (rewrite → select → expand → search → rerank → answer) │
├─────────────────────────────────────────────────────────┤
│  Arcana.ask/2   │  Arcana.search/2  │  Arcana.ingest/2  │
├─────────────────┴───────────────────┴───────────────────┤
│                                                         │
│  ┌─────────────┐  ┌─────────────────┐  ┌─────────────┐  │
│  │   Chunker   │  │   Embeddings    │  │   Search    │  │
│  │ (splitting) │  │   (Bumblebee)   │  │ (pgvector)  │  │
│  └─────────────┘  └─────────────────┘  └─────────────┘  │
│                                                         │
├─────────────────────────────────────────────────────────┤
│              Your Existing Ecto Repo                    │
│         PostgreSQL + pgvector extension                 │
└─────────────────────────────────────────────────────────┘

Guides

Getting Started - Installation, embedding models, basic usage
Agentic RAG - Build sophisticated RAG pipelines
GraphRAG - Knowledge graphs with entity extraction and community detection
LLM Integration - Connect to OpenAI, Anthropic, or custom LLMs
Search Algorithms - Semantic, fulltext, and hybrid search
Re-ranking - Improve retrieval quality
Evaluation - Measure and improve retrieval quality
Telemetry - Observability, metrics, and debugging
Dashboard - Web UI setup

Roadmap

[x] LiveView dashboard
[x] Hybrid search (vector + full-text with RRF)
[x] File ingestion (text, markdown, PDF)
[x] Telemetry events for observability
[x] In-memory vector store (HNSWLib backend)
[x] Query expansion (Agent.expand/2)
[x] Re-ranking (Agent.rerank/2)
[x] Agentic RAG
- [x] Agent pipeline with context struct
- [x] Self-correcting answers (evaluate + refine)
- [x] Question decomposition (multi-step)
- [x] Collection selection
- [x] Pluggable components (custom behaviours for all steps)
[x] E5 embedding model prefix support (query: / passage: prefixes)
[ ] Additional vector store backends
- [ ] TurboPuffer (hybrid search)
- [ ] ChromaDB
[ ] Async ingestion with Oban
[ ] HyDE (Hypothetical Document Embeddings)
[x] GraphRAG (knowledge graph + community summaries)

Development

# Start PostgreSQL
docker compose up -d

# Install deps
mix deps.get

# Create and migrate test database
MIX_ENV=test mix ecto.create -r Arcana.TestRepo
MIX_ENV=test mix ecto.migrate -r Arcana.TestRepo

# Run tests
mix test

License

Licensed under the Apache License, Version 2.0. See LICENSE file for details.

Next Page → Getting Started with Arcana 🔮📚