Arcana.Graph (Arcana v1.3.3)

View Source

GraphRAG (Graph-enhanced Retrieval Augmented Generation) for Arcana.

This module provides the public API for GraphRAG functionality:

  • Building knowledge graphs from documents
  • Graph-based search and retrieval
  • Fusion search combining vector and graph results
  • Community summaries for global context

Installation

GraphRAG is optional and requires separate installation:

$ mix arcana.graph.install
$ mix ecto.migrate

Add the NER serving to your supervision tree:

children = [
  MyApp.Repo,
  Arcana.Embedder.Local,
  Arcana.Graph.NERServing  # For entity extraction
]

Configuration

GraphRAG is disabled by default. Enable it in your config:

config :arcana,
  graph: [
    enabled: true,
    community_levels: 5,
    resolution: 1.0
  ]

Or enable per-call:

Arcana.ingest(text, repo: MyApp.Repo, graph: true)
Arcana.search(query, repo: MyApp.Repo, graph: true)

Usage

# Build a graph from chunks
{:ok, graph_data} = Arcana.Graph.build(chunks,
  entity_extractor: &MyApp.extract_entities/2,
  relationship_extractor: &MyApp.extract_relationships/3
)

# Convert to queryable format
graph = Arcana.Graph.to_query_graph(graph_data, chunks)

# Search the graph
results = Arcana.Graph.search(graph, entities, depth: 2)

# Fusion search combining vector and graph
results = Arcana.Graph.fusion_search(graph, entities, vector_results)

Components

GraphRAG consists of several modules:

Custom Implementations

All core extractors and detectors support the behaviour pattern for extensibility:

# Custom entity extractor
config :arcana, :graph,
  entity_extractor: {MyApp.SpacyExtractor, endpoint: "http://localhost:5000"}

# Custom relationship extractor
config :arcana, :graph,
  relationship_extractor: {MyApp.PatternExtractor, patterns: [...]}

# Custom community detector
config :arcana, :graph,
  community_detector: {MyApp.LouvainDetector, resolution: 0.5}

# Custom community summarizer
config :arcana, :graph,
  community_summarizer: {MyApp.ExtractiveSum, max_sentences: 3}

Summary

Functions

Builds graph data from document chunks.

Builds and persists graph data from chunk records during ingest.

Gets community summaries from the graph.

Returns the current GraphRAG configuration.

Returns whether GraphRAG is enabled globally.

Finds entities in the graph by name.

Combines vector search and graph search using Reciprocal Rank Fusion.

Resolves the entity extractor from options and config.

Searches the knowledge graph for relevant chunks.

Converts builder output to queryable graph format.

Traverses the graph from a starting entity.

Functions

build(chunks, opts)

Builds graph data from document chunks.

Delegates to Arcana.Graph.GraphBuilder.build/2.

Options

  • :entity_extractor - Function to extract entities from text
  • :relationship_extractor - Function to extract relationships

Example

{:ok, graph_data} = Arcana.Graph.build(chunks,
  entity_extractor: fn text, _opts ->
    Arcana.Graph.EntityExtractor.NER.extract(text, [])
  end,
  relationship_extractor: fn text, entities, _opts ->
    Arcana.Graph.RelationshipExtractor.extract(text, entities, my_llm)
  end
)

build_and_persist(chunk_records, collection, repo, opts)

Builds and persists graph data from chunk records during ingest.

Processes chunks incrementally, persisting after each chunk so progress is saved continuously. Accepts an optional :progress callback that receives {current_chunk, total_chunks} after each chunk is processed.

Options

  • :progress - Callback function fn current, total -> ... end called after each chunk

Examples

# With progress logging
Arcana.Graph.build_and_persist(chunks, collection, repo,
  progress: fn current, total ->
    IO.puts("Processed chunk #{current}/#{total}")
  end
)

community_summaries(graph, opts \\ [])

Gets community summaries from the graph.

Community summaries provide high-level context about clusters of related entities, useful for global queries.

Options

  • :level - Filter by hierarchy level (0 = finest)
  • :entity_id - Filter by communities containing entity

Example

# Get all top-level summaries
summaries = Arcana.Graph.community_summaries(graph, level: 0)

config()

Returns the current GraphRAG configuration.

Example

Arcana.Graph.config()
# => %{enabled: false, community_levels: 5, resolution: 1.0}

enabled?()

Returns whether GraphRAG is enabled globally.

Check this before performing graph operations:

if Arcana.Graph.enabled?() do
  # Build graph during ingest
end

find_entities(graph, name, opts \\ [])

Finds entities in the graph by name.

Options

  • :fuzzy - Enable fuzzy matching (default: false)

fusion_search(graph, entities, vector_results, opts \\ [])

Combines vector search and graph search using Reciprocal Rank Fusion.

This is the primary retrieval method for GraphRAG, merging results from both vector similarity and knowledge graph traversal.

Options

  • :depth - Graph traversal depth (default: 1)
  • :limit - Maximum results to return (default: 10)
  • :k - RRF constant (default: 60)

Example

# Run vector search separately
{:ok, vector_results} = Arcana.search(query, repo: MyApp.Repo)

# Extract entities from query
{:ok, entities} = Arcana.Graph.EntityExtractor.NER.extract(query, [])

# Combine with graph search
results = Arcana.Graph.fusion_search(graph, entities, vector_results)

resolve_entity_extractor(opts)

Resolves the entity extractor from options and config.

search(graph, entities, opts \\ [])

Searches the knowledge graph for relevant chunks.

Finds entities matching the query, traverses relationships, and returns connected chunks.

Options

  • :depth - How many hops to traverse (default: 1)

Example

entities = [%{name: "OpenAI", type: :organization}]
results = Arcana.Graph.search(graph, entities, depth: 2)

to_query_graph(graph_data, chunks)

Converts builder output to queryable graph format.

Delegates to Arcana.Graph.GraphBuilder.to_query_graph/2.

traverse(graph, entity_id, opts \\ [])

Traverses the graph from a starting entity.

Options

  • :depth - Maximum traversal depth (default: 1)