Arcana.Graph.GraphBuilder (Arcana v1.3.3)

View Source

Builds knowledge graph data from document chunks.

GraphBuilder orchestrates entity extraction, relationship extraction, and mention tracking to create a knowledge graph structure from text.

Usage

GraphBuilder is designed to integrate optionally into the ingest pipeline:

# During ingest (when graph: true option is passed)
chunks = Chunker.chunk(text, opts)
{:ok, graph_data} = GraphBuilder.build(chunks,
  entity_extractor: &Arcana.Graph.EntityExtractor.NER.extract/2,
  relationship_extractor: &RelationshipExtractor.extract/3
)

# Convert to queryable format
graph = GraphBuilder.to_query_graph(graph_data, chunks)

Output Structure

The builder outputs a map with:

%{
  entities: [%{id: "...", name: "...", type: :atom}],
  relationships: [%{source: "...", target: "...", type: "..."}],
  mentions: [%{entity_name: "...", chunk_id: "..."}]
}

This intermediate format can be persisted to a database or converted to the in-memory format used by GraphQuery.

Summary

Functions

Builds graph data from a list of chunks.

Builds graph data from a single text string.

Merges two graph data structures.

Converts builder output to the format used by GraphQuery.

Functions

build(chunks, opts)

Builds graph data from a list of chunks.

Extracts entities and relationships from each chunk, tracking which entities appear in which chunks (mentions).

Options

  • :extractor - Combined extractor (text, opts) -> {:ok, %{entities: [...], relationships: [...]}}. When provided, this takes priority over separate extractors.
  • :entity_extractor - Function (text, opts) -> {:ok, entities} | {:error, reason}. Used when :extractor is not provided.

  • :relationship_extractor - Function (text, entities, opts) -> {:ok, rels} | {:error, reason}. Used when :extractor is not provided.

Returns

  • {:ok, graph_data} - Successfully built graph data
  • {:error, reason} - If all extractions fail

build_from_text(text, opts)

Builds graph data from a single text string.

Convenience function for processing a single document without chunks.

merge(graph1, graph2)

Merges two graph data structures.

Combines entities (deduplicating by name), relationships, and mentions. Useful for incremental graph building across multiple documents.

to_query_graph(graph_data, chunks)

Converts builder output to the format used by GraphQuery.

Takes the graph data and original chunks to build an indexed graph structure suitable for querying.