Kreuzberg.DocumentStructure (kreuzberg v4.4.2)

Copy Markdown View Source

Structured document representation with hierarchical node tree.

A flat array of nodes with index-based parent/child references forming a tree. Root-level nodes have no parent. Nodes are stored in document/reading order.

Fields

  • :nodes - List of DocumentNode structs in reading order

Examples

iex> structure = %Kreuzberg.DocumentStructure{
...>   nodes: [
...>     %Kreuzberg.DocumentNode{
...>       id: "node-1",
...>       node_type: "paragraph",
...>       content: %{"text" => "Hello world"},
...>       page_number: 1
...>     }
...>   ]
...> }
iex> structure.nodes
[%Kreuzberg.DocumentNode{...}]

Summary

Functions

Get child nodes of a specific parent node.

Get total number of nodes in structure.

Check if document structure is empty.

Creates a DocumentStructure struct from a map.

Get a node by index (0-based).

Get all nodes of a specific type.

Get all root-level nodes (nodes with no parent).

Converts a DocumentStructure struct to a map.

Types

t()

@type t() :: %Kreuzberg.DocumentStructure{nodes: [Kreuzberg.DocumentNode.t()]}

Functions

children(arg1, parent_index)

@spec children(t(), non_neg_integer()) :: [Kreuzberg.DocumentNode.t()]

Get child nodes of a specific parent node.

Parameters

  • structure - A DocumentStructure struct
  • parent_index - The index of the parent node

Returns

A list of child nodes.

Examples

iex> structure = %Kreuzberg.DocumentStructure{
...>   nodes: [
...>     %Kreuzberg.DocumentNode{id: "node-1", children: [1, 2]},
...>     %Kreuzberg.DocumentNode{id: "node-2", parent: 0},
...>     %Kreuzberg.DocumentNode{id: "node-3", parent: 0}
...>   ]
...> }
iex> children = Kreuzberg.DocumentStructure.children(structure, 0)
iex> length(children)
2

count(document_structure)

@spec count(t()) :: non_neg_integer()

Get total number of nodes in structure.

Parameters

  • structure - A DocumentStructure struct

Returns

The count of nodes as an integer.

Examples

iex> structure = %Kreuzberg.DocumentStructure{nodes: [%Kreuzberg.DocumentNode{}]}
iex> Kreuzberg.DocumentStructure.count(structure)
1

empty?(document_structure)

@spec empty?(t()) :: boolean()

Check if document structure is empty.

Parameters

  • structure - A DocumentStructure struct

Returns

Boolean indicating whether the structure has no nodes.

Examples

iex> structure = %Kreuzberg.DocumentStructure{nodes: []}
iex> Kreuzberg.DocumentStructure.empty?(structure)
true

from_map(data)

@spec from_map(map()) :: t()

Creates a DocumentStructure struct from a map.

Converts a plain map (typically from NIF/Rust) into a proper struct, handling nested node data.

Parameters

  • data - A map containing document structure fields

Returns

A DocumentStructure struct with properly typed fields.

Examples

iex> structure_map = %{
...>   "nodes" => [
...>     %{
...>       "id" => "node-1",
...>       "node_type" => "paragraph",
...>       "content" => %{"text" => "Hello"}
...>     }
...>   ]
...> }
iex> structure = Kreuzberg.DocumentStructure.from_map(structure_map)
iex> length(structure.nodes)
1

get_node(arg1, index)

@spec get_node(t(), non_neg_integer()) :: Kreuzberg.DocumentNode.t() | nil

Get a node by index (0-based).

Parameters

  • structure - A DocumentStructure struct
  • index - Zero-based index of the node to retrieve

Returns

The node at that index, or nil if out of bounds.

Examples

iex> structure = %Kreuzberg.DocumentStructure{
...>   nodes: [%Kreuzberg.DocumentNode{id: "node-1"}]
...> }
iex> Kreuzberg.DocumentStructure.get_node(structure, 0)
%Kreuzberg.DocumentNode{id: "node-1", ...}

nodes_by_type(document_structure, node_type)

@spec nodes_by_type(t(), String.t() | atom()) :: [Kreuzberg.DocumentNode.t()]

Get all nodes of a specific type.

Parameters

  • structure - A DocumentStructure struct
  • node_type - The node type to filter by (string or atom)

Returns

A list of nodes matching the given type.

Examples

iex> structure = %Kreuzberg.DocumentStructure{
...>   nodes: [
...>     %Kreuzberg.DocumentNode{node_type: "paragraph"},
...>     %Kreuzberg.DocumentNode{node_type: "heading"},
...>     %Kreuzberg.DocumentNode{node_type: "paragraph"}
...>   ]
...> }
iex> paragraphs = Kreuzberg.DocumentStructure.nodes_by_type(structure, "paragraph")
iex> length(paragraphs)
2

root_nodes(document_structure)

@spec root_nodes(t()) :: [Kreuzberg.DocumentNode.t()]

Get all root-level nodes (nodes with no parent).

Parameters

  • structure - A DocumentStructure struct

Returns

A list of root-level nodes.

Examples

iex> structure = %Kreuzberg.DocumentStructure{
...>   nodes: [
...>     %Kreuzberg.DocumentNode{id: "node-1", parent: nil},
...>     %Kreuzberg.DocumentNode{id: "node-2", parent: 0}
...>   ]
...> }
iex> roots = Kreuzberg.DocumentStructure.root_nodes(structure)
iex> length(roots)
1

to_map(structure)

@spec to_map(t()) :: map()

Converts a DocumentStructure struct to a map.

Useful for serialization and passing to external systems.

Parameters

  • structure - A DocumentStructure struct

Returns

A map with string keys representing all fields.

Examples

iex> structure = %Kreuzberg.DocumentStructure{nodes: []}
iex> Kreuzberg.DocumentStructure.to_map(structure)
%{"nodes" => []}