Document Model Architecture

Quillon uses an AST (Abstract Syntax Tree) to represent document structures. This guide documents the architecture and extension strategies for supporting rich content elements.

Current Architecture

AST Structure

Documents are represented as nested Elixir tuples following a strict grammar:

{:type, attrs, children}

Example:

{:document, %{id: "doc_123", name: "Welcome"},
  [
    {:heading, %{level: 1}, [
      {:text, %{text: "Hello ", marks: []}, []},
      {:text, %{text: "World", marks: [:bold]}, []}
    ]},
    {:paragraph, %{}, [
      {:text, %{text: "This is a ", marks: []}, []},
      {:text, %{text: "rich text", marks: [:bold, :italic]}, []},
      {:text, %{text: " editor.", marks: []}, []}
    ]}
  ]}

Element Categories

Category	Types	Purpose
Block	heading, paragraph, blockquote, callout, code_block, divider, image, video, bullet_list, ordered_list, table	Vertical stacking elements
Inline	text	Text content with marks
Container	document, list_item, table_row, table_cell	Structural containers

Type Definitions

@block_types [
  :heading, :paragraph, :blockquote, :callout, :code_block, :divider,
  :image, :video, :embed,
  :bullet_list, :ordered_list,
  :table
]

@inline_types [:text]

@container_types [:document, :list_item, :table_row, :table_cell]

Key Operations

All operations are immutable - they return a new AST rather than modifying in place.

# Path-based access (indices into children)
Quillon.get(doc, [0, 2])           # Get node at path
Quillon.update(doc, [0, 2], fn)    # Update node at path
Quillon.insert(doc, [0, 3], node)  # Insert at position
Quillon.delete(doc, [0, 2])        # Delete node
Quillon.reorder(doc, [0], ids)     # Reorder children by ID list
Quillon.move(doc, [0, 1], [1, 0])  # Move between positions

# ID-based access
Quillon.find_path(doc, "node_id")        # Find path to node
Quillon.get_by_id(doc, "id")             # Get node by ID
Quillon.update_by_id(doc, "id", fn)      # Update by ID

# Factory functions
Quillon.new(:document, %{name: "My Doc"})
Quillon.new(:heading, %{level: 1}, "Title")
Quillon.new(:paragraph, "Some text content")

JSON Serialization

# AST → JSON
Quillon.to_json({:paragraph, %{}, [{:text, %{text: "Hello", marks: [:bold]}, []}]})
# => %{"type" => "paragraph", "attrs" => %{}, "children" => [
#      %{"type" => "text", "attrs" => %{"text" => "Hello", "marks" => ["bold"]}, "children" => []}
#    ]}

# JSON → AST
Quillon.from_json(%{"type" => "paragraph", "attrs" => %{}, "children" => [...]})
# => {:paragraph, %{}, [...]}

Block Elements

Heading

{:heading, %{level: 2}, [
  {:text, %{text: "Section Title", marks: []}, []}
]}

Attribute	Type	Default	Description
level	integer	2	Heading level (1-6)

Children: inline content (text nodes with marks)

Paragraph

{:paragraph, %{}, [
  {:text, %{text: "Hello ", marks: []}, []},
  {:text, %{text: "world", marks: [:bold]}, []}
]}

Children: inline content (text nodes with marks)

Divider

{:divider, %{style: :solid}, []}

Attribute	Type	Default	Description
style	atom	:solid	Line style (:solid, :dashed, :dotted)

Design Goals

Pure Elixir - No JavaScript dependencies in the core library
Immutable operations - All transforms return new AST, never mutate
Structured inline content - Text nodes with marks, not markdown strings
Schema validation - Content expressions define valid structures
Framework agnostic - Core library works without Phoenix/LiveView
CRDT-ready - Structure supports collaborative editing (via separate package)

Block Types Reference

Block elements are content that takes up its own vertical space (stacks vertically). This is distinct from inline elements which flow within a line of text.

Block vs Inline:
┌──────────────────────────────────────┐
│ {:heading, ...}           ← block    │
├──────────────────────────────────────┤
│ {:paragraph, %{}, [                  │
│   {:text, %{text: "Hello "}, []}     │  ← inline
│   {:text, %{text: "world",           │
│            marks: [:bold]}, []}      │  ← inline
│ ]}                        ← block    │
├──────────────────────────────────────┤
│ {:image, ...}             ← block    │
└──────────────────────────────────────┘

Block Type Categories

@block_types %{
  # Text blocks
  text: [:heading, :paragraph, :blockquote, :callout, :code_block],

  # Media
  media: [:image, :video, :embed],

  # Lists
  list: [:bullet_list, :ordered_list],

  # Tables
  table: [:table],

  # Structural
  structural: [:divider]
}

Blockquote

{:blockquote, %{citation: "Shakespeare"}, [
  {:paragraph, %{}, [
    {:text, %{text: "To be or not to be", marks: []}, []}
  ]}
]}

Attribute	Type	Default	Description
citation	string	nil	Attribution

Children: paragraph blocks

Callout

{:callout, %{type: :info, title: "Note"}, [
  {:paragraph, %{}, [
    {:text, %{text: "Important information", marks: []}, []}
  ]}
]}

Attribute	Type	Default	Description
type	atom	:info	Callout type (:info, :warning, :success, :error)
title	string	nil	Optional title

Children: paragraph blocks

Code Block

{:code_block, %{code: "def hello, do: :world", language: "elixir"}, []}

Attribute	Type	Default	Description
code	string	required	Code content
language	string	nil	Syntax highlighting language

Children: none (code stored in attrs)

Image

{:image, %{src: "/uploads/photo.jpg", alt: "Photo", caption: "A nice photo"}, []}

Attribute	Type	Default	Description
src	string	required	Image URL
alt	string	""	Alt text
caption	string	nil	Optional caption
width	integer	nil	Width in pixels

Children: none

Lists

{:bullet_list, %{}, [
  {:list_item, %{}, [
    {:paragraph, %{}, [{:text, %{text: "First item", marks: []}, []}]}
  ]},
  {:list_item, %{}, [
    {:paragraph, %{}, [{:text, %{text: "Second item", marks: []}, []}]}
  ]}
]}

{:ordered_list, %{start: 1}, [
  {:list_item, %{}, [
    {:paragraph, %{}, [{:text, %{text: "Step one", marks: []}, []}]}
  ]},
  {:list_item, %{}, [
    {:paragraph, %{}, [{:text, %{text: "Step two", marks: []}, []}]}
  ]}
]}

Tables

{:table, %{}, [
  {:table_row, %{header: true}, [
    {:table_cell, %{}, [
      {:paragraph, %{}, [{:text, %{text: "Name", marks: []}, []}]}
    ]},
    {:table_cell, %{}, [
      {:paragraph, %{}, [{:text, %{text: "Email", marks: []}, []}]}
    ]}
  ]},
  {:table_row, %{}, [
    {:table_cell, %{}, [
      {:paragraph, %{}, [{:text, %{text: "John", marks: []}, []}]}
    ]},
    {:table_cell, %{}, [
      {:paragraph, %{}, [{:text, %{text: "john@example.com", marks: []}, []}]}
    ]}
  ]}
]}

Inline Content (Rich Text)

Inline content uses text nodes with marks, similar to ProseMirror/Tiptap, Lexical, and Slate.

Text Node Structure

{:text, %{text: "content", marks: [mark1, mark2, ...]}, []}

All inline content uses the :text node type. Formatting (including links) is applied via marks.

Example

# "Hello world! Visit our site for more."
#        ^^^^^ bold
#              ^^^^^^^^^^ link

{:paragraph, %{}, [
  {:text, %{text: "Hello ", marks: []}, []},
  {:text, %{text: "world", marks: [:bold]}, []},
  {:text, %{text: "! Visit ", marks: []}, []},
  {:text, %{text: "our site", marks: [{:link, %{href: "https://example.com"}}]}, []},
  {:text, %{text: " for more.", marks: []}, []}
]}

Mark Types

Marks can be simple atoms or tuples with attributes:

# Simple marks (no attributes needed)
:bold
:italic
:underline
:strike
:code
:subscript
:superscript

# Marks with attributes
{:link, %{href: "https://example.com", title: "Link title", target: "_blank"}}
{:highlight, %{color: "yellow"}}
{:font_color, %{color: "#FF5500"}}
{:mention, %{id: "user_123", type: :user, label: "@john"}}

Mark Reference

Mark	Type	Attributes	Description
`:bold`	atom	-	Bold text
`:italic`	atom	-	Italic text
`:underline`	atom	-	Underlined text
`:strike`	atom	-	Strikethrough
`:code`	atom	-	Inline code (monospace)
`:subscript`	atom	-	Subscript text
`:superscript`	atom	-	Superscript text
`:link`	tuple	`href`, `title`, `target`	Hyperlink
`:highlight`	tuple	`color`	Background highlight
`:font_color`	tuple	`color`	Text color
`:mention`	tuple	`id`, `type`, `label`	User/item mention

Complex Formatted Text

# "Hello world! Click here for more info."
#        ^^^^^ bold
#              ^^^^^^^^^^ bold + italic + link

{:paragraph, %{}, [
  {:text, %{text: "Hello ", marks: []}, []},
  {:text, %{text: "world", marks: [:bold]}, []},
  {:text, %{text: "! ", marks: []}, []},
  {:text, %{text: "Click here", marks: [:bold, :italic, {:link, %{href: "/info"}}]}, []},
  {:text, %{text: " for more info.", marks: []}, []}
]}

Why Structured Text (Not Markdown)

Programmatic manipulation - add/remove formatting without parsing strings
Collaborative editing - CRDT can track changes to individual text nodes
Validation - enforce allowed marks per context
Rendering flexibility - same structure renders to HTML, plain text, or other formats
Cursor positioning - track cursor position within rich text

Mark Configuration

Each mark type has configuration that controls its behavior:

@mark_config %{
  bold: %{
    inclusive: true,       # New text at mark boundary inherits the mark
    keep_on_split: true,   # Mark persists when pressing Enter
    excludes: []           # No conflicts with other marks
  },
  italic: %{
    inclusive: true,
    keep_on_split: true,
    excludes: []
  },
  code: %{
    inclusive: false,      # New text doesn't inherit code formatting
    keep_on_split: false,
    excludes: [:bold, :italic, :underline]  # Code excludes other formatting
  },
  link: %{
    inclusive: false,      # New text doesn't become part of link
    keep_on_split: false,  # Link doesn't span newlines
    excludes: [],
    attrs: [:href, :title, :target]
  },
  highlight: %{
    inclusive: true,
    keep_on_split: true,
    excludes: [],
    attrs: [:color]
  }
}

Property	Description
`inclusive`	Whether new text typed at mark boundary gets the mark
`keep_on_split`	Whether mark persists when node is split (e.g., pressing Enter)
`excludes`	List of marks that cannot coexist with this mark
`attrs`	List of attributes for marks with data

Text Transforms

Text Splitting Algorithm

When applying a mark to a selection, text nodes are split at the selection boundaries:

defmodule Quillon.Transforms do
  @doc """
  Apply a mark to text in a paragraph at the given offsets.
  Splits text nodes at boundaries, applies the mark, then normalizes.
  """
  def apply_mark({:paragraph, attrs, children}, start_offset, end_offset, mark) do
    new_children =
      children
      |> split_at_offset(end_offset)    # Split at END first (preserves start offset)
      |> split_at_offset(start_offset)  # Then split at START
      |> add_mark_in_range(start_offset, end_offset, mark)
      |> normalize()

    {:paragraph, attrs, new_children}
  end

  defp split_at_offset(nodes, offset) do
    {before, at_offset, after_nodes} = find_node_at_offset(nodes, offset)

    case at_offset do
      nil ->
        nodes

      {:text, %{text: text, marks: marks}, []} ->
        split_pos = offset - total_length(before)

        if split_pos == 0 or split_pos == String.length(text) do
          nodes  # No split needed at boundary
        else
          left_text = String.slice(text, 0, split_pos)
          right_text = String.slice(text, split_pos..-1//1)

          left_node = {:text, %{text: left_text, marks: marks}, []}
          right_node = {:text, %{text: right_text, marks: marks}, []}

          before ++ [left_node, right_node] ++ after_nodes
        end
    end
  end
end

Normalization Algorithm

After every edit, normalize the paragraph to merge adjacent text nodes with identical marks:

defmodule Quillon.Normalizer do
  @doc """
  Normalize paragraph content by:
  1. Removing empty text nodes
  2. Merging adjacent text nodes with identical marks
  """
  def normalize(children) when is_list(children) do
    children
    |> Enum.reject(&empty_text_node?/1)
    |> merge_adjacent()
  end

  defp empty_text_node?({:text, %{text: ""}, []}), do: true
  defp empty_text_node?(_), do: false

  defp merge_adjacent([]), do: []
  defp merge_adjacent([node]), do: [node]
  defp merge_adjacent([{:text, a1, []}, {:text, a2, []} | rest]) do
    if marks_equal?(a1.marks, a2.marks) do
      # Merge: combine text, keep marks
      merged = {:text, %{text: a1.text <> a2.text, marks: a1.marks}, []}
      merge_adjacent([merged | rest])
    else
      [{:text, a1, []} | merge_adjacent([{:text, a2, []} | rest])]
    end
  end
  defp merge_adjacent([node | rest]), do: [node | merge_adjacent(rest)]

  @doc """
  Compare marks for equality (loose comparison - ignores text content).
  Marks must be sorted for reliable comparison.
  """
  def marks_equal?(marks1, marks2) do
    sort_marks(marks1 || []) == sort_marks(marks2 || [])
  end

  @mark_priority %{bold: 0, italic: 1, underline: 2, strike: 3, code: 4,
                   subscript: 5, superscript: 6, link: 7, highlight: 8}

  defp sort_marks(marks) do
    Enum.sort_by(marks, fn
      m when is_atom(m) -> {@mark_priority[m] || 99, to_string(m)}
      {type, _attrs} -> {@mark_priority[type] || 99, to_string(type)}
    end)
  end
end

Toggle Mark Command

High-level command that checks if mark is active and toggles accordingly:

defmodule Quillon.Commands do
  alias Quillon.{Transforms, Normalizer}

  def toggle_mark(paragraph, {start_offset, end_offset}, mark) do
    if selection_has_mark?(paragraph, start_offset, end_offset, mark) do
      remove_mark(paragraph, start_offset, end_offset, mark)
    else
      Transforms.apply_mark(paragraph, start_offset, end_offset, mark)
    end
  end

  defp selection_has_mark?({:paragraph, _attrs, children}, start_off, end_off, mark) do
    # Check if ALL text in range has the mark
    children
    |> nodes_in_range(start_off, end_off)
    |> Enum.all?(fn {:text, %{marks: marks}, []} ->
      mark in extract_mark_types(marks)
    end)
  end

  defp extract_mark_types(marks) do
    Enum.map(marks, fn
      m when is_atom(m) -> m
      {type, _attrs} -> type
    end)
  end
end

Schema Validation

Schema validation ensures documents conform to valid structures, similar to ProseMirror's schema system.

Quillon.validate(doc)   # Returns {:ok, doc} or {:error, errors}
Quillon.validate!(doc)  # Returns doc or raises ValidationError

Groups

Groups simplify content rules by categorizing node types:

@groups %{
  # Block-level content
  block: [
    :paragraph, :heading, :blockquote, :callout, :code_block,
    :image, :video, :bullet_list, :ordered_list, :table, :divider
  ],

  # Inline content (text with marks)
  inline: [:text],

  # List items
  list_content: [:list_item]
}

Content Expressions

ProseMirror-style content expressions for declarative rules:

Expression	Meaning
`"block+"`	One or more block nodes
`"block*"`	Zero or more block nodes
`"inline*"`	Zero or more inline nodes (text)
`"paragraph"`	Exactly one paragraph
`"(paragraph \| heading)+"`	One or more paragraphs or headings
`"paragraph block*"`	One paragraph followed by zero or more blocks

Node Schema

@node_schema %{
  # Root type
  document: %{
    content: "block*",
    attrs: [:id, :name]
  },

  # Text container blocks
  paragraph: %{
    content: "inline*",
    group: :block,
    marks: :all
  },
  heading: %{
    content: "inline*",
    group: :block,
    marks: [:bold, :italic, :underline, :strike, :link],
    attrs: [:level]
  },
  blockquote: %{
    content: "paragraph+",
    group: :block,
    attrs: [:citation]
  },
  callout: %{
    content: "paragraph+",
    group: :block,
    attrs: [:type, :title]
  },
  code_block: %{
    content: nil,
    group: :block,
    marks: [],
    attrs: [:code, :language]
  },
  divider: %{
    content: nil,
    group: :block,
    attrs: [:style]
  },

  # Lists
  bullet_list: %{
    content: "list_item+",
    group: :block
  },
  ordered_list: %{
    content: "list_item+",
    group: :block,
    attrs: [:start]
  },
  list_item: %{
    content: "paragraph (bullet_list | ordered_list)?",
    marks: :parent
  },

  # Tables
  table: %{
    content: "table_row+",
    group: :block
  },
  table_row: %{
    content: "table_cell+",
    attrs: [:header]
  },
  table_cell: %{
    content: "paragraph+",
    marks: :all,
    attrs: [:colspan, :rowspan]
  },

  # Media
  image: %{
    content: nil,
    group: :block,
    attrs: [:src, :alt, :caption, :width],
    required_attrs: [:src]
  },
  video: %{
    content: nil,
    group: :block,
    attrs: [:src, :poster],
    required_attrs: [:src]
  },

  # Inline text node
  text: %{
    content: nil,
    group: :inline,
    attrs: [:text, :marks]
  }
}

Mark Schema

Defines what marks exist and their behavior:

@mark_schema %{
  # Simple formatting marks
  bold: %{
    inclusive: true,
    keep_on_split: true,
    excludes: [],
    attrs: []
  },
  italic: %{
    inclusive: true,
    keep_on_split: true,
    excludes: [],
    attrs: []
  },
  underline: %{
    inclusive: true,
    keep_on_split: true,
    excludes: [],
    attrs: []
  },
  strike: %{
    inclusive: true,
    keep_on_split: true,
    excludes: [],
    attrs: []
  },
  code: %{
    inclusive: false,
    keep_on_split: false,
    excludes: [:bold, :italic, :underline, :strike, :link],  # code is exclusive
    attrs: []
  },
  subscript: %{
    inclusive: true,
    keep_on_split: true,
    excludes: [:superscript],  # can't be both
    attrs: []
  },
  superscript: %{
    inclusive: true,
    keep_on_split: true,
    excludes: [:subscript],
    attrs: []
  },

  # Marks with attributes
  link: %{
    inclusive: false,  # typing at end doesn't extend link
    keep_on_split: false,
    excludes: [],
    attrs: [:href, :title, :target],
    required_attrs: [:href]
  },
  highlight: %{
    inclusive: true,
    keep_on_split: true,
    excludes: [],
    attrs: [:color],
    default_attrs: %{color: "yellow"}
  },
  font_color: %{
    inclusive: true,
    keep_on_split: true,
    excludes: [],
    attrs: [:color],
    required_attrs: [:color]
  },
  mention: %{
    inclusive: false,
    keep_on_split: false,
    excludes: [],
    attrs: [:id, :type, :label],
    required_attrs: [:id, :type]
  }
}

Mark Allowance per Node

Some nodes restrict which marks are allowed:

def allowed_marks(node_type) do
  case @node_schema[node_type][:marks] do
    :all -> Map.keys(@mark_schema)
    :parent -> :inherit  # look up to parent
    nil -> []  # no marks allowed
    list when is_list(list) -> list
  end
end

# Examples:
# allowed_marks(:paragraph)  => all marks
# allowed_marks(:heading)    => [:bold, :italic, :underline, :strike, :link]
# allowed_marks(:code_block) => []

Validation Rules

defmodule Quillon.Schema do
  @moduledoc """
  Schema-based validation for document AST.
  Uses content expressions and mark schemas like ProseMirror.
  """

  def valid_content?(parent_type, children) do
    expression = @node_schema[parent_type][:content]
    matches_expression?(children, expression)
  end

  def matches_expression?(children, expression) do
    case expression do
      nil -> children == []
      "block+" -> length(children) >= 1 and Enum.all?(children, &in_group?(&1, :block))
      "block*" -> Enum.all?(children, &in_group?(&1, :block))
      "inline*" -> Enum.all?(children, &in_group?(&1, :inline))
      "list_item+" -> length(children) >= 1 and Enum.all?(children, &is_type?(&1, :list_item))
      "paragraph+" -> length(children) >= 1 and Enum.all?(children, &is_type?(&1, :paragraph))
      "table_row+" -> length(children) >= 1 and Enum.all?(children, &is_type?(&1, :table_row))
      "table_cell+" -> length(children) >= 1 and Enum.all?(children, &is_type?(&1, :table_cell))
      _ -> true  # Complex expressions need parser
    end
  end

  defp in_group?({type, _, _}, group), do: type in (@groups[group] || [])
  defp is_type?({type, _, _}, expected), do: type == expected

  def valid_marks?(parent_type, marks) do
    allowed = allowed_marks(parent_type)
    mark_types = Enum.map(marks, fn
      m when is_atom(m) -> m
      {type, _} -> type
    end)
    Enum.all?(mark_types, &(&1 in allowed))
  end

  def mark_allowed?(existing_marks, new_mark) do
    new_type = case new_mark do
      m when is_atom(m) -> m
      {type, _} -> type
    end
    excludes = @mark_schema[new_type][:excludes] || []
    existing_types = Enum.map(existing_marks, fn
      m when is_atom(m) -> m
      {type, _} -> type
    end)
    not Enum.any?(existing_types, &(&1 in excludes))
  end

  def validate({type, attrs, children} = node) do
    with :ok <- validate_type_exists(type),
         :ok <- validate_attrs(type, attrs),
         :ok <- validate_content(type, children),
         :ok <- validate_children(children) do
      {:ok, node}
    end
  end
end

Validation Errors

# Invalid: list_item outside list
{:paragraph, %{}, [
  {:list_item, %{}, [...]}
]}
# => {:error, "Invalid content for paragraph: expected inline*, got list_item"}

# Invalid: code mark in heading
{:heading, %{level: 1}, [
  {:text, %{text: "Hello", marks: [:code]}, []}
]}
# => {:error, "Mark :code not allowed in heading"}

# Invalid: conflicting marks (subscript + superscript)
{:text, %{text: "x", marks: [:subscript, :superscript]}, []}
# => {:error, "Mark :superscript conflicts with :subscript"}

# Invalid: link without href
{:text, %{text: "click", marks: [{:link, %{title: "Link"}}]}, []}
# => {:error, "Mark :link requires attr :href"}

JSON Serialization

{
  "type": "document",
  "attrs": { "id": "doc_123", "name": "My Document" },
  "children": [
    {
      "type": "heading",
      "attrs": { "level": 1 },
      "children": [
        { "type": "text", "attrs": { "text": "Hello", "marks": [] }, "children": [] }
      ]
    },
    {
      "type": "paragraph",
      "attrs": {},
      "children": [
        { "type": "text", "attrs": { "text": "Hello ", "marks": [] }, "children": [] },
        { "type": "text", "attrs": { "text": "world", "marks": ["bold"] }, "children": [] }
      ]
    }
  ]
}

Mark serialization:

Simple marks: "bold", "italic", "code"
Marks with attrs: { "type": "link", "attrs": { "href": "..." } }

Comparison with JS Editors

Feature	Quillon	ProseMirror/Tiptap	Lexical	Slate
Structure	Elixir tuples	JS objects	JS classes	JS objects
Immutable	Yes (language native)	No (mutable DOM)	Yes	Yes
Block types	Schema-defined	Schema-defined	Node classes	Element types
Inline formatting	Structured nodes + marks	Mark system	Format states	Leaf nodes
Collaboration	CRDT-ready (separate pkg)	Yjs plugin	Yjs plugin	Yjs plugin
Server-side	Native Elixir	N/A	N/A	N/A

Quillon advantages:

Native Elixir immutability (no runtime overhead)
Same structure on client and server
JSON serialization built-in
Server-side rendering without JS dependency
Framework agnostic (works without Phoenix/LiveView)

← Previous Page Roadmap