Quillon uses an AST (Abstract Syntax Tree) to represent document structures. This guide documents the architecture and extension strategies for supporting rich content elements.
Current Architecture
AST Structure
Documents are represented as nested Elixir tuples following a strict grammar:
{:type, attrs, children}Example:
{:document, %{id: "doc_123", name: "Welcome"},
[
{:heading, %{level: 1}, [
{:text, %{text: "Hello ", marks: []}, []},
{:text, %{text: "World", marks: [:bold]}, []}
]},
{:paragraph, %{}, [
{:text, %{text: "This is a ", marks: []}, []},
{:text, %{text: "rich text", marks: [:bold, :italic]}, []},
{:text, %{text: " editor.", marks: []}, []}
]}
]}Element Categories
| Category | Types | Purpose |
|---|---|---|
| Block | heading, paragraph, blockquote, callout, code_block, divider, image, video, bullet_list, ordered_list, table | Vertical stacking elements |
| Inline | text | Text content with marks |
| Container | document, list_item, table_row, table_cell | Structural containers |
Type Definitions
@block_types [
:heading, :paragraph, :blockquote, :callout, :code_block, :divider,
:image, :video, :embed,
:bullet_list, :ordered_list,
:table
]
@inline_types [:text]
@container_types [:document, :list_item, :table_row, :table_cell]Key Operations
All operations are immutable - they return a new AST rather than modifying in place.
# Path-based access (indices into children)
Quillon.get(doc, [0, 2]) # Get node at path
Quillon.update(doc, [0, 2], fn) # Update node at path
Quillon.insert(doc, [0, 3], node) # Insert at position
Quillon.delete(doc, [0, 2]) # Delete node
Quillon.reorder(doc, [0], ids) # Reorder children by ID list
Quillon.move(doc, [0, 1], [1, 0]) # Move between positions
# ID-based access
Quillon.find_path(doc, "node_id") # Find path to node
Quillon.get_by_id(doc, "id") # Get node by ID
Quillon.update_by_id(doc, "id", fn) # Update by ID
# Factory functions
Quillon.new(:document, %{name: "My Doc"})
Quillon.new(:heading, %{level: 1}, "Title")
Quillon.new(:paragraph, "Some text content")JSON Serialization
# AST → JSON
Quillon.to_json({:paragraph, %{}, [{:text, %{text: "Hello", marks: [:bold]}, []}]})
# => %{"type" => "paragraph", "attrs" => %{}, "children" => [
# %{"type" => "text", "attrs" => %{"text" => "Hello", "marks" => ["bold"]}, "children" => []}
# ]}
# JSON → AST
Quillon.from_json(%{"type" => "paragraph", "attrs" => %{}, "children" => [...]})
# => {:paragraph, %{}, [...]}Block Elements
Heading
{:heading, %{level: 2}, [
{:text, %{text: "Section Title", marks: []}, []}
]}| Attribute | Type | Default | Description |
|---|---|---|---|
| level | integer | 2 | Heading level (1-6) |
Children: inline content (text nodes with marks)
Paragraph
{:paragraph, %{}, [
{:text, %{text: "Hello ", marks: []}, []},
{:text, %{text: "world", marks: [:bold]}, []}
]}Children: inline content (text nodes with marks)
Divider
{:divider, %{style: :solid}, []}| Attribute | Type | Default | Description |
|---|---|---|---|
| style | atom | :solid | Line style (:solid, :dashed, :dotted) |
Design Goals
- Pure Elixir - No JavaScript dependencies in the core library
- Immutable operations - All transforms return new AST, never mutate
- Structured inline content - Text nodes with marks, not markdown strings
- Schema validation - Content expressions define valid structures
- Framework agnostic - Core library works without Phoenix/LiveView
- CRDT-ready - Structure supports collaborative editing (via separate package)
Block Types Reference
Block elements are content that takes up its own vertical space (stacks vertically). This is distinct from inline elements which flow within a line of text.
Block vs Inline:
┌──────────────────────────────────────┐
│ {:heading, ...} ← block │
├──────────────────────────────────────┤
│ {:paragraph, %{}, [ │
│ {:text, %{text: "Hello "}, []} │ ← inline
│ {:text, %{text: "world", │
│ marks: [:bold]}, []} │ ← inline
│ ]} ← block │
├──────────────────────────────────────┤
│ {:image, ...} ← block │
└──────────────────────────────────────┘Block Type Categories
@block_types %{
# Text blocks
text: [:heading, :paragraph, :blockquote, :callout, :code_block],
# Media
media: [:image, :video, :embed],
# Lists
list: [:bullet_list, :ordered_list],
# Tables
table: [:table],
# Structural
structural: [:divider]
}Blockquote
{:blockquote, %{citation: "Shakespeare"}, [
{:paragraph, %{}, [
{:text, %{text: "To be or not to be", marks: []}, []}
]}
]}| Attribute | Type | Default | Description |
|---|---|---|---|
| citation | string | nil | Attribution |
Children: paragraph blocks
Callout
{:callout, %{type: :info, title: "Note"}, [
{:paragraph, %{}, [
{:text, %{text: "Important information", marks: []}, []}
]}
]}| Attribute | Type | Default | Description |
|---|---|---|---|
| type | atom | :info | Callout type (:info, :warning, :success, :error) |
| title | string | nil | Optional title |
Children: paragraph blocks
Code Block
{:code_block, %{code: "def hello, do: :world", language: "elixir"}, []}| Attribute | Type | Default | Description |
|---|---|---|---|
| code | string | required | Code content |
| language | string | nil | Syntax highlighting language |
Children: none (code stored in attrs)
Image
{:image, %{src: "/uploads/photo.jpg", alt: "Photo", caption: "A nice photo"}, []}| Attribute | Type | Default | Description |
|---|---|---|---|
| src | string | required | Image URL |
| alt | string | "" | Alt text |
| caption | string | nil | Optional caption |
| width | integer | nil | Width in pixels |
Children: none
Lists
{:bullet_list, %{}, [
{:list_item, %{}, [
{:paragraph, %{}, [{:text, %{text: "First item", marks: []}, []}]}
]},
{:list_item, %{}, [
{:paragraph, %{}, [{:text, %{text: "Second item", marks: []}, []}]}
]}
]}
{:ordered_list, %{start: 1}, [
{:list_item, %{}, [
{:paragraph, %{}, [{:text, %{text: "Step one", marks: []}, []}]}
]},
{:list_item, %{}, [
{:paragraph, %{}, [{:text, %{text: "Step two", marks: []}, []}]}
]}
]}Tables
{:table, %{}, [
{:table_row, %{header: true}, [
{:table_cell, %{}, [
{:paragraph, %{}, [{:text, %{text: "Name", marks: []}, []}]}
]},
{:table_cell, %{}, [
{:paragraph, %{}, [{:text, %{text: "Email", marks: []}, []}]}
]}
]},
{:table_row, %{}, [
{:table_cell, %{}, [
{:paragraph, %{}, [{:text, %{text: "John", marks: []}, []}]}
]},
{:table_cell, %{}, [
{:paragraph, %{}, [{:text, %{text: "john@example.com", marks: []}, []}]}
]}
]}
]}Inline Content (Rich Text)
Inline content uses text nodes with marks, similar to ProseMirror/Tiptap, Lexical, and Slate.
Text Node Structure
{:text, %{text: "content", marks: [mark1, mark2, ...]}, []}All inline content uses the :text node type. Formatting (including links) is applied via marks.
Example
# "Hello world! Visit our site for more."
# ^^^^^ bold
# ^^^^^^^^^^ link
{:paragraph, %{}, [
{:text, %{text: "Hello ", marks: []}, []},
{:text, %{text: "world", marks: [:bold]}, []},
{:text, %{text: "! Visit ", marks: []}, []},
{:text, %{text: "our site", marks: [{:link, %{href: "https://example.com"}}]}, []},
{:text, %{text: " for more.", marks: []}, []}
]}Mark Types
Marks can be simple atoms or tuples with attributes:
# Simple marks (no attributes needed)
:bold
:italic
:underline
:strike
:code
:subscript
:superscript
# Marks with attributes
{:link, %{href: "https://example.com", title: "Link title", target: "_blank"}}
{:highlight, %{color: "yellow"}}
{:font_color, %{color: "#FF5500"}}
{:mention, %{id: "user_123", type: :user, label: "@john"}}Mark Reference
| Mark | Type | Attributes | Description |
|---|---|---|---|
:bold | atom | - | Bold text |
:italic | atom | - | Italic text |
:underline | atom | - | Underlined text |
:strike | atom | - | Strikethrough |
:code | atom | - | Inline code (monospace) |
:subscript | atom | - | Subscript text |
:superscript | atom | - | Superscript text |
:link | tuple | href, title, target | Hyperlink |
:highlight | tuple | color | Background highlight |
:font_color | tuple | color | Text color |
:mention | tuple | id, type, label | User/item mention |
Complex Formatted Text
# "Hello world! Click here for more info."
# ^^^^^ bold
# ^^^^^^^^^^ bold + italic + link
{:paragraph, %{}, [
{:text, %{text: "Hello ", marks: []}, []},
{:text, %{text: "world", marks: [:bold]}, []},
{:text, %{text: "! ", marks: []}, []},
{:text, %{text: "Click here", marks: [:bold, :italic, {:link, %{href: "/info"}}]}, []},
{:text, %{text: " for more info.", marks: []}, []}
]}Why Structured Text (Not Markdown)
- Programmatic manipulation - add/remove formatting without parsing strings
- Collaborative editing - CRDT can track changes to individual text nodes
- Validation - enforce allowed marks per context
- Rendering flexibility - same structure renders to HTML, plain text, or other formats
- Cursor positioning - track cursor position within rich text
Mark Configuration
Each mark type has configuration that controls its behavior:
@mark_config %{
bold: %{
inclusive: true, # New text at mark boundary inherits the mark
keep_on_split: true, # Mark persists when pressing Enter
excludes: [] # No conflicts with other marks
},
italic: %{
inclusive: true,
keep_on_split: true,
excludes: []
},
code: %{
inclusive: false, # New text doesn't inherit code formatting
keep_on_split: false,
excludes: [:bold, :italic, :underline] # Code excludes other formatting
},
link: %{
inclusive: false, # New text doesn't become part of link
keep_on_split: false, # Link doesn't span newlines
excludes: [],
attrs: [:href, :title, :target]
},
highlight: %{
inclusive: true,
keep_on_split: true,
excludes: [],
attrs: [:color]
}
}| Property | Description |
|---|---|
inclusive | Whether new text typed at mark boundary gets the mark |
keep_on_split | Whether mark persists when node is split (e.g., pressing Enter) |
excludes | List of marks that cannot coexist with this mark |
attrs | List of attributes for marks with data |
Text Transforms
Text Splitting Algorithm
When applying a mark to a selection, text nodes are split at the selection boundaries:
defmodule Quillon.Transforms do
@doc """
Apply a mark to text in a paragraph at the given offsets.
Splits text nodes at boundaries, applies the mark, then normalizes.
"""
def apply_mark({:paragraph, attrs, children}, start_offset, end_offset, mark) do
new_children =
children
|> split_at_offset(end_offset) # Split at END first (preserves start offset)
|> split_at_offset(start_offset) # Then split at START
|> add_mark_in_range(start_offset, end_offset, mark)
|> normalize()
{:paragraph, attrs, new_children}
end
defp split_at_offset(nodes, offset) do
{before, at_offset, after_nodes} = find_node_at_offset(nodes, offset)
case at_offset do
nil ->
nodes
{:text, %{text: text, marks: marks}, []} ->
split_pos = offset - total_length(before)
if split_pos == 0 or split_pos == String.length(text) do
nodes # No split needed at boundary
else
left_text = String.slice(text, 0, split_pos)
right_text = String.slice(text, split_pos..-1//1)
left_node = {:text, %{text: left_text, marks: marks}, []}
right_node = {:text, %{text: right_text, marks: marks}, []}
before ++ [left_node, right_node] ++ after_nodes
end
end
end
endNormalization Algorithm
After every edit, normalize the paragraph to merge adjacent text nodes with identical marks:
defmodule Quillon.Normalizer do
@doc """
Normalize paragraph content by:
1. Removing empty text nodes
2. Merging adjacent text nodes with identical marks
"""
def normalize(children) when is_list(children) do
children
|> Enum.reject(&empty_text_node?/1)
|> merge_adjacent()
end
defp empty_text_node?({:text, %{text: ""}, []}), do: true
defp empty_text_node?(_), do: false
defp merge_adjacent([]), do: []
defp merge_adjacent([node]), do: [node]
defp merge_adjacent([{:text, a1, []}, {:text, a2, []} | rest]) do
if marks_equal?(a1.marks, a2.marks) do
# Merge: combine text, keep marks
merged = {:text, %{text: a1.text <> a2.text, marks: a1.marks}, []}
merge_adjacent([merged | rest])
else
[{:text, a1, []} | merge_adjacent([{:text, a2, []} | rest])]
end
end
defp merge_adjacent([node | rest]), do: [node | merge_adjacent(rest)]
@doc """
Compare marks for equality (loose comparison - ignores text content).
Marks must be sorted for reliable comparison.
"""
def marks_equal?(marks1, marks2) do
sort_marks(marks1 || []) == sort_marks(marks2 || [])
end
@mark_priority %{bold: 0, italic: 1, underline: 2, strike: 3, code: 4,
subscript: 5, superscript: 6, link: 7, highlight: 8}
defp sort_marks(marks) do
Enum.sort_by(marks, fn
m when is_atom(m) -> {@mark_priority[m] || 99, to_string(m)}
{type, _attrs} -> {@mark_priority[type] || 99, to_string(type)}
end)
end
endToggle Mark Command
High-level command that checks if mark is active and toggles accordingly:
defmodule Quillon.Commands do
alias Quillon.{Transforms, Normalizer}
def toggle_mark(paragraph, {start_offset, end_offset}, mark) do
if selection_has_mark?(paragraph, start_offset, end_offset, mark) do
remove_mark(paragraph, start_offset, end_offset, mark)
else
Transforms.apply_mark(paragraph, start_offset, end_offset, mark)
end
end
defp selection_has_mark?({:paragraph, _attrs, children}, start_off, end_off, mark) do
# Check if ALL text in range has the mark
children
|> nodes_in_range(start_off, end_off)
|> Enum.all?(fn {:text, %{marks: marks}, []} ->
mark in extract_mark_types(marks)
end)
end
defp extract_mark_types(marks) do
Enum.map(marks, fn
m when is_atom(m) -> m
{type, _attrs} -> type
end)
end
endSchema Validation
Schema validation ensures documents conform to valid structures, similar to ProseMirror's schema system.
Quillon.validate(doc) # Returns {:ok, doc} or {:error, errors}
Quillon.validate!(doc) # Returns doc or raises ValidationErrorGroups
Groups simplify content rules by categorizing node types:
@groups %{
# Block-level content
block: [
:paragraph, :heading, :blockquote, :callout, :code_block,
:image, :video, :bullet_list, :ordered_list, :table, :divider
],
# Inline content (text with marks)
inline: [:text],
# List items
list_content: [:list_item]
}Content Expressions
ProseMirror-style content expressions for declarative rules:
| Expression | Meaning |
|---|---|
"block+" | One or more block nodes |
"block*" | Zero or more block nodes |
"inline*" | Zero or more inline nodes (text) |
"paragraph" | Exactly one paragraph |
"(paragraph | heading)+" | One or more paragraphs or headings |
"paragraph block*" | One paragraph followed by zero or more blocks |
Node Schema
@node_schema %{
# Root type
document: %{
content: "block*",
attrs: [:id, :name]
},
# Text container blocks
paragraph: %{
content: "inline*",
group: :block,
marks: :all
},
heading: %{
content: "inline*",
group: :block,
marks: [:bold, :italic, :underline, :strike, :link],
attrs: [:level]
},
blockquote: %{
content: "paragraph+",
group: :block,
attrs: [:citation]
},
callout: %{
content: "paragraph+",
group: :block,
attrs: [:type, :title]
},
code_block: %{
content: nil,
group: :block,
marks: [],
attrs: [:code, :language]
},
divider: %{
content: nil,
group: :block,
attrs: [:style]
},
# Lists
bullet_list: %{
content: "list_item+",
group: :block
},
ordered_list: %{
content: "list_item+",
group: :block,
attrs: [:start]
},
list_item: %{
content: "paragraph (bullet_list | ordered_list)?",
marks: :parent
},
# Tables
table: %{
content: "table_row+",
group: :block
},
table_row: %{
content: "table_cell+",
attrs: [:header]
},
table_cell: %{
content: "paragraph+",
marks: :all,
attrs: [:colspan, :rowspan]
},
# Media
image: %{
content: nil,
group: :block,
attrs: [:src, :alt, :caption, :width],
required_attrs: [:src]
},
video: %{
content: nil,
group: :block,
attrs: [:src, :poster],
required_attrs: [:src]
},
# Inline text node
text: %{
content: nil,
group: :inline,
attrs: [:text, :marks]
}
}Mark Schema
Defines what marks exist and their behavior:
@mark_schema %{
# Simple formatting marks
bold: %{
inclusive: true,
keep_on_split: true,
excludes: [],
attrs: []
},
italic: %{
inclusive: true,
keep_on_split: true,
excludes: [],
attrs: []
},
underline: %{
inclusive: true,
keep_on_split: true,
excludes: [],
attrs: []
},
strike: %{
inclusive: true,
keep_on_split: true,
excludes: [],
attrs: []
},
code: %{
inclusive: false,
keep_on_split: false,
excludes: [:bold, :italic, :underline, :strike, :link], # code is exclusive
attrs: []
},
subscript: %{
inclusive: true,
keep_on_split: true,
excludes: [:superscript], # can't be both
attrs: []
},
superscript: %{
inclusive: true,
keep_on_split: true,
excludes: [:subscript],
attrs: []
},
# Marks with attributes
link: %{
inclusive: false, # typing at end doesn't extend link
keep_on_split: false,
excludes: [],
attrs: [:href, :title, :target],
required_attrs: [:href]
},
highlight: %{
inclusive: true,
keep_on_split: true,
excludes: [],
attrs: [:color],
default_attrs: %{color: "yellow"}
},
font_color: %{
inclusive: true,
keep_on_split: true,
excludes: [],
attrs: [:color],
required_attrs: [:color]
},
mention: %{
inclusive: false,
keep_on_split: false,
excludes: [],
attrs: [:id, :type, :label],
required_attrs: [:id, :type]
}
}Mark Allowance per Node
Some nodes restrict which marks are allowed:
def allowed_marks(node_type) do
case @node_schema[node_type][:marks] do
:all -> Map.keys(@mark_schema)
:parent -> :inherit # look up to parent
nil -> [] # no marks allowed
list when is_list(list) -> list
end
end
# Examples:
# allowed_marks(:paragraph) => all marks
# allowed_marks(:heading) => [:bold, :italic, :underline, :strike, :link]
# allowed_marks(:code_block) => []Validation Rules
defmodule Quillon.Schema do
@moduledoc """
Schema-based validation for document AST.
Uses content expressions and mark schemas like ProseMirror.
"""
def valid_content?(parent_type, children) do
expression = @node_schema[parent_type][:content]
matches_expression?(children, expression)
end
def matches_expression?(children, expression) do
case expression do
nil -> children == []
"block+" -> length(children) >= 1 and Enum.all?(children, &in_group?(&1, :block))
"block*" -> Enum.all?(children, &in_group?(&1, :block))
"inline*" -> Enum.all?(children, &in_group?(&1, :inline))
"list_item+" -> length(children) >= 1 and Enum.all?(children, &is_type?(&1, :list_item))
"paragraph+" -> length(children) >= 1 and Enum.all?(children, &is_type?(&1, :paragraph))
"table_row+" -> length(children) >= 1 and Enum.all?(children, &is_type?(&1, :table_row))
"table_cell+" -> length(children) >= 1 and Enum.all?(children, &is_type?(&1, :table_cell))
_ -> true # Complex expressions need parser
end
end
defp in_group?({type, _, _}, group), do: type in (@groups[group] || [])
defp is_type?({type, _, _}, expected), do: type == expected
def valid_marks?(parent_type, marks) do
allowed = allowed_marks(parent_type)
mark_types = Enum.map(marks, fn
m when is_atom(m) -> m
{type, _} -> type
end)
Enum.all?(mark_types, &(&1 in allowed))
end
def mark_allowed?(existing_marks, new_mark) do
new_type = case new_mark do
m when is_atom(m) -> m
{type, _} -> type
end
excludes = @mark_schema[new_type][:excludes] || []
existing_types = Enum.map(existing_marks, fn
m when is_atom(m) -> m
{type, _} -> type
end)
not Enum.any?(existing_types, &(&1 in excludes))
end
def validate({type, attrs, children} = node) do
with :ok <- validate_type_exists(type),
:ok <- validate_attrs(type, attrs),
:ok <- validate_content(type, children),
:ok <- validate_children(children) do
{:ok, node}
end
end
endValidation Errors
# Invalid: list_item outside list
{:paragraph, %{}, [
{:list_item, %{}, [...]}
]}
# => {:error, "Invalid content for paragraph: expected inline*, got list_item"}
# Invalid: code mark in heading
{:heading, %{level: 1}, [
{:text, %{text: "Hello", marks: [:code]}, []}
]}
# => {:error, "Mark :code not allowed in heading"}
# Invalid: conflicting marks (subscript + superscript)
{:text, %{text: "x", marks: [:subscript, :superscript]}, []}
# => {:error, "Mark :superscript conflicts with :subscript"}
# Invalid: link without href
{:text, %{text: "click", marks: [{:link, %{title: "Link"}}]}, []}
# => {:error, "Mark :link requires attr :href"}JSON Serialization
{
"type": "document",
"attrs": { "id": "doc_123", "name": "My Document" },
"children": [
{
"type": "heading",
"attrs": { "level": 1 },
"children": [
{ "type": "text", "attrs": { "text": "Hello", "marks": [] }, "children": [] }
]
},
{
"type": "paragraph",
"attrs": {},
"children": [
{ "type": "text", "attrs": { "text": "Hello ", "marks": [] }, "children": [] },
{ "type": "text", "attrs": { "text": "world", "marks": ["bold"] }, "children": [] }
]
}
]
}Mark serialization:
- Simple marks:
"bold","italic","code" - Marks with attrs:
{ "type": "link", "attrs": { "href": "..." } }
Comparison with JS Editors
| Feature | Quillon | ProseMirror/Tiptap | Lexical | Slate |
|---|---|---|---|---|
| Structure | Elixir tuples | JS objects | JS classes | JS objects |
| Immutable | Yes (language native) | No (mutable DOM) | Yes | Yes |
| Block types | Schema-defined | Schema-defined | Node classes | Element types |
| Inline formatting | Structured nodes + marks | Mark system | Format states | Leaf nodes |
| Collaboration | CRDT-ready (separate pkg) | Yjs plugin | Yjs plugin | Yjs plugin |
| Server-side | Native Elixir | N/A | N/A | N/A |
Quillon advantages:
- Native Elixir immutability (no runtime overhead)
- Same structure on client and server
- JSON serialization built-in
- Server-side rendering without JS dependency
- Framework agnostic (works without Phoenix/LiveView)