Understanding Schema & Validation

Copy Markdown View Source

Schemas define the valid structure of Quillon documents. They specify which node types are allowed, what children each can contain, which marks can be applied, and attribute requirements.

Why Schemas?

Rich text documents need structure constraints:

  • A document must contain blocks, not raw text
  • Lists must contain list items, not paragraphs directly
  • Links require an href attribute
  • Subscript and superscript can't both be applied to the same text

Without validation, it's easy to create malformed documents that break rendering or cause unexpected behavior. Schemas catch these issues early.

Schema Structure

A schema has three main components:

%Quillon.Schema{
  groups: %{...},   # Named groups of node types
  nodes: %{...},    # Node type specifications
  marks: %{...}     # Mark specifications
}

Groups

Groups bundle node types for use in content expressions:

groups: %{
  block: [:paragraph, :heading, :divider, :blockquote, ...],
  inline: [:text],
  list_content: [:list_item],
  table_content: [:table_row],
  table_row_content: [:table_cell]
}

Instead of listing every allowed type, you can reference a group: "block+" means "one or more of any block type."

Node Specifications

Each node type has a specification defining its structure:

nodes: %{
  paragraph: %{
    content: "inline*",  # What children are allowed
    group: :block,       # Which group this belongs to
    marks: :all          # Which marks can be applied
  },
  heading: %{
    content: "inline*",
    group: :block,
    marks: :all,
    attrs: %{
      level: %{required: true}  # Required attribute
    }
  },
  divider: %{
    content: nil,        # No children allowed
    group: :block,
    marks: nil,          # No marks allowed
    attrs: %{
      style: %{default: :solid}  # Optional with default
    }
  }
}

Node Spec Fields

FieldTypeDescription
contentString.t() or nilContent expression defining allowed children
groupatom()Group this node belongs to
marks:all, [atom()], or nilAllowed marks (:all = any, nil = none)
attrsmap()Attribute specifications

Attribute Specs

FieldTypeDescription
requiredboolean()Must be present (default: false)
defaultany()Default value if not provided

Mark Specifications

Each mark type has a specification controlling its behavior:

marks: %{
  bold: %{
    inclusive: true,
    keep_on_split: true
  },
  link: %{
    inclusive: false,
    keep_on_split: true,
    attrs: %{
      href: %{required: true},
      title: %{},
      target: %{}
    }
  },
  subscript: %{
    inclusive: true,
    keep_on_split: true,
    excludes: [:superscript]
  }
}

Mark Spec Fields

FieldTypeDescription
inclusiveboolean()Text at boundary inherits mark (default: true)
keep_on_splitboolean()Mark persists after Enter/split (default: true)
excludes[atom()]Conflicting marks that can't coexist
attrsmap()Attribute specifications

See the Marks Guide for detailed explanations of inclusive, keep_on_split, and excludes.

Content Expressions

Content expressions define what children a node can contain using a simple grammar.

Basic Syntax

ExpressionMeaning
"paragraph"Exactly one paragraph
"paragraph+"One or more paragraphs
"paragraph*"Zero or more paragraphs
"block+"One or more from the block group
"inline*"Zero or more from the inline group
nilNo children allowed

Choice Syntax

Use parentheses and | for alternatives:

ExpressionMeaning
"(paragraph | heading)"Exactly one paragraph OR heading
"(paragraph | heading)+"One or more of either type
"(paragraph | heading)*"Zero or more of either type

Sequence Syntax

Separate elements with spaces for sequences:

ExpressionMeaning
"heading paragraph+"One heading followed by one or more paragraphs
"block+ divider block+"Blocks, then divider, then more blocks

Examples in Practice

# Document contains one or more blocks
document: %{content: "block+"}

# Paragraph contains zero or more inline (text) nodes
paragraph: %{content: "inline*"}

# Table contains one or more rows
table: %{content: "table_row+"}

# List item contains one or more blocks (for nesting)
list_item: %{content: "block+"}

# Code block has no children (code stored in attrs)
code_block: %{content: nil}

Using the Default Schema

Quillon provides a comprehensive default schema:

schema = Quillon.Schema.default()

# Check what's in the schema
Map.keys(schema.nodes)
# => [:document, :paragraph, :heading, :divider, :text, :blockquote,
#     :callout, :code_block, :image, :video, :bullet_list, :ordered_list,
#     :list_item, :table, :table_row, :table_cell]

Map.keys(schema.marks)
# => [:bold, :italic, :underline, :strike, :code, :link,
#     :subscript, :superscript, :highlight, :font_color, :mention]

Validating Documents

Basic Validation

doc = Quillon.document([
  Quillon.paragraph("Hello world")
])

# Validate with ok/error tuple
case Quillon.validate(doc) do
  {:ok, doc} ->
    # Document is valid
    save_to_database(doc)

  {:error, errors} ->
    # Handle validation errors
    Enum.each(errors, &IO.inspect/1)
end

# Validate with exception
doc = Quillon.validate!(doc)  # Raises on error

Understanding Errors

Validation errors include path, type, and message:

bad_doc = {:unknown_type, %{}, []}

{:error, errors} = Quillon.validate(bad_doc)
# errors = [
#   %{
#     path: [],
#     type: :unknown_type,
#     message: "Unknown node type: unknown_type"
#   }
# ]

Error Types

TypeDescription
:unknown_typeNode type not in schema
:invalid_contentChildren don't match content expression
:missing_attrRequired attribute not present
:mark_not_allowedMark can't be applied to this node
:mark_conflictTwo marks that exclude each other
:unknown_markMark type not in schema

Error Paths

The path field tells you where the error occurred:

doc = Quillon.document([
  Quillon.paragraph("First"),       # path: [0]
  {:bad_node, %{}, []}              # path: [1]
])

{:error, [%{path: [1], ...}]} = Quillon.validate(doc)

For nested structures:

# path: [0, 1, 0] means:
# - First child of document (the list)
# - Second child of list (second list_item)
# - First child of list_item (the paragraph)

Validation Examples

Missing Required Attribute

# Heading requires a level
bad = {:heading, %{}, [{:text, %{text: "Title", marks: []}, []}]}

{:error, [%{type: :missing_attr, message: "Missing required attribute: level"}]} =
  Quillon.validate(bad)

Invalid Content

# Document requires block+ (one or more blocks)
empty_doc = {:document, %{}, []}

{:error, [%{type: :invalid_content, ...}]} = Quillon.validate(empty_doc)

# Paragraph allows inline*, but not blocks
para_with_block = {:paragraph, %{}, [
  {:heading, %{level: 1}, [{:text, %{text: "Wrong", marks: []}, []}]}
]}

{:error, [%{type: :invalid_content, ...}]} = Quillon.validate(para_with_block)

Mark Conflicts

# Subscript and superscript conflict
text = {:text, %{text: "H2O", marks: [:subscript, :superscript]}, []}
doc = Quillon.document([{:paragraph, %{}, [text]}])

{:error, errors} = Quillon.validate(doc)
# Contains :mark_conflict error

Creating Custom Schemas

Extending the Default

Use Schema.merge/2 to extend the default schema:

custom = Quillon.Schema.merge(
  Quillon.Schema.default(),
  %Quillon.Schema{
    # Add a custom node type
    nodes: %{
      aside: %{
        content: "block+",
        group: :block,
        attrs: %{
          position: %{default: :right}
        }
      }
    },
    # Add a custom mark
    marks: %{
      redacted: %{
        inclusive: false,
        keep_on_split: false
      }
    },
    # Update groups to include new node
    groups: %{
      block: [:aside | Quillon.Types.block_types()]
    }
  }
)

Creating from Scratch

For complete control, build a schema from scratch:

minimal_schema = %Quillon.Schema{
  groups: %{
    block: [:paragraph],
    inline: [:text]
  },
  nodes: %{
    document: %{content: "block+"},
    paragraph: %{content: "inline*", group: :block, marks: :all},
    text: %{content: nil, group: :inline, marks: :all}
  },
  marks: %{
    bold: %{inclusive: true, keep_on_split: true},
    italic: %{inclusive: true, keep_on_split: true}
  }
}

# Validate against custom schema
{:ok, doc} = Quillon.Schema.Validator.validate(doc, minimal_schema)

Restricting Marks

Limit which marks can be applied to specific nodes:

%Quillon.Schema{
  nodes: %{
    paragraph: %{marks: :all},           # All marks allowed
    heading: %{marks: [:bold, :italic]}, # Only bold/italic
    code_block: %{marks: nil}            # No marks
  }
}

Schema Functions

The Quillon.Schema module provides utilities for querying schemas:

schema = Quillon.Schema.default()

# Check existence
Quillon.Schema.node_type?(schema, :paragraph)  # => true
Quillon.Schema.mark_type?(schema, :bold)       # => true

# Get specifications
Quillon.Schema.get_node_spec(schema, :heading)
# => %{content: "inline*", group: :block, marks: :all, attrs: %{level: %{required: true}}}

Quillon.Schema.get_mark_spec(schema, :link)
# => %{inclusive: false, keep_on_split: true, attrs: %{...}}

# Query groups
Quillon.Schema.get_group(schema, :block)
# => [:paragraph, :heading, :divider, ...]

# Check mark permissions
Quillon.Schema.allowed_marks(schema, :paragraph)  # => :all
Quillon.Schema.allowed_marks(schema, :divider)    # => nil
Quillon.Schema.mark_allowed?(schema, :paragraph, :bold)  # => true

# Check conflicts
Quillon.Schema.marks_conflict?(schema, :subscript, :superscript)  # => true
Quillon.Schema.marks_conflict?(schema, :bold, :italic)            # => false

Best Practices

1. Validate at Boundaries

Validate documents at system boundaries:

# When receiving from API
def create_document(params) do
  with {:ok, doc} <- Quillon.from_json(params["content"]),
       {:ok, doc} <- Quillon.validate(doc) do
    save_document(doc)
  end
end

# When loading from database
def load_document(id) do
  raw = get_from_db(id)
  {:ok, doc} = Quillon.from_json(raw)
  {:ok, doc} = Quillon.validate(doc)
  doc
end

2. Fail Fast in Development

Use validate!/1 in tests and development:

# In tests - fail immediately on invalid docs
test "creates valid document" do
  doc = MyApp.create_document(attrs)
  Quillon.validate!(doc)  # Raises with details if invalid
end

3. Handle Errors Gracefully in Production

Use validate/1 in production for graceful error handling:

def process_document(doc) do
  case Quillon.validate(doc) do
    {:ok, valid_doc} ->
      {:ok, render(valid_doc)}

    {:error, errors} ->
      Logger.error("Invalid document", errors: errors)
      {:error, :invalid_document}
  end
end

4. Custom Validation Rules

Layer business rules on top of schema validation:

def validate_blog_post(doc) do
  with {:ok, doc} <- Quillon.validate(doc),
       :ok <- validate_has_title(doc),
       :ok <- validate_word_count(doc, min: 100),
       :ok <- validate_no_empty_paragraphs(doc) do
    {:ok, doc}
  end
end

defp validate_has_title(doc) do
  case find_heading(doc, level: 1) do
    nil -> {:error, "Blog post requires an H1 title"}
    _ -> :ok
  end
end

Default Schema Reference

Here's the complete default schema for reference:

Nodes

TypeContentGroupMarksRequired Attrs
documentblock+---
paragraphinline*blockall-
headinginline*blockalllevel
dividernilblocknil-
textnilinlinealltext
blockquoteblock+block--
calloutblock+block-type
code_blocknilblock-code
imagenilblock-src
videonilblock-src
bullet_listlist_item+block--
ordered_listlist_item+block--
list_itemblock+list_content--
tabletable_row+block--
table_rowtable_cell+table_content--
table_cellblock+table_row_content--

Marks

MarkInclusiveKeep on SplitExcludesRequired Attrs
boldtruetrue--
italictruetrue--
underlinetruetrue--
striketruetrue--
codefalsetruelink-
linkfalsetrue-href
subscripttruetruesuperscript-
superscripttruetruesubscript-
highlighttruetrue-color
font_colortruetrue-color
mentionfalsefalse-id, type, label