MDEx.Document (MDEx v0.8.5)
View SourceTree representation of a Markdown document.
%MDEx.Document{
nodes: [
%MDEx.Paragraph{
nodes: [
%MDEx.Code{num_backticks: 1, literal: "Elixir"}
]
}
]
}Each node may contain attributes and children nodes as in the example above where MDEx.Document
contains a MDEx.Paragraph node which contains a MDEx.Code node with the attributes :num_backticks and :literal.
You can check out each node's documentation in the Document Nodes section, for example MDEx.HtmlBlock.
The MDEx.Document module represents the root of a document and implements several behaviours and protocols
to enable operations to fetch, update, and manipulate the document tree.
In these examples we will be using the ~MD sigil.
Tree Traversal
Understanding tree traversal is fundamental to working with MDEx documents, as it affects how all
Enum functions, Access operations, and other protocols behave.
The document tree is enumerated using depth-first pre-order traversal. This means:
- The parent node is visited first
- Then each child node is visited recursively
- Children are processed in the order they appear in the
:nodeslist
This traversal order affects all Enum functions, including Enum.at/2, Enum.map/2, Enum.find/2, etc.
iex> doc = ~MD[# Hello]
iex> Enum.at(doc, 0)
%MDEx.Document{nodes: [%MDEx.Heading{nodes: [%MDEx.Text{literal: "Hello"}], level: 1, setext: false}]}
iex> Enum.at(doc, 1)
%MDEx.Heading{nodes: [%MDEx.Text{literal: "Hello"}], level: 1, setext: false}
iex> Enum.at(doc, 2)
%MDEx.Text{literal: "Hello"}More complex traversal with nested elements:
iex> doc = ~MD[**bold** text]
iex> Enum.at(doc, 0)
%MDEx.Document{nodes: [%MDEx.Paragraph{nodes: [%MDEx.Strong{nodes: [%MDEx.Text{literal: "bold"}]}, %MDEx.Text{literal: " text"}]}]}
iex> Enum.at(doc, 1)
%MDEx.Paragraph{nodes: [%MDEx.Strong{nodes: [%MDEx.Text{literal: "bold"}]}, %MDEx.Text{literal: " text"}]}
iex> Enum.at(doc, 2)
%MDEx.Strong{nodes: [%MDEx.Text{literal: "bold"}]}
iex> Enum.at(doc, 3)
%MDEx.Text{literal: "bold"}
iex> Enum.at(doc, 4)
%MDEx.Text{literal: " text"}Enumerable
The Enumerable protocol allows us to call Enum functions to iterate over and manipulate the document tree.
All enumeration follows the depth-first traversal order described above.
Count the nodes in a document:
iex> doc = ~MD"""
...> # Languages
...>
...> `elixir`
...>
...> `rust`
...> """
iex> Enum.count(doc)
7Count how many nodes have the :literal attribute:
iex> doc = ~MD"""
...> # Languages
...>
...> `elixir`
...>
...> `rust`
...> """
iex> Enum.reduce(doc, 0, fn
...> %{literal: _literal}, acc -> acc + 1
...>
...> _node, acc -> acc
...> end)
3Check if a node is member of the document:
iex> doc = ~MD"""
...> # Languages
...>
...> `elixir`
...>
...> `rust`
...> """
iex> Enum.member?(doc, %MDEx.Code{literal: "elixir", num_backticks: 1})
trueMap each node to its module name:
iex> doc = ~MD"""
...> # Languages
...>
...> `elixir`
...>
...> `rust`
...> """
iex> Enum.map(doc, fn %node{} -> inspect(node) end)
["MDEx.Document", "MDEx.Heading", "MDEx.Text", "MDEx.Paragraph", "MDEx.Code", "MDEx.Paragraph", "MDEx.Code"]Collectable
The Collectable protocol allows you to build documents by collecting nodes or merging multiple documents together.
This is particularly useful for programmatically constructing documents from various sources.
Merge two documents together using Enum.into/2:
iex> first_doc = ~MD[# First Document]
iex> second_doc = ~MD[# Second Document]
iex> Enum.into(second_doc, first_doc)
%MDEx.Document{
nodes: [
%MDEx.Heading{nodes: [%MDEx.Text{literal: "First Document"}], level: 1, setext: false},
%MDEx.Heading{nodes: [%MDEx.Text{literal: "Second Document"}], level: 1, setext: false}
]
}Collect individual nodes into a document:
iex> chunks = [
...> %MDEx.Text{literal: "Hello "},
...> %MDEx.Code{literal: "world", num_backticks: 1}
...> ]
iex> document = Enum.into(chunks, %MDEx.Document{})
%MDEx.Document{
nodes: [
%MDEx.Text{literal: "Hello "},
%MDEx.Code{literal: "world", num_backticks: 1}
]
}
iex> MDEx.to_html!(document)
"Hello <code>world</code>"Build a document incrementally by collecting mixed content:
iex> chunks = [
...> %MDEx.Heading{nodes: [%MDEx.Text{literal: "Title"}], level: 1, setext: false},
...> %MDEx.Paragraph{nodes: []},
...> %MDEx.Text{literal: "Some text"},
...> %MDEx.ListItem{nodes: [%MDEx.Text{literal: "Item 1"}]},
...> %MDEx.Text{literal: " - WIP"},
...> ]
iex> document = Enum.into(chunks, %MDEx.Document{})
%MDEx.Document{
nodes: [
%MDEx.Heading{
level: 1,
nodes: [%MDEx.Text{literal: "Title"}],
setext: false
},
%MDEx.Paragraph{
nodes: [%MDEx.Text{literal: "Some text"}]
},
%MDEx.List{
bullet_char: "-",
delimiter: :period,
is_task_list: false,
list_type: :bullet,
marker_offset: 0,
nodes: [%MDEx.ListItem{nodes: [%MDEx.Text{literal: "Item 1 - WIP"}], list_type: :bullet, marker_offset: 0, padding: 2, start: 1, delimiter: :period, bullet_char: "-", tight: true, is_task_list: false}],
padding: 2,
start: 1,
tight: true
}
]
}
iex> MDEx.to_html!(document)
"<h1>Title</h1>\n<p>Some text</p>\n<ul>\n<li>Item 1 - WIP</li>\n</ul>"Access
The Access behaviour gives you the ability to fetch and update nodes using different types of keys.
Access operations also follow the depth-first traversal order when searching through nodes.
Access by Index
You can access nodes by their position in the depth-first traversal using integer indices:
iex> doc = ~MD[# Hello]
iex> doc[0] # First node (the document itself)
%MDEx.Document{nodes: [%MDEx.Heading{nodes: [%MDEx.Text{literal: "Hello"}], level: 1, setext: false}]}
iex> doc[1] # Second node (the heading)
%MDEx.Heading{nodes: [%MDEx.Text{literal: "Hello"}], level: 1, setext: false}
iex> doc[2] # Third node (the text)
%MDEx.Text{literal: "Hello"}Negative indices access nodes from the end:
iex> doc = ~MD[# Hello **world**]
iex> doc[-1] # Last node
%MDEx.Text{literal: "world"}Access by Node Type
Starting with a simple Markdown document, let's fetch only the text node by matching the MDEx.Text node:
iex> ~MD[# Hello][%MDEx.Text{literal: "Hello"}]
[%MDEx.Text{literal: "Hello"}]That's essentially the same as:
doc = %MDEx.Document{nodes: [%MDEx.Heading{nodes: [%MDEx.Text{literal: "Hello"}], level: 1, setext: false}]},
Enum.filter(
doc,
fn node -> node == %MDEx.Text{literal: "Hello"} end
)The key can also be modules, atoms, and even functions! For example:
Fetch all Code nodes, either by MDEx.Code module or the :code atom representing the Code node:
iex> doc = ~MD"""
...> # Languages
...>
...> `elixir`
...>
...> `rust`
...> """
iex> doc[MDEx.Code]
[%MDEx.Code{num_backticks: 1, literal: "elixir"}, %MDEx.Code{num_backticks: 1, literal: "rust"}]
iex> doc[:code]
[%MDEx.Code{num_backticks: 1, literal: "elixir"}, %MDEx.Code{num_backticks: 1, literal: "rust"}]Dynamically fetch Code nodes where the :literal (node content) starts with "eli" using a function to filter the result:
iex> doc = ~MD"""
...> # Languages
...>
...> `elixir`
...>
...> `rust`
...> """
iex> doc[fn node -> String.starts_with?(Map.get(node, :literal, ""), "eli") end]
[%MDEx.Code{num_backticks: 1, literal: "elixir"}]That's the most flexible option, in case struct, modules, or atoms are not enough to match the node you want.
The Access protocol also allows us to update nodes that match a selector.
In the example below we'll capitalize the content of all MDEx.Code nodes:
iex> doc = ~MD"""
...> # Languages
...>
...> `elixir`
...>
...> `rust`
...>
...> Continue...
...> """
iex> update_in(doc, [:document, Access.key!(:nodes), Access.all(), :code, Access.key!(:literal)], fn literal ->
...> String.upcase(literal)
...> end)
%MDEx.Document{
nodes: [
%MDEx.Heading{nodes: [%MDEx.Text{literal: "Languages"}], level: 1, setext: false},
%MDEx.Paragraph{nodes: [%MDEx.Code{num_backticks: 1, literal: "ELIXIR"}]},
%MDEx.Paragraph{nodes: [%MDEx.Code{num_backticks: 1, literal: "RUST"}]},
%MDEx.Paragraph{nodes: [%MDEx.Text{literal: "Continue..."}]}
]
}String.Chars
Calling Kernel.to_string/1 will format it as CommonMark text:
iex> to_string(~MD[# Hello])
"# Hello"Fragments (nodes without the parent %Document{}) are also formatted:
iex> to_string(%MDEx.Heading{nodes: [%MDEx.Text{literal: "Hello"}], level: 1})
"# Hello"Traverse and Update
You can also use the low-level MDEx.traverse_and_update/2 and MDEx.traverse_and_update/3 APIs
to traverse each node of the AST and either update the nodes or do some calculation with an accumulator.
Practical Examples
Here are some common patterns for working with MDEx documents that combine the protocols described above.
Update all code block nodes filtered by the selector function
Add line "// Modified" in Rust block codes:
iex> doc = ~MD"""
...> # Code Examples
...>
...> ```elixir
...> def hello do
...> :world
...> end
...> ```
...>
...> ```rust
...> fn main() {
...> println!("Hello");
...> }
...> ```
...> """
iex> selector = fn
...> %MDEx.CodeBlock{info: "rust"} -> true
...> _ -> false
...> end
iex> update_in(doc, [:document, Access.key!(:nodes), Access.all(), selector], fn node ->
...> %{node | literal: "// Modified\n" <> node.literal}
...> end)
%MDEx.Document{
nodes: [
%MDEx.Heading{
nodes: [%MDEx.Text{literal: "Code Examples"}],
level: 1,
setext: false
},
%MDEx.CodeBlock{
info: "elixir",
literal: "def hello do\n :world\nend\n"
},
%MDEx.CodeBlock{
info: "rust",
literal: "// Modified\nfn main() {\n println!(\"Hello\");\n}\n"
}
]
}Collect headings by level
iex> doc = ~MD"""
...> # Main Title
...>
...> ## Section 1
...>
...> ### Subsection
...>
...> ## Section 2
...> """
iex> Enum.reduce(doc, %{}, fn
...> %MDEx.Heading{level: level, nodes: [%MDEx.Text{literal: text}]}, acc ->
...> Map.update(acc, level, [text], &[text | &1])
...> _node, acc -> acc
...> end)
%{
1 => ["Main Title"],
2 => ["Section 2", "Section 1"],
3 => ["Subsection"]
}Extract and transform task list items
iex> doc = ~MD"""
...> # Todo List
...>
...> - [ ] Buy groceries
...> - [x] Call mom
...> - [ ] Read book
...> """
iex> Enum.map(doc, fn
...> %MDEx.TaskItem{checked: checked, nodes: [%MDEx.Paragraph{nodes: [%MDEx.Text{literal: text}]}]} ->
...> {checked, text}
...> _ -> nil
...> end)
...> |> Enum.reject(&is_nil/1)
[
{false, "Buy groceries"},
{true, "Call mom"},
{false, "Read book"}
]Bump all heading levels, except level 6
iex> doc = ~MD"""
...> # Main Title
...>
...> ## Subtitle
...>
...> ###### Notes
...> """
iex> selector = fn
...> %MDEx.Heading{level: level} when level < 6 -> true
...> _ -> false
...> end
iex> update_in(doc, [:document, Access.key!(:nodes), Access.all(), selector], fn node ->
...> %{node | level: node.level + 1}
...> end)
%MDEx.Document{
nodes: [
%MDEx.Heading{nodes: [%MDEx.Text{literal: "Main Title"}], level: 2, setext: false},
%MDEx.Heading{nodes: [%MDEx.Text{literal: "Subtitle"}], level: 3, setext: false},
%MDEx.Heading{nodes: [%MDEx.Text{literal: "Notes"}], level: 6, setext: false}
]
}
Summary
Types
Fragment of a Markdown document, a single node. May contain children nodes.
Selector used to match nodes in the document.
Tree root of a Markdown document, including all children nodes.
Functions
Callback implementation for Access.fetch/2.
Callback implementation for Access.get_and_update/3.
Callback implementation for Access.fetch/2.
Types
@type md_node() :: MDEx.FrontMatter.t() | MDEx.BlockQuote.t() | MDEx.List.t() | MDEx.ListItem.t() | MDEx.DescriptionList.t() | MDEx.DescriptionItem.t() | MDEx.DescriptionTerm.t() | MDEx.DescriptionDetails.t() | MDEx.CodeBlock.t() | MDEx.HtmlBlock.t() | MDEx.Paragraph.t() | MDEx.Heading.t() | MDEx.ThematicBreak.t() | MDEx.FootnoteDefinition.t() | MDEx.FootnoteReference.t() | MDEx.Table.t() | MDEx.TableRow.t() | MDEx.TableCell.t() | MDEx.Text.t() | MDEx.TaskItem.t() | MDEx.SoftBreak.t() | MDEx.LineBreak.t() | MDEx.Code.t() | MDEx.HtmlInline.t() | MDEx.Raw.t() | MDEx.Emph.t() | MDEx.Strong.t() | MDEx.Strikethrough.t() | MDEx.Superscript.t() | MDEx.Link.t() | MDEx.Image.t() | MDEx.ShortCode.t() | MDEx.Math.t() | MDEx.MultilineBlockQuote.t() | MDEx.Escaped.t() | MDEx.WikiLink.t() | MDEx.Underline.t() | MDEx.Subscript.t() | MDEx.SpoileredText.t() | MDEx.EscapedTag.t() | MDEx.Alert.t()
Fragment of a Markdown document, a single node. May contain children nodes.
Selector used to match nodes in the document.
Valid selectors can be the module or struct, an atom representing the node name, or a function that receives a node and returns a boolean.
See MDEx.Document for more info and examples.
@type t() :: %MDEx.Document{nodes: [md_node()]}
Tree root of a Markdown document, including all children nodes.
Functions
Callback implementation for Access.fetch/2.
See the Access section for examples.
Callback implementation for Access.get_and_update/3.
See the Access section for examples.
Callback implementation for Access.fetch/2.
See the Access section for examples.