MDEx (MDEx v0.8.1)

View Source
MDEx logo
Hex Version Hex Docs MIT

Fast and Extensible Markdown for Elixir.

Features

Examples

Livebook examples are available at Pages / Examples

Installation

Add :mdex dependency:

def deps do
  [
    {:mdex, "~> 0.8"}
  ]
end

Usage

iex> MDEx.to_html!("# Hello :smile:", extension: [shortcodes: true])
"<h1>Hello ๐Ÿ˜„</h1>"
iex> import MDEx.Sigil
iex> ~MD[
...> # Hello :smile:
...> ]HTML
"<h1>Hello ๐Ÿ˜„</h1>"
iex> import MDEx.Sigil
iex> ~MD[
...> # Hello :smile:
...> ]
%MDEx.Document{nodes: [%MDEx.Heading{nodes: [%MDEx.Text{literal: "Hello "}, %MDEx.ShortCode{code: "smile", emoji: "๐Ÿ˜„"}], level: 1, setext: false}]}

Foundation

The library is built on top of:

Parsing

Converts Markdown to an AST data structure that can be inspected and manipulated to change the content of the document programmatically.

The data structure format is inspired on Floki (with :attributes_as_maps = true) so we can keep similar APIs and keep the same mental model when working with these documents, either Markdown or HTML, where each node is represented as a struct holding the node name as the struct name and its attributes and children, for eg:

%MDEx.Heading{
  level: 1
  nodes: [...],
}

The parent node that represents the root of the document is the MDEx.Document struct, where you can find more more information about the AST and what operations are available.

The complete list of nodes is listed in the the section Document Nodes.

Formatting

Formatting is the process of converting from one format to another, for example from AST or Markdown to HTML. Formatting to XML and to Markdown is also supported.

You can use MDEx.parse_document/2 to generate an AST or any of the to_* functions to convert to Markdown (CommonMark), HTML, JSON, or XML.

Summary

Types

Options to customize the parsing and rendering of Markdown documents.

Input source document.

Syntax Highlight code blocks using autumn.

Functions

Converts a given text string to a format that can be used as an "anchor", such as in a Table of Contents.

Returns the default options for the :extension group.

Returns the default options for the :parse group.

Returns the default options for the :render group.

Returns the default options for the :sanitize group.

Returns the default options for the :syntax_highlight group.

Builds a new MDEx.Pipe instance.

Same as parse_document/2 but raises if the parsing fails.

Parse a markdown string and returns only the node that represents the fragment.

Same as parse_fragment/2 but raises if the parsing fails or returns nil.

Utility function to sanitize and escape HTML.

Convert Markdown, MDEx.Document, or MDEx.Pipe to HTML.

Same as to_html/2 but raises error if the conversion fails.

Convert Markdown, MDEx.Document, or MDEx.Pipe to JSON using default options.

Convert Markdown, MDEx.Document, or MDEx.Pipe to JSON using custom options.

Same as to_json/1 but raises an error if the conversion fails.

Same as to_json/2 but raises error if the conversion fails.

Convert MDEx.Document or MDEx.Pipe to Markdown using default options.

Convert MDEx.Document or MDEx.Pipe to Markdown using custom options.

Same as to_markdown/1 but raises MDEx.DecodeError if the conversion fails.

Same as to_markdown/2 but raises MDEx.DecodeError if the conversion fails.

Convert Markdown, MDEx.Document, or MDEx.Pipe to XML.

Same as to_xml/2 but raises error if the conversion fails.

Low-level function to traverse and update the Markdown document preserving the tree structure format.

Low-level function to traverse and update the Markdown document preserving the tree structure format and keeping an accumulator.

Types

extension_options()

@type extension_options() :: [
  strikethrough: boolean(),
  tagfilter: boolean(),
  table: boolean(),
  autolink: boolean(),
  tasklist: boolean(),
  superscript: boolean(),
  header_ids: binary() | nil,
  footnotes: boolean(),
  description_lists: boolean(),
  front_matter_delimiter: binary() | nil,
  multiline_block_quotes: boolean(),
  alerts: boolean(),
  math_dollars: boolean(),
  math_code: boolean(),
  shortcodes: boolean(),
  wikilinks_title_after_pipe: boolean(),
  wikilinks_title_before_pipe: boolean(),
  underline: boolean(),
  subscript: boolean(),
  spoiler: boolean(),
  greentext: boolean(),
  image_url_rewriter: binary() | nil,
  link_url_rewriter: binary() | nil
]

List of comrak extension options.

Example

MDEx.to_html!("~~strikethrough~~", extension: [strikethrough: true])
#=> "<p><del>strikethrough</del></p>"

options()

@type options() :: [
  document: markdown :: String.t() | MDEx.Document.t(),
  extension: extension_options(),
  parse: parse_options(),
  render: render_options(),
  syntax_highlight: syntax_highlight_options() | nil,
  sanitize: sanitize_options() | nil,
  features: keyword()
]

Options to customize the parsing and rendering of Markdown documents.

Examples

  • Enable the table extension:

      MDEx.to_html!("""
      | lang |
      |------|
      | elixir |
      """,
      extension: [table: true]
      )
  • Syntax highlight using inline style and the github_light theme:

      MDEx.to_html!("""
      ## Code Example
    
      ```elixir
      Atom.to_string(:elixir)
      ```
      """,
      syntax_highlight: [
        formatter: {:html_inline, theme: "github_light"}
      ])
  • Sanitize HTML output, in this example disallow <a> tags:

      MDEx.to_html!("""
      ## Links won't be displayed
    
      <a href="https://example.com">Example</a>
      ```
      """,
      sanitize: [
        rm_tags: ["a"],
      ])

Options

  • :document - Markdown document, either a string or a MDEx.Document struct. The default value is "".

  • :extension (keyword/0) - Enable extensions. See comrak's ExtensionOptions for more info and examples. The default value is [].

    • :strikethrough (boolean/0) - Enables the strikethrough extension from the GFM spec. The default value is false.

    • :tagfilter (boolean/0) - Enables the tagfilter extension from the GFM spec. The default value is false.

    • :table (boolean/0) - Enables the table extension from the GFM spec. The default value is false.

    • :autolink (boolean/0) - Enables the autolink extension from the GFM spec. The default value is false.

    • :tasklist (boolean/0) - Enables the task list extension from the GFM spec. The default value is false.

    • :superscript (boolean/0) - Enables the superscript Comrak extension. The default value is false.

    • :header_ids - Enables the header IDs Comrak extension. The default value is nil.

    • :footnotes (boolean/0) - Enables the footnotes extension per cmark-gfm The default value is false.

    • :description_lists (boolean/0) - Enables the description lists extension. The default value is false.

    • :front_matter_delimiter - Enables the front matter extension. The default value is nil.

    • :multiline_block_quotes (boolean/0) - Enables the multiline block quotes extension. The default value is false.

    • :alerts (boolean/0) - Enables GitHub style alerts. The default value is false.

    • :math_dollars (boolean/0) - Enables math using dollar syntax. The default value is false.

    • :math_code (boolean/0) - Enables the math code extension from the GFM spec. The default value is false.

    • :shortcodes (boolean/0) - Phrases wrapped inside of ':' blocks will be replaced with emojis. The default value is false.

    • :wikilinks_title_after_pipe (boolean/0) - Enables wikilinks using title after pipe syntax. The default value is false.

    • :wikilinks_title_before_pipe (boolean/0) - Enables wikilinks using title before pipe syntax. The default value is false.

    • :underline (boolean/0) - Enables underlines using double underscores. The default value is false.

    • :subscript (boolean/0) - Enables subscript text using single tildes. The default value is false.

    • :spoiler (boolean/0) - Enables spoilers using double vertical bars. The default value is false.

    • :greentext (boolean/0) - Requires at least one space after a > character to generate a blockquote, and restarts blockquote nesting across unique lines of input. The default value is false.

    • :image_url_rewriter - Wraps embedded image URLs using a string template.

      Example:

      Given this image ![alt text](http://unsafe.com/image.png) and this rewriter:

      image_url_rewriter: "https://example.com?url={@url}"

      Renders <p><img src="https://example.com?url=http://unsafe.com/image.png" alt="alt text" /></p>

      Notes:

      • Assign @url is always passed to the template.
      • Function callback is not supported, only string templates. Transform the Document AST for more complex cases.

      The default value is nil.

    • :link_url_rewriter - Wraps link URLs using a string template.

      Example:

      Given this link [my link](http://unsafe.example.com/bad) and this rewriter:

      link_url_rewriter: "https://safe.example.com/norefer?url={@url}"

      Renders <p><a href="https://safe.example.com/norefer?url=http://unsafe.example.com/bad">my link</a></p>

      Notes:

      • Assign @url is always passed to the template.
      • Function callback is not supported, only string templates. Transform the Document AST for more complex cases.

      The default value is nil.

  • :parse (keyword/0) - Configure parsing behavior. See comrak's ParseOptions for more info and examples. The default value is [].

    • :smart (boolean/0) - Punctuation (quotes, full-stops and hyphens) are converted into 'smart' punctuation. The default value is false.

    • :default_info_string - The default info string for fenced code blocks. The default value is nil.

    • :relaxed_tasklist_matching (boolean/0) - Whether or not a simple x or X is used for tasklist or any other symbol is allowed. The default value is false.

    • :relaxed_autolinks (boolean/0) - Relax parsing of autolinks, allow links to be detected inside brackets and allow all url schemes. It is intended to allow a very specific type of autolink detection, such as [this http://and.com that] or {http://foo.com}, on a best can basis. The default value is true.

  • :render (keyword/0) - Configure rendering behavior. See comrak's RenderOptions for more info and examples. The default value is [].

    • :hardbreaks (boolean/0) - Soft line breaks in the input translate into hard line breaks in the output. The default value is false.

    • :github_pre_lang (boolean/0) - GitHub-style <pre lang="xyz"> is used for fenced code blocks with info tags. The default value is false.

    • :full_info_string (boolean/0) - Enable full info strings for code blocks. The default value is false.

    • :width (integer/0) - The wrap column when outputting CommonMark. The default value is 0.

    • :unsafe (boolean/0) - Allow rendering of raw HTML and potentially dangerous links. The default value is false.

    • :escape (boolean/0) - Escape raw HTML instead of clobbering it. The default value is false.

    • :list_style - Set the type of bullet list marker to use. Either one of :dash, :plus, or :star. The default value is :dash.

    • :sourcepos (boolean/0) - Include source position attributes in HTML and XML output. The default value is false.

    • :escaped_char_spans (boolean/0) - Wrap escaped characters in a <span> to allow any post-processing to recognize them. The default value is false.

    • :ignore_setext (boolean/0) - Ignore setext headings in input. The default value is false.

    • :ignore_empty_links (boolean/0) - Ignore empty links in input. The default value is false.

    • :gfm_quirks (boolean/0) - Enables GFM quirks in HTML output which break CommonMark compatibility. The default value is false.

    • :prefer_fenced (boolean/0) - Prefer fenced code blocks when outputting CommonMark. The default value is false.

    • :figure_with_caption (boolean/0) - Render the image as a figure element with the title as its caption. The default value is false.

    • :tasklist_classes (boolean/0) - Add classes to the output of the tasklist extension. This allows tasklists to be styled. The default value is false.

    • :ol_width (integer/0) - Render ordered list with a minimum marker width. Having a width lower than 3 doesn't do anything. The default value is 1.

    • :experimental_minimize_commonmark (boolean/0) - Minimise escapes used in CommonMark output (-t commonmark) by removing each individually and seeing if the resulting document roundtrips. Brute-force and expensive, but produces nicer output. Note that the result may not in fact be minimal. The default value is false.

  • :syntax_highlight - Apply syntax highlighting to code blocks.

    Examples:

      syntax_highlight: [formatter: {:html_inline, theme: "github_dark"}]
    
      syntax_highlight: [formatter: {:html_linked, theme: "github_light"}]

    See Autumn for more info and examples.

    The default value is [formatter: {:html_inline, [theme: "onedark"]}].

  • :sanitize - Cleans HTML using ammonia after rendering.

    It's disabled by default but you can enable its conservative set of default options as:

    [sanitize: MDEx.default_sanitize_options()]

    Or customize one of the options. For example, to disallow <a> tags:

    [sanitize: [rm_tags: ["a"]]]

    In the example above it will append rm_tags: ["a"] into the default set of options, essentially the same as:

    sanitize = Keyword.put(MDEx.default_sanitize_options(), :rm_tags, ["a"])
    [sanitize: sanitize]

    See the Safety section for more info.

    The default value is nil.

  • :features (keyword/0) - This option is deprecated. Use :syntax_highlight or :sanitize instead.

    • :sanitize - This option is deprecated. Use :sanitize (in :options) instead.

    • :syntax_highlight_theme - This option is deprecated. Use :syntax_highlight (in :options) instead.

    • :syntax_highlight_inline_style (boolean/0) - This option is deprecated. Use :syntax_highlight (in :options) instead.

parse_options()

@type parse_options() :: [
  smart: boolean(),
  default_info_string: binary() | nil,
  relaxed_tasklist_matching: boolean(),
  relaxed_autolinks: boolean()
]

List of comrak parse options.

Example

MDEx.to_html!(""Hello" -- world...", parse: [smart: true])
#=> "<p>โ€œHelloโ€ โ€“ worldโ€ฆ</p>"

render_options()

@type render_options() :: [
  hardbreaks: boolean(),
  github_pre_lang: boolean(),
  full_info_string: boolean(),
  width: integer(),
  unsafe: boolean(),
  escape: boolean(),
  list_style: term(),
  sourcepos: boolean(),
  escaped_char_spans: boolean(),
  ignore_setext: boolean(),
  ignore_empty_links: boolean(),
  gfm_quirks: boolean(),
  prefer_fenced: boolean(),
  figure_with_caption: boolean(),
  tasklist_classes: boolean(),
  ol_width: integer(),
  experimental_minimize_commonmark: boolean()
]

List of comrak render options.

Example

MDEx.to_html!("<script>alert('xss')</script>", render: [unsafe: true])
#=> "<script>alert('xss')</script>"

sanitize_options()

@type sanitize_options() :: [
  tags: [binary()],
  add_tags: [binary()],
  rm_tags: [binary()],
  clean_content_tags: [binary()],
  add_clean_content_tags: [binary()],
  rm_clean_content_tags: [binary()],
  tag_attributes: %{optional(binary()) => [binary()]},
  add_tag_attributes: %{optional(binary()) => [binary()]},
  rm_tag_attributes: %{optional(binary()) => [binary()]},
  tag_attribute_values: %{
    optional(binary()) => %{optional(binary()) => [binary()]}
  },
  add_tag_attribute_values: %{
    optional(binary()) => %{optional(binary()) => [binary()]}
  },
  rm_tag_attribute_values: %{
    optional(binary()) => %{optional(binary()) => [binary()]}
  },
  set_tag_attribute_values: %{
    optional(binary()) => %{optional(binary()) => binary()}
  },
  set_tag_attribute_value: %{
    optional(binary()) => %{optional(binary()) => binary()}
  },
  rm_set_tag_attribute_value: %{optional(binary()) => binary()},
  generic_attribute_prefixes: [binary()],
  add_generic_attribute_prefixes: [binary()],
  rm_generic_attribute_prefixes: [binary()],
  generic_attributes: [binary()],
  add_generic_attributes: [binary()],
  rm_generic_attributes: [binary()],
  url_schemes: [binary()],
  add_url_schemes: [binary()],
  rm_url_schemes: [binary()],
  url_relative: term() | {atom(), binary()} | {atom(), {binary(), binary()}},
  link_rel: binary() | nil,
  allowed_classes: %{optional(binary()) => [binary()]},
  add_allowed_classes: %{optional(binary()) => [binary()]},
  rm_allowed_classes: %{optional(binary()) => [binary()]},
  strip_comments: boolean(),
  id_prefix: binary() | nil
]

List of ammonia options.

Example

iex> MDEx.to_html!("<h1>Title</h1><p>Content</p>", sanitize: [rm_tags: ["h1"]], render: [unsafe: true])
"Title<p>Content</p>"

source()

@type source() :: markdown :: String.t() | MDEx.Document.t() | MDEx.Pipe.t()

Input source document.

Examples

  • From Markdown to HTML

    iex> MDEx.to_html!("# Hello")
    "<h1>Hello</h1>"
  • From Markdown to MDEx.Document

    iex> MDEx.parse_document!("Hello")
    %MDEx.Document{
      nodes: [
        %MDEx.Paragraph{nodes: [%MDEx.Text{literal: "Hello"}]}
      ]
    }
  • From MDEx.Document to HTML

    iex> MDEx.to_html!(%MDEx.Document{
    ...>   nodes: [
    ...>     %MDEx.Paragraph{nodes: [%MDEx.Text{literal: "Hello"}]}
    ...>   ]
    ...> })
    "<p>Hello</p>"

You can also leverage MDEx.Document as an intermediate data type to convert between formats:

  • From JSON to HTML:

    iex> json = ~s|{"nodes":[{"nodes":[{"literal":"Hello","node_type":"MDEx.Text"}],"level":1,"setext":false,"node_type":"MDEx.Heading"}],"node_type":"MDEx.Document"}|
    iex> {:json, json} |> MDEx.parse_document!() |> MDEx.to_html!()
    "<h1>Hello</h1>"

syntax_highlight_options()

@type syntax_highlight_options() :: [{:formatter, Autumn.formatter()}]

Syntax Highlight code blocks using autumn.

Example

MDEx.to_html!("""
...> ```elixir
...> {:mdex, "~> 0.1"}
...> ```
...> """, syntax_highlight: [formatter: {:html_inline, theme: "nord"}])
#=> <pre class="athl" style="color: #d8dee9; background-color: #2e3440;"><code class="language-elixir" translate="no" tabindex="0"><span class="line" data-line="1"><span style="color: #88c0d0;">&lbrace;</span><span style="color: #ebcb8b;">:mdex</span><span style="color: #88c0d0;">,</span> <span style="color: #a3be8c;">&quot;~&gt; 0.1&quot;</span><span style="color: #88c0d0;">&rbrace;</span>
#=> </span></code></pre>

Functions

anchorize(text)

@spec anchorize(String.t()) :: String.t()

Converts a given text string to a format that can be used as an "anchor", such as in a Table of Contents.

This uses the same algorithm GFM uses for anchor ids, so it can be used reliably.

Repeated anchors

GFM will dedupe multiple repeated anchors with the same value by appending an incrementing number to the end of the anchor. That is beyond the scope of this function, so you will have to handle it yourself.

Examples

iex> MDEx.anchorize("Hello World")
"hello-world"

iex> MDEx.anchorize("Hello, World!")
"hello-world"

iex> MDEx.anchorize("Hello -- World")
"hello----world"

iex> MDEx.anchorize("Hello World 123")
"hello-world-123"

iex> MDEx.anchorize("ไฝ ๅฅฝไธ–็•Œ")
"ไฝ ๅฅฝไธ–็•Œ"

default_extension_options()

@spec default_extension_options() :: extension_options()

Returns the default options for the :extension group.

default_parse_options()

@spec default_parse_options() :: parse_options()

Returns the default options for the :parse group.

default_render_options()

@spec default_render_options() :: render_options()

Returns the default options for the :render group.

default_sanitize_options()

@spec default_sanitize_options() :: sanitize_options()

Returns the default options for the :sanitize group.

default_syntax_highlight_options()

@spec default_syntax_highlight_options() :: syntax_highlight_options()

Returns the default options for the :syntax_highlight group.

new(options \\ [])

@spec new(options()) :: MDEx.Pipe.t()

Builds a new MDEx.Pipe instance.

Once the pipe is complete, call either one of the following functions to format the document:

Examples

  • Build a pipe with :document:

    iex> mdex = MDEx.new(document: "# Hello")
    iex> MDEx.to_html(mdex)
    {:ok, "<h1>Hello</h1>"}
    
    iex> mdex = MDEx.new(document: "Hello ~world~", extension: [strikethrough: true])
    iex> MDEx.to_json(mdex)
    {:ok, ~s|{"nodes":[{"nodes":[{"literal":"Hello ","node_type":"MDEx.Text"},{"nodes":[{"literal":"world","node_type":"MDEx.Text"}],"node_type":"MDEx.Strikethrough"}],"node_type":"MDEx.Paragraph"}],"node_type":"MDEx.Document"}|}
  • Pass a :document when formatting:

    iex> mdex = MDEx.new(extension: [strikethrough: true])
    iex> MDEx.to_xml(mdex, document: "Hello ~world~")
    {:ok, ~s|<?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE document SYSTEM "CommonMark.dtd">
    <document xmlns="http://commonmark.org/xml/1.0">
      <paragraph>
        <text xml:space="preserve">Hello </text>
        <strikethrough>
          <text xml:space="preserve">world</text>
        </strikethrough>
      </paragraph>
    </document>|}

Notes

  1. Source :document is automatically parsed into MDEx.Document before the pipeline runs so every step receives the same data type.

  2. You can pass the document when creating the pipe:

MDEx.new(document: "# Hello") |> MDEx.to_html()

Or pass it only when formatting the document, useful to reuse the same pipe with different documents and formats.

mdex = MDEx.new()
# ... attach plugins and steps

MDEx.to_html(mdex, document: "# Hello HTML")
MDEx.to_json(mdex, document: "# Hello JSON")

parse_document(source, options \\ [])

@spec parse_document(markdown :: String.t() | {:json, String.t()}, options()) ::
  {:ok, MDEx.Document.t()} | {:error, any()}

Parse source and returns MDEx.Document.

Source can be either a Markdown string or a tagged JSON string.

Examples

  • Parse Markdown with default options:

    iex> MDEx.parse_document!("""
    ...> # Languages
    ...>
    ...> - Elixir
    ...> - Rust
    ...> """)
    %MDEx.Document{
      nodes: [
        %MDEx.Heading{nodes: [%MDEx.Text{literal: "Languages"}], level: 1, setext: false},
        %MDEx.List{
          nodes: [
            %MDEx.ListItem{
              nodes: [%MDEx.Paragraph{nodes: [%MDEx.Text{literal: "Elixir"}]}],
              list_type: :bullet,
              marker_offset: 0,
              padding: 2,
              start: 1,
              delimiter: :period,
              bullet_char: "-",
              tight: false
            },
            %MDEx.ListItem{
              nodes: [%MDEx.Paragraph{nodes: [%MDEx.Text{literal: "Rust"}]}],
              list_type: :bullet,
              marker_offset: 0,
              padding: 2,
              start: 1,
              delimiter: :period,
              bullet_char: "-",
              tight: false
            }
          ],
          list_type: :bullet,
          marker_offset: 0,
          padding: 2,
          start: 1,
          delimiter: :period,
          bullet_char: "-",
          tight: true
        }
      ]
    }
  • Parse Markdown with custom options:

    iex> MDEx.parse_document!("Darth Vader is ||Luke's father||", extension: [spoiler: true])
    %MDEx.Document{
      nodes: [
        %MDEx.Paragraph{
          nodes: [
            %MDEx.Text{literal: "Darth Vader is "},
            %MDEx.SpoileredText{nodes: [%MDEx.Text{literal: "Luke's father"}]}
          ]
        }
      ]
    }
  • Parse JSON:

    iex> json = ~s|{"nodes":[{"nodes":[{"literal":"Title","node_type":"MDEx.Text"}],"level":1,"setext":false,"node_type":"MDEx.Heading"}],"node_type":"MDEx.Document"}|
    iex> MDEx.parse_document!({:json, json})
    %MDEx.Document{
      nodes: [
        %MDEx.Heading{
          nodes: [%MDEx.Text{literal: "Title"} ],
          level: 1,
          setext: false
        }
      ]
    }

parse_document!(source, options \\ [])

@spec parse_document!(markdown :: String.t() | {:json, String.t()}, options()) ::
  MDEx.Document.t()

Same as parse_document/2 but raises if the parsing fails.

parse_fragment(markdown, options \\ [])

@spec parse_fragment(String.t(), options()) :: {:ok, MDEx.Document.md_node()} | nil

Parse a markdown string and returns only the node that represents the fragment.

Usually that means filtering out the parent document and paragraphs.

That's useful to generate fragment nodes and inject them into the document when you're manipulating it.

Use parse_document/2 to generate a complete document.

Experimental

Consider this function experimental and subject to change.

Examples

iex> MDEx.parse_fragment("# Elixir")
{:ok, %MDEx.Heading{nodes: [%MDEx.Text{literal: "Elixir"}], level: 1, setext: false}}

iex> MDEx.parse_fragment("<h1>Elixir</h1>")
{:ok, %MDEx.HtmlBlock{nodes: [], block_type: 6, literal: "<h1>Elixir</h1>\n"}}

parse_fragment!(markdown, options \\ [])

@spec parse_fragment!(String.t(), options()) :: MDEx.Document.md_node()

Same as parse_fragment/2 but raises if the parsing fails or returns nil.

Experimental

Consider this function experimental and subject to change.

safe_html(unsafe_html, options \\ [])

@spec safe_html(
  String.t(),
  options :: [sanitize: sanitize_options() | nil, escape: [atom()]]
) :: String.t()

Utility function to sanitize and escape HTML.

Examples

iex> MDEx.safe_html("<script>console.log('attack')</script>")
""

iex> MDEx.safe_html("<custom_tag>Hello</custom_tag>")
"Hello"

iex> MDEx.safe_html("<custom_tag>Hello</custom_tag>", sanitize: [add_tags: ["custom_tag"]], escape: [content: false])
"<custom_tag>Hello</custom_tag>"

iex> MDEx.safe_html("<h1>{'Example:'}</h1><code>{:ok, 'MDEx'}</code>")
"&lt;h1&gt;{&#x27;Example:&#x27;}&lt;&#x2f;h1&gt;&lt;code&gt;&lbrace;:ok, &#x27;MDEx&#x27;&rbrace;&lt;&#x2f;code&gt;"

iex> MDEx.safe_html("<h1>{'Example:'}</h1><code>{:ok, 'MDEx'}</code>", escape: [content: false])
"<h1>{'Example:'}</h1><code>&lbrace;:ok, 'MDEx'&rbrace;</code>"

Options

  • :sanitize - cleans HTML after rendering. Defaults to MDEx.default_sanitize_options/0.

  • :escape - which entities should be escaped. Defaults to [:content, :curly_braces_in_code].

    • :content - escape common chars like <, >, &, and others in the HTML content;
    • :curly_braces_in_code - escape { and } only inside <code> tags, particularly useful for compiling HTML in LiveView;

to_commonmark(document)

This function is deprecated. Use `to_markdown/1` instead.

to_commonmark(document, options)

This function is deprecated. Use `to_markdown/2` instead.

to_commonmark!(document)

This function is deprecated. Use `to_markdown!/1` instead.

to_commonmark!(document, options)

This function is deprecated. Use `to_markdown!/2` instead.

to_html(source, options \\ [])

@spec to_html(source(), options()) ::
  {:ok, String.t()}
  | {:error, MDEx.DecodeError.t()}
  | {:error, MDEx.InvalidInputError.t()}

Convert Markdown, MDEx.Document, or MDEx.Pipe to HTML.

Examples

iex> MDEx.to_html("# MDEx")
{:ok, "<h1>MDEx</h1>"}

iex> MDEx.to_html("Implemented with:\n1. Elixir\n2. Rust")
{:ok, "<p>Implemented with:</p>\n<ol>\n<li>Elixir</li>\n<li>Rust</li>\n</ol>"}

iex> MDEx.to_html(%MDEx.Document{nodes: [%MDEx.Heading{nodes: [%MDEx.Text{literal: "MDEx"}], level: 3, setext: false}]})
{:ok, "<h3>MDEx</h3>"}

iex> MDEx.to_html("Hello ~world~ there", extension: [strikethrough: true])
{:ok, "<p>Hello <del>world</del> there</p>"}

iex> MDEx.to_html("<marquee>visit https://beaconcms.org</marquee>", extension: [autolink: true], render: [unsafe: true])
{:ok, "<p><marquee>visit <a href=\"https://beaconcms.org\">https://beaconcms.org</a></marquee></p>"}

Fragments of a document are also supported:

iex> MDEx.to_html(%MDEx.Paragraph{nodes: [%MDEx.Text{literal: "MDEx"}]})
{:ok, "<p>MDEx</p>"}

to_html!(source, options \\ [])

@spec to_html!(source(), options()) :: String.t()

Same as to_html/2 but raises error if the conversion fails.

to_json(source)

@spec to_json(source()) ::
  {:ok, String.t()}
  | {:error, MDEx.DecodeError.t()}
  | {:error, MDEx.InvalidInputError.t()}

Convert Markdown, MDEx.Document, or MDEx.Pipe to JSON using default options.

Use to_json/2 to pass custom options.

Examples

iex> MDEx.to_json("# Hello")
{:ok, ~s|{"nodes":[{"nodes":[{"literal":"Hello","node_type":"MDEx.Text"}],"level":1,"setext":false,"node_type":"MDEx.Heading"}],"node_type":"MDEx.Document"}|}

iex> MDEx.to_json("1. First\n2. Second")
{:ok, ~s|{"nodes":[{"start":1,"nodes":[{"start":1,"nodes":[{"nodes":[{"literal":"First","node_type":"MDEx.Text"}],"node_type":"MDEx.Paragraph"}],"delimiter":"period","padding":3,"list_type":"ordered","marker_offset":0,"bullet_char":"","tight":false,"is_task_list":false,"node_type":"MDEx.ListItem"},{"start":2,"nodes":[{"nodes":[{"literal":"Second","node_type":"MDEx.Text"}],"node_type":"MDEx.Paragraph"}],"delimiter":"period","padding":3,"list_type":"ordered","marker_offset":0,"bullet_char":"","tight":false,"is_task_list":false,"node_type":"MDEx.ListItem"}],"delimiter":"period","padding":3,"list_type":"ordered","marker_offset":0,"bullet_char":"","tight":true,"is_task_list":false,"node_type":"MDEx.List"}],"node_type":"MDEx.Document"}|}

iex> MDEx.to_json(%MDEx.Document{nodes: [%MDEx.Heading{nodes: [%MDEx.Text{literal: "Hello"}], level: 3, setext: false}]})
{:ok, ~s|{"nodes":[{"nodes":[{"literal":"Hello","node_type":"MDEx.Text"}],"level":3,"setext":false,"node_type":"MDEx.Heading"}],"node_type":"MDEx.Document"}|}

Fragments of a document are also supported:

iex> MDEx.to_json(%MDEx.Paragraph{nodes: [%MDEx.Text{literal: "Hello"}]})
{:ok, ~s|{"nodes":[{"nodes":[{"literal":"Hello","node_type":"MDEx.Text"}],"node_type":"MDEx.Paragraph"}],"node_type":"MDEx.Document"}|}

to_json(source, options)

@spec to_json(source(), options()) ::
  {:ok, String.t()}
  | {:error, MDEx.DecodeError.t()}
  | {:error, MDEx.InvalidInputError.t()}

Convert Markdown, MDEx.Document, or MDEx.Pipe to JSON using custom options.

Examples

iex> MDEx.to_json("Hello ~world~", extension: [strikethrough: true])
{:ok, ~s|{"nodes":[{"nodes":[{"literal":"Hello ","node_type":"MDEx.Text"},{"nodes":[{"literal":"world","node_type":"MDEx.Text"}],"node_type":"MDEx.Strikethrough"}],"node_type":"MDEx.Paragraph"}],"node_type":"MDEx.Document"}|}

to_json!(source)

@spec to_json!(source()) :: String.t()

Same as to_json/1 but raises an error if the conversion fails.

to_json!(source, options)

@spec to_json!(source(), options()) :: String.t()

Same as to_json/2 but raises error if the conversion fails.

to_markdown(source)

@spec to_markdown(MDEx.Document.t() | MDEx.Pipe.t()) ::
  {:ok, String.t()} | {:error, MDEx.DecodeError.t()}

Convert MDEx.Document or MDEx.Pipe to Markdown using default options.

Use to_markdown/2 to pass custom options.

Example

iex> MDEx.to_markdown(%MDEx.Document{nodes: [%MDEx.Heading{nodes: [%MDEx.Text{literal: "Hello"}], level: 3, setext: false}]})
{:ok, "### Hello"}

to_markdown(source, options)

@spec to_markdown(MDEx.Document.t() | MDEx.Pipe.t(), options()) ::
  {:ok, String.t()} | {:error, MDEx.DecodeError.t()}

Convert MDEx.Document or MDEx.Pipe to Markdown using custom options.

to_markdown!(document)

@spec to_markdown!(MDEx.Document.t()) :: String.t()

Same as to_markdown/1 but raises MDEx.DecodeError if the conversion fails.

to_markdown!(document, options)

@spec to_markdown!(MDEx.Document.t(), options()) :: String.t()

Same as to_markdown/2 but raises MDEx.DecodeError if the conversion fails.

to_xml(source, options \\ [])

@spec to_xml(source(), options()) ::
  {:ok, String.t()}
  | {:error, MDEx.DecodeError.t()}
  | {:error, MDEx.InvalidInputError.t()}

Convert Markdown, MDEx.Document, or MDEx.Pipe to XML.

Examples

iex> {:ok, xml} = MDEx.to_xml("Hello ~world~ there", extension: [strikethrough: true])
iex> xml
~s|<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">
  <paragraph>
    <text xml:space="preserve">Hello </text>
    <strikethrough>
      <text xml:space="preserve">world</text>
    </strikethrough>
    <text xml:space="preserve"> there</text>
  </paragraph>
</document>|

iex> {:ok, xml} = MDEx.to_xml("<marquee>visit https://beaconcms.org</marquee>", extension: [autolink: true], render: [unsafe: true])
iex> xml
~s|<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">
  <paragraph>
    <html_inline xml:space="preserve">&lt;marquee&gt;</html_inline>
    <text xml:space="preserve">visit </text>
    <link destination="https://beaconcms.org" title="">
      <text xml:space="preserve">https://beaconcms.org</text>
    </link>
    <html_inline xml:space="preserve">&lt;/marquee&gt;</html_inline>
  </paragraph>
</document>|

Fragments of a document are also supported:

iex> {:ok, xml} = MDEx.to_xml(%MDEx.Paragraph{nodes: [%MDEx.Text{literal: "MDEx"}]})
iex> xml
~s|<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">
  <paragraph>
    <text xml:space="preserve">MDEx</text>
  </paragraph>
</document>|

to_xml!(source, options \\ [])

@spec to_xml!(source(), options()) :: String.t()

Same as to_xml/2 but raises error if the conversion fails.

traverse_and_update(ast, fun)

@spec traverse_and_update(MDEx.Document.t(), (MDEx.Document.md_node() ->
                                          MDEx.Document.md_node())) ::
  MDEx.Document.t()

Low-level function to traverse and update the Markdown document preserving the tree structure format.

See MDEx.Document for more information about the tree structure and for higher-level functions using the Access and Enumerable protocols.

Examples

Traverse an entire Markdown document:

iex> import MDEx.Sigil
iex> doc = ~MD"""
...> # Languages
...>
...> `elixir`
...>
...> `rust`
...> """
iex> MDEx.traverse_and_update(doc, fn
...>   %MDEx.Code{literal: "elixir"} = node -> %{node | literal: "ex"}
...>   %MDEx.Code{literal: "rust"} = node -> %{node | literal: "rs"}
...>   node -> node
...> end)
%MDEx.Document{
  nodes: [
    %MDEx.Heading{nodes: [%MDEx.Text{literal: "Languages"}], level: 1, setext: false},
    %MDEx.Paragraph{nodes: [%MDEx.Code{num_backticks: 1, literal: "ex"}]},
    %MDEx.Paragraph{nodes: [%MDEx.Code{num_backticks: 1, literal: "rs"}]}
  ]
}

Or fragments of a document:

iex> fragment = MDEx.parse_fragment!("Lang: `elixir`")
iex> MDEx.traverse_and_update(fragment, fn
...>   %MDEx.Code{literal: "elixir"} = node -> %{node | literal: "ex"}
...>   node -> node
...> end)
%MDEx.Paragraph{nodes: [%MDEx.Text{literal: "Lang: "}, %MDEx.Code{num_backticks: 1, literal: "ex"}]}

traverse_and_update(ast, acc, fun)

@spec traverse_and_update(MDEx.Document.t(), any(), (MDEx.Document.md_node() ->
                                                 MDEx.Document.md_node())) ::
  MDEx.Document.t()

Low-level function to traverse and update the Markdown document preserving the tree structure format and keeping an accumulator.

See MDEx.Document for more information about the tree structure and for higher-level functions using the Access and Enumerable protocols.

Example

iex> import MDEx.Sigil
iex> doc = ~MD"""
...> # Languages
...>
...> `elixir`
...>
...> `rust`
...> """
iex> MDEx.traverse_and_update(doc, 0, fn
...>   %MDEx.Code{literal: "elixir"} = node, acc -> {%{node | literal: "ex"}, acc + 1}
...>   %MDEx.Code{literal: "rust"} = node, acc -> {%{node | literal: "rs"}, acc + 1}
...>   node, acc -> {node, acc}
...> end)
{%MDEx.Document{
  nodes: [
    %MDEx.Heading{nodes: [%MDEx.Text{literal: "Languages"}], level: 1, setext: false},
    %MDEx.Paragraph{nodes: [%MDEx.Code{num_backticks: 1, literal: "ex"}]},
    %MDEx.Paragraph{nodes: [%MDEx.Code{num_backticks: 1, literal: "rs"}]}
  ]
}, 2}

Also works with fragments.