View Source Md (md v0.9.1)

Md is a markup parser allowing fully customized syntax definition and understanding the wide range of markdown out of the box.

It is stream-aware, extendable, flexible, blazingly fast, with callbacks and more.

main-focus

Main Focus

This library is not yet another markdown parser, rather it’s a highly configurable and extendable parser for any custom markdown-like markup. It has been created mostly to allow custom markdown syntax, like ^foo^ for superscript, or ⇓bar⇓ for subscript. It also supports custom parsers for anything that cannot be handled with generic parsers, inspired by markdown (something more complex than standard markdown provides.)

The library provides callbacks for all the default syntax handlers, as well as for custom handlers, allowing the on-fly modification of what’s currently being processed.

Md parses the incoming stream once and keeps the state, producing an AST of the input document. It has an ability to recover from errors collecting them.

It currently does not support (and I frankly doubt it ever will) lists with embedded quotes, and other contrived syntax. If one needs to perfectly parse the common markdown, Md is probably not the correct choice.

But if one wants to easily extend syntax almost without limits, Md might be good.

markup-handling

Markup Handling

There are several different syntax patterns recognizable by Md. Those are:

  • custom — the custom parser implementing Md.Parser behavious would be called
  • substitute — simple substitution, like "<" → "&lt;"
  • escape — characters to be treated as is, not as a part of syntax
  • comment — characters to be treated as a comment, discarded in the output
  • flush — somewhat breaking a paragraph flow, like triple-dash
  • magnet — the markup for a single work following the patters, like #tag
  • block — the whole block of input treated distinguished, like triple-backtick
  • shift — the same as block, but the opening marker should precede each line and "\n" is treated as the closing marker
  • pair — the opening marker followed by closing marker, and a subsequent pair of opening and closing, like ![name](#anchor); the second element might be an internal shortcut to the deferred disclosure
  • disclosure — the disclosure of elements previously declared as pair with deferred parameter provided
  • paragraph — a header, blockquote, or such, followed by a paragraph flow break
  • list — a list, like - one\n-two
  • tag — allowed tags (e. g. <sup>2</sup>)
  • brace — a most common markdown feature, like text decoration or such (e. g. **bold**)

syntax-description

Syntax description

The syntax must be configured at compile time (because parse/2 handlers are generated in compile time.) It is a map, having settings key

settings: %{
  outer: :p,
  span: :span,
  empty_tags: ~w|img hr br|a
}

and key ⇒ list_of_tuples key-values, providing a text markup representation and its handling rules. Here is the excerpt from the default parser for braces

  brace: %{
    "*" => %{tag: :b},
    "_" => %{tag: :i},
    "**" => %{tag: :strong, attributes: %{class: "nota-bene"}},
    "__" => %{tag: :em},
    "~" => %{tag: :s},
    "~~" => %{tag: :del},
    "`" => %{tag: :code, mode: :raw, attributes: %{class: "code-inline"}}
  }

For more examples of what properties are allowed for each kind of handlers, see the sources (ATM.)

predefined-parsers

Predefined parsers

Md comes with a generic predefined parser Md.Parser.Default, which includes all the markup currently supported by Md.

Custom parser definition would be usually based on Md.Parser.Syntax.Void syntax as shown below

defmodule MyParser do
  use Md.Parser

  alias Md.Parser.Syntax.Void

  @default_syntax Map.put(Void.syntax(), :settings, Void.settings())
  @syntax @default_syntax |> Map.merge(%{
    comment: [{"<!--", %{closing: "-->"}}],
    paragraph: [
      {"##", %{tag: :h2}},
      {"###", %{tag: :h3}},
      {">", %{tag: :blockquote}}
    ],
    list:
      [
        {"- ", %{tag: :li, outer: :ul}},
        {"+ ", %{tag: :li, outer: :ol}}
      ]
    brace: [
      {"*", %{tag: :b}},
      {"_", %{tag: :i}},
      {"~", %{tag: :s}},
      {"`", %{tag: :code, mode: :raw, attributes: %{class: "code-inline"}}}
    ]
  })
end

@syntax module attribute must be declared, or DSL used as shown below (declarations), or an argument in a call to use Md.Parser. The separate declarations will be collected and merged.

defmodule MyDSLParser do
  @my_syntax %{brace: [{"***", %{tag: "u"}}]}
  
  use Md.Parser, syntax: @my_syntax
  import Md.Parser.DSL

  comment "<!--", %{closing: "-->"}
  ...
end

Instead of @syntax module attribute, one might use

  • a parameter to use Md.Parser as use Md.Parser, syntax: map()
  • a DSL like paragraph {"#", %{tag: :h1}}.

Link to this section Summary

Functions

Interface to the library. Use parse/2 to parse the input to the state, use generate/{1,2} to produce an HTML out of the input.

Link to this section Functions

See Md.Parser.generate/1.

Link to this function

generate(input, options)

View Source

See Md.Parser.generate/2.

Link to this function

generate(input, parser, options)

View Source

See Md.Parser.generate/3.

Link to this function

parse(input, state \\ %State{})

View Source

Interface to the library. Use parse/2 to parse the input to the state, use generate/{1,2} to produce an HTML out of the input.

examples

Examples

iex> Md.parse("   foo")
%Md.Parser.State{ast: [{:p, nil, ["foo"]}], mode: [:finished]}

iex> Regex.replace(~r/\s+/, Md.generate("*bold*"), "")
"<p><b>bold</b></p>"

iex> Md.generate("It’s all *bold* and _italic_!", Md.Parser.Default, format: :none)
"<p>It’s all <b>bold</b> and <i>italic</i>!</p>"