TODO
Features
-
Introduce optional features
- Option record / look at other packages for inspo
-
Optionally strip YAML frontmatter
- tests
- enable/disable flavors and options in frontmatter
-
Footnotes1
- tests (well, there is one test)
- implementation
-
Inline Footnotes^[Like this.]
- tests
- implementation
-
Heading IDs2
- tests
- implementation
-
emoji shortcodes3
- tests
- implementation
-
Definition Lists4
- tests
- implementation
-
Smartypants5
- tests
- fancy quotes
- backticks-style quotes (?)
- em dashes
- ellipsis
-
GFM6
- test suite
- tables
- task list items
- strikethrough
- autolinks?
-
tagfilterProbably not, use safe HTML (see below)
-
Obsidian Flavored Markdown7
- internal links
- embed files
- block references (?)
- defining a block (?)
-
commentsProbably not, just use HTML comments - highlights
- callouts
-
Safe HTML
strip or escape HTML blocks and inline HTML
https://www.markdownguide.org/extended-syntax/#footnotes
https://www.markdownguide.org/extended-syntax/#heading-ids
https://www.markdownguide.org/extended-syntax/#definition-lists
https://www.markdownguide.org/extended-syntax/#emoji
Performance
-
Create Cache for splitters and regexes
In progress, but a lot of stuff is still not cached.
-
Try to move more of the regex-based parsing to using splitters / BitArray
-
Inline parsing performance:
- Clean up
- Optimize escape parsing a la houdini
- Optimize as much as possible without OTP, then..
- Parse inlines concurrently
-
See if we can use glentities for HTML entity decoding
Note: A first pass at this failed because the Commonmark spec suite expectation doesn’t match the glentities output. :/ May try again and take a closer look at why it fails.
-
Generally minimize copies as much as possible
-
Create a new parser for inlines which isn’t line-based at all instead of misusing the line- based block parser.
-
When parsing the block structure, we should create inline parsing tasks, store a reference to the task in a task list and push the task to a job scheduler. Then during HTML conversion we can resolve jobs on demand and process them in parallel on the BEAM (and async on JS). So the block structure only contains inline IDs, and you have to query the context for the actual inlines.
-
Use string_tree when producing HTML output instead of just concatenating strings together.
Note: I tried a naive pass at this and everything became super slow.
-
Try building the Gleam website with mork isf jot
-
Refactor the API
- Rename mork.parse to mork.from_string
- Rename mork.parse_with_options to mork.parse
- Make mork.strip_frontmatter take a Bool argument