View Source Makeup.Lexer behaviour (Makeup v1.1.0)

A lexer turns raw source code into a list of tokens.

Link to this section Summary

Callbacks

Lexes a string into a list of tokens

Matches groups in a list of tokens.

Post-processes a list of tokens before matching the contained groups.

Parses the given string into a parsec result that includes a list of tokens.

Parses the smallest number of tokens that make sense. It's a parsec.

Functions

Merge adjacent tokens of the same type and with the same attributes.

Splits a list of tokens on newline characters ().

Merges the token values into the original string.

Link to this section Callbacks

Specs

lex(String.t(), list()) :: [Makeup.Lexer.Types.token()]

Lexes a string into a list of tokens

Specs

match_groups([Makeup.Lexer.Types.token()], String.t()) :: [
  Makeup.Lexer.Types.token()
]

Matches groups in a list of tokens.

Specs

Post-processes a list of tokens before matching the contained groups.

Specs

root(String.t()) :: Makeup.Lexer.Types.parsec_result()

Parses the given string into a parsec result that includes a list of tokens.

Specs

root_element(String.t()) :: Makeup.Lexer.Types.parsec_result()

Parses the smallest number of tokens that make sense. It's a parsec.

Link to this section Functions

Specs

Merge adjacent tokens of the same type and with the same attributes.

Doing this will require iterating over the list of tokens again, so only do this if you have a good reason.

Link to this function

split_into_lines(tokens)

View Source

Specs

split_into_lines([Makeup.Lexer.Types.token()]) :: [[Makeup.Lexer.Types.token()]]

Splits a list of tokens on newline characters ().

The result is a list of lists of tokens with no newlines.

Specs

unlex([Makeup.Lexer.Types.token()]) :: String.t()

Merges the token values into the original string.

Inverts the output of a lexer. That is, if lexer is a lexer, then:

string |> lexer.lex() |> Makeup.Lexer.unlex() == string

This only works for a correctly implemented lexer, of course. The above identity can be treated as a lexer invariant for newly implemented lexers.