View Source Makeup.Lexer behaviour (Makeup v1.2.1)

A lexer turns raw source code into a list of tokens.

Summary

Callbacks

Lexes a string into a list of tokens

Matches groups in a list of tokens.

Post-processes a list of tokens before matching the contained groups.

Parses the given string into a parsec result that includes a list of tokens.

Parses the smallest number of tokens that make sense. It's a parsec.

Functions

Merge adjacent tokens of the same type and with the same attributes.

Splits a list of tokens on newline characters ().

Merges the token values into the original string.

Callbacks

@callback lex(String.t(), list()) :: [Makeup.Lexer.Types.token()]

Lexes a string into a list of tokens

@callback match_groups([Makeup.Lexer.Types.token()], String.t()) :: [
  Makeup.Lexer.Types.token()
]

Matches groups in a list of tokens.

@callback postprocess([Makeup.Lexer.Types.token()], list()) :: [
  Makeup.Lexer.Types.token()
]

Post-processes a list of tokens before matching the contained groups.

@callback root(String.t()) :: Makeup.Lexer.Types.parsec_result()

Parses the given string into a parsec result that includes a list of tokens.

@callback root_element(String.t()) :: Makeup.Lexer.Types.parsec_result()

Parses the smallest number of tokens that make sense. It's a parsec.

Functions

Merge adjacent tokens of the same type and with the same attributes.

Doing this will require iterating over the list of tokens again, so only do this if you have a good reason.

Link to this function

split_into_lines(tokens)

View Source
@spec split_into_lines([Makeup.Lexer.Types.token()]) :: [[Makeup.Lexer.Types.token()]]

Splits a list of tokens on newline characters ().

The result is a list of lists of tokens with no newlines.

@spec unlex([Makeup.Lexer.Types.token()]) :: String.t()

Merges the token values into the original string.

Inverts the output of a lexer. That is, if lexer is a lexer, then:

string |> lexer.lex() |> Makeup.Lexer.unlex() == string

This only works for a correctly implemented lexer, of course. The above identity can be treated as a lexer invariant for newly implemented lexers.