gecko/lexer

Types

A lexer that tokenizes input strings.

Fields

  • toks: List of token functions to try in order
  • eof: Token value to return when input is empty

Example

let lexer = Lexer([
  gen_naked("(", OParen),
  gen_naked(")", CParen),
  gen_rule("[a-zA-Z]+", Ident("")),
], Eof)
pub type Lexer(tt) {
  Lexer(
    toks: List(
      fn(String, String, fn(String) -> tt) -> option.Option(
        #(String, Token(tt)),
      ),
    ),
    eof: tt,
  )
}

Constructors

  • Lexer(
      toks: List(
        fn(String, String, fn(String) -> tt) -> option.Option(
          #(String, Token(tt)),
        ),
      ),
      eof: tt,
    )
pub type Loc {
  Loc(file: String, row: Int, col: Int)
}

Constructors

  • Loc(file: String, row: Int, col: Int)

Metadata about how a token was matched.

  • Rule: Token was matched using a regex pattern
  • Naked: Token was matched using a literal string
pub type Naked {
  Rule(regex: String)
  Naked
}

Constructors

  • Rule(regex: String)
  • Naked

Internal wrapper type for tokens matched by token functions.

This type is public for use in type signatures but should not be exposed directly to library users. Use the next function instead, which automatically unwraps tokens.

Stores the matched text and metadata about how it was matched.

pub type Token(tt) {
  Token(
    wrapper: Naked,
    token_type: fn(String) -> tt,
    word: String,
  )
}

Constructors

  • Token(wrapper: Naked, token_type: fn(String) -> tt, word: String)

A function that attempts to match a token at the start of input.

Takes:

  • input: The input string to match against
  • check: Additional context (currently unused)
  • constructor: Function to construct token (currently unused)

Returns Some(#(remaining, wrapper)) if matched, None otherwise.

pub type TokenFn(tt) =
  fn(String, String, fn(String) -> tt) -> option.Option(
    #(String, Token(tt)),
  )

Values

pub fn advance_loc(loc: Loc, text: String) -> Loc

Advance a Loc by a string, updating row/col for newlines.

pub fn gen_naked(
  s: String,
  constructor: fn(String) -> tt,
) -> fn(String, String, fn(String) -> tt) -> option.Option(
  #(String, Token(tt)),
)

Creates a token function that matches a literal string.

Example

let lparen = gen_naked("(", fn(_) { OParen })

Parameters

  • s: The literal string to match
  • constructor: Function that takes matched text and returns a token

Returns

A TokenFn that matches the literal string at the start of input.

pub fn gen_rule(
  regex: String,
  constructor: fn(String) -> tt,
) -> fn(String, String, fn(String) -> tt) -> option.Option(
  #(String, Token(tt)),
)

Creates a token function that matches a regex pattern.

Uses the gleam_regexp library for regex matching. Supports standard regex syntax as provided by the library.

Example

let identifier = gen_rule("[a-zA-Z][a-zA-Z0-9_]*", Ident)
let number = gen_rule("[1-9][0-9]*", fn(s) { Number(parse_int(s)) })

Parameters

  • regex: A regex pattern string
  • constructor: Function that takes matched text and returns a token

Returns

A TokenFn that matches the regex pattern at the start of input.

pub fn next(
  lexer: Lexer(tt),
  source: String,
  loc: Loc,
) -> #(String, Loc, tt)

Attempt to read the next token from the start of source.

  • If source is empty, returns #("", eof).
  • Otherwise, tries each token function in order and returns the first match.
  • If no token matches the caurrent input, returns #("", eof). Attempt to read the next token from the start of source, tracking position.
  • If source is empty, returns #("", loc, eof).
  • Otherwise, tries each token function in order and returns the first match.
  • If no token matches the current input, returns #("", loc, eof).
pub fn next_opt(
  lexer: Lexer(tt),
  source: String,
  loc: Loc,
) -> option.Option(#(String, Loc, tt))

Attempt to read the next token from the start of source.

  • If source is empty, returns Some(#(source, eof)).
  • Otherwise, tries each token function in order and returns the first match.
  • If no token matches the current input, returns None. Attempt to read the next token from the start of source, tracking position.
  • If source is empty, returns Some(#(source, loc, eof)).
  • Otherwise, tries each token function in order and returns the first match.
  • If no token matches the current input, returns None.
Search Document