Predicator.Lexer (predicator v2.2.0)

View Source

Lexical analyzer for predicator expressions.

The lexer converts input strings into a stream of tokens with complete position tracking for detailed error reporting. Each token includes:

  • Token type and value
  • Line and column position
  • Length for precise error highlighting

Example

iex> Predicator.Lexer.tokenize("score > 85")
{:ok, [
  {:identifier, 1, 1, 5, "score"},
  {:gt, 1, 7, 1, ">"},  
  {:integer, 1, 9, 2, 85},
  {:eof, 1, 11, 0, nil}
]}

Summary

Types

Internal lexer state for position tracking.

Position information for a token.

Lexer result - either success with tokens or error with details.

A lexical token with position information.

Functions

Tokenizes an input string into a list of tokens.

Types

lexer_state()

@type lexer_state() :: %{
  input: binary(),
  position: non_neg_integer(),
  line: pos_integer(),
  column: pos_integer(),
  tokens: [token()]
}

Internal lexer state for position tracking.

position()

@type position() ::
  {line :: pos_integer(), column :: pos_integer(), length :: pos_integer()}

Position information for a token.

Contains:

  • line - 1-based line number
  • column - 1-based column number
  • length - number of characters in the token

result()

@type result() :: {:ok, [token()]} | {:error, binary(), pos_integer(), pos_integer()}

Lexer result - either success with tokens or error with details.

token()

@type token() ::
  {:identifier, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:integer, pos_integer(), pos_integer(), pos_integer(), integer()}
  | {:string, pos_integer(), pos_integer(), pos_integer(), binary(),
     :double | :single}
  | {:boolean, pos_integer(), pos_integer(), pos_integer(), boolean()}
  | {:date, pos_integer(), pos_integer(), pos_integer(), Date.t()}
  | {:datetime, pos_integer(), pos_integer(), pos_integer(), DateTime.t()}
  | {:gt, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:lt, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:gte, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:lte, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:eq, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:ne, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:equal_equal, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:plus, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:minus, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:multiply, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:divide, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:modulo, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:and_and, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:or_or, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:bang, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:and_op, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:or_op, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:not_op, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:lparen, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:rparen, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:lbracket, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:rbracket, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:comma, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:in_op, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:contains_op, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:function_name, pos_integer(), pos_integer(), pos_integer(), binary()}
  | {:eof, pos_integer(), pos_integer(), pos_integer(), nil}

A lexical token with position information.

Format: {type, line, column, length, value}

Functions

tokenize(input)

@spec tokenize(binary()) :: result()

Tokenizes an input string into a list of tokens.

Parameters

  • input - The expression string to tokenize

Returns

  • {:ok, tokens} - Successfully tokenized input
  • {:error, message, line, column} - Lexical error with position

Examples

iex> Predicator.Lexer.tokenize("score > 85")
{:ok, [
  {:identifier, 1, 1, 5, "score"},
  {:gt, 1, 7, 1, ">"},
  {:integer, 1, 9, 2, 85},
  {:eof, 1, 11, 0, nil}
]}

iex> Predicator.Lexer.tokenize("age >= 18")
{:ok, [
  {:identifier, 1, 1, 3, "age"},
  {:gte, 1, 5, 2, ">="},
  {:integer, 1, 8, 2, 18},
  {:eof, 1, 10, 0, nil}
]}

iex> Predicator.Lexer.tokenize("name = \"John\"")
{:ok, [
  {:identifier, 1, 1, 4, "name"},
  {:eq, 1, 6, 1, "="},
  {:string, 1, 8, 6, "John", :double},
  {:eof, 1, 14, 0, nil}
]}

iex> Predicator.Lexer.tokenize("score > 85 AND age >= 18")
{:ok, [
  {:identifier, 1, 1, 5, "score"},
  {:gt, 1, 7, 1, ">"},
  {:integer, 1, 9, 2, 85},
  {:and_op, 1, 12, 3, "AND"},
  {:identifier, 1, 16, 3, "age"},
  {:gte, 1, 20, 2, ">="},
  {:integer, 1, 23, 2, 18},
  {:eof, 1, 25, 0, nil}
]}