Sifter.Query.Parser (Sifter v0.2.0)
View SourceThe Parser transforms tokens from Sifter.Query.Lexer into structured Sifter.AST nodes.
This parser implements a recursive descent parser with operator precedence parsing (Pratt parsing) to handle complex boolean expressions, field predicates, and set operations in the Sifter query language.
Features
- Operator precedence: Handles correct precedence with
AND(precedence 20) binding tighter thanOR(precedence 10) - Boolean logic: Supports
AND,OR, andNOToperators with proper associativity - Field predicates: Parses field-based comparisons with various operators (
:,<,<=,>,>=) - Set operations: Handles
IN,NOT IN, andALLoperations with list validation - Wildcard support: Processes prefix/suffix wildcards (
field:prefix*,field:*suffix) for equality operations - Grouping: Supports parentheses for explicit precedence control
- Full-text search: Handles bare terms as full-text search expressions
Parser Architecture
The parser uses a two-phase approach:
- Prefix parsing: Handles terms that can appear at the start of expressions (NOT, parentheses, field predicates, bare terms)
- Infix parsing: Manages binary operators (AND, OR) with precedence climbing
Precedence Rules
From highest to lowest precedence:
NOTprefix operator (binds to the immediately following term)ANDconnector (precedence 20, left-associative)ORconnector (precedence 10, left-associative)
Wildcard Processing
For equality operations (: operator), the parser automatically converts wildcard patterns:
field:prefix*→%AST.Cmp{op: :starts_with, value: "prefix"}field:*suffix→%AST.Cmp{op: :ends_with, value: "suffix"}field:"*literal*"→%AST.Cmp{op: :eq, value: "*literal*"}(quoted wildcards are literals)
Wildcards are not allowed in:
- Relational comparisons (
<,<=,>,>=) - Set operations (
IN,NOT IN,ALL) unless quoted
Error Handling
The parser provides detailed error reporting with the following error types:
:unrecognized_token- Invalid token in context (e.g., starting with connector):unexpected_eof_after_operator- Missing right-hand side after operator:missing_rhs- Missing value after comparison operator:invalid_wildcard_position- Wildcard in invalid position (e.g., middle of term):wildcard_not_allowed_for_relop- Wildcard used with relational operator:empty_list- Empty parentheses where list expected:trailing_comma_in_list- Trailing comma in list:missing_comma_in_list- Missing comma between list items
Notes
The parser is designed to be used after tokenization by Sifter.Query.Lexer and produces
AST nodes that can be consumed by a database query builder or further transformation passes.
Summary
Functions
Parses a list of tokens from Sifter.Query.Lexer into a Sifter.AST tree.
Types
@type token() :: {atom(), binary(), any(), {non_neg_integer(), non_neg_integer()}}
Functions
@spec parse([token()]) :: {:ok, Sifter.AST.t()} | {:error, {atom(), token()}}
Parses a list of tokens from Sifter.Query.Lexer into a Sifter.AST tree.
This is the main entry point for the parser. It takes a list of tokens produced by the lexer and transforms them into a structured AST that represents the query's logical structure.
Parameters
tokens- A list of 4-tuples{type, lexeme, literal, location}produced bySifter.Query.Lexer.tokenize/1
Return Values
{:ok, ast}- Successfully parsed AST node{:error, {error_type, token}}- Parse error with the problematic token