step_trie

Step Trie - Fast step definition lookup using Cucumber Expressions.

Provides O(word count) step matching instead of O(step definitions) linear search. This is the core data structure for efficient Gherkin step matching.

Performance Characteristics

Insert: O(words in pattern) - typically 5-10 words
Lookup: O(words in step text) - independent of total step definitions
Memory: O(total words across all patterns)

Placeholder Syntax

Patterns use Cucumber Expression placeholders:

Placeholder	Matches	Captures
`{int}`	Integer like `42`, `-5`	`CapturedInt(Int)`
`{float}`	Decimal like `3.14`, `0.5`	`CapturedFloat(Float)`
`{string}`	Quoted text like `"hello"`	`CapturedString(String)`
`{word}`	Single word like `apple`	`CapturedWord(String)`
`{}`	Any single token	`CapturedWord(String)`

Prefix and Suffix Support

Placeholders can have literal prefixes and suffixes:

${float} matches $3.99 and captures 3.99
{int}% matches 50% and captures 50
${float}USD matches $19.99USD and captures 19.99

This works by splitting patterns and text at placeholder/numeric boundaries, so ${float} becomes ["$", "{float}"] in the trie, and $3.99 becomes ["$", "3.99"] during matching.

Matching Priority

When multiple patterns could match, the trie uses this priority order:

Literal words - exact string match (highest priority)
{string} - quoted string capture
{int} - integer capture
{float} - decimal capture
{word} - single word capture
{} - any word capture (lowest priority)

This ensures the most specific step definition wins.

Example

let trie = new()
|> insert("Given", "I have {int} items", count_handler)
|> insert("Given", "I have an empty cart", empty_handler)
|> insert("Then", "the total is ${float}", total_handler)

// Matches empty_handler (literal "an" beats {int})
lookup(trie, "Given", "I have an empty cart")

// Matches count_handler, captures [CapturedInt(42)]
lookup(trie, "Given", "I have 42 items")

// Matches total_handler, captures [CapturedFloat(19.99)]
lookup(trie, "Then", "the total is $19.99")

Types

CapturedValue

</>

A value captured from step text during matching.

Each captured placeholder produces a typed value.

pub type CapturedValue {
  CapturedInt(Int)
  CapturedFloat(Float)
  CapturedString(String)
  CapturedWord(String)
}

Constructors

```
CapturedInt(Int)
```
Integer value from {int}
```
CapturedFloat(Float)
```
Float value from {float}
```
CapturedString(String)
```
String value from {string} (without quotes)
```
CapturedWord(String)
```
Word value from {word} or {}

StepMatch

</>

Result of a successful step match.

Contains the matched handler and captured values.

pub type StepMatch(handler) {
  StepMatch(handler: handler, captures: List(CapturedValue))
}

Constructors

```
StepMatch(handler: handler, captures: List(CapturedValue))
```
Arguments

handler

The handler that matched

captures

Captured values in order of appearance

StepSegment

</>

Segment types for step patterns.

Each segment type has different matching behavior and priority:

LiteralWord: Exact word match, highest priority
StringParam: Matches quoted strings
IntParam: Matches integers
FloatParam: Matches decimal numbers
WordParam: Matches a single unquoted word
AnyParam: Matches any single word (lowest priority)

pub type StepSegment {
  LiteralWord(String)
  IntParam
  FloatParam
  StringParam
  WordParam
  AnyParam
}

Constructors

```
LiteralWord(String)
```
Literal word requiring exact match. Example: “have” in “I have {int} items”
```
IntParam
```
Integer parameter capture. Matches: 42, -5, 0
```
FloatParam
```
Float parameter capture. Matches: 3.14, -0.5, 2.0
```
StringParam
```
Quoted string parameter capture. Matches: “hello world”, “test”
```
WordParam
```
Single unquoted word capture. Matches any non-whitespace word
```
AnyParam
```
Anonymous capture (matches any single word). Lowest priority among params

StepTrie

opaque </>

Radix trie for fast step lookup.

The trie is the main data structure holding all step definitions. Patterns are organized by word segments for efficient O(words) lookup.

pub opaque type StepTrie(handler)

StepTrieNode

opaque </>

A node in the step trie.

Each node can store handlers for different keywords and has children for different segment types. Children are organized by priority.

pub opaque type StepTrieNode(handler)

Values

insert

</>

pub fn insert(
  trie: StepTrie(handler),
  keyword: String,
  pattern: String,
  handler: handler,
) -> StepTrie(handler)

Insert a step pattern into the trie.

Adds a handler at the path specified by the pattern. If a handler already exists for the same keyword and pattern, it is replaced.

Parameters

trie: The trie to insert into
keyword: Step keyword as string (“Given”, “When”, “Then”, or “*” for any)
pattern: Step pattern with placeholders (e.g., “I have {int} items”)
handler: The handler to store

Example

let trie = new()
|> insert("Given", "I have {int} items", have_items)
|> insert("When", "I add {int} more", add_items)

lookup

</>

pub fn lookup(
  trie: StepTrie(handler),
  keyword: String,
  text: String,
) -> option.Option(StepMatch(handler))

Look up a step in the trie.

Searches for a handler matching the given keyword and step text. Returns the matched handler and captured values, or None if no match.

Lookup is O(word count) - independent of total step definitions.

Parameters

trie: The trie to search
keyword: Step keyword as string (“Given”, “When”, “Then”)
text: Step text to match (e.g., “I have 42 items”)

Returns

Some(StepMatch): Contains matched handler and captured values
None: No step definition matched

Example

let result = lookup(trie, "Given", "I have 42 items")
case result {
  Some(StepMatch(handler, captures)) -> {
    // captures = [CapturedInt(42)]
  }
  None -> // No matching step definition
}

new

</>

pub fn new() -> StepTrie(handler)

Create a new empty step trie.

Returns a trie with no step definitions.

Example

let trie = new()
|> insert("Given", "I have {int} items", handler)

parse_step_pattern

</>

pub fn parse_step_pattern(pattern: String) -> List(StepSegment)

Parse a step pattern string into segments.

Splits the pattern by whitespace and converts each token into a StepSegment. Placeholders like {int} become typed parameter segments, while other tokens become literal word segments.

Placeholder Syntax

Placeholder	Segment Type	Matches
`{int}`	`IntParam`	Integers like `42`, `-5`
`{float}`	`FloatParam`	Decimals like `3.14`
`{string}`	`StringParam`	Quoted strings like `"hello"`
`{word}`	`WordParam`	Single unquoted words
`{}`	`AnyParam`	Any single token

Prefix and Suffix Handling

Placeholders can have literal prefixes and/or suffixes attached. The parser automatically splits these into separate segments:

${float} → [LiteralWord("$"), FloatParam]
{int}% → [IntParam, LiteralWord("%")]
${float}USD → [LiteralWord("$"), FloatParam, LiteralWord("USD")]

This enables patterns like "the price is ${float}" to match text like "the price is $19.99" and capture 19.99 as the float value.

Examples

parse_step_pattern("I have {int} items")
// [LiteralWord("I"), LiteralWord("have"), IntParam, LiteralWord("items")]

parse_step_pattern("the total is ${float}")
// [LiteralWord("the"), LiteralWord("total"), LiteralWord("is"),
//  LiteralWord("$"), FloatParam]

parse_step_pattern("I apply a {int}% discount")
// [LiteralWord("I"), LiteralWord("apply"), LiteralWord("a"),
//  IntParam, LiteralWord("%"), LiteralWord("discount")]

tokenize_step_text

</>

pub fn tokenize_step_text(text: String) -> List(String)

Tokenize step text for matching.

Prepares step text for trie lookup by splitting it into tokens that align with how patterns are parsed. This enables matching patterns with prefixed or suffixed placeholders.

Tokenization Rules

Whitespace splitting: Text is split on spaces
Quote preservation: Quoted strings like "Red Widget" stay as one token
Numeric boundary splitting: Tokens are split at boundaries between numeric and non-numeric characters

The numeric boundary splitting is key for prefix/suffix support. When a pattern like ${float} is parsed, it becomes ["$", "{float}"]. For matching to work, the text $19.99 must also become ["$", "19.99"].

Examples

// Basic tokenization
tokenize_step_text("I have 5 items")
// ["I", "have", "5", "items"]

// Quoted strings preserved
tokenize_step_text("I add \"Red Widget\" to cart")
// ["I", "add", "\"Red Widget\"", "to", "cart"]

// Currency prefix split from number
tokenize_step_text("the total is $19.99")
// ["the", "total", "is", "$", "19.99"]

// Percentage suffix split from number
tokenize_step_text("I apply a 15% discount")
// ["I", "apply", "a", "15", "%", "discount"]

// Multiple numeric boundaries
tokenize_step_text("price is $99.99USD")
// ["price", "is", "$", "99.99", "USD"]

Performance Characteristics

Placeholder Syntax

Prefix and Suffix Support

Matching Priority

Example

Constructors

Constructors

Arguments

Constructors

Parameters

Example

Parameters

Returns

Example

Example

Placeholder Syntax

Prefix and Suffix Handling

Examples

Tokenization Rules

Examples