dream_test/gherkin/step_trie

Fast step-definition lookup using Cucumber-Expression-style placeholders.

This is the data structure behind dream_test/gherkin/steps. It matches step text in time proportional to the number of tokens in the step text, not the number of registered step definitions.

Placeholder syntax

Prefix/suffix text can be attached to placeholders. For example, ${float}USD matches $19.99USD and captures 19.99 as a float.

Example

      step_trie.lookup(trie, "Then", "the total is $19.99USD")
      |> should
      |> be_equal(
        Some(
          step_trie.StepMatch(handler: "total_usd", captures: [
            step_trie.CapturedFloat(19.99),
          ]),
        ),
      )
      |> or_fail_with("expected float capture for $19.99USD")

Types

A value captured from step text during matching.

Each captured placeholder produces a typed value.

pub type CapturedValue {
  CapturedInt(Int)
  CapturedFloat(Float)
  CapturedString(String)
  CapturedWord(String)
}

Constructors

  • CapturedInt(Int)

    Integer value from {int}

  • CapturedFloat(Float)

    Float value from {float}

  • CapturedString(String)

    String value from {string} (without quotes)

  • CapturedWord(String)

    Word value from {word} or {}

Result of a successful step match.

Contains the matched handler and captured values.

pub type StepMatch(handler) {
  StepMatch(handler: handler, captures: List(CapturedValue))
}

Constructors

  • StepMatch(handler: handler, captures: List(CapturedValue))

    Arguments

    handler

    The handler that matched

    captures

    Captured values in order of appearance

Segment types for step patterns.

Each segment type has different matching behavior and priority:

  • LiteralWord: Exact word match, highest priority
  • StringParam: Matches quoted strings
  • IntParam: Matches integers
  • FloatParam: Matches decimal numbers
  • WordParam: Matches a single unquoted word
  • AnyParam: Matches any single word (lowest priority)
pub type StepSegment {
  LiteralWord(String)
  IntParam
  FloatParam
  StringParam
  WordParam
  AnyParam
}

Constructors

  • LiteralWord(String)

    Literal word requiring exact match. Example: “have” in “I have {int} items”

  • IntParam

    Integer parameter capture. Matches: 42, -5, 0

  • FloatParam

    Float parameter capture. Matches: 3.14, -0.5, 2.0

  • StringParam

    Quoted string parameter capture. Matches: “hello world”, “test”

  • WordParam

    Single unquoted word capture. Matches any non-whitespace word

  • AnyParam

    Anonymous capture (matches any single word). Lowest priority among params

Radix trie for fast step lookup.

The trie is the main data structure holding all step definitions. Patterns are organized by word segments for efficient O(words) lookup.

pub opaque type StepTrie(handler)

A node in the step trie.

Each node can store handlers for different keywords and has children for different segment types. Children are organized by priority.

pub opaque type StepTrieNode(handler)

Values

pub fn insert(
  trie trie: StepTrie(handler),
  keyword keyword: String,
  pattern pattern: String,
  handler handler: handler,
) -> StepTrie(handler)

Insert a step pattern into the trie.

Adds a handler at the path specified by the pattern. If a handler already exists for the same keyword and pattern, it is replaced.

Parameters

  • trie: The trie to insert into
  • keyword: Step keyword as string (“Given”, “When”, “Then”, or “*” for any)
  • pattern: Step pattern with placeholders (e.g., “I have {int} items”)
  • handler: The handler to store

Example

        |> step_trie.insert(
          keyword: "Given",
          pattern: "I have {int} items",
          handler: "count",
        )
pub fn lookup(
  trie trie: StepTrie(handler),
  keyword keyword: String,
  text text: String,
) -> option.Option(StepMatch(handler))

Look up a step in the trie.

Searches for a handler matching the given keyword and step text. Returns the matched handler and captured values, or None if no match.

Lookup is O(word count) - independent of total step definitions.

Parameters

  • trie: The trie to search
  • keyword: Step keyword as string (“Given”, “When”, “Then”)
  • text: Step text to match (e.g., “I have 42 items”)

Returns

  • Some(StepMatch): Contains matched handler and captured values
  • None: No step definition matched

Example

      step_trie.lookup(trie, "Then", "the total is $19.99USD")
      |> should
      |> be_equal(
        Some(
          step_trie.StepMatch(handler: "total_usd", captures: [
            step_trie.CapturedFloat(19.99),
          ]),
        ),
      )
      |> or_fail_with("expected float capture for $19.99USD")
pub fn new() -> StepTrie(handler)

Create a new empty step trie.

Returns a trie with no step definitions.

Example

      let trie =
        step_trie.new()
        |> step_trie.insert(
          keyword: "Given",
          pattern: "I have an empty cart",
          handler: "empty",
        )
pub fn parse_step_pattern(
  pattern pattern: String,
) -> List(StepSegment)

Parse a step pattern string into segments.

Splits the pattern by whitespace and converts each token into a StepSegment. Placeholders like {int} become typed parameter segments, while other tokens become literal word segments.

Placeholder Syntax

PlaceholderSegment TypeMatches
{int}IntParamIntegers like 42, -5
{float}FloatParamDecimals like 3.14
{string}StringParamQuoted strings like "hello"
{word}WordParamSingle unquoted words
{}AnyParamAny single token

Prefix and Suffix Handling

Placeholders can have literal prefixes and/or suffixes attached. The parser automatically splits these into separate segments:

  • ${float}[LiteralWord("$"), FloatParam]
  • {int}%[IntParam, LiteralWord("%")]
  • ${float}USD[LiteralWord("$"), FloatParam, LiteralWord("USD")]

This enables patterns like "the price is ${float}" to match text like "the price is $19.99" and capture 19.99 as the float value.

Example

      step_trie.parse_step_pattern("the total is ${float}USD")
      |> should
      |> be_equal([
        step_trie.LiteralWord("the"),
        step_trie.LiteralWord("total"),
        step_trie.LiteralWord("is"),
        step_trie.LiteralWord("$"),
        step_trie.FloatParam,
        step_trie.LiteralWord("USD"),
      ])
      |> or_fail_with(
        "expected ${float}USD to split into literal + FloatParam segments",
      )
pub fn tokenize_step_text(text text: String) -> List(String)

Tokenize step text for matching.

Prepares step text for trie lookup by splitting it into tokens that align with how patterns are parsed. This enables matching patterns with prefixed or suffixed placeholders.

Tokenization Rules

  1. Whitespace splitting: Text is split on spaces
  2. Quote preservation: Quoted strings like "Red Widget" stay as one token
  3. Numeric boundary splitting: Tokens are split at boundaries between numeric and non-numeric characters

The numeric boundary splitting is key for prefix/suffix support. When a pattern like ${float} is parsed, it becomes ["$", "{float}"]. For matching to work, the text $19.99 must also become ["$", "19.99"].

Example

      step_trie.tokenize_step_text("I add \"Red Widget\" and pay $19.99USD")
      |> should
      |> be_equal([
        "I",
        "add",
        "\"Red Widget\"",
        "and",
        "pay",
        "$",
        "19.99",
        "USD",
      ])
      |> or_fail_with(
        "expected tokenization to preserve quotes and split $19.99USD",
      )
Search Document