ex_spirit v0.4.0 ExSpirit.Parser
ExSpirit.Parser
is the parsing section of ExSpirit, designed to parse out some
kind of stream of data (whether via a binary, a list, or perhaps an actual
stream) into a data structure of your own design.
Definitions
Terminal Parser: A terminal parser is one that does not operate over any other parser, it is ‘terminal’ in its location.
Combination Parser: A combination parser is one that takes a parser as an input and does something with it, whether that is repeating it, surrounding it, or ignoring its output as a few examples.
Usage
Just add use ExSpirit.Parser
to a module to make it into a parsing module.
To add text parsing functions from the ExSpirit.Parsing.Text
module then add
text: true
to the use call. For example:
defmodule MyModule do
use ExSpirit.Parser, text: true
end
Note that the ExSpirit.Parser
module must be use
ed, and not import
ed!
The functions and macros below are meant to be defined inside your own module
throught code generation orchestrated by the __using__
macro.
Importing the module will not bring any useful functions or macros into your scope, only “virtual” functions and macros that are used for documentation only.
Link to this section Summary
Functions
When this module is use
d then it will import what is required, define some inline functions for speed, and load in
other parsers
The alternative parser runs the parsers in the inline list (cannot be a variable) and returns the result of the first one that succeeds, or the error of the last one
The branch
combination parser is designed for efficient branching based on
the result from another parser
Defining a rule defines a parser as well as some associated information
Success if there at the “End Of Input”, else fails
Takes a parser but if it fails then it returns a hard error that will prevent further parsers, even in branch tests, from running
The fail parser always fails, documenting the user information passed in
Get something(s) from the state and put it into the locations in the parser that are marked with &1-* bindings
Takes and runs a parser but ignores the result of the parser, instead returning nil
Returns the entire parsed text from the parser, regardless of the actual return value
Looks ahead to confirm success, but does not update the context when successful
Looks ahead to confirm failure, but does not update the context when failed
The no_skip
combination parser takes a parser and clears the skipper so they
do no skipping
The parse function is applied to the input and a parser call, such as in
Runs a function and parser with the both the context before and after the function call
Runs a function with the context
Runs a function with the result
Puts something into the state at the specified key
Puts something into the state at the specified key
Repeats over a parser for bounded number of times, returning the results as a list
The repeat function parser allows you to pass in a parser function to repeat
over, but is otherwise identical to repeat
, especially as repeat
delegates
to repeatFn
The Sequence operator runs all of the parsers in the inline list (cannot be a variable) and returns their results as a list
Runs the skipper now
The skipper
combination parser takes a parser and changes the skipper within
it to the one you pass in for the duration of the parser that you pass in
The success parser always returns the passed in value, default of nil, successfully like a parsed value
Wraps the result of the passed in parser in a standard erlang 2-tuple, where the first element the tag that you pass in and the second is the result of the parser
Tests whether the context is valid
Link to this section Functions
When this module is use
d then it will import what is required, define some inline functions for speed, and load in
other parsers.
Paramters:
- text: true|false -> Will use the Text Parsing module as well
The alternative parser runs the parsers in the inline list (cannot be a variable) and returns the result of the first one that succeeds, or the error of the last one.
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("FF", alt([uint(16), lit("Test")]))
iex> {contexts.error, contexts.result, contexts.rest}
{nil, 255, ""}
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("Test", alt([uint(16), lit("Test")]))
iex> {contexts.error, contexts.result, contexts.rest}
{nil, nil, ""}
The branch
combination parser is designed for efficient branching based on
the result from another parser.
It allows you to parse something, and using the result of that parser you can then either lookup the value in a map or call into a user function, either of which can return a parser function that will then be used to continue parsing.
It takes two arguments, the first of which is the initial parser, the second
is either a user function of value -> parserFn
or a map of
values => parserFn
where the value key is looked up from the result of the
first parser. If the parserFn is nil
then branch
fails, else the parserFn
is executed to continue parsing. Because of the anonymous function calls this
has a slight overhead so only use this if switching parsers dynamically based
on a parsed value that is more complex then a simple alt
parser or the count
is more than a few branches in size.
This returns only the output from the parser in the map, not the lookup parser.
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> symbol_map = %{?b => &uint(&1, 2), ?d => &uint(&1, 10), ?x => &uint(&1, 16)}
iex> context = parse("b101010", branch(char(), symbol_map))
iex> {context.error, context.result, context.rest}
{nil, 42, ""}
iex> context = parse("d213478", branch(char(), symbol_map))
iex> {context.error, context.result, context.rest}
{nil, 213478, ""}
iex> context = parse("xe1DCf", branch(char(), symbol_map))
iex> {context.error, context.result, context.rest}
{nil, 925135, ""}
iex> context = parse("a", branch(char(), symbol_map))
iex> {context.error.message, context.result, context.rest}
{"Tried to branch to `97` but it was not found in the symbol_map", nil, ""}
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> symbol_mapper = fn
iex> ?b -> &uint(&1, 2)
iex> ?d -> &uint(&1, 10)
iex> ?x -> &uint(&1, 16)
iex> _value -> nil # Always have a default case. :-)
iex> end
iex> context = parse("b101010", branch(char(), symbol_mapper))
iex> {context.error, context.result, context.rest}
{nil, 42, ""}
iex> context = parse("d213478", branch(char(), symbol_mapper))
iex> {context.error, context.result, context.rest}
{nil, 213478, ""}
iex> context = parse("xe1DCf", branch(char(), symbol_mapper))
iex> {context.error, context.result, context.rest}
{nil, 925135, ""}
iex> context = parse("a", branch(char(), symbol_mapper))
iex> {context.error.message, context.result, context.rest}
{"Tried to branch to `97` but it was not found in the symbol_map", nil, ""}
Defining a rule defines a parser as well as some associated information.
Such associated information can be the its name for error reporting purposes, a mapping function so you can convert the output on the fly (fantastic for in-line AST generation for example!), among others.
It is used like any other normal terminal rule.
All of the following examples use this definition of rules in a module:
defmodule ExSpirit.Parser do
use ExSpirit.Tests.Parser, text: true
defrule testrule(
seq([ uint(), lit(? ), uint() ])
)
defrule testrule_pipe(
seq([ uint(), lit(? ), uint() ])
), pipe_result_into: Enum.map(fn i -> i-40 end)
defrule testrule_fun(
seq([ uint(), lit(? ), uint() ])
), fun: (fn context -> %{context | result: {"altered", context.result}} end).()
defrule testrule_context(context) do
%{context | result: "always success"}
end
defrule testrule_context_arg(context, value) do
%{context | result: value}
end
end
## Examples
# You can use `defrule`s as any other terminal parser
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 64", testrule())
iex> {contexts.error, contexts.result, contexts.rest}
{nil, [42, 64], ""}
# `defrule`'s also set up a stack of calls down a context so you know
# 'where' an error occured, so name the rules descriptively
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 fail", testrule())
iex> {contexts.error.context.rulestack, contexts.result, contexts.rest}
{[:testrule], nil, "fail"}
# `defrule`s can map the result to return a different one:
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 64", testrule_pipe())
iex> {contexts.error, contexts.result, contexts.rest}
{nil, [2, 24], ""}
# `defrule`s can also operate over the context itself to do anything
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 64", testrule_fun())
iex> {contexts.error, contexts.result, contexts.rest}
{nil, {"altered", [42, 64]}, ""}
# `defrule`s can also be a context function by only passing in `context`
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 64", testrule_context())
iex> {contexts.error, contexts.result, contexts.rest}
{nil, "always success", "42 64"}
# `defrule`s with a context can have other arguments too, but context
# must always be first
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 64", testrule_context_arg(:success))
iex> {contexts.error, contexts.result, contexts.rest}
{nil, :success, "42 64"}
Success if there at the “End Of Input”, else fails.
If the argument is statically pass_result: true
then it passes on the prior return value.
If the argument is statically result: whatever
with whatever
being what
you want to return, then it will set the result to that value on success.
pass_result
must be set to false to use result: value
or it is skipped.
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("42", uint() |> eoi())
iex> {context.error, context.result, context.rest}
{nil, nil, ""}
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("42", uint() |> eoi(pass_result: true))
iex> {context.error, context.result, context.rest}
{nil, 42, ""}
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("42", uint() |> eoi(result: :success))
iex> {context.error, context.result, context.rest}
{nil, :success, ""}
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("42a", uint() |> eoi())
iex> {is_map(context.error), context.result, context.rest}
{true, nil, "a"}
Takes a parser but if it fails then it returns a hard error that will prevent further parsers, even in branch tests, from running.
The purpose of this parser is to hard mention parsing errors at the correct
parsing site, so that if you are parsing an alt
of parsers, but you parse
out a ‘let’ for example, followed by an identifier, if the identifier fails
then you do not want to let the alt try the next one but instead fail out hard
with an error message related to the proper place the parse failed instead of
trying other parsers that you know will not succeed anyway.
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("do 10", lit("do ") |> expect(uint()))
iex> {context.error, context.result, context.rest}
{nil, 10, ""}
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("do nope", lit("do ") |> expect(uint()))
iex> %ExSpirit.Parser.ExpectationFailureException{} = context.error
iex> {context.error.message, context.result, context.rest}
{"Parsing uint with radix of 10 had 0 digits but 1 minimum digits were required", nil, "nope"}
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("do nope", alt([ lit("do ") |> expect(uint()), lit("blah") ]))
iex> %ExSpirit.Parser.ExpectationFailureException{} = context.error
iex> {context.error.message, context.result, context.rest}
{"Parsing uint with radix of 10 had 0 digits but 1 minimum digits were required", nil, "nope"}
# Difference without the `expect`
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("do nope", alt([ lit("do ") |> uint(), lit("blah") ]))
iex> %ExSpirit.Parser.ParseException{} = context.error
iex> {context.error.message =~ "Alt failed all branches:", context.result, context.rest}
{true, nil, "do nope"}
The fail parser always fails, documenting the user information passed in
Examples
iex> import ExSpirit.Parser
iex> context = parse("", fail(42))
iex> {context.error.extradata, context.result, context.rest}
{42, nil, ""}
Get something(s) from the state and put it into the locations in the parser that are marked with &1-* bindings
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("A:A", char() |> put_state(:test, :result) |> lit(?:) |> get_state_into([:test], char(&1)))
iex> {context.error, context.result, context.rest}
{nil, ?A, ""}
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("A:B", char() |> put_state(:test, :result) |> lit(?:) |> get_state_into([:test], char(&1)))
iex> {String.starts_with?(context.error.message, "Tried parsing out any of the the characters of"), context.result, context.rest}
{true, nil, "B"}
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("A:B", char() |> put_state(:test, :result) |> lit(?:) |> get_state_into(:test, :result))
iex> {context.error, context.result, context.rest}
{nil, ?A, "B"}
Takes and runs a parser but ignores the result of the parser, instead returning nil
.
Can be given the option of pass_result: true
to pass the previous result on.
Examples
# `ignore` will run the parser but return no result
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("Test", ignore(char([?a..?z, ?T])))
iex> {context.error, context.result, context.rest}
{nil, nil, "est"}
# `ignore` will pass on the previous result if you want it to
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("42Test", uint() |> ignore(char([?a..?z, ?T]), pass_result: true))
iex> {context.error, context.result, context.rest}
{nil, 42, "est"}
Returns the entire parsed text from the parser, regardless of the actual return value.
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("A256B", lexeme(char() |> uint()))
iex> {context.error, context.result, context.rest}
{nil, "A256", "B"}
Looks ahead to confirm success, but does not update the context when successful.
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("AA", lit(?A) |> lookahead(lit(?A)) |> char())
iex> {context.error, context.result, context.rest}
{nil, ?A, ""}
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("AB", lit(?A) |> lookahead(lit(?A)) |> char())
iex> {String.starts_with?(context.error.message, "Lookahead failed"), context.result, context.rest}
{true, nil, "B"}
Looks ahead to confirm failure, but does not update the context when failed.
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("AB", lit(?A) |> lookahead_not(lit(?A)) |> char())
iex> {context.error, context.result, context.rest}
{nil, ?B, ""}
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("AA", lit(?A) |> lookahead_not(lit(?A)) |> char())
iex> {String.starts_with?(context.error.message, "Lookahead_not failed"), context.result, context.rest}
{true, nil, "A"}
The no_skip
combination parser takes a parser and clears the skipper so they
do no skipping.
Good to parse non-skippable content within a large parser.
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse(" Test:42 ", lit("Test:") |> no_skip(uint()), skipper: lit(?\s))
iex> {context.error, context.result, context.rest}
{nil, 42, " "}
The parse function is applied to the input and a parser call, such as in:
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("42", uint())
iex> {context.error, context.result, context.rest}
{nil, 42, ""}
Runs a function and parser with the both the context before and after the function call.
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> fun = fn {pre, post} -> %{post|result: {pre, post}} end
iex> context = parse("42", pipe_context_around(fun.(), uint()))
iex> {pre, post} = context.result
iex> {context.error, pre.column, post.column, context.rest}
{nil, 1, 3, ""}
Runs a function with the context.
TODO: Expand this a lot.
Examples
iex> import ExSpirit.Parser
iex> fun = fn c -> %{c|result: 42} end
iex> context = parse("a", pipe_context_into(fun.()))
iex> {context.error, context.result, context.rest}
{nil, 42, "a"}
Runs a function with the result
Examples
iex> import ExSpirit.Parser
iex> fun = fn nil -> 42 end
iex> context = parse("a", pipe_result_into(fun.()))
iex> {context.error, context.result, context.rest}
{nil, 42, "a"}
Puts something into the state at the specified key
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("42", uint() |> push_state(:test, :result))
iex> {context.error, context.result, context.rest, context.state}
{nil, 42, "", %{test: [42]}}
Puts something into the state at the specified key
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("42", uint() |> put_state(:test, :result))
iex> {context.error, context.result, context.rest, context.state}
{nil, 42, "", %{test: 42}}
Repeats over a parser for bounded number of times, returning the results as a list.
It does have a slight overhead compared to known execution times due to an anonmous function call, but that is necessary when performing a dynamic number of repetitions without mutable variables.
The optional arguments are the minimum number of repeats required, default of
0
, and the maximum number of repeats, default of -1
(infinite).
Examples
iex> import ExSpirit.Parser, only: :macros
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTTX", repeat(char(?T)))
iex> {context.error, context.result, context.rest}
{nil, [?T, ?T, ?T], "X"}
iex> import ExSpirit.Parser, only: :macros
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTTX", repeat(char(?T), 1))
iex> {context.error, context.result, context.rest}
{nil, [?T, ?T, ?T], "X"}
iex> import ExSpirit.Parser, only: :macros
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTTX", repeat(char(?T), 1, 10))
iex> {context.error, context.result, context.rest}
{nil, [?T, ?T, ?T], "X"}
iex> import ExSpirit.Parser, only: :macros
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTTX", repeat(char(?T), 1, 2))
iex> {context.error, context.result, context.rest}
{nil, [?T, ?T], "TX"}
iex> import ExSpirit.Parser, only: :macros
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTTX", repeat(char(?T), 4))
iex> {context.error.message, context.result, context.rest}
{"Repeating over a parser failed due to not reaching the minimum amount of 4 with only a repeat count of 3", nil, "X"}
iex> import ExSpirit.Parser, only: :macros
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTT", repeat(char(?T), 4))
iex> {context.error.message, context.result, context.rest}
{"Repeating over a parser failed due to not reaching the minimum amount of 4 with only a repeat count of 3", nil, ""}
iex> import ExSpirit.Parser, only: :macros
iex> import ExSpirit.Tests.Parser
iex> context = parse("", repeat(char(?T)))
iex> {context.error, context.result, context.rest}
{nil, [], ""}
The repeat function parser allows you to pass in a parser function to repeat
over, but is otherwise identical to repeat
, especially as repeat
delegates
to repeatFn
.
See ExSpirit.Parser.repeat/4
for more.
Examples
iex> import ExSpirit.Parser, only: :macros
iex> import ExSpirit.Tests.Parser
iex> context = parse("TTTX", repeatFn(fn c -> c |> char(?T) end))
iex> {context.error, context.result, context.rest}
{nil, [?T, ?T, ?T], "X"}
The Sequence operator runs all of the parsers in the inline list (cannot be a variable) and returns their results as a list.
Any nil
’s returned are not added to the result list, and if the result list
has only a single value returned then it returns that value straight away
without being wrapped in a list.
Examples
# `seq` parses a sequence returning the return of all of them, removing nils,
# as a list if more than one or the raw value if only one, if any fail then
# all fail.
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42 64", seq([uint(), lit(" "), uint()]))
iex> {contexts.error, contexts.result, contexts.rest}
{nil, [42, 64], ""}
# `seq` Here is sequence only returning a single value
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> contexts = parse("42Test", seq([uint(), lit("Test")]))
iex> {contexts.error, contexts.result, contexts.rest}
{nil, 42, ""}
Runs the skipper now
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse(" a", skip(), skipper: chars(?\s, 0))
iex> {context.error, context.result, context.rest}
{nil, nil, "a"}
The skipper
combination parser takes a parser and changes the skipper within
it to the one you pass in for the duration of the parser that you pass in.
Examples
# You can change a skipper for a parser as well with `skipper`
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse(" Test: 42 ", lit("Test:") |> skipper(uint(), lit(?\t)), skipper: lit(?\s))
iex> {context.error, context.result, context.rest}
{nil, 42, " "}
The success parser always returns the passed in value, default of nil, successfully like a parsed value.
Examples
iex> import ExSpirit.Parser
iex> context = parse("", success(42))
iex> {context.error, context.result, context.rest}
{nil, 42, ""}
Wraps the result of the passed in parser in a standard erlang 2-tuple, where the first element the tag that you pass in and the second is the result of the parser.
Examples
iex> import ExSpirit.Parser
iex> import ExSpirit.Tests.Parser
iex> context = parse("ff", tag(:integer, uint(16)))
iex> {context.error, context.result, context.rest}
{nil, {:integer, 255}, ""}
Tests whether the context is valid.