parsey v0.0.3 Parsey
A library to setup basic parsing requirements for non-complex nested inputs.
Parsing behaviours are defined using rulesets, these sets take the format
of [rule]
. Rulesets are
matched against in the order defined. The first rule in the set will have a
higher priority than the last rule in the set.
A rule
is a matching
expression that is named. The name of a rule can be any atom, and multiple
rules can consist of the same name. While the matching expression can be
either a Regex expression or a function.
Rules may additionally be configured to specify the additional options that
will be returned in the ast
,
or the ruleset modification behaviour (what rules to exclude, include or
re-define), and if the rule should be ignored (not added to the
ast
).
The default behaviour of a matched rule is to remove all rules with the same
name from the ruleset, and then try further match the matched input with the
new ruleset. Returning the ast
one completion.
The behaviour of matchers (applies to both regex and functions) is return a
list of indices [{ index, length }]
where the first List.first
tuple in
the list is used to indicate the portion of the input to be removed, while
the last List.last
is used to indicate the portion of the input to be
focused on (parsed further).
Link to this section Summary
Functions
Parse the given input using the specified ruleset
Link to this section Types
Link to this section Functions
Parse the given input using the specified ruleset.
Example
iex> rules = [
...> whitespace: %{ match: ~r/\A\s/, ignore: true },
...> element_end: %{ match: ~r/\A<\/.*?>/, ignore: true },
...> element: %{ match: fn
...> input = <<"<", _ :: binary>> ->
...> elements = String.splitter(input, "<", trim: true)
...>
...> [first] = Enum.take(elements, 1)
...> [{ 0, tag_length }] = Regex.run(~r/\A.*?>/, first, return: :index)
...> tag_length = tag_length + 1
...>
...> { 0, length } = Stream.drop(elements, 1) |> Enum.reduce_while({ 1, 0 }, fn
...> element = <<"/", _ :: binary>>, { 1, length } ->
...> [{ 0, tag_length }] = Regex.run(~r/\A.*?>/, element, return: :index)
...> { :halt, { 0, length + tag_length + 1 } }
...> element = <<"/", _ :: binary>>, { count, length } -> { :cont, { count - 1, length + String.length(element) + 1 } }
...> element, { count, length } -> { :cont, { count + 1, length + String.length(element) + 1 } }
...> end)
...>
...> length = length + String.length(first) + 1
...> [{ 0, length }, {1, tag_length - 2}, { tag_length, length - tag_length }]
...> _ -> nil
...> end, exclude: nil, option: fn input, [_, { index, length }, _] -> String.slice(input, index, length) end },
...> value: %{ match: ~r/\A\d+/, rules: [] }
...> ]
iex> input = """
...> <array>
...> <integer>1</integer>
...> <integer>2</integer>
...> </array>
...> <array>
...> <integer>3</integer>
...> <integer>4</integer>
...> </array>
...> """
iex> Parsey.parse(input, rules)
[
{ :element, [
{ :element, [value: ["1"]], "integer" },
{ :element, [value: ["2"]], "integer" }
], "array" },
{ :element, [
{ :element, [value: ["3"]], "integer" },
{ :element, [value: ["4"]], "integer" }
], "array" },
]