parsey v0.0.3 Parsey

A library to setup basic parsing requirements for non-complex nested inputs.

Parsing behaviours are defined using rulesets, these sets take the format of [rule]. Rulesets are matched against in the order defined. The first rule in the set will have a higher priority than the last rule in the set.

A rule is a matching expression that is named. The name of a rule can be any atom, and multiple rules can consist of the same name. While the matching expression can be either a Regex expression or a function.

Rules may additionally be configured to specify the additional options that will be returned in the ast, or the ruleset modification behaviour (what rules to exclude, include or re-define), and if the rule should be ignored (not added to the ast).

The default behaviour of a matched rule is to remove all rules with the same name from the ruleset, and then try further match the matched input with the new ruleset. Returning the ast one completion.

The behaviour of matchers (applies to both regex and functions) is return a list of indices [{ index, length }] where the first List.first tuple in the list is used to indicate the portion of the input to be removed, while the last List.last is used to indicate the portion of the input to be focused on (parsed further).

Link to this section Summary

Functions

Parse the given input using the specified ruleset

Link to this section Types

Link to this type ast()
ast() :: String.t() | {name(), [ast()]} | {name(), [ast()], option()}
Link to this type excluder()
excluder() :: name() | {name(), option()}
Link to this type formatter()
formatter() :: String.t() | (String.t() -> String.t())
Link to this type matcher()
matcher() :: Regex.t() | (String.t() -> nil | [{integer(), integer()}])
Link to this type name()
name() :: atom()
Link to this type option()
option() :: any()
Link to this type rule()
rule() ::
  {name(), matcher()}
  | {name(),
     %{
       match: matcher(),
       capture: non_neg_integer(),
       format: formatter(),
       option: option(),
       ignore: boolean(),
       skip: boolean(),
       exclude: excluder() | [excluder()],
       include: rule() | [rule()],
       rules: rule() | [rule()]
     }}

Link to this section Functions

Link to this function parse(input, rules)
parse(String.t(), [rule()]) :: [ast()]

Parse the given input using the specified ruleset.

Example

iex> rules = [
...>     whitespace: %{ match: ~r/\A\s/, ignore: true },
...>     element_end: %{ match: ~r/\A<\/.*?>/, ignore: true },
...>     element: %{ match: fn
...>         input = <<"<", _ :: binary>> ->
...>             elements = String.splitter(input, "<", trim: true)
...>
...>             [first] = Enum.take(elements, 1)
...>             [{ 0, tag_length }] = Regex.run(~r/\A.*?>/, first, return: :index)
...>             tag_length = tag_length + 1
...>
...>             { 0, length } = Stream.drop(elements, 1) |> Enum.reduce_while({ 1, 0 }, fn
...>                 element = <<"/", _ :: binary>>, { 1, length } ->
...>                     [{ 0, tag_length }] = Regex.run(~r/\A.*?>/, element, return: :index)
...>                     { :halt, { 0, length + tag_length + 1 } }
...>                 element = <<"/", _ :: binary>>, { count, length } -> { :cont, { count - 1, length + String.length(element) + 1 } }
...>                 element, { count, length } -> { :cont, { count + 1, length + String.length(element) + 1 } }
...>             end)
...>
...>             length = length + String.length(first) + 1
...>             [{ 0, length }, {1, tag_length - 2}, { tag_length, length - tag_length }]
...>         _ -> nil
...>     end, exclude: nil, option: fn input, [_, { index, length }, _] -> String.slice(input, index, length) end },
...>     value: %{ match: ~r/\A\d+/, rules: [] }
...> ]
iex> input = """
...> <array>
...>     <integer>1</integer>
...>     <integer>2</integer>
...> </array>
...> <array>
...>     <integer>3</integer>
...>     <integer>4</integer>
...> </array>
...> """
iex> Parsey.parse(input, rules)
[
    { :element, [
        { :element, [value: ["1"]], "integer" },
        { :element, [value: ["2"]], "integer" }
    ], "array" },
    { :element, [
        { :element, [value: ["3"]], "integer" },
        { :element, [value: ["4"]], "integer" }
    ], "array" },
]