# `ShotDs.Util.Lexer`
[🔗](https://github.com/jcschuster/ShotDs/blob/v1.2.1/lib/shot_ds/util/lexer.ex#L1)

Contains a `tokenize/1` function for tokenizing a string representing a
formula in THF syntax or a TPTP TH0/TH1 problem file using `NimbleParsec`.
This is mainly used as a preprocessing step for parsing. For information about
the returned structure, see
https://hexdocs.pm/nimble_parsec/NimbleParsec.html.

Every token is a 3-tuple `{type, value, byte_offset}` where `byte_offset` is
the position of the first byte of the token in the original input string.
Downstream parsers use this to report the position of type errors.

## Examples

    iex> {:ok, tokens, "", _, _, _} = tokenize("A & B")
    iex> tokens
    [{:var "A", 0}, {:and, "&", 2}, {:var, "B", 4}]

# `tokens`

```elixir
@type tokens() :: [{atom(), String.t() | atom(), non_neg_integer()}]
```

Tokens are generated by the lexer as a list of `{type, value, byte_offset}`
triples, where `type` is a label given as atom, `value` is the original
string representation of the token, and `byte_offset` is the position in
the original input.

# `byte_offset_to_line_col`

```elixir
@spec byte_offset_to_line_col(String.t(), non_neg_integer()) ::
  {pos_integer(), pos_integer()}
```

Converts a byte offset into the source string into a `{line, column}` pair
(1-indexed). Used for decorating type-error messages produced by the parser.

# `tokenize`

```elixir
@spec tokenize(binary(), keyword()) ::
  {:ok, [term()], rest, context, line, byte_offset}
  | {:error, reason, rest, context, line, byte_offset}
when line: {pos_integer(), byte_offset},
     byte_offset: non_neg_integer(),
     rest: binary(),
     reason: String.t(),
     context: map()
```

Parses the given `binary` as tokenize.

Returns `{:ok, [token], rest, context, position, byte_offset}` or
`{:error, reason, rest, context, line, byte_offset}` where `position`
describes the location of the tokenize (start position) as `{line, offset_to_start_of_line}`.

To column where the error occurred can be inferred from `byte_offset - offset_to_start_of_line`.

## Options

  * `:byte_offset` - the byte offset for the whole binary, defaults to 0
  * `:line` - the line and the byte offset into that line, defaults to `{1, byte_offset}`
  * `:context` - the initial context value. It will be converted to a map

---

*Consult [api-reference.md](api-reference.md) for complete listing*