00 Introduction
Nibble is a parser combinator library with a twist: it includes a lexer combinator library as well! If some of those words already started to sound like gibberish to you then don’t worry, this introduction is going to get you up to speed on the core concepts!
Your first parser!
type T {
Hello
Name(String)
}
fn lexer() {
lexer.simple([
lexer.token("hello", Hello),
lexer.variable("\w", "\w", Name),
lexer.whitespace(Nil)
|> lexer.ignore
])
}
fn parser() {
use _ <- nibble.do(nibble.token(Hello))
use name <- nibble.do(name_parser())
nibble.return("You are greeting " <> name)
}
fn name_parser() {
use tok <- nibble.take_map
case tok {
Name(name) -> Ok(name)
_ -> Error("Expected a name")
}
}
fn main() {
let input = "Hello Joe"
input
|> lexer.run(lexer())
|> parser.run(parser())
//=> "You are greeting Joe"
}
Terminology
Throughout Nibble’s docs we use words that not all Gleamlins might have come across before. Here’s a quick rundown of the important terms and concepts to know:
What is a combinator?
Although you can find some more-formal definitions of what a combinator is – looking at you, combinatory logic – we’re Gleamlins here and we like to keep things simple. For our purposes we can think of a combinators as functions that work together like building blocks for more complex behaviour.
You’ll have seen combinators already if you’ve ever written any code using
gleam/dynamic
! With gleam/dynamic
you combine decoders together to create more
complex ones:
dynamic.field("wibble", dynamic.list(dynamic.int))
We can take the simple dynamic.int
decoder and combine it with dynamic.list
to get back a decoder that can decode a list of ints. And we can combine that
decoder with dynamic.field
to get back a decoder that can decode a list of ints
from an object field called "wibble"
! We can keep going, continuing to build
decoders up from smaller pieces: this is the essence of combinators!
What is a parser?
In the broadest sense, a parser takes an unstructured sequence of stuff (often characters in a string or tokens in a list) and turns it into something more structured. You can imagine all parsers can be thought of as the same basic idea:
type Parser(a, b) = fn(List(a)) -> #(b, List(a))
In the real world parsers tend to be a bit more complex than this, including things
like errors and failure cases, position tracking, and so on. But in essence parsers
are combinators, and just like gleam/dynamic
that means we can combine them
together to parse very complex things.
What is a lexer?
A lexer is a special kind of parser that takes an input string and turns it into
a list of tokens. Lexers are common outside of parser combinator libraries, and
see a lot of use in parser generators like lex
and flex
as well as in
hand-written parsers.
The idea is that often we want to work on a slightly higher-level of abstraction
than just plain characters. Lexers apply simple rules to turn groups of characters
into tokens: we saw in the example above that the lexer turned the string "Hello Joe"
into a list of tokens [Hello, Name("Joe")]
.
This is Nibble’s twist: we include a lexer combinator library as part of the package! This lets your parsers focus on higher-level transformations and the lexer deal with the simpler stuff.
If we were writing a parser for a programming language, we might want to parser variable declarations like:
let x = 10
Using Nibble’s approach we would write a lexer that turns this input into a list of tokens:
[Let, Name("x"), Eq, Int(10)]
Now the parser we write doesn’t have to worry about the details of how to turn
the characters "1"
and "0"
into the integer 10
, or how to match the string
"let"
but not the string "letter"
. Instead we can focus on the higher-level
structure of the input and let our parser more-closely match how we think about
the problem:
use _ <- nibble.do(nibble.token(Let))
use name <- nibble.do(name_parser())
use _ <- nibble.do(nibble.token(Eq))
use expr <- nibble.do(int_parser())
nibble.return(Variable(name, expr))