Zippers
Introduction
Elixir provides the functions Macro.traverse/4
, Macro.prewalk/3
and Macro.postwalk/3
to traverse
the AST. They are very useful if we want to collect the nodes of some type, or to perform localized
manipulations to the current node we are visiting, but it becomes increasingly hard to use when
we need to remove a node, insert siblings, or get information about other nodes. To mitigate these
issues, we can reach for a zipper.
A zipper is a data structure that represents a location in the data, composed by the focus, ie the current element we are standing at, and the context or metadata that holds the rest of the data from the perspective of the focus. This is simpler to visualize with a list. Let's consider the list of integers from 1 to 10, and let's stand at the 5th element:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
↑
focus
From the perspective of the 5
, the surrounding elements are the numbers from 4
to 1
at it's left,
and the numbers from 6
to 10
at it's right. We can encode that information in a tuple:
focus
↓
{[4, 3, 2, 1], 5, [6, 7, 8, 9, 10]}
↑ ↑
left siblings right siblings
That data structure, composed of the focus and the rest of the data as viewed from the focus, is called
a zipper. In a traditional list, if we wanted to go to the previous element, we would have to
start from the beginning of the list and traverse it all the way down until the current element minus
one position. With a list zipper, however, all we need to do is to pop the first element of the
left siblings to get the new focus, and put the old focus in the list of right siblings. After that
we get a new zipper focused on 4
:
{[3, 2, 1], 4, [5, 6, 7, 8, 9, 10]}
If we wanted to move to the right, we would perform the same operation, but popping from the right and prepending to the left. Such movements are performed in constant time, making zippers an attractive option when a lot of movements over a list are required, but things start to get more interesting when we use zippers with trees. When moving across a list, we are constrained to horizontal movements(left and right), but in trees we have an extra dimension to move around(up and down). So we need to encode at least four things: the focus, the left and right siblings, and the parent nodes in the tree.
Sourceror provides a zipper implementation for the Elixir AST, let's see what it looks like. First, we need to install Sourceror:
Mix.install([
{:sourceror, "~> 0.9"}
])
The module that implements zippers for the Elixir AST is Sourceror.Zipper
, we will alias it Z
for
convenience:
alias Sourceror.Zipper, as: Z
To create a zipper for a tree, we use Z.zip/1
. Let's first start with a tree of lists first, since
they are easier to visualize than a full blown AST:
zipper = Z.zip([1, [2, 3], 4, [5, [6, 7]], 8])
For now it's not very interesting, it's just a 2-tuple where the first element is the tree we passed
in, and the second one is just nil
, meaning we are at the topmost node in the tree. Let's see what
happens when we go down a level:
zipper |> Z.down()
Now it's getting more interesting! Now the first element is 1
, meaning we went down a level and we
are now focusing at the first element of the list. The second element of the tuple is now a map holding
the rest of the tree from the perspective of 1
. We are at the first element, so our left siblings
l
is nil
, our right siblings are the rest of the elements in the list, and our parent is the
zipper of the parent node. If we now move one position to the right, we get a different zipper:
zipper |> Z.down() |> Z.right()
Now we are focusing [2, 3]
, which was popped from the right siblings and the 1
we had in the
previous step was added to the left siblings. We didn't move vertically, so the parent tree stayed
the same.
Besides moving around, we can also change the tree as a localized operation. For example, let's add an element to the right of our position:
zipper |> Z.down() |> Z.right() |> Z.insert_right(:to_the_right!)
We can see that the atom :to_the_right!
was added to our right siblings. let's now remove it:
zipper |> Z.down() |> Z.right() |> Z.insert_right(:to_the_right!) |> Z.right() |> Z.remove()
A couple interesting things happened. First, the item was removed, we can see it nowhere in the zipper.
Second, the focus moved to the previous element, but inside of the [2, 3]
list. Whenever we remove
an item, the focus is moved to the previous element in depth-first order. It's as if we did a
Macro.postwalk
step, but backwards. Let's look a what happens if we go higher in the tree:
zipper |> Z.down() |> Z.right() |> Z.insert_right(:to_the_right!) |> Z.up()
Here instead of removing the added node we went back up. The list was reconstructed and the second
element is now nil
, meaning we reached the top. One way is that when we go down a level, the
tree is deconstructed to adjust to the new focus, ie the zipper is being opened, and when we go up
a level the tree is reconstructed, ie the zipper is being closed.
Now, while it's useful to know how to move up, down, left and right, this only works if we know the
shape of the tree before hand. In the real world, we never know the shape of the tree, and so we use
other kinds of movements to traverse it. For zippers, we have Z.next/1
and Z.prev/1
to move
one step forward or backwards in depth-first pre-order, and Z.traverse/2-3
to perform a depth-first
pre-order traversal while calling a function at each location.
To demonstrate the ability of zippers to look around at the rest of the tree, we will traverse a tree of lists and integers, multiplying each integer by the integer at it's right, or removing it if it's already the rightmost sibling, or the following element is another list:
[1, 2, 3, [4, 5, 6, [7, 8, 9], 10, 11], 12, 13, 14]
|> Z.zip()
|> Z.traverse(fn
{x, _} = zipper when is_integer(x) ->
with {right, _} when is_integer(right) <- Z.right(zipper) do
Z.replace(zipper, x * right)
else
_ -> Z.remove(zipper)
end
zipper ->
zipper
end)
|> Z.node()
The traversal function takes a zipper representing the current position, and must return a new zipper.
When we get back to the top of the tree, the second element will be the atom :end
, signifying that
the traversal ended, preventing subsequent calls to Z.next/1
from going down the tree again. The
Z.node/1
function just extracts the current node from the zipper, so we can look at the final result
instead of a zipper.
That wraps up the introduction to zippers. We'll see them in action in the "Multi alias expansion" livebook, where we use them to simplify the traversal and transformations.