Zippers

Introduction

Elixir provides the functions Macro.traverse/4, Macro.prewalk/3 and Macro.postwalk/3 to traverse the AST. They are very useful if we want to collect the nodes of some type, or to perform localized manipulations to the current node we are visiting, but it becomes increasingly hard to use when we need to remove a node, insert siblings, or get information about other nodes. To mitigate these issues, we can reach for a zipper.

A zipper is a data structure that represents a location in the data, composed by the focus, ie the current element we are standing at, and the context or metadata that holds the rest of the data from the perspective of the focus. This is simpler to visualize with a list. Let's consider the list of integers from 1 to 10, and let's stand at the 5th element:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
             
           focus

From the perspective of the 5, the surrounding elements are the numbers from 4 to 1 at it's left, and the numbers from 6 to 10 at it's right. We can encode that information in a tuple:

             focus
               
{[4, 3, 2, 1], 5, [6, 7, 8, 9, 10]}
                         
left siblings      right siblings

That data structure, composed of the focus and the rest of the data as viewed from the focus, is called a zipper. In a traditional list, if we wanted to go to the previous element, we would have to start from the beginning of the list and traverse it all the way down until the current element minus one position. With a list zipper, however, all we need to do is to pop the first element of the left siblings to get the new focus, and put the old focus in the list of right siblings. After that we get a new zipper focused on 4:

{[3, 2, 1], 4, [5, 6, 7, 8, 9, 10]}

If we wanted to move to the right, we would perform the same operation, but popping from the right and prepending to the left. Such movements are performed in constant time, making zippers an attractive option when a lot of movements over a list are required, but things start to get more interesting when we use zippers with trees. When moving across a list, we are constrained to horizontal movements(left and right), but in trees we have an extra dimension to move around(up and down). So we need to encode at least four things: the focus, the left and right siblings, and the parent nodes in the tree.

Sourceror provides a zipper implementation for the Elixir AST, let's see what it looks like. First, we need to install Sourceror:

Mix.install([
  {:sourceror, "~> 0.9"}
])

The module that implements zippers for the Elixir AST is Sourceror.Zipper, we will alias it Z for convenience:

alias Sourceror.Zipper, as: Z

To create a zipper for a tree, we use Z.zip/1. Let's first start with a tree of lists first, since they are easier to visualize than a full blown AST:

zipper = Z.zip([1, [2, 3], 4, [5, [6, 7]], 8])

For now it's not very interesting, it's just a 2-tuple where the first element is the tree we passed in, and the second one is just nil, meaning we are at the topmost node in the tree. Let's see what happens when we go down a level:

zipper |> Z.down()

Now it's getting more interesting! Now the first element is 1, meaning we went down a level and we are now focusing at the first element of the list. The second element of the tuple is now a map holding the rest of the tree from the perspective of 1. We are at the first element, so our left siblings l is nil, our right siblings are the rest of the elements in the list, and our parent is the zipper of the parent node. If we now move one position to the right, we get a different zipper:

zipper |> Z.down() |> Z.right()

Now we are focusing [2, 3], which was popped from the right siblings and the 1 we had in the previous step was added to the left siblings. We didn't move vertically, so the parent tree stayed the same.

Besides moving around, we can also change the tree as a localized operation. For example, let's add an element to the right of our position:

zipper |> Z.down() |> Z.right() |> Z.insert_right(:to_the_right!)

We can see that the atom :to_the_right! was added to our right siblings. let's now remove it:

zipper |> Z.down() |> Z.right() |> Z.insert_right(:to_the_right!) |> Z.right() |> Z.remove()

A couple interesting things happened. First, the item was removed, we can see it nowhere in the zipper. Second, the focus moved to the previous element, but inside of the [2, 3] list. Whenever we remove an item, the focus is moved to the previous element in depth-first order. It's as if we did a Macro.postwalk step, but backwards. Let's look a what happens if we go higher in the tree:

zipper |> Z.down() |> Z.right() |> Z.insert_right(:to_the_right!) |> Z.up()

Here instead of removing the added node we went back up. The list was reconstructed and the second element is now nil, meaning we reached the top. One way is that when we go down a level, the tree is deconstructed to adjust to the new focus, ie the zipper is being opened, and when we go up a level the tree is reconstructed, ie the zipper is being closed.

Now, while it's useful to know how to move up, down, left and right, this only works if we know the shape of the tree before hand. In the real world, we never know the shape of the tree, and so we use other kinds of movements to traverse it. For zippers, we have Z.next/1 and Z.prev/1 to move one step forward or backwards in depth-first pre-order, and Z.traverse/2-3 to perform a depth-first pre-order traversal while calling a function at each location.

To demonstrate the ability of zippers to look around at the rest of the tree, we will traverse a tree of lists and integers, multiplying each integer by the integer at it's right, or removing it if it's already the rightmost sibling, or the following element is another list:

[1, 2, 3, [4, 5, 6, [7, 8, 9], 10, 11], 12, 13, 14]
|> Z.zip()
|> Z.traverse(fn
  {x, _} = zipper when is_integer(x) ->
    with {right, _} when is_integer(right) <- Z.right(zipper) do
      Z.replace(zipper, x * right)
    else
      _ -> Z.remove(zipper)
    end

  zipper ->
    zipper
end)
|> Z.node()

The traversal function takes a zipper representing the current position, and must return a new zipper. When we get back to the top of the tree, the second element will be the atom :end, signifying that the traversal ended, preventing subsequent calls to Z.next/1 from going down the tree again. The Z.node/1 function just extracts the current node from the zipper, so we can look at the final result instead of a zipper.

That wraps up the introduction to zippers. We'll see them in action in the "Multi alias expansion" livebook, where we use them to simplify the traversal and transformations.