Sunday, November 27, 2016

99 Clojure Problems – 64: Binary Tree Layout (1)

"Given a binary tree as defined before being either a vector of the form [v l r] or nil. As a preparation for drawing the tree, a layout algorithm is required to determine the position of each node in a rectangular grid. In this layout strategy, the position of a node v is obtained by the following two rules:

  • x(v) is equal to the position of the node v in the inorder
  • y(v) is equal to the depth of the node v in the tree sequence"

Example

The Problem

I thought it would be interesting to use this problem as an excuse to take a closer look at the zipper implementation that comes with Clojure in the clojure.zip namespace.

One solution to problem 64 is an in-order traversal of the tree while transforming or editing the individual nodes to enrich them with positional information. As said in the problem statement, the x-coordinate is equal to the position of the current node in the in-order traversal. We therefore need to carry some context forward during our traversal. The calculation of the y-coordinate needs contextual information about the current depth as well.

Introduction to Zippers

Zippers solve the problem that such transformations or edits in trees are quite expensive in immutable persistent data structures. This can become an issue when working with deeper trees and many edits. Every edit would trigger the creation of a new path to the root node in order to be non-destructive. In an immutable persistent (tree) data structure the old versions of the tree 'persist' and can be accessed as before the edit.

Zippers as described by Huet make these edits local to the node we are focussing on at one given moment (the focal node). Zippers achieve this by keeping contextual information around that describes the path back up to the root node (and the local neighbours). The edits stay local to the focal node and are only reconstructed into a new tree structure bit by bit as you 'zip' up the tree when moving upwards.

While this concept is really simple at its core, it took me a while to wrap my head around the mechanics of working with Clojure's zippers when you want to use them for a non-trivial edit that needs additional context, as it is the case for problem 64.

There are a couple of good blog posts describing Clojure's zipper implementation but Alex Miller's DeveloperWorks post from 2011 made the penny drop for me. It is targeting Java developers but you can skip the Java bits (but not the initial problem statement) and jump right into the section about zippers at the end. I found it also helpful to read the source of clojure.zip.

A thorough introduction to zippers is out of the scope of this post, but I want to draw your attention to the traversal of a tree using zippers that was not immediately obvious to me.

Consider one of the binary trees we have build in one of the previous posts:

 
  v
 / \
l   r 

Using a zipper we can traverse this tree, moving—in this example—down and right, even though our tree graph does not have an edge between the two branch nodes:


> v                        v                         v
 / \  clojure.zip/down    / \  clojure.zip/right    / \
l   r                   >l   r                     l  >r
From a zippers point of view our tree has additional traversal 'edges' between the neighbour nodes.
 
     [l] - v - [r]  //either l and r exists in a binary tree
          / \
         l - r 

Tree Zipper

Clojure comes with built-in zippers for seqs, the results of xml/parse and vectors. I decided to built my own zipper, because, while our trees are based on vectors, the built-in vector zipper does not know anything about our domain specific interpretation of those vectors.

(defn tree-zip
  [t]
  (z/zipper 
    (complement nil?) 
    rest 
    (fn [n c] (with-meta (apply vector (first n) c) (meta n))) t))

You define a zipper in clojure.zip with three functions. First, a branch detector which in our case treats everything that is not nil as a potential branch. Second a function to extract the children of a node, in our case just the tail of a branch node. Finally a node creator function, that takes a node and children and creates a new node, in our case by replacing the children of the existing node n with the new children.

We preserve the metadata when creating a new node because these three zipper functions are passed through the zipper by storing them as metadata on the nodes.

In-Order Zipper

clojure.zip comes with the 'next' function that iterates the tree depth-first. We need in-order traversal. I implemented a variant of 'next' that does that. Due to the applicative nature of zippers you can use arbitrary look-ahead and walk around the tree to make the next navigation decision, backtracking once it becomes obvious that the current path is not the next valid in-order traversal step.

Putting it all together

The tree-edit function itself is a variation on one of Alex Miller's visitors from the DeveloperWorks post I mentioned earlier. We traverse the tree using an iterator function (here: in-order) and call an edit function on every node.

(defn tree-edit [zipper f next]
  (loop [loc zipper state {}]  
    (let [node (z/node loc)
          path (z/path loc)
          depth (inc (count path))
          [new-node new-state :as res] (f node (assoc state :depth depth))
          new-loc (if (= new-node node)
                    loc
                    (z/replace loc new-node))
          next-loc (next new-loc)]
      (if (end-in-order? next-loc)
        (z/root new-loc)
        (recur next-loc new-state)))))

The edit function takes the current tree node and a context object that we pass around as we traverse the tree. To solve problem 64 we need to know the depth of the current location in the tree. I solved that by injecting the current depth into the context map from the tree traversing 'tree-edit' function. The depth itself can be calculated from the zippers location information.

My full solution is, as always, on Github.

Read more about this series.