Sunday, November 27, 2016

99 Clojure Problems – 64: Binary Tree Layout (1)

"Given a binary tree as defined before being either a vector of the form [v l r] or nil. As a preparation for drawing the tree, a layout algorithm is required to determine the position of each node in a rectangular grid. In this layout strategy, the position of a node v is obtained by the following two rules:

  • x(v) is equal to the position of the node v in the inorder
  • y(v) is equal to the depth of the node v in the tree sequence"

Example

The Problem

I thought it would be interesting to use this problem as an excuse to take a closer look at the zipper implementation that comes with Clojure in the clojure.zip namespace.

One solution to problem 64 is an in-order traversal of the tree while transforming or editing the individual nodes to enrich them with positional information. As said in the problem statement, the x-coordinate is equal to the position of the current node in the in-order traversal. We therefore need to carry some context forward during our traversal. The calculation of the y-coordinate needs contextual information about the current depth as well.

Introduction to Zippers

Zippers solve the problem that such transformations or edits in trees are quite expensive in immutable persistent data structures. This can become an issue when working with deeper trees and many edits. Every edit would trigger the creation of a new path to the root node in order to be non-destructive. In an immutable persistent (tree) data structure the old versions of the tree 'persist' and can be accessed as before the edit.

Zippers as described by Huet make these edits local to the node we are focussing on at one given moment (the focal node). Zippers achieve this by keeping contextual information around that describes the path back up to the root node (and the local neighbours). The edits stay local to the focal node and are only reconstructed into a new tree structure bit by bit as you 'zip' up the tree when moving upwards.

While this concept is really simple at its core, it took me a while to wrap my head around the mechanics of working with Clojure's zippers when you want to use them for a non-trivial edit that needs additional context, as it is the case for problem 64.

There are a couple of good blog posts describing Clojure's zipper implementation but Alex Miller's DeveloperWorks post from 2011 made the penny drop for me. It is targeting Java developers but you can skip the Java bits (but not the initial problem statement) and jump right into the section about zippers at the end. I found it also helpful to read the source of clojure.zip.

A thorough introduction to zippers is out of the scope of this post, but I want to draw your attention to the traversal of a tree using zippers that was not immediately obvious to me.

Consider one of the binary trees we have build in one of the previous posts:

 
  v
 / \
l   r 

Using a zipper we can traverse this tree, moving—in this example—down and right, even though our tree graph does not have an edge between the two branch nodes:


> v                        v                         v
 / \  clojure.zip/down    / \  clojure.zip/right    / \
l   r                   >l   r                     l  >r
From a zippers point of view our tree has additional traversal 'edges' between the neighbour nodes.
 
     [l] - v - [r]  //either l and r exists in a binary tree
          / \
         l - r 

Tree Zipper

Clojure comes with built-in zippers for seqs, the results of xml/parse and vectors. I decided to built my own zipper, because, while our trees are based on vectors, the built-in vector zipper does not know anything about our domain specific interpretation of those vectors.

(defn tree-zip
  [t]
  (z/zipper 
    (complement nil?) 
    rest 
    (fn [n c] (with-meta (apply vector (first n) c) (meta n))) t))

You define a zipper in clojure.zip with three functions. First, a branch detector which in our case treats everything that is not nil as a potential branch. Second a function to extract the children of a node, in our case just the tail of a branch node. Finally a node creator function, that takes a node and children and creates a new node, in our case by replacing the children of the existing node n with the new children.

We preserve the metadata when creating a new node because these three zipper functions are passed through the zipper by storing them as metadata on the nodes.

In-Order Zipper

clojure.zip comes with the 'next' function that iterates the tree depth-first. We need in-order traversal. I implemented a variant of 'next' that does that. Due to the applicative nature of zippers you can use arbitrary look-ahead and walk around the tree to make the next navigation decision, backtracking once it becomes obvious that the current path is not the next valid in-order traversal step.

Putting it all together

The tree-edit function itself is a variation on one of Alex Miller's visitors from the DeveloperWorks post I mentioned earlier. We traverse the tree using an iterator function (here: in-order) and call an edit function on every node.

(defn tree-edit [zipper f next]
  (loop [loc zipper state {}]  
    (let [node (z/node loc)
          path (z/path loc)
          depth (inc (count path))
          [new-node new-state :as res] (f node (assoc state :depth depth))
          new-loc (if (= new-node node)
                    loc
                    (z/replace loc new-node))
          next-loc (next new-loc)]
      (if (end-in-order? next-loc)
        (z/root new-loc)
        (recur next-loc new-state)))))

The edit function takes the current tree node and a context object that we pass around as we traverse the tree. To solve problem 64 we need to know the depth of the current location in the tree. I solved that by injecting the current depth into the context map from the tree traversing 'tree-edit' function. The depth itself can be calculated from the zippers location information.

My full solution is, as always, on Github.

Read more about this series.

Saturday, May 28, 2016

99 Clojure Problems – 63: Construct a Complete Binary Tree

"A complete binary tree with height H is defined as follows: The levels 1,2,3,...,H-1 contain the maximum number of nodes (i.e 2(i-1) at the level i, note that we start counting the levels from 1 at the root). In level H, which may contain less than the maximum possible number of nodes, all the nodes are 'left-adjusted'. This means that in a levelorder tree traversal all internal nodes come first, the leaves come second, and empty successors (the Ends which are not really nodes!) come last.

Particularly, complete binary trees are used as data structures (or addressing schemes) for heaps.

We can assign an address number to each node in a complete binary tree by enumerating the nodes in levelorder, starting at the root with number 1. In doing so, we realize that for every node X with address A the following property holds: The address of X's left and right successors are 2*A and 2*A+1, respectively, supposed the successors do exist. This fact can be used to elegantly construct a complete binary tree structure. Write a method completeBinaryTree that takes as parameters the number of nodes and the value to put in each node."

Example

ninety-nine-clojure.bintrees> (complete-binary-tree 6 'x)                                 |
[x [x [x nil nil] [x nil nil]] [x [x nil nil] nil]]

A simple not-stack-safe approach uses recursion. One key insight here is to see that the address of the node is also at the same time the discriminator to terminate the recursion in the base case. If the address of the node we are going to construct would exceed the desired size of the tree we have to stop. This property holds also for the multiple recursion we are doing when constructing the branches of the binary tree.

Verify

You can verify the completeness of your tree by writing a predicate. My solution is a translation of the Haskell solutions of the 99 problems

This predicate verifies that the first empty slot in a node comes after the last populated node which is another way of expressing the completeness property.

(defn complete-tree? [t]
  (let [minmax (fn minmax [[v l r] idx]
                 (cond
                   (nil? v) [0 idx]
                   :else (let [[max-l min-l] (minmax l (* 2 idx))
                               [max-r min-r] (minmax r (inc (* 2 idx)))]
                           [(max idx max-l max-r) (min min-l min-r)])))]
    (apply < (minmax t 1))))

A Note on Design Decisions

While working on this predicate a design decision made early on in this section on trees bit me. By using Clojure's nil to express the emptiness of a node I overloaded (or 'complected' to parrot Rich Hickey's choice of words) the meaning of nil (nothing) to also express the absence of a child node in the tree and the empty tree. My traversal functions ignore nil, collapse it or elude it when concatenating branches of the tree for traversal. This is usually not a problem when you are only interested in the values of your tree. But here we want to know when in the levelorder traversal the first nil appears.

I thought a bit about this and came to the conclusion that the effects of that choice could have been mitigated by choosing a more expressive representation of the tree. My vector based approach is very minimal and works nicely in many cases because it relies on the seq abstraction. Vector destructuring works even with nil because nil seems to support nth. By creating a record or a map we can still destructure and be succinct in the handling of the tree but solve problems like determining the first occurrence of nil without resorting to a reimplementation of traversal.

Read more about this series.

Sunday, March 20, 2016

99 Clojure Problems – 62B: Collect the Nodes at a Given Level in a List.

"A node of a binary tree is at level N if the path from the root to the node has length N-1. The root node is at level 1. Write a method at-level to collect all nodes at a given level in a list."
(deftest p62b-at-level
  (is (= '(b c) (at-level '[a [b nil nil] [c [d nil nil] [e nil nil]]] 2))))

The basic idea for the solution was to see the tree as a list of nodes and to realise that the nodes at a given level start at node 2^(level - 1) and that this is also the size of each level. Solution on Github.

Read more about this series.

Tuesday, March 08, 2016

99 Clojure Problems – 61A: Collect the Leaves of a Binary Tree in a List

"61A (*) Collect the leaves of a binary tree in a list. A leaf is a node with no successors. Write a method leaves to collect them in a list."

(deftest p61a-leaves
  (is (= '(b d e) (leaves '[a [b nil nil] [c [d nil nil] [e nil nil]]]))))

A variation on the previous exercise. Solution.

Read more about this series.

Wednesday, March 02, 2016

99 Clojure Problems – 62: Collect the Internal Nodes of Binary Tree in a List

"An internal node of a binary tree has either one or two non-empty successors. Write a method internals to collect them in a list."
(deftest p62-internals
  (is (= '(a c) (internals '[a [b nil nil] [c [d nil nil] [e nil nil]]]))))

I used the same idea as in the two previous exercises: walk the tree and filter, just filtering for branch nodes this time. Again using a predicate I wrote earlier. Solution on Github.

Read more about this series.

Sunday, February 21, 2016

99 Clojure Problems – 61: Count the Leaves of a Binary Tree.

"A leaf is a node with no successors. Write a method leaf-count to count them."

(deftest p61-count-leaves
  (is  (= (leaf-count '[x [x [x nil nil] nil] [x nil nil]]) 2)))

I simply walked the tree and counted all the leaves. Both functions, for detecting leaves and walking the tree, existed from previous exercises. See the solution on Github.

Read more about this series.

Wednesday, February 17, 2016

99 Clojure Problems – 60 (alternative solution): Construct Height-balanced Binary Trees with a Given Number of Nodes.

This post describes an alternative solution to problem 60 based on logic programming. Check out my previous post for a functional Clojure solution.

The original 99 problems were compiled to teach Prolog. Clojure's core.logic implements a minimal logic DSL called miniKanren using Clojure as the host language. In this post I reimplement the solution to problem 60 based on core.logic. If you want to learn more about miniKanren 'The Reasoned Schemer' and minikanren.org are probably good places to start.

My solution is not fully relational, I rely on the groundness of certain terms. See the comments below. Full source can be found on Github.

"Construct height-balanced binary trees with a given number of nodes. Consider a height-balanced binary tree of height H. What is the maximum number of nodes it can contain? Clearly, MaxN = 2H - 1.

However, what is the minimum number MinN? This question is more difficult. Try to find a recursive statement and turn it into a function min-hbal-nodes that takes a height and returns MinN."

(defne min-nodes [h n]
  ([0 0])
  ([1 1])
  ([h n]
   (fd/> h 1)
   (fresh [h1 h2 n1 n2]
     (is h1 h dec)
     (is h2 h1 dec)
     (min-nodes h1 n1)
     (min-nodes h2 n2)
     (fd/in n1 n2 n (fd/interval 0 Integer/MAX_VALUE))
     (fd/eq (= (+ 1 n1 n2) n)))))

(defn min-hbal-nodes  
  [h]
  (run 1 [q]
    (min-nodes h q)))

"On the other hand, we might ask: what is the maximum height H a height-balanced binary tree with N nodes can have?"


(defne max-height [n h h1 n1]
  ([n h h1 n1]
   (fd/> n1 n)
   (is h h1 dec)) ;;non-relational
  ([n h h1 n1]
   (fd/<= n1 n)
   (fresh [h2 n2]
     (is h2 h1 inc)
     (min-nodes h2 n2)
     (max-height n h h2 n2))))

"Now, we can attack the main problem: construct all the height-balanced binary trees with a given nuber of nodes."


(defne num-nodes [t n]
  ([nil 0])
  ([[_ l r] n]
   (fresh [nl nr]
     (num-nodes l nl)
     (num-nodes r nr)
     (fd/in nl nr n (fd/interval 0 Integer/MAX_VALUE))
     (fd/eq (= (+ 1 nl nr) n)))))

(defne min-height [n h]
  ([0 0])
  ([n h]
   (fresh [n1 h1]
     (fd/in n1 (fd/interval 0 Integer/MAX_VALUE))
     (fd/> n 0)
     (project [n n1]
              (== (quot n 2) n1)) ;; non-relational
     (min-height n1 h1)
     (fd/+ h1 1 h))))


(defn hbal-tree-nodes [n t]
  (fresh [hmin hmax h]
    (min-height n hmin)
    (max-height n hmax 1 1)
    (fd/>= h hmin)
    (fd/<= h hmax)
    (hbal-tree h t);; requires solution to P59
    (num-nodes t n)))

(defn all-hbal-trees [n]
  (run* [q]
    (hbal-tree-nodes n q)))

Usage:

ninety-nine-clojure.bintrees> (min-hbal-nodes 4)
7
ninety-nine-clojure.bintrees> (all-hbal-trees 4 )
([x [x [x nil nil] nil] [x nil nil]] [x [x nil nil] [x [x nil nil] nil]] [x [x nil [x nil nil]] [x nil nil]] [x [x nil nil] [x nil [x nil nil]]])

Read more about this series.

Saturday, January 16, 2016

99 Clojure Problems – 60: Construct Height-balanced Binary Trees with a Given Number of Nodes.

"Construct height-balanced binary trees with a given number of nodes. Consider a height-balanced binary tree of height H. What is the maximum number of nodes it can contain? Clearly, MaxN = 2H - 1.

However, what is the minimum number MinN? This question is more difficult. Try to find a recursive statement and turn it into a function min-hbal-nodes that takes a height and returns MinN.

On the other hand, we might ask: what is the maximum height H a height-balanced binary tree with N nodes can have?

Now, we can attack the main problem: construct all the height-balanced binary trees with a given nuber of nodes."

Example:

ninety-nine-clojure.bintrees> (min-hbal-nodes 4)
7
ninety-nine-clojure.bintrees> (all-hbal-trees 4 'x)
([x [x [x nil nil] nil] [x nil nil]] [x [x nil nil] [x [x nil nil] nil]] [x [x nil [x nil nil]] [x nil nil]] [x [x nil nil] [x nil [x nil nil]]])

Discussion

The solution to problem 60 builds upon the solution to problem 59, so if you have not done that one yet, now would be a good time.

As always I don't want to give away the solution in the blog post right away to give you a chance to try it yourself first. But if you get stuck, it is all on Github.

Approach

The question contains a hint towards a possible solution but it was not immediately clear to me how the minimum number of nodes at a given height MinN helps us in finding all height-balanced trees with a given number of nodes.

It helped me to explore the problem starting with the expected result and progressing to the intermediate computations mentioned in the problem description later: What do we want to achieve and which parts do we have already?

  1. We have a method "height-balanced-trees" to calculate all height-balanced trees with a certain height from P59.
  2. We could now just filter the result of calling this existing function, leaving those trees in the result that have the required number of nodes and have the desired solution to problem 60.
  3. But at which height do we start generating trees and where do we stop? We cannot generate trees for all possible heights.
  4. Working our way back towards the beginning of the question, we could use the maximum height H of a tree with a given number of nodes as the upper bound of the input to the "height-balanced-trees" function from problem 59.
  5. That leaves us with the missing lower bound: What is the minimum height of a tree with a given number of nodes? We need another function to answer that question. Let's start with that one.

Minimum Height of a Tree with a Given Number of Nodes

If we recursively half the given number of nodes and keep track of how many times we did that we have the minimum height, because it is a height-balanced binary tree. But we could also exploit the fact that the height of a balanced binary tree of n nodes is log2 n + 1.

Maximum Height of a Tree with a Given Number of Nodes

Now this part is a bit trickier. We now need to know what the minimum number of nodes at a given height is, as indicated in the question. Why? Let's look at the number of nodes for all the trees at a given height:

H=3: (4 5 6 7)
H=4:       (7 8 9 10 11 12 13 14 15)
H=5:                   (12 13 14 15 16 17 18 19 20 ...

To pick an interesting example: A tree with seven nodes is at most four levels high. We know that because the minimum number of nodes at height five is greater than seven. If we look at all MinN for increasing heights and stop when MinN gets bigger then our N, we know that the previous height is the maximum height we are looking for.

Now how do we calculate the MinN? Interestingly the MinN(H) is always the MinN(H-1) + MinN(H-2) +1. A formula that seems like a twist on the Fibonacci sequence. I don't have a good intuition as to why this is, but you can just derive this rule by looking at trees with small Hs.

The mentioned formula translates nicely into a recursive function.

Solution and Outlook

Putting it all together is then relatively straight forward. So much so that I will share the code right here:

(defn all-hbal-trees
  "P60 (**)  ... Now, we can attack the main problem: construct all the
  height-balanced binary trees with a given nuber of nodes."
  [n v]
  (->>  (range (min-height-hbal n) (inc (max-height-hbal n)))
        (mapcat #(height-balanced-trees % v))
        (filter #(= n (num-nodes %)))))

Reusing the code from problem 59 is certainly nice but this solution seemed rather wasteful to me as we are throwing away many intermediate results in the filter step. Feeling a bit underwhelmed by my solution I started exploring the original Prolog solution to the problem and reimplemented it using David Nolen's core.logic library. We will cover this additional experiment in the next post.

Read more about this series.