Saturday, May 28, 2016

99 Clojure Problems – 63: Construct a Complete Binary Tree

"A complete binary tree with height H is defined as follows: The levels 1,2,3,...,H-1 contain the maximum number of nodes (i.e 2(i-1) at the level i, note that we start counting the levels from 1 at the root). In level H, which may contain less than the maximum possible number of nodes, all the nodes are 'left-adjusted'. This means that in a levelorder tree traversal all internal nodes come first, the leaves come second, and empty successors (the Ends which are not really nodes!) come last.

Particularly, complete binary trees are used as data structures (or addressing schemes) for heaps.

We can assign an address number to each node in a complete binary tree by enumerating the nodes in levelorder, starting at the root with number 1. In doing so, we realize that for every node X with address A the following property holds: The address of X's left and right successors are 2*A and 2*A+1, respectively, supposed the successors do exist. This fact can be used to elegantly construct a complete binary tree structure. Write a method completeBinaryTree that takes as parameters the number of nodes and the value to put in each node."

Example

ninety-nine-clojure.bintrees> (complete-binary-tree 6 'x)                                 |
[x [x [x nil nil] [x nil nil]] [x [x nil nil] nil]]

A simple not-stack-safe approach uses recursion. One key insight here is to see that the address of the node is also at the same time the discriminator to terminate the recursion in the base case. If the address of the node we are going to construct would exceed the desired size of the tree we have to stop. This property holds also for the multiple recursion we are doing when constructing the branches of the binary tree.

Verify

You can verify the completeness of your tree by writing a predicate. My solution is a translation of the Haskell solutions of the 99 problems

This predicate verifies that the first empty slot in a node comes after the last populated node which is another way of expressing the completeness property.

(defn complete-tree? [t]
  (let [minmax (fn minmax [[v l r] idx]
                 (cond
                   (nil? v) [0 idx]
                   :else (let [[max-l min-l] (minmax l (* 2 idx))
                               [max-r min-r] (minmax r (inc (* 2 idx)))]
                           [(max idx max-l max-r) (min min-l min-r)])))]
    (apply < (minmax t 1))))

A Note on Design Decisions

While working on this predicate a design decision made early on in this section on trees bit me. By using Clojure's nil to express the emptiness of a node I overloaded (or 'complected' to parrot Rich Hickey's choice of words) the meaning of nil (nothing) to also express the absence of a child node in the tree and the empty tree. My traversal functions ignore nil, collapse it or elude it when concatenating branches of the tree for traversal. This is usually not a problem when you are only interested in the values of your tree. But here we want to know when in the levelorder traversal the first nil appears.

I thought a bit about this and came to the conclusion that the effects of that choice could have been mitigated by choosing a more expressive representation of the tree. My vector based approach is very minimal and works nicely in many cases because it relies on the seq abstraction. Vector destructuring works even with nil because nil seems to support nth. By creating a record or a map we can still destructure and be succinct in the handling of the tree but solve problems like determining the first occurrence of nil without resorting to a reimplementation of traversal.

Read more about this series.

No comments: