Saturday, October 05, 2013

Tries, JSNI and GWT

It's been a long time since the last post. I have been busy developing a web app based on GWT recently. While it's not exactly cutting edge technology, there are still some interesting aspects.

For example, there is the SuggestBox implementation that comes with GWT. You can use it to implement predictive text or auto-suggestions for a text input.

The suggest box uses a so called SuggestOracle to calculate the suggestions for a given string entered by the user. The default implementation that comes with GWT is the MultiWordSuggestOracle, which itself uses a combination of hash tables and a prefix tree to do its job.

A Trie is a Tree

A prefix tree or trie (as in retrieval) is a search tree where the nodes do not contain the keys themselves. Instead, the key for a particular node is determined by its position in the tree. More specifically, the key for a particular node is the concatenation of all the prefixes on the edges above it.

     X(he)        > prefix:""
      \
       X(ll)      > prefix: "he"
        \
         X(o)     > prefix: "hell"

If you look a my attempt at visualising a prefix tree for the word "Hello" above, you will notice that the root node is special in that its prefix is the empty string. The other thing to notice is that my example tree uses a prefix size of two characters. This has certain consequences for lookup time and data storage as we will see shortly. (Note: I put the values on the nodes to make it easier to read.)

What happens if you want to store more than one word in your prefix tree? Let's say we want to add the word "hell" to our trie.

     X(he)                 X(h)       
      \                     \
   (ll)X(ll)      OR         X(e)
        \                     \
      (o)X                     X(l)
                                \
                                 O(l)
                                  \
                                   O(o)

First of all the word "hell" was already in the tree as it is a substring of "hello". But the tree did not "know" about it. The tree on the left is the same tree I introduced before but on the first child node the additional "ll" string has been added. This is how the prefix tree that comes with GWT works. When adding new words, in this case "hell", we walk down the tree and we check on every node whether the remaining string length is equal or less than the prefix size (2). If so, we store the value directly on the node. If not, we look for a matching prefix tree below or create one. The prefix size increases by a power of 2 for each layer. (To keep it simpler I left that bit out in the illustration.)

If we compare this to a prefix tree with a prefix length of one (illustrated above on the right), there are two observations to make. First, you need a way of marking nodes of interest that are not leaf nodes. In our case we want to make clear that the string "hell" is a meaningful value and not just a substring of "hello". The GWT trie uses a separate storage location to hold "ll". This means that when looking up all values starting with "he", you have to look at those values stored on the nodes below "he" directly ("ll") and all values on child nodes ("llo"). In the case of the single character prefix you need some way of marking nodes that contain suffixes (i.e. that are the end of a meaningful word). I highlighted those nodes in the graphic by using an "O".

Oh, performance!

The second observation is that using a prefix size greater than one reduces the depth of the tree considerably. But, of course, there is a trade-off. If you choose your prefix too long, you lose the property that made the trie the right choice for the task: efficient lookup. The time complexity for a lookup in a prefix tree is determined by the length of the query string n and not the size of the tree, it is therefore O(n).

By increasing the prefix size, the deeper we descend into the tree, we give up that property. The longer your prefix, the more values will be stored on a particular node. The time complexity is now dominated by the size of the tree and the data structure used to store values on the nodes. Because you have to check every suffix on a particular node for a partial match, time complexity is now O(m), where m is the number of values stored on that node. So the worst case performance for this modified prefix tree with a start prefix length of p is O((logp(n)m) on a degenerated tree with all values on one node.

I can only guess that the reason for this design decision was an assumption about typical usage patterns of the auto-suggest feature. If the user accepts a suggestion after typing only a couple of characters, the increasing prefix size makes the tree actually more efficient. The lookup is reasonably efficient for short queries, where the prefixes are still short and you do not have to descend into a very deep tree to find all suggestions.

This implementation also avoids deep trees consisting of many objects, because in a typical prefix tree you would find one (sub-)tree object per character. The creators seem to have traded lookup efficiency for easy retrieval, smaller memory footprint and a lower cost of object creation in JavaScript.

There is much more to be said about prefix trees, but what I want to try here is to go from the high level concepts down to the implementation details in Java and eventually to the resulting JavaScript all in one post. So let's look at the implementation now.

JSNI is the inline assembly of GWT

Well, sort of. JSNI allows you to include bits of "native" JavaScript in your Java code. By writing a "native" method in Java and including the JavaScript in a special comment block, you instruct the GWT compiler to include that bit of code with only minimal checking directly into the resulting JavaScript object.

In this example you can also see how the interaction with those parts of the PrefixTree class works that have not been implemented in JSNI but in Java. It's not exactly elegant or easy too read. Suddenly, the assembly comparison seems not so far off. Once you put JavaScript into a forced marriage with Java, it loses all elegance and succinctness. What you can see, though, is how all the members of the PrefixTree class are translated to JavaScript by referring to them in "mangled" syntax that uses fully-qualified names for everything.

If you try to find the method I just used as an example for the JSNI syntax in the output of the GWT compiler, you won't find it. At least not where you expect it. The compiler has inlined the "clear" method into the constructor of the prefix tree. The generated code is very readable compared to its JSNI counterpart.

If you turn on the emulated stack trace feature of GWT, the compiler keeps track of the original source on every call by storing the current location in a global variable. All the readability is gone. But it shows again how the code has been inlined. You can now see the line numbers from the original Java source file:

All non trivial methods of the PrefixTree class are implemented using JSNI in GWT.

What does it buy you?

First, there is a cost attached: you lose almost all tool support in your IDE. There is no refactoring tool that will pick up usages inside a JSNI block. It is easy to get your JavaScript wrong, introduce a memory leak and so forth.

What you're promised in return is performance. To test that I reimplemented the PrefixTree in pure Java, compiled it with GWT and compared the two implementations in a recent version of Firefox. I saw that the optimised method to calculate the possible suffixes for a given prefix was about four to six times faster than my reimplementation.

My simple test setup was to implement both versions using a small subset of about 2000 geographic names in the USA, run the same queries on both of the inputs and profile them using the build-in Firefox profiler.

The version without JSNI shown above spends about 20% of the overall runtime looking up suggestions in the trie. It also noteworthy that about 80% of the time is spent outside the prefix tree code.

Looking at the optimised version, the speed-up is striking but, at the same time, the ratio of tree lookup / time spent on things not related to the lookup is even more off-balance. The reason is to be found in the way the MultiWordSuggestOracle is implemented. As it is a multi-word suggest-box, it needs to map partial matches across word boundaries back to the original input strings (e.g. a query for "Mou" would match "Aquarius Mountains" etc.) It therefore keeps a separate copy of all "real" suggestions in a hash map. It then splits up the originals at the word boundaries and inserts them as a normalised, "fragmented" version of each suggestion into the tree. Looking up the original suggestions and sorting the final results takes about 50% of the time; the actual tree lookup amounts to only 4.5%.

Conclusion

I find it very rewarding to look at a piece of technology and to try to dig up layer after layer to understand its inner workings and the design decisions behind it. What I found in this case supports my general opinion about GWT in that there is a whole lot of cleverness and well thought-out solutions to be found under the hood: the right data structure with some well advised optimisations yields a convincing solution. But it has showed me again the ambiguity that comes with most optimisations. While the 5x performance gain justifies the use of JSNI, the component itself is only one small part in the greater context of the auto-suggest feature. In this bigger context the gain is much smaller and limited to the stock implementation that comes with GWT. While the PrefixTree and its optimisations are well isolated from other code by making the class package private, it also means you cannot reuse any of it easily in your auto-suggest implementation.