Wednesday, December 31, 2014

99 Clojure Problems-50: Huffman Code


(deftest huffmann
  (is (= '([a "0"] [c "100"] [b "101"] [f "1100"] [e "1101"] [d "111"])
         (-> '([a 45] [b 13] [c 12] [d 16] [e 9] [f 5])


Huffman coding is a variable length coding based on the frequency of each possible source value. You might have a look at the wikipedia article (or a good book on discrete mathematics and/or algorithms as suggested on the original "99 problems" page) to get a basic understanding of what we are trying to do here.

I used a variation of the two queues method: one queue for the initial set of leaf nodes, the other for the aggregate or branch nodes. The queues have to be ordered by ascending frequency. If you can use a priority queue that does the ordering for you that's fine. You then dequeue the two items with the lowest frequency from each queue (or from just one queue if the other is empty) and combine the two leaves into a branch node. You continue with this procedure until there is only one node left, which is your Huffman tree.

Have a look at the code, but, as usual, first try it yourself. I'm not entirely happy with my solution, which seems to be correct, but some of its semantics are rather subtle. Did you spot that the check for an empty queue is implicit in the default value of Long/MAX_VALUE when accessing the next value polled from the queue? As always, I am happy to get feedback on the solutions presented.

This is the last post for 2014. I set out at the beginning of the year to solve all 99 problems in Clojure that were mentioned in the original 99 Prolog problems compilation. I managed to do the first 50 and I really enjoyed them. Nevertheless, when learning a new language, solving these kinds of nicely prepared educational problems takes you only half the way. I think that the next step should be to build an actual system in Clojure. I still found doing these exercises rewarding and useful, as they allow me to do some Clojure in the limited spare time I have. I am going to continue with the series next year hoping to solve the remaining problems.

Tuesday, October 28, 2014

99 Clojure Problems – 49: Gray Code

"An n-bit Gray code is a sequence of n-bit strings constructed according to certain rules. For example:

n = 1: C(1) = ("0", "1").
n = 2: C(2) = ("00", "01", "11", "10").
n = 3: C(3) = ("000", "001", "011", "010", "110", "111", "101", "100").

Find out the construction rules and write a function to generate Gray codes. See if you can use memoization to make the function more efficient."


deftest gray-2
  (is (= ["00" "01" "11" "10"] (gray 2))))

(deftest gray-3
  (is (= ["000" "001" "011" "010" "110" "111" "101" "100"] (gray 3))))


I did not work out the construction rules myself but simply looked at Wikipedia. I recommend to read the relevant article yourself before continuing. Based on my reading I came up with two solutions.

The first solution follows the name giving generative principle of forming new lists of gray codes by reflecting existing sublists. Or more precisely from Wikipedia:

"The binary-reflected Gray code list for n bits can be generated recursively from the list for n−1 bits by reflecting the list (i.e. listing the entries in reverse order), concatenating the original list with the reversed list, prefixing the entries in the original list with a binary 0, and then prefixing the entries in the reflected list with a binary 1"

This lead to an immediate translation into code based on strings as can be seen on Github.

Adding memoization as requested by the exercise is very easy in Clojure as there is a dedicated memoize function that returns a memoized version for a referentially transparent function.

Leaving aside the inefficiencies of this solution there is a more immediate problem with this implementation because it is not "stack safe". That means for every recursive step a new stack frame will be created and the code will only unwind the stack when it has reached the exit condition leading to a stack overflow for deep recursive calls. Using the gray code example, depending on your VM configuration and whether or not you are using memoization, you start running into trouble around 20 bits or so.

The second solution is based on the gray code conversion algorithm whereby you convert binary to gray code by xor'ing the binary number with itself right shifted by one. The only complication is that you have to calculate the range of possible numbers first. This is 2^n where n is the number of bits you want to support in your gray code encoding.

You calculate an exponent in Clojure by literally multiplying the number as many times as the exponent states:

(reduce * (repeat n 2))
or use a library like numeric-tower that features an exponent function. To create the identical output as the string based version I used the common lisp formatting build into clojure.pprint:
(cl-format nil (str "~" n "'0b") %)
This allows me to print the binary/gray code elegantly left padded with zeros. Alternatively you could use Java interop with String.format and Long.toBinaryString.

There is probably more to be said about gray code and there are more efficient solutions. The two main takeaways here on the language level are how easy it is in Clojure to create a memoized function and the possibility to use Common Lisp formatting instead of Java's if it seems more appropriate.

Read more about this series.

Wednesday, August 20, 2014

99 Clojure Problems – 48: Truth Tables for Logical Expressions (3)

Generalize problem 47 in such a way that the logical expression may contain any number of logical variables. Define table in a way that (table vector expr) prints the truth table for the expression expr, which contains the logical variables enumerated in vector.


ninety-nine-clojure.logic> (table (i [a b c], (a and (b or c) equ a and b or a and c)))

|    :a |    :b |    :c | :result |
|  true |  true |  true |    true |
|  true |  true | false |    true |
|  true | false |  true |    true |
|  true | false | false |    true |
| false |  true |  true |    true |
| false |  true | false |    true |
| false | false |  true |    true |
| false | false | false |    true |


This is more of a refactoring of the previous solution than anything else. I learned a bit more about how macros work in Clojure and discovered that some of the problems I solved last time were actually non-problems.

There are three places in our current code that prevent us from specifying expressions with arbitrary many logical variables: The input values were hard-coded to the four possible combinations of two boolean arguments. The header used when printing the truth table was hard-coded. Finally the macro that supported the defintion of a logical expresion in infix notation was limited to two arguments as well.

Possible input values

We want to create all possible ordered sets that can be created out of the n sets of input parameters for each logical variable. For two boolean parameters (that's what we had so far) it looks like this:

        |  true         false         
  true  | [true true]   [true false]  
  false | [false true]  [false false] 

I initially thought of this problem as a for comprehension in the form of

(for [x [true false] y [true false]] [x y])

To make that work for n arguments I would have to implement another macro to create a for comprehension with n generators. When I realised that the operation here is just an n-fold Cartesian product, I knew that this was not necessary. I decided to use clojure.math.combinatorics which was already a dependency of the project to calculate the Cartesian product of n sets of Boolean parameters instead.

Generating functions with a given arity

I had introduced the 'i' macro to have a simple and elegant way to express that its argument was to be interpreted as a logical expression in infix notation.

(i (a or b))

To support n logical variables we now need to move away from this succinct syntax a bit and specify the logical variables in use in a separate argument.

(i [a b c] (a and (b or c)))

Implementing this was way easier than I thought. It looks similar to this:

(defmacro [bindings & expr]
   `(fn ~bindings (infix ~expr)))
I also noticed that the deep code walking and code replacement I introduced last time, was completely unnecessary. I thought you need generated symbols, but that is only true if you want to use let inside a syntax quote, which I did not. I had probably stumbled over Clojure's quoting rules, which are one of the more complicated bits of the language.

This new argument was interesting in another respect: I wanted to use it to create a header for the truth table. But wanted to avoid this duplication:

(table [a b] (i [a b] (a or b)))

Extracting metadata from a function

A possible solution: Clojure functions have metadata about their arity which is exactly what we want and need to generate the header of the truth table. But the metadata is only present on functions created via defn. The one used here was anonymous. In addition you should be able to do
(table and)

That is not even a function. As we already know 'and' is a macro in Clojure but it has the necessary metadata. Even more than we need:

(:arglists (meta #'and))
([] [x] [x & next])

The hash-quote is a reader macro that translates to (var and) which resolves to the var object (as opposed to it's value) where the metadata is stored. The problem here: 'and' has more than one argument list and the relevant one in this context contains a var arg.

I ended up offering both APIs:

(table '[a b] and)
(table (i [a b] (a and b)))

The second variant uses the metadata to infer the header of the truth table while it's explicit in the first variant. I'm not entirely happy with that result as it seems a bit brittle: in the second variant it just uses the first argument list and falls flat on its face when the argument does not contain any metadata, which could well happen if you just specify an anonymous function yourself instead of using the 'i' macro.

Have a look at the complete code here on Github.

Read more about this series.

Tuesday, August 12, 2014

99 Clojure Problems – 47: Truth Tables for Logical Expressions (2)

"Continue problem P46 by defining and/2, or/2, etc as being operators. This allows to write the logical expression in the more natural way, as in the example: A and (A or not B). Define operator precedence as usual; i.e. as in Java."

If we want to have this 'natural way' of writing logical expressions we will have to introduce infix notation into Clojure's prefix world.


ninety-nine-clojure.logic> (table (i (a and (a or not b))))

|    :a |    :b | :result |
|  true |  true |    true |
|  true | false |    true |
| false |  true |   false |
| false | false |   false |


How do we get the desired syntax?

If you have read the previous article you know that we are deep in macro-land and there is no way back. This can be seen as a macro anti-pattern (See "Macros all the way down" in Clojure for the Brave and True).

We are already in a situation where our truth table function can only be a macro because we want to pass in unquoted macros as parameters. This is also true for the infix problem: What we need in a first step is a macro that allows us to write something like this:

(table #(infix (%1 and (%1 or not %2))))

Remember the truth table macro expects expressions that take two arguments. A first step towards that goal is to create an anonymous function with two arguments (%1, %2) and process our logical expression with a yet to write infix function or macro to turn it back into Clojure code. The infix notation is not easy to read due to the numbered arguments but we will work on that later.

Infix in Clojure

It is a well established exercise to implement infix operators in a Lisp. There is the exercise 2.58 in SICP for example that teaches us about the benefits of prefix notation over infix notation. The chapter 1.22 in the "Joy of Clojure" shows us in a similar fashion how we can avoid all the troubles in Clojure that usually come with the rules of operator precedence and still have some infix notation if we want it.

Now to get a basic infix notation we just need a rather simple macro that puts the operator into function position and adds parens around it.

([a op b]
     `(~op (infix ~a) (infix ~b)))
It recursively calls itself on the symbols 'a' and 'b', which can be composite expressions in parens like:
(a and b) 
or a simple boolean value. I came up with this to deal with both cases:
 ([x] (if (seq? x)
         `(infix ~@x)
It says: If the argument is a sub-expression in list form and therefore sequential apply the infix macro again otherwise return the value.

We have got all the binary operators covered assuming that they all have the same precedence. (Which they don't e.g. in Java which was mentioned in the original Prolog problem description). A bit trickier are now the unary operators like 'not'. If we use parens as in '(not a) and b)' it will work out of the box, as this negation of 'a' is already valid Clojure. I am supporting expressions like 'a and not b' with the following extension of my infix macro:

([a b c d]
     (if (unary? a)
       `(~c (~a (infix ~b)) (infix ~d) )
       `(~b (infix ~a) (~c (infix ~d)))))

The basic idea here is to take advantage of Clojure's support for multi-arity functions/macros and deal with a unary operator before the second value/sub-expression by interpreting any call with four arguments in that way. Now this falls obviously down as soon as we have something like 'not a and not b'. We will have to cover that case seperately.

Optional noise reduction

I already mentioned that the usage of the infix macros in combination with the truth tables leaves room for improvement:

(table #(infix (%1 and (%1 or not %2))))

The numbered arguments of the anonymous function are a bit ugly. We would rather have something like this:

(table (i (a and (a or not b))))

We imagine that 'i' is the macro that triggers the infix magic and the rest is an easy to read logical expression. We imagine further that 'i' is a macro that transform our expression into a function of two arguments 'a' and 'b'. The result of the transformation would be the following piece of Clojure code:

(fn [a b] (and a (or a (not b))))

[Update: As it turns out the symbol replacement described below is unnecessary. Stay tuned for problem 48 for more detail on this.]

The problem with that idea: Clojure won't allow you to introduce new bindings in a syntax quote (`) that might capture existing variables. The gensym mechanism is there for you to generate unique symbols to be used in a macro that won't accidentally hijack existing variables.

A post on Fogus' blog about deep code walking macros lead me to this implementation:

(defmacro i
  [& args]  
  (let [a (gensym)
        b (gensym)
        code (clojure.walk/postwalk-replace {'a a 'b b} args)]
    `(fn [~a ~b] (infix ~@code))))

It has its limitations: you can only use 'a' and 'b' as variables, as they are hard-coded. I then uses a deep code walking function to replace the 'a's and 'b's with generated symbols, which will also be used to generate a wrapper function around the logical expressions.

That was not very hard, but there would be a bit more work involved to get rid of the limitations. We will look at that when we tackle the next problem which will require us to support logical expressions of arbitrary arity. The source for my complete solution is as always on Github.

Read more about this series.

Tuesday, August 05, 2014

99 Clojure Problems – 46: Truth Tables For Logical Expressions

Define functions and, or, nand, nor, xor, impl, and equ (for logical equivalence) which return true or false according to the result of their respective operations; e.g. (and A B) is true if and only if both A and B are true.

A logical expression in two variables can then be written in prefix notation, as in the following example: (and (or A B) (nand A B)).

Now, write a predicate table which prints the truth table of a given logical expression in two variables.


ninety-nine-clojure.logic> (table impl)

|    :a |    :b | :result |
|  true |  true |    true |
|  true | false |   false |
| false |  true |    true |
| false | false |    true |


This problem has three parts: Implementing the logical expressions as functions. Implementing the truth table function and finally—and this is Clojure specific—thinking about how to deal with the built-in logical expressions 'and' and 'or'.

Logical expressions in Clojure

Once you have implemented 'and', 'or' and 'not' you can implement all other logical expressions in terms of these three "primitives". In Clojure 'and' and 'or' are macros. I don't want to give an introduction to macros here (I recommend having a look at the most excellent Joy of Clojure or Clojure for the Brave and True) But I will say that much: As Clojure's code is just data, macros allow you to transform that data before compilation into other code/data.

Bearing that in mind and trying to understand clojure.core/and we hope to get an idea how to implement the most basic logical expressions for this problem ourselves. This is how the 'and' macro is implemented in Clojure:

(defmacro and
  ([] true)
  ([x] x)
  ([x & next]
   `(let [and# ~x]
      (if and# (and ~@next) and#))))

Even without understanding every last detail of this macro definition, we can see that Clojure's 'and' transforms the and-expression into a recursive series of if-expressions.

Filling in concrete values and using macroexpand makes things a bit clearer:

ninety-nine-clojure.logic> (macroexpand '(and true false))
(let* [and__3941__auto__ true] 
  (if and__3941__auto__ 
     (clojure.core/and false) 
Expanded to the first level, we see how the and-expression is now an 'if' inside a 'let'. A recursive call to 'and' for the second argument is still unexpanded. This should give us enough inspiration to implement simple 'and' and 'or' functions of our own.

But we can also reuse the existing control structures Clojure offers right away to build the higher level expressions for 'nand', 'nor', 'equ', 'impl' and 'xor'.

Creating a truth table

The next step is to build the table function. I cheated a bit and used Clojure's print-table, hard-coding the possible inputs for a function with two boolean arguments. Print-table takes a sequence of maps for each row. To get the desired effect I interleaved a descriptive header with the result of evaluating the logical expressions and turned that into a sorted map. But it should not be too hard to built a simple printing function yourself.

Make it work with the built-in expressions

Now this last part is only relevant if you were lazy, as I was, and reused the existing Clojure expressions for 'and' and 'or'. Try to create a truth table for them and you will get a response similar to the following:

(table and)
CompilerException java.lang.RuntimeException: Can't take value of a macro: #'clojure.core/and, compiling:(NO_SOURCE_PATH:1:1) 
Now we already know that 'and' is implemented as a macro. Unless we are quoting we can't use it as a value:
(table 'and)

If we want to provide an API where the user does not need to know whether the argument to our table function is a macro and therefore has to be quoted, we have to write a macro ourselves.

Now this is a bit tricky and took a while. The hardest bit was was a helper macro I created to evaluate the logical expression (possibly a macro) for all input values, even though this is shamelessly taken from an answer on stackoverflow. The helper wraps the macro in a function where it is eval'd after making it referable by quoting it like so (with exemplary values added for clarity):

(list 'and true false)
I called it evaluator:
(defmacro evaluator [expr]
  `(fn [& args#]
     (eval `(~'~expr ~@args#))))
It uses two levels of syntax-quoting (`). The first to allow the introduction of the args-symbol and the second to allow the unquote-splicing (~@args#) of the args vector in the quoted function call (~'~expr) which we are building inside our evaluator function.

This kind of complexity is in my experience the exception and I am not even sure if there is not an easier solution to my problem than the one I found. But given the fact that the argument to our function is already a macro, I don't see a solution avoiding macros. Please let me know if you think I'm wrong. In the meantime have a look at the source code.

Read more about this series.

Sunday, July 13, 2014

99 Clojure Problems: 41: A List Of Goldbach Compositions

"Given a range of integers by its lower and upper limit, print a list of all even numbers and their Goldbach composition."


(print-goldbach (range 9 21))
10 = 3 + 7
12 = 5 + 7
14 = 3 + 11
16 = 3 + 13
18 = 5 + 13
20 = 3 + 17

"In most cases, if an even number is written as the sum of two prime numbers, one of them is very small. Very rarely, the primes are both bigger than, say, 50. Try to find out how many such cases there are in the range 2..3000.

Example (minimum value of 50 for the primes)":

(print-goldbach (range 2000) 50)
992 = 73 + 919
1382 = 61 + 1321
1856 = 67 + 1789
1928 = 61 + 1867


A simple problem as we have already implemented all the bits that do most of the work. To create a list of Goldbach compositions we take the input range (note the range semantics in Clojure: the upper limit is exclusive) filter out the even numbers that are greater than 2 (as required by Goldbach's conjecture). We then map the Goldbach function over the range and apply the mimimum filter for the first prime in the results (see the second example above).

We format the result and println it, generating a new line for every Goldbach composition. Have a look at the source once you tried it yourself.

The interesting aspect of this problem is the use of doall, which we have to call at the very end. Why? The intermediate representation of the data is a lazy sequence. By mapping println over that sequence we declare that we want to trigger a side effect (the printing). But the effect does not occur, due to laziness, until we force it with doall.

Read more about this series.

Monday, July 07, 2014

99 Clojure Problems – 40: Goldbach's Conjecture

Goldbach's conjecture says that every positive even number greater than 2 is the sum of two prime numbers. Example: 28 = 5 + 23. It is one of the most famous facts in number theory that has not been proved to be correct in the general case. It has been numerically confirmed up to very large numbers. Write a predicate to find the two prime numbers that sum up to a given even integer.


(deftest p40-goldbach
  (is (= [5 23] (goldbach 28))))

(def even-nats-greater-two (gen/such-that #(< 2 %) (gen/fmap #(* 2 %) gen/nat)))

(defspec goldbach-conjecture
  (prop/for-all [i even-nats-greater-two]
                (let [res (goldbach i)]
                   (= 2 (count res))
                   (every? prime? res)
                   (= i (reduce + res))))))


A naive approach just looks at all prime numbers below the input value. We are then only interested in those primes that if subtracted from that input leave us with another prime number. Both numbers obtained by that method form the result. We already got a list of prime numbers, Clojure's take-while and filter do the rest. I expressed the fact that Goldbach's conjecture is only defined for even natural numbers greater than two with a precondition.

If you have been following this series you might have noticed that the example I give in the blog posts is almost always a unit test. I think this a quite succinct way to illustrate the intended behaviour of a function. But this time you will have noticed that there is something else.

I have added a dependency to clojure.test.check. Test.check is a QuickCheck inspired property-based testing library for Clojure. You can learn more about QuickCheck by having a look at the original paper.

Looking at the code above you will notice that a generative test consists of two parts. First a property (within the defspec). In this case I specified that for all even natural numbers greater than two the Goldbach function should return two prime numbers whose sum is equal to the input value. This property is an almost literal translation of Goldbach's conjecture.

Secondly you need to specify how to generate all possible input values that the test runner should use. Again: Instead of doing this explicitly for every single input value, as you would have to in a traditional exemplary unit test, you just define a generator. I used the built-in generator for natural numbers called nat. But in order to create only valid input values for the Goldbach function the generated values need to be greater than two and even. You can achieve this by using combinators on the original generator. A fmap * 2 shifts the domain of the generator into the even values. I then used the such-that combinator to filter out everything below two.

While writing these generative tests seems like more work at first sight you get back a more thorough verification of your program in return.

Property-based testing recognises the impossibility of coming up with test inputs that expose all edge cases. Instead it encourages you to define properties of your function that you expect to hold. The flip side of the coin is that property based tests won't contribute as much to the design of your APIs and your system as the traditional TDD approach did. But it is a very valuable attempt at automated verification.

Read more about this series.

Sunday, July 06, 2014

99 Clojure Problems: 39 – A List Of Prime Numbers

Given a range of integers by its lower and upper limit, construct a list of all prime numbers in that range.


(deftest p39-primes-range
  (is (= '(7 11 13 17 19 23 29 31) (primes-range 7 31))))


This is a quick one. As we have already defined a lazy seq of prime numbers to solve P35 we can reuse that. To produce the desired range drop from that sequence until we reach the lower bound and then take until we reach the upper bound. Have a look at the solution if this was too terse.

Read more about this series.

Thursday, June 19, 2014

99 Clojure Problems – 37/38: Calculate Euler's Totient Function phi(m) (Improved)

See problem P34 for the definition of Euler's totient function. If the list of the prime factors of a number m is known in the form of problem P36 then the function phi(m>) can be efficiently calculated as follows: Let [[p1, m1], [p2, m2], [p3, m3], ...] be the list of prime factors (and their multiplicities) of a given number m. Then phi(m) can be calculated with the following formula: $$\varphi(m) = (p_1-1) p_1^{(m_1-1)} (p_2-1) p_2^{(m_2-1)} (p_3-1) p_3^{(m_3-1)} \ldots $$

I combined this with the next problem: P38 (*) Compare the two methods of calculating Euler's totient function. Use the solutions of problems P34 and P37 to compare the algorithms. Try to calculate phi(10090) as an example.


(deftest p37-totient-improved
  (is (= 144 (fast-totient 315))))
The output of comparing the two functions using the criterium benchmarking library looks like this on my machine (running it in a REPL):

Benchmarking the original approach

Evaluation count : 13200 in 60 samples of 220 calls.
             Execution time mean : 4.661557 ms
    Execution time std-deviation : 111.338965 µs
   Execution time lower quantile : 4.577765 ms ( 2.5%)
   Execution time upper quantile : 4.905024 ms (97.5%)
                   Overhead used : 130.196103 ns

Found 5 outliers in 60 samples (8.3333 %)
 low-severe  2 (3.3333 %)
 low-mild  3 (5.0000 %)
 Variance from outliers : 11.0453 % Variance is moderately inflated by outliers

Benchmarking the product based algorithm

Evaluation count : 1510020 in 60 samples of 25167 calls.
             Execution time mean : 39.865790 µs
    Execution time std-deviation : 463.957713 ns
   Execution time lower quantile : 39.517915 µs ( 2.5%)
   Execution time upper quantile : 40.800218 µs (97.5%)
                   Overhead used : 130.196103 ns

Found 4 outliers in 60 samples (6.6667 %)
 low-severe  2 (3.3333 %)
 low-mild  2 (3.3333 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers


The given formula follows from Euler's product formula. Translating the formula into Clojure is pretty straightforward. We can reuse the result from P36 and reduce the resulting map of prime numbers and their multiplicity by plugging in the product formula directly as the reducer function.

If you look at the code you will find that there is some additional "key/val-noise" due to the fact that we are getting a map back from the function calculating the multiplicities.

Running the benchmarks shows that this solution is about two orders of magnitude faster than the original solution based on the divisor sum and using Clojures ratio datatype.

There is a precondition though: The prime numbers have to be precalculated. If you have to calculate the prime numbers on the fly the product formula approach is only one order of magnitude faster:

Evaluation count : 259980 in 60 samples of 4333 calls.
             Execution time mean : 233.261376 µs
    Execution time std-deviation : 13.268456 µs
   Execution time lower quantile : 218.678716 µs ( 2.5%)
   Execution time upper quantile : 262.518850 µs (97.5%)
                   Overhead used : 131.115424 ns

Found 2 outliers in 60 samples (3.3333 %)
 low-severe  2 (3.3333 %)
 Variance from outliers : 41.8220 % Variance is moderately inflated by outliers

Read more about this series.

Friday, June 13, 2014

99 Clojure Problems – 36: Determine the Prime Factors of a Given Positive Integer (2)

Construct a list containing the prime factors and their multiplicity. Alternately, use a map for the result.


(deftest p36-prime-factors-multiplicity
  (is (= {3 2, 5 1, 7 1} (prime-factors-multiplicity 315) )))


If you have been following this series this problem might sound familiar. In fact it is just a slight variation on the combination of Problem 35 (prime factors) and Problem 10 (run-length encoding). I followed Phil Gold in using a map as the output format to make the problem a bit more interesting.

While you could reimplement the solution from scratch, I think it is good practice to reuse existing code. In this case all we have to do is call prime-factors on the input, run-length encode, reverse the individual pairs of primes and their multiplicity as required by the problem description. Turning it into a map is then just a matter of flattening the result and applying the assoc function to an empty map and our intermediate result. Try it yourself before you look at the code.

Read more about this series.

Thursday, May 29, 2014

99 Clojure Problems – 35: Determine the Prime Factors of a Given Positive Integer


(deftest p35-prime-factors
  (is (= '(3 3 5 7) (prime-factors 315))))


If you have ever done the "Prime Factors Kata" the solution to this problem is probably already in your muscle memory. One reason why this problem works well as an exercise is that there exists a very simple solution to it. In fact the solution can be expressed in just a few lines of code (in almost any language).

This simplest approach is to do trial division with increasing positive integers. You don't even have to bother checking for primality as things just fall into place: dividing by two for example effectively "eliminates" all multiples of two, in the sense that the next successful division can only happen with a number that is not a multiple of two. You do quite a lot of unnecessary divisions in the process though.

In my Clojure solution I chose to calculate a lazy (memoized) sequence of primes first and then do trial division using primes only. This reduces the number of divisions you have to do, assuming the primes are already calculated.

As long as your number is divisible by one of the small primes and does not have too many digits, trial division might be sufficient. For anything else you have to start looking at things like the general number field sieve.

Read more about this series.

Thursday, May 01, 2014

99 Clojure Problems – 34: Calculate Euler's Totient Function phi(m)

Euler's so-called totient function phi(m) is defined as the number of positive integers r (1 <= r <= m) that are coprime to m.


(deftest p34-totient
  (is (= 8 (totient-euler 20)))
  (is (= 4 (totient-euler 5))))


There are multiple ways of calculating the totient function. One pretty straightforward way is to check for every positive integer less than m whether it is coprime to m and count their number.

But when reading up on Wikipedia, Euler's classical formula struck me as a use case for Clojure's ratio data type.

It is defined as the sum over all positive devisors d of n: $$\sum_{d\mid n}\varphi(d)=n $$ One way of deriving that formula is to look at all the fractions between 0 and 1 with denominator n. For n = 10: $$\tfrac{1}{10} \tfrac{2}{10} \tfrac{3}{10} \tfrac{4}{10} \tfrac{5}{10} \tfrac{6}{10} \tfrac{7}{10} \tfrac{8}{10} \tfrac{9}{10} \tfrac{10}{10} $$

That is just mapping / over a range of numbers in Clojure.

Put them to lowest terms: $$\tfrac{1}{10} \tfrac{1}{5} \tfrac{3}{10} \tfrac{2}{5} \tfrac{1}{2} \tfrac{3}{5} \tfrac{7}{10} \tfrac{4}{5} \tfrac{9}{10} \tfrac{1}{1} $$ Clojure reduces it's ratios automatically to lowest terms.

We then look at the ratios that still have n as their denominator (here 10): $$\tfrac{1}{10} \tfrac{3}{10} \tfrac{7}{10} \tfrac{9}{10} $$ They turn out to be the ones whose nominators are coprimes of n and their number is the result of the totient function. In Clojure that is just a filter and a count over the result of the application of map.

This solution is probably not the fastest way of calculating the totient but it is faster than calculating coprimes based on the previous solution.

If I interpret the Criterium benchmark results correctly it is about 1.7x faster

ninety-nine-clojure.arithmetic> (bench (totient-euler 2000))
Evaluation count : 70320 in 60 samples of 1172 calls.
             Execution time mean : 867.964067 µs
    Execution time std-deviation : 14.561162 µs
   Execution time lower quantile : 859.192054 µs ( 2.5%)
   Execution time upper quantile : 901.487425 µs (97.5%)
                   Overhead used : 133.031916 ns

Found 7 outliers in 60 samples (11.6667 %)
 low-severe  7 (11.6667 %)
 Variance from outliers : 6.2554 % Variance is slightly inflated by outliers
ninety-nine-clojure.arithmetic> (bench (totient 2000))
Evaluation count : 41400 in 60 samples of 690 calls.
             Execution time mean : 1.496732 ms
    Execution time std-deviation : 63.212003 µs
   Execution time lower quantile : 1.450532 ms ( 2.5%)
   Execution time upper quantile : 1.636526 ms (97.5%)
                   Overhead used : 133.031916 ns

Found 5 outliers in 60 samples (8.3333 %)
 low-severe  3 (5.0000 %)
 low-mild  2 (3.3333 %)
 Variance from outliers : 28.6860 % Variance is moderately inflated by outliers

Read more about this series.

Thursday, April 24, 2014

99 Clojure Problems – 33: Determine Whether Two Positive Integers Are Coprime


(deftest p33-coprime-integers
  (is (coprime? 14 15))
  (is (not (coprime? 14 21))))


Two positive integers are coprime if the only positive integer that divides them both is 1. That is equivalent to their greatest common divisor being 1.

The solution is trivial building upon the solution to the last problem.

Read more about this series.

Sunday, April 20, 2014

99 Clojure Problems – 32: Determine the Greatest Common Divisor of Two Positive Integer Numbers

Use Euclid's algorithm.


(deftest p32-euclids-algorithm
  (is (= 9 (gcd 36 63)))
  (is (= 21 (gcd 462 1071)))
  (is (= 1  (gcd 1 1))))


The greatest common divisor of two positive integers is the largest integer that divides them both without a remainder. A common form of Euclid's algorithm uses division and comparison. We start with a pair of numbers (here represented by the two arguments of the function). We then form a new pair consisting of the smaller of the two numbers and the remainder of dividing the larger number by the smaller one. We continue this process until the remainder is zero. The first number of the pair is then the greatest common divisor.

A simple recursive solution pretty much follows this description. We compare the second argument with zero to determine the exit condition. If we find it to be zero we return the first argument. Otherwise we recursively call the function itself with the second argument in the first argument's place and the remainder of the division in the second place.

The use of recursion can cause problems as the available stack size is not knowable and depends on runtime configuration. While Lamé's Theorem states that the number of steps in Euclid's algorithm never exceed five times the number of digits in the smaller number, this doesn't guarantee that we won't run out of stack frames given Clojure's support for integer sizes exceeding 64 bits. Using recur here makes the function "stack-safe."

If you look at the description of the algorithm above you will notice the implicit assumption that the first number is always the larger of the two. But we don't enforce the order of arguments (it could be done easily using Clojure's support for preconditions) because the first iteration of the algorithm will simply reorder them. This becomes clear when we make the first couple of substitutions explicit:

(gcd 36 63)
(gcd 63 (rem 36 63))
(gcd 63 36)

Read more about this series.

Thursday, April 10, 2014

99 Clojure Problems – 31: Determine Whether a Given Integer Number Is Prime


(deftest p31-test-primes
  (is (not (prime? 4)))
  (is (prime? 7)))


My first solution is based on the straight forward approach of testing for divisibility by all integers starting with 2 until sqrt(n). If a number is prime it will be only divisible by itself (and 1).

My second solution is a translation of a primality test based on Fermat's Little Theorem as outlined in SICP 1.2.6. It says that a number n is certainly not prime if raised to a random positive integer a less than n and divided by n is not equal to a.

This second approach is faster for large numbers but it is a probabilistic algorithm. This means in our case if the remainder of a^n divided by n is a, the number n is only probably a prime. While the probability gets higher every time we successfully run the test with another number, there is no "guarantee".

In addition the Fermat test can be fooled by a couple of extremely rare numbers (called Carmichael numbers). Read the relevant chapter in SICP for more on this topic.

Read more about this series.

Wednesday, April 02, 2014

Scala Magic

Today we were discussing the benefits of handling errors or missing values without resorting to exceptions via Scala's Option data type.

Imagine, you have a list of options on Strings and you want to work with the values (the Somes) and ignore the absent values (the Nones). One way of doing that would be something like this:

val names = List(Some("Peter"), None, Some("Mary"))
names.flatMap(n =>"Hello " + _))
You might remember faintly what the type signature of flatMap was and think: hang on a minute, what's going on here?
def flatMap[A,B](f: A => List[B]): List[B]
This is obviously simplified compared to the actual Scala standard library version of flatMap (look for TraversableLike#flatMap). But still flatMap takes a function from A to List[B], while Option's map looks like this:
@inline final def map[B](f: A => B): Option[B] =
    if (isEmpty) None else Some(f(this.get))

So how does the result of map which is of type Option[String] turn magically into List[Option[B]]? Implicit conversion is the answer. If scalac finds a type mismatch between what we are passing in and what the function expects it looks for an implicit view definition to satisfy the expression.

We can use scalac's print option to see what's going on under the hood:

p_brc@brc-mbp ~/s/s/S/s/t/s/intro> scalac -print Tester.scala 

//lots of code removed 
// here is the call to flatMap

        (new anonymous class anonfun$1(): Function1)
      }, immutable.this.List.canBuildFrom());

// more synthetic classes  removed
// here is the synthetic class representing the outer lambda function:

  @SerialVersionUID(0) final  class Tester$$anonfun$1 extends runtime.AbstractFunction1 with Serializable {
    final def apply(n: Option): Iterable = scala.this.Option.option2Iterable({
      (new anonymous class anonfun$apply$1(Tester$$anonfun$1.this): Function1)
    final  def apply(v1: Object): Object = Tester$$anonfun$1.this.apply(v1.$asInstanceOf[Option]());
    def (): anonymous class anonfun$1 = {

You can see how the lambda is represented by a synthetic class Tester$$anonfun$1. In its apply method the code we defined is wrapped in a call to option2Iterable.

Option.option2Iterable is the implicit view definition that was in scope and was automagically injected by the compiler. It is quite simple and looks like this:

implicit def option2Iterable[A](xo: Option[A]): Iterable[A] = xo.toList

Riddle solved. Are we happy now? Maybe, but implicit conversions are certainly one of Scala's features that can lead to confusing if not unintelligible code quite easily. But then again, that is nothing new and if you look at the related literature, be it Odersky, Spoon, Venners: Programming in Scala or Suereth: Scala in Depth, the chapters covering implicit conversions are full of warnings and calls for restraint. Therefore: advance with caution.

Tuesday, April 01, 2014

99 Clojure Problems – 28: Sorting a List of Lists According to Length of Sublists

  1. "We suppose that a list contains elements that are lists themselves. The objective is to sort the elements of the list according to their length. E.g. short lists first, longer lists later, or vice versa."
  2. "Again, we suppose that a list contains elements that are lists themselves. But this time the objective is to sort the elements according to their length frequency; i.e. in the default, sorting is done ascendingly, lists with rare lengths are placed first, others with a more frequent length come later."


(deftest p28-sublist-sort
  (is (= '((o) (d e) (d e) (m n) (a b c)  (f g h)  (i j k l))
         (lsort '((a b c) (d e) (f g h) (d e) (i j k l) (m n) (o))))))

(deftest p28-sublist-freq-sort
  (is (= '((i j k l) (o) (a b c) (f g h) (d e) (d e) (m n))
         (lsort-freq '((a b c) (d e) (f g h) (d e) (i j k l) (m n) (o))))))

"Note that in the above example, the first two lists in the result L have length 4 and 1, both lengths appear just once. The third and forth list have length 3; there are two list of this length. And finally, the last three lists have length 2. This is the most frequent length."


The solution to a) is trivial if we use Clojure's sort-by and base the key function on the count of the sublists.

The solution to b) is based on the frequencies of the sublists. Clojure comes with the frequencies function that we use on the lengths of the sublists. All that is left is to call sort on the list of lists and use the frequencies map to determine the order of the sublists.

Read more about this series.

Monday, March 31, 2014

99 Clojure Problems – 27: Group the Elements of a Set into Disjoint Subsets

  1. In how many ways can a group of 9 people work in 3 disjoint subgroups of 2, 3 and 4 persons? Write a function that generates all the possibilities.
  2. Generalize the above predicate in a way that we can specify a list of group sizes and the predicate will return a list of groups.

Note that we do not want permutations of the group members; i.e. ((Aldo, Beat), ...) is the same solution as ((Beat, Aldo), ...). However, we make a difference between ((Aldo, Beat), (Carla, David), ...) and ((Carla, David), (Aldo, Beat), ...).

You may find more about this combinatorial problem in a good book on discrete mathematics under the term "multinomial coefficients".


(deftest p27-multinomial-coeffients
  (is (=
       '((("Peter" "Paul") ("Mary")) 
         (("Peter" "Mary") ("Paul")) 
         (("Paul" "Mary") ("Peter")))
       (group [2 1] ["Peter" "Paul" "Mary"]))))


The mulinomial coefficients are a generalisation of the binomial coefficients that we saw in the last problem. $${n \choose k_1, k_2, \ldots, k_m} = \frac{n!}{k_1!\, k_2! \cdots k_m!}$$ I used this to validate some aspects of my solution as you can see in the tests. Showing the correctness of your solution in a traditional unit test gets quite unwieldy when you have, as in this example, 1260 combinations to check.

My solution to part a) of the problem is based on the idea to reuse the enumeration of all k-combinations of n elements from problem 26. We then take all possible combinations of four people out of the group of nine and remove these people from the set of all people by using Clojures set functions. We continue to do that for the 3-combinations and the 2-combinations and add them into a result vector.

The repeated set creation/difference code is neither elegant nor performant but a pragmatic choice given that the combinations function was supposed to work with/return lists. Another area for improvement is the calculation of the last k-combinations. It is redundant as its result is just the remainder of the original input sequence.

The solution to part b) is a bit more involved but very closely related in terms of its structure. It's a recursive solution that starts with calculating the combinations for the first group and mapping the result to a recursive self-call. In each recursive call we remove the elements/persons already "used" from the input set. We also remove the k for which we just calculated the combinations from the list of group sizes. The whole operation is wrapped in a mapcat to flatten the result into just one level as you can see in the example.

Read more about this series.

Sunday, March 30, 2014

99 Clojure Problems – 26: Generate the Combinations of K Distinct Objects Chosen From the N Elements of a List

"In how many ways can a committee of 3 be chosen from a group of 12 people? We all know that there are C(12,3) = 220 possibilities (C(N,K) denotes the well-known binomial coefficients). For pure mathematicians, this result may be great. But we want to really generate all the possibilities."


(deftest p26-combinations
  (is (= '((0 1) (0 2) (1 2)) (combinations 2 (range 3))))
  (is (= 220 (count (combinations 3 (range 12))))))


Here is my thinking, which is inspired by SICPs counting change example: If you want to enumerate all combinations of 2 distinct objects chosen from the list (0 1 2) you could fix the first element to be the first element of the list. That leaves you with a list of two elements and one more object to choose. There are only two combinations for C(2, 1): the two remaining elements of the list. Join them to the first elements we have already chosen and we end up with (0 1) and (0 2). Do the same thing again: Enumerate all combinations of two distinct objects from the remaining list (1 2), there is just one: the list itself is the answer.

We can generalize these observations into a solution: combinations of k distinct objects chosen from a list of n elements are the first element of the list conjoined to the combinations of k-1 distinct objects from the rest of the list, plus all combinations of k distinct objects from the list without its first element (again: the rest or tail of the list).

Read more about this series.

Wednesday, March 19, 2014

99 Clojure Problems – 25: Generate a Random Permutation of the Elements of a List


I am a bit at a loss when it comes to providing a meaningful test case that illustrates the problem, as I did in the previous posts in this series.

Based on my current understanding we would have to show that the randomness of the results matches a uniform random distribution. But lets look at the solutions first. I will come back to that question at the end.


There are at least three solutions to this problem that I am aware of.

1. Idiomatic/Java Interop

The first one is to reuse the idiomatic variant of the random-select, which is based on java.util.Collections#shuffle, which in turn implements AFAIK the well known Fisher-Yates shuffle algorithm. While a functional purist might sniff at the mutation involved I think, this is perfectly fine in this case as the mutability is encapsulated in a functional/immutable shell. Your sequence is translated into a mutable collection to use the efficient algorithm based on swapping. The result is then transformed back into a persistent data structure: a pattern often used to get the best of both worlds.

2. Naive Functional

Solution number two is functional but not very efficient. If I am not mistaken the running time of this solution should be O(n^2). We are calling remove-at n times, which calls split-at. Split-at has linear complexity.

3. Perfect Functional Shuffle

The third solution is nothing I came up with myself. I merely followed a hint given by Phil Gold. It pointed to Oleg Kiselyov's post about perfect functional shuffle in Haskell.

Implementing it in Clojure turned out to be quite a lot of work. I have never done any Haskell and I am not sure if I got all of if right. It is probably best you have a look at Oleg's text first.

I did not read up much on Haskell syntax, but nonetheless: it started to make sense after looking at the code for a while even without knowing the language. The biggest difference with effects on the structure of the program: Haskell comes with continuation passing style built right in— something I could not or did not want to recreate in Clojure.

The gist of the solution is to build a binary tree out of the input sequence. The tree allows for efficient extraction of branches and is by its very nature at most ceil(log2 n) deep. The overall complexity of the solution should be therefore O(n*log n). Again, read the original text for more detail.

I decided to use core.match and add with it the first dependency to the project. This allowed me to stick closer to the original solution which makes heavy use of Haskell's pattern matching.

Translating the essential two functions was pretty straightforward and core.match offered all I needed. The syntax is a bit clunkier (notably the guard clauses) than Haskell's succinctness but it's not in a different ballpark alltogether.

The first interesting function is called pair-leaves. It takes a sequence of nodes—initially all leaves—(represented by vectors of the form [:node|:leaf count value]) and joins them together to form a binary tree. We need four pattern matches to do that: matching two leaves, a node and a leaf, a leaf and a node and finally one matching two nodes.

This gives us the binary tree. We now take a sequence of random numbers (each being an independent sample of a uniform random distribution [0...n-1]) and extract the numbers at the positions indicated by those random indices from the tree.

We need six pattern matches to extract from the tree. (Oleg needs five as Haskell apparently has something like a multiple guard with sub-branches which spares him writing the last match.) The patterns are:

  1. The first element is always a leaf on the left.
  2. Any but the first element when our tree consists of two leaves: the right leaf.
  3. Any but the first element where our tree is bigger than two leaves: extract recursively to the right.
  4. Any but the first element where n + 1 is equal to (or greater) than the count of the root node and we have a leaf as the right subtree. This is actually the biggest deviation from Oleg's code and possibly a mistake on my part.
    But as I understand it, it is due to the lack of CPS in my solution. Oleg's continuations take the flow of execution back up into the shuffle function where he deals with the case of a single leaf. My solution has to be able to deal with the leaf case inside the extract-tree — by weakening the guard here.
  5. The last two are closely related: a tree with a non-leaf sub-tree to the left and none of the above: if the number we are looking for is in the left subtree (as indicated by its count on the node) we recursivly extract to the left, otherwise to the right.

Verifying the Results

What is left to do is to verify the property of uniform random distribution. The assumtions about time complexity as outlined by Oleg seem obvious to me. Let's look at a histogram:

This histogram left me wondering whether I had made a mistake. It does seem to show a quite clearly discernible repetitive sawtooth pattern. I created the data with this snippet of Clojure:

(def input '[a b c d e])
(def perms (vec (combo/permutations input)))
(def samples (repeatedly 10000 (fn [] (.indexOf perms (random-permute-functional input)))))

Compare this with a histogram of 10000 results generated by the first solution based on Fisher-Yates:

The most likely explanation is a mistake in my implementation or my reasoning. I have not found it yet, but I am quite keen on any hints. Could it be that perfect functional shuffle is not quite perfect? Probably not.


Of course not. Andy Fingerhut was so kind to point out my mistake to me: You have to choose the random numbers uniformly distributed within [0,n-1] for the first number, [0, n-2] for the second number and so on. Otherwise you will bias your shuffle towards whatever is the rightmost branch of your tree, as every number > n where n is the size of your current tree will extract the rightmost element. You can find the corrected version here. The histogram looks much better now too:

Read more about this series.

Friday, February 28, 2014

99 Clojure Problems – 24: Draw N Different Random Numbers From The Set 1...M


(deftest p24-lotto
  (let [draw (lotto 6 49)]
    (is (= 6 (count draw)))
    (is (>= 49 (apply max draw)))))


This is simple if you solved the previous problem. Just reuse random-select to select n numbers from the set 1...m which you pass in as the second argument.

Read more about this series.

Saturday, February 22, 2014

99 Clojure Problems – 23: Extract a Given Number of Randomly Selected Elements From a List


(deftest p23-random-select
  (is (= 3 (count (random-select 3 '(1 2 3 4 5 6))) )))


There is a simple and idiomatic solution to this problem based on shuffle (which delegates to java.util.Collections#shuffle).

I did another implementation based on recursive loop'ing and the Clojure's rand function (which delegates to java.lang.Math#random). The basic idea is to remove a randomly selected element (reusing remove-at from problem 20) and repeat the process n times. The recipe is fairly simple: use a loop binding with the function parameters and an aggregator. Remove the random element. Then destructure the result, cons'ing the removed element onto the result aggregator and passing the remaining elements into the recur call. Decrement n with each recursion. Exit at (= 0 n).

Read more about this series.

Monday, February 17, 2014

99 Clojure Problems – 22: Create a List Containing All Integers Within a Given Range


(deftest p22-range
  (is (= '(4 5 6 7 8 9) (my-range 4 9))))


There is a function in clojure.core called 'range' that does the job.

My solution is fairly simple compared to the built-in. It uses a lazy-seq, Clojure's alternative to streams. In the body of the lazy-seq we 'cons' the start element onto either an empty list or onto ourselves via a recursive call. We decide which alternative we choose by comparing the start element with the specified end element. If they are equal we have generated all the values we need and end the recursion.

Delayed computation is the necessary precondition for solving the range problem for arbitrarily large ranges of numbers. Delaying the computation allows us to compute the next value in the range only in the very moment we need it and once we continue with our computation it can be thrown away and the memory it consumed can be freed. In our case, on the JVM, the garbage collector will take care of the last part, unless you hold onto the head of your range somewhere.

I have mentioned lazy-seqs before. But a much better introduction with motivating examples can be found as always in SICP. Rich Hickey did not seem to like the stream approach. What he did like were the benefits of delayed computation. If you compare Clojure's form of laziness to the SICP examples you will find that in SICP a completely separate stream API has been established to allow delayed computation. Clojure's lazy-seq keeps compatibility with the existing higher order functions like map, filter and reduce while offering the same degree of laziness.

Read more about this series.

Thursday, February 13, 2014

99 Clojure Problems – 21: Insert an Element at a given Position into a List


(deftest p21-insert-at
  (is (= '(1 new 2 3 4) (insert-at 'new 1 '(1 2 3 4)))))


My solution to this problem is based on the same idea as the previous one: Reuse split from problem 17 to cut the input sequence into two parts. Then concatenate the parts, cons'ing the element to insert to the tail before doing so.

Read more about this series.

Tuesday, February 11, 2014

99 Clojure Problems – 20: Remove the Kth Element from a List

Return the list and the removed element in a tuple. Elements are numbered from 0.


(deftest p20-remove-at
  (is (= '((a c d) b) (remove-at 1 '(a b c d)))))


A quick one: Reuse split from problem 17 to cut the input sequence into two parts. I called them head and tail, which is not quite accurate as head usually refers to the first element of a sequence only. You can then concatenate the head part and the tail part again into a single sequence. Calling rest on the tail before concatenating removes the right element. Calling first on the tail allows you to return the removed element as required (Source).

Read more about this series.

Saturday, February 08, 2014

99 Clojure Problems – 19: Rotate a List N Places to the Left


(deftest p19-rotate
  (is (= '(d e f g h i j k a b c) (rotate 3 '(a b c d e f g h i j k))))
  (is (= '(j k a b c d e f g h i) (rotate -2 '(a b c d e f g h i j k)))))


The examples illustrate that the sign of the first parameter determines the direction of rotation. A positive integer is meant to indicate that we want to rotate N places to left counting from the beginning of the input sequence. A negative integer means start counting from the end of the input sequence.

My solution calculates the split point accordingly and reuses the solution to problem 17 to split the input sequence. All of this happens in an initial let binding, thus we can use destructuring to bind the result of the split operation to two symbols. I called the second sublist "newhead" and the first sublist "newtail". We can then simple concatenate the two parts in the order indicated by the symbol names to create the rotated list.

Read more about this series.

Friday, February 07, 2014

99 Clojure Problems – 18: Extract a Slice from a List

Given two indices, I and K, the slice is the list containing the elements from and including the Ith element up to but not including the Kth element of the original list. Start counting the elements with 0.


(deftest p18-slice
  (is (= '(d e f g) (slice 3 7 '(a b c d e f g h i j k)))))


If we drop the first I elements we get a lazy sequence starting with the Ith element (because we count starting with 0). If we then take the difference between K and I and remove as many elements from our sequence we obtain the desired slice (Source).

Read more about this series.

Tuesday, February 04, 2014

99 Clojure Problems – 17: Split a List into Two Parts

The length of the first part is given. Use a tuple for your result.


(deftest p17-split
  (is (= ['(a b c) '(d e f g h i j k)]) (split 3 '(a b c d e f g h i j k))))


There is a nice, functional, built-in solution in Clojure. I wanted to do it anyway.

My solution uses a nested, recursive function. Strictly speaking you cannot nest function definitions in Clojure in the same way as you can e.g. in Scala. But what you can do is to define a binding via let pointing to a function within another function. Since writing the solution I have learned that you can use letfn for this purpose.

The local function itself is fairly simple. It returns a vector (the tuple if you will) of the two lists and takes the length n of the first list, the list to break apart and the the initially empty result accumulator. We then "recur" by decrementing n, conjoining the first element to the result and passing on the rest of the sequence. When n becomes zero or we don't have anything left to split, we return the result.

Read more about this series.

Thursday, January 30, 2014

99 Clojure Problems – 16: Drop Every Nth Element of a List


(deftest p16-drop-every-nth
  (is (= '( a  b  d  e  g  h  j  k) (drop-every 3  '(a b c d e f g h i j k)))))


I have two solutions to offer. The basic idea for both of them is to go through the input list and transform it to a result list, counting the elements as they are cons'ed to the result. But I would only do the cons'ing if the value of the counter was not a multiple of n thereby dropping every nth element as required.

The first one is a recursive solution. I used a let binding to define an inner function with the additional counter argument. Without any further precautions the function would be stack-consuming and a stack overflow would be due quite quickly (on my machine with a standard 'lein repl' it can handle 4221 elements). In order to prevent that I wrapped the meat of the function in a lazy-seq

 +--------+   +--------+  +--------+
 |LazySeq |   |LazySeq |  |LazySeq |
 |--------|   |--------|  |--------|
 |+--+---+|   |+--+---+|  |        |
 ||a | +----> ||b | +---> |(drop..)|
 |+--+---+|   |+--+---+|  |        |
 +--------+   +--------+  +--------+

lazy-seq is just a macro that turns the expressions it is given into a LazySeq object. If you followed the link and were surprised by Java code: LazySeq is part of the Java underpinnings of Clojure. The lazy part in lazy-seq means that it does not immediately realise the given expressions (and in our case use up all the stack through recursion). Instead it returns and does nothing until it is accessed as a sequence (that's the seq part). Once realised it caches its value. See my poor man's illustration above to find two realised seqs (a and b) and one that has not yet computed its value and still contains the function to produce it.

With this arrangement in place, the only limit is the size of your heap. At least if you prevent the garbage collector from freeing up memory by collecting the elements of the list we have already computed but won't need anymore. This will happen if you hold on to the head of the list somewhere.

Solution number two is based on the same idea but makes use of a built-in function called 'keep-indexed'. I discovered it a couple of days after my first implementation. It's a higher-order function that takes a decider as its first argument. The decider function gets called for every element in the input sequence. The decider itself takes two arguments the first being the zero-based index of the current item and the second being the item itself. The function is called 'keep-indexed' because it keeps every element for which the decider returns a non-nil value. Implementing drop-every-nth is thus reduced to a very simple anonymous function that does the division (as explained above)—we end up with a quite elegant one-liner.

Read more about this series.

Tuesday, January 28, 2014

99 Clojure Problems – 15: Duplicate the Elements of a List a Given Number of Times


(deftest p15-duplicate-n
  (is (= '(a a a b b b c c c c c c d d d) (duplicate-n 3 '(a b c c d)))))


Again, the solution is just a slight variation on the solution to the previous problem. We are reducing the list into an accumulator as before, but this time we use Clojure's built-in 'repeat' to achieve the specified number of duplications per element. 'Repeat' is returning a list so we have to concat the accumulator with the repeated elements to get our results back.

This is a two star problem, meaning it should take an experienced programmer about 60-90 minutes to solve. Presumably this problem is much harder to solve in Prolog than in Clojure.

Read more about this series.

Monday, January 27, 2014

99 Clojure Problems – 14: Duplicate the Elements of a List


(deftest p14-duplicate
  (is (= '(a a b b c c c c d d) (duplicate '(a b c c d)))))


A straightforward solution just uses reduce to conjoin the current element twice with an accumulator.

Read more about this series.

Sunday, January 26, 2014

99 Clojure Problems – 13: Run-length Encoding of a List (Direct Solution)

Implement the so-called run-length encoding data compression method directly. I.e. don't use other methods you've written (like P09's pack); do all the work directly.


(deftest p13-decode
  (is (= '( a a a a b c c a a d e e e e) (decode '((4 a) (1 b) (2 c) (2 a) (1 d) (4 e))) )))


Maybe I missed the point here, but my solution is identical to the solution of problem 10 after the substitution of pack with its body.

Read more about this series.

Thursday, January 23, 2014

99 Clojure Problems – 12: Decode a Run-length Encoded List

Given a run-length code list generated as specified in problem P10, construct its uncompressed version.


(deftest p12-decode
  (is (= '( a a a a b c c a a d e e e e) (decode '((4 a) (1 b) (2 c) (2 a) (1 d) (4 e))) )))


Again, the solution to this problem is fairly straightforward: We map the encoded list over a function that applies Clojure's built-in repeat to each sublist's second element (the character) as many times as the first element of the sublist specifies. This leaves us with a list of lists, each containing the repeating elements. We obtain the final result by flattening the nested structure in to a single list.

Read more about this series.

Wednesday, January 22, 2014

99 Clojure Problems – 11: Modified Run-length Encoding


(deftest p10-encode-modified
  (is (= '((4 a)  b  (2 c)  (2 a)  d  (4 e)) (encode-modified '( a a a a b c c a a d e e e e)))))


Source. A simple map over the result of P10 leads to the desired result.

Read more about this series.

Tuesday, January 21, 2014

99 Clojure Problems – 10: Run-length Encoding of a List

Use the result of problem P09 to implement the so-called run-length encoding data compression method. Consecutive duplicates of elements are encoded as tuples (N, E) where N is the number of duplicates of the element E.


(deftest p10-encode
  (is (= '((4 a) (1 b) (2 c) (2 a) (1 d) (4 e)) (encode '( a a a a b c c a a d e e e e) ))))


My simple solution maps over the result of problem 9 using a mapping function that replaces each sublist with a tuple consisting of the length of the sublist and the element.

Read more about this series.

Monday, January 20, 2014

99 Clojure Problems – 9: Pack Consecutive Duplicates of List Elements into Sublists


(deftest p09-pack
  (is (= '( ( a a a a) ( b) ( c c) ( a a) ( d) ( e e e e)) (pack  '( a a a a b c c a a d e e e e)))))


The key to the solution was the partition-by function. I believe I first learnt about it when looking at other people's solutions to a previous problem (While I find it helpful and enlightening to look at other people's code, I am quite strict about not peeking at solutions to problems I have not solved myself yet.)

Partition-by is a higher-order function that takes a grouping function as it's first argument. It pipes all values through that function and lumps the values together in a subsequence (read sublist) until the grouping functions returns a new value—which marks the start of the next subsequence.

By choosing the identity as our grouping function we get exactly the desired result.

Read more about this series.

Friday, January 17, 2014

99 Clojure Problems – 8: Eliminate Consecutive Duplicates of List Elements


(deftest p08-compress
  (is (= '[a b c a d e]  (compress  '[a a a a b c c a a d e e e e]))))



A simple reduce that only conjoins the current element to the result if it is not identical to its predecessor. It uses last to find out who that predecessor was. This is not the most efficient way to solve the problem, as we know from problem 1, last has linear time complexity.

Read more about this series.

Thursday, January 16, 2014

99 Clojure Problems – 7: Flatten a Nested List Structure


(deftest p07-flatten
  (is (= '(1 1 2 3 5 8) (flatten (list (list 1 1) 2 (list 3 (list 5 8)))))))


This was the first problem that forced me think a bit harder. It is marked with two asterisks in the original prolog version of the "challenge", meaning a skilled programmer should be able to solve the problem in 30-90 minutes.

It is also the first problem where I'm not happy with my solution. It feels quite imperative too me and its nested if-structures are not exactly easy on the eyes. I used a multiple arity function, which means the function accepts two (or more) different parameter lists. I did this to supply an initial value for an accumulator. Thinking about it now it seems cleaner to me to hide these implementation details completely from the user. This could be achieved by using a let binding or something similar inside the function without polluting the API. The following first "if" checks whether we are dealing with a sequence at all. If not we are just conjoining this non-sequential element with the accumulator. If the input is a sequence we need another "if" to check for an empty sequence which is used as the escape hatch in the recursion that follows: We thread the accumulator through two recursive calls of the function one for the head of the sequence, one for the rest.

Feeling a bit unsatisfied with what I had come up with I went looking for other people's solutions, again, something I had not done before. I found rodnaph github repo with a rather elegant solution based on reduce (I have included it with minor modifications in my code for easy comparison). Reduce saves one noisy if and leaves us with just one check to decide whether an element is a sequence or not. If it is we call the function recursively. But instead of joining individual elements as I did he just concats the resulting flattened sublist onto the accumulator.

Finally I had a quick look at Clojure's built-in flatten. It interprets the nested lists as a tree and uses its tree-seq function to turn this tree into a lazy sequence of its elements. Tree-seq is a higher-order function that takes a branch detector function as its first argument to decide when to descent into the depths of the given data structure. By using sequential? as the branch detector we follow the nested lists as we walk the tree. Tree-seq then conses all nodes together in a lazy sequence. We are not quite done with the resulting sequence because it contains lots of "duplicates" as each branch node also contains in our case all of its children—the root node of our tree for example is the whole nested input list. Therefore we have to filter out all sequences in the result thereby removing the branching nodes. We return the remaining leaves—our flattened list.

Read more about this series.

Wednesday, January 15, 2014

99 Clojure Problems – 6: Find Out Whether a List Is a Palindrome


(deftest p06-test
  (is (= true (palindrome? '(1 2 3 3 2 1))))
  (is (= false (palindrome? '(1 2 3)))))


One definition of a palindrome is "a sequence of symbols or elements, whose meaning may be interpreted the same way in either forward or reverse direction" [Wikipedia]. The definition contains already a possible solution. To test whether a given sequence of elements is a palindrome we can compare it with the same sequence in reverse order. We can reuse the reverse function defined to solve the last problem.

Read more about this series.

Tuesday, January 14, 2014

99 Clojure Problems – 5: Reverse a List


(deftest p05-reverse
  (is (= '(3 2 1) (my-reverse '(1 2 3)))))


Now this might be a bit disappointing but my two solutions to this problem are structurally very similar to the solutions to problem no. 4. One way of reversing a list is to recursively call reverse with the tail of the list until all that is left is an empty list. We cons the head of the current list to the result on the way and return the reversed result when we hit the base case.

Another solution could use reduce to cons the elements together in a similar fashion. The val parameter initialised with an empty list is used again as an accumulator to build up the result.

Finally there is, of course, also the built-in reverse.

Read more about this series.

Monday, January 13, 2014

99 Clojure Problems – 4: Find the Number of Elements of a List


(deftest p04-test
  (is (= 3 (my-count (list 1 2 3)))))


There is the built-in count that solves this problem.

I implemented two possible solutions for this problem:

Source 1. This is again a variation on the theme of using the loop/recur construct (that I also used in the previous problems) with an additional parameter binding aka accumulator to keep track of the state—in this case the number of elements encountered so far.

Source 2 I'm not entirely sure if this was really my own idea or if I saw it or something similar somewhere—I should have made a note. This solution uses reduce, a standard higher-order function found in most functional languages. Clojure's implementation allows for a val argument that we use to accumulate the count by incrementing by one for each element in the collection passed as the third argument.

Read more about this series.

Sunday, January 12, 2014

99 Clojure Problems – 3: Find the Kth Element of a List


(deftest p03-kth
  (is (= 3 (kth 2 (list 1 2 3 4)))))


There is a built-in function called nth that solves this problem.


A relatively straightforward recursive solution uses recur on the function itself while decrementing the k and dropping the first element on each recursive call until we reach k = 0 at which point we return the first element. Strictly speaking this is a linear iterative process and not a recursive one as all the state of the process is in its parameter bindings and not in the results of the stack of recursive calls.

Read more about this series.

Saturday, January 11, 2014

Ninety Nine Clojure Problems

My New Year's resolution for 2014: I am going to solve 99 Clojure problems inspired by "S-99: Ninety Nine Scala Problems" and "P-99: Ninety Nine Prolog Problems". It might be a bit ambitious, but we will see.

I thought it would be a good idea to cover each problem and its solution in a short blog post. I will try to do it in a format that does not reveal the solution right away (there will be a link to the solution in each post).

Another word of warning: I am really just starting my Clojure adventure and the code presented here does not claim to be correct or idiomatic Clojure. I hope to see continuous improvement throughout the process though. If you have suggestions or more elegant solutions to one of the problems, please let me know.

The classification of the problems works as follows: * means easy, ** means 60-90 minutes if you are a skilled programmer, *** means difficult. The levels of difficulty are not based on my own experience but taken from the original Prolog problems.



Logic and Codes

Binary Trees