Thursday, May 29, 2014

99 Clojure Problems – 35: Determine the Prime Factors of a Given Positive Integer

Example

(deftest p35-prime-factors
  (is (= '(3 3 5 7) (prime-factors 315))))

Solution

If you have ever done the "Prime Factors Kata" the solution to this problem is probably already in your muscle memory. One reason why this problem works well as an exercise is that there exists a very simple solution to it. In fact the solution can be expressed in just a few lines of code (in almost any language).

This simplest approach is to do trial division with increasing positive integers. You don't even have to bother checking for primality as things just fall into place: dividing by two for example effectively "eliminates" all multiples of two, in the sense that the next successful division can only happen with a number that is not a multiple of two. You do quite a lot of unnecessary divisions in the process though.

In my Clojure solution I chose to calculate a lazy (memoized) sequence of primes first and then do trial division using primes only. This reduces the number of divisions you have to do, assuming the primes are already calculated.

As long as your number is divisible by one of the small primes and does not have too many digits, trial division might be sufficient. For anything else you have to start looking at things like the general number field sieve.

Read more about this series.

Thursday, May 01, 2014

99 Clojure Problems – 34: Calculate Euler's Totient Function phi(m)

Euler's so-called totient function phi(m) is defined as the number of positive integers r (1 <= r <= m) that are coprime to m.

Example

(deftest p34-totient
  (is (= 8 (totient-euler 20)))
  (is (= 4 (totient-euler 5))))

Solution

There are multiple ways of calculating the totient function. One pretty straightforward way is to check for every positive integer less than m whether it is coprime to m and count their number.

But when reading up on Wikipedia, Euler's classical formula struck me as a use case for Clojure's ratio data type.

It is defined as the sum over all positive devisors d of n: $$\sum_{d\mid n}\varphi(d)=n $$ One way of deriving that formula is to look at all the fractions between 0 and 1 with denominator n. For n = 10: $$\tfrac{1}{10} \tfrac{2}{10} \tfrac{3}{10} \tfrac{4}{10} \tfrac{5}{10} \tfrac{6}{10} \tfrac{7}{10} \tfrac{8}{10} \tfrac{9}{10} \tfrac{10}{10} $$

That is just mapping / over a range of numbers in Clojure.

Put them to lowest terms: $$\tfrac{1}{10} \tfrac{1}{5} \tfrac{3}{10} \tfrac{2}{5} \tfrac{1}{2} \tfrac{3}{5} \tfrac{7}{10} \tfrac{4}{5} \tfrac{9}{10} \tfrac{1}{1} $$ Clojure reduces it's ratios automatically to lowest terms.

We then look at the ratios that still have n as their denominator (here 10): $$\tfrac{1}{10} \tfrac{3}{10} \tfrac{7}{10} \tfrac{9}{10} $$ They turn out to be the ones whose nominators are coprimes of n and their number is the result of the totient function. In Clojure that is just a filter and a count over the result of the application of map.

This solution is probably not the fastest way of calculating the totient but it is faster than calculating coprimes based on the previous solution.

If I interpret the Criterium benchmark results correctly it is about 1.7x faster

ninety-nine-clojure.arithmetic> (bench (totient-euler 2000))
Evaluation count : 70320 in 60 samples of 1172 calls.
             Execution time mean : 867.964067 µs
    Execution time std-deviation : 14.561162 µs
   Execution time lower quantile : 859.192054 µs ( 2.5%)
   Execution time upper quantile : 901.487425 µs (97.5%)
                   Overhead used : 133.031916 ns

Found 7 outliers in 60 samples (11.6667 %)
 low-severe  7 (11.6667 %)
 Variance from outliers : 6.2554 % Variance is slightly inflated by outliers
nil
ninety-nine-clojure.arithmetic> (bench (totient 2000))
Evaluation count : 41400 in 60 samples of 690 calls.
             Execution time mean : 1.496732 ms
    Execution time std-deviation : 63.212003 µs
   Execution time lower quantile : 1.450532 ms ( 2.5%)
   Execution time upper quantile : 1.636526 ms (97.5%)
                   Overhead used : 133.031916 ns

Found 5 outliers in 60 samples (8.3333 %)
 low-severe  3 (5.0000 %)
 low-mild  2 (3.3333 %)
 Variance from outliers : 28.6860 % Variance is moderately inflated by outliers
nil

Read more about this series.