Infinite Streams in Groovy

streams

Streams

LazyLists



class LazyList

{

private def car;

private Closure cdr;

LazyList(def car, Closure cdr) { this.car=car; this.cdr=cdr; }

public def getCar() { return car; }

public LazyList getCdr() { if(cdr != null){ return cdr(); } else { cdr; } }

}





public static cons(def val, Closure c) { return new LazyList(val, c); }





public List take(int n)

{

def r = []; def l = this;

n.times { r.add(l.car); l = l.cdr(); }

return r;

}





def integers(n) { cons(n, { integers(n+1) }) }

def naturalnumbers = integers(1)



naturalnumbers.take(10).each { print "$it

"; }



Big deal! What about filters?



public LazyList filter(Closure pred)

{

if(getCar() != null)

if(pred(getCar()))

return cons(getCar(), { getCdr().filter(pred) })

else

return getCdr().filter(pred)

return l

}





def evennumbers = naturalnumbers.filter { it % 2 == 0 }



The Sieve of Eratosthenes



def sieve(list) { cons(list.car, { sieve(list.cdr.filter { x-> x % list.car != 0 }) }) }





def sieve(list)

{

cons(list.car,

{

sieve(list.cdr.filter { x-> x % list.car != 0 })

})

}





def primes = sieve(integers)

primes.take(100).each { print "$it

" }





def fibo(a,b) { cons(a, { fibo(b, a+b) }) }

fibo(0,1).take(100).each { print "$it

" }





public LazyList compose(Closure c, LazyList l)

{

return cons(c(getCar(), l.car), { getCdr().compose(c, l.cdr) })

}



public LazyList map(Closure c)

{

return cons(c(getCar()), { getCdr().map(c); });

}





def twinprimes = primes.compose({x,y -> [y-x == 2, x]}, primes.cdr).filter({it[0] == true}).map({ it[1] });

twinprimes.take(10).each { print "($it, ${it+2})

" }



Caveats



public static delay(Closure c)

{

boolean evaled=false;

def val=null;

return {

Object[] args ->

if(!evaled) { evaled=true; val=c(args); return val }

else return val;

}

}





public static cons(def val, Closure c)

{

return new LazyList(val, delay(c));

}



In my last post, I showed how you can use byte-code manipulation and Generics to achieve a modicum of functional programming in Java without overly cumbersome syntax. However, it still feels a little clunky.But how about in Groovy? Groovy's support for real closure syntax turns out to make functional programming, er, Groovy! I won't bother writing up the basics, since they have already been done elsewhere, such as Functional Programming in Groovy What I will do is show how to build a classic data structure used in functional programming,, using Groovy. And then I'll show a classic application of functional streams: a stream of the Sieve of Eratosthenes, in one line of code.A stream is a collection, like a list, only that it doesn't necessarily produce its contents until needed, and it may in fact, be infinitely large. It bears some resemblance to streams used in I/O, such as InputStream in Java. For example, one could create a stream of integers, by implementing a class IntegerInputStream whose read() method returns the integers in montonically increasing order.But our streams won't subclass InputStream, ours will be a sort of linked-list.First, we'll create a new data structure, LazyList which represents a pair of values: a concrete value 'car', and a closure 'cdr' which returns another LazyList.Notice that the getCdr() function doesn't return the cdr field, it returns the result of executing the closure.Next we'll need a way of constructing a new LazyList, so we add the following method to class LazyListThis factory method just uses the LazyList constructor (for now). And finally, we need a method for grabbing some values out of the list.We're not done yet, but we're already capable of defining our first infinite stream.The first line says that the integers starting from N, is the result of constructing a list consisting of the element N and the integers starting from N+1. Note that the recursive call to integers(n+1) is inside a closure. Groovy is an eager language, and all function parameters are evaluated before calling the function. Without the closure, this would result in an infinite loop and overflow the stack.The second line says that the natural numbers are the integers starting from 1.The third line says to take the first 10 elements from the natural numbers, iterate over the result, printing them out.Yeah, you could do that easily with [1..10].each { print "$it

" }. So let's move on. What if we want to filter an infinite LazyList? We can do that too, with the following method added to LazyList.Which says that to filter a LazyList with closure predicate 'pred', we call pred() on the first element of the list, and if true, we construct a new list using this element, and a closure which will invoke filter on the rest of the elements of the list. Otherwise, we recursively call filter until we find an element that satisfies the predicate.Now we can define the even numbers...One of the classic examples used to demonstrate the power of streams is the Sieve of Eratosthenes , which is to define a stream whose successive list of values are the prime numbers.This can in fact be done in a single line of code using LazyList:although that is kinda ugly, but formatted this wayis easier to read. If you read the Wikipedia link above, you'll see that the sieve takes a list, whose first element is prime, and returns a new list, with this first element, followed by a sieve of another list with all multiplies of the prime number filtered out.Now we can print the first hundred prime numbers:or the first 100 fibonacci numbersAnd with just two other magic functionswe can even print out the first 10 twin primes:First, don't expect this code to run fast. It generates alot of garbage lazylists, uses recursion, and it also recalculates values. The first problem probably can't be fixed, the second problem can be fixed by rewriting the LazyList functions to be tail-recursive and then hacking Groovy to make sure it optimizes tail-recursion.The final problem (recalcuation) can be fixed with memoization. Recall the cons() function which appears to do nothing but invoke the LazyList constructor? Let's introduce a new function:and modify consso that now the Closure passed to cons() is only executed once. Subsequent values are returned from the cached value. Of course, this only helps performance if the closure block is computationally expensive to evaluate. Otherwise, the overhead of creating the memoization closure probably incurs even worse penalties.That's it for today's hack. Practical? Maybe not. Groovy? Yes. If you want to use these techniques and run them more efficiently on the JVM, you may want to look at some of the Scheme/Lisp dialects available for Java.-Ray

Labels: java