Stack overflows from deep recursion in Java?

Increasing the stack size will only serve as a temporary bandage. As others have pointed out, what you really want is tail call elimination, and Java does not have this for various reasons. However, you can cheat if you want.

Red pill in hand? OK, this way please.

There are ways in which you can exchange stack for heap. For example, instead of making a recursive call within a function, have it return a lazy datastructure that makes the call when evaluated. You can then unwind the “stack” with Java’s for-construct. I’ll demonstrate with an example. Consider this Haskell code:

map :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = (f x) : map f xs

Note that this function never evaluates the tail of the list. So the function doesn’t actually need to make a recursive call. In Haskell, it actually returns a thunk for the tail, which is called if it’s ever needed. We can do the same thing in Java (this uses classes from Functional Java):

public <B> Stream<B> map(final F<A, B> f, final Stream<A> as)
  {return as.isEmpty()
     ? nil()
     : cons(f.f(as.head()), new P1<Stream<A>>()
         {public Stream<A> _1()
           {return map(f, as.tail);}});}

Note that Stream<A> consists of a value of type A and a value of type P1 which is like a thunk that returns the rest of the stream when _1() is called. While it certainly looks like recursion, the recursive call to map is not made, but becomes part of the Stream datastructure.

This can then be unwound with a regular for-construct.

for (Stream<B> b = bs; b.isNotEmpty(); b = b.tail()._1())
  {System.out.println(b.head());}

Here’s another example, since you were talking about Project Euler. This program uses mutually recursive functions and does not blow the stack, even for millions of calls:

import fj.*; import fj.data.Natural;
import static fj.data.Enumerator.naturalEnumerator;
import static fj.data.Natural.*; import static fj.pre.Ord.naturalOrd;
import fj.data.Stream; import fj.data.vector.V2;
import static fj.data.Stream.*; import static fj.pre.Show.*;

public class Primes
  {public static Stream<Natural> primes()
    {return cons(natural(2).some(), new P1<Stream<Natural>>()
       {public Stream<Natural> _1()
         {return forever(naturalEnumerator, natural(3).some(), 2)
                 .filter(new F<Natural, Boolean>()
                   {public Boolean f(final Natural n)
                      {return primeFactors(n).length() == 1;}});}});}

   public static Stream<Natural> primeFactors(final Natural n)
     {return factor(n, natural(2).some(), primes().tail());}

   public static Stream<Natural> factor(final Natural n, final Natural p,
                                        final P1<Stream<Natural>> ps)
     {for (Stream<Natural> ns = cons(p, ps); true; ns = ns.tail()._1())
          {final Natural h = ns.head();
           final P1<Stream<Natural>> t = ns.tail();
           if (naturalOrd.isGreaterThan(h.multiply(h), n))
              return single(n);
           else {final V2<Natural> dm = n.divmod(h);
                 if (naturalOrd.eq(dm._2(), ZERO))
                    return cons(h, new P1<Stream<Natural>>()
                      {public Stream<Natural> _1()
                        {return factor(dm._1(), h, t);}});}}}

   public static void main(final String[] a)
     {streamShow(naturalShow).println(primes().takeWhile
       (naturalOrd.isLessThan(natural(Long.valueOf(a[0])).some())));}}

Another thing you can do to exchange stack for heap is to use multiple threads. The idea is that instead of making a recursive call, you create a thunk that makes the call, hand this thunk off to a new thread and let the current thread exit the function. This is the idea behind things like Stackless Python.

The following is an example of that in Java. Apologies that it’s a bit opaque to look at without the import static clauses:

public static <A, B> Promise<B> foldRight(final Strategy<Unit> s,
                                          final F<A, F<B, B>> f,
                                          final B b,
                                          final List<A> as)
  {return as.isEmpty()
     ? promise(s, P.p(b))
     : liftM2(f).f
         (promise(s, P.p(as.head()))).f
         (join(s, new P1<Promise<B>>>()
            {public Promise<B> _1()
              {return foldRight(s, f, b, as.tail());}}));}

Strategy<Unit> s is backed by a thread pool, and the promise function hands a thunk to the thread pool, returning a Promise, which is very much like java.util.concurrent.Future, only better. See here. The point is that the method above folds a right-recursive datastructure to the right in O(1) stack, which ordinarily requires tail-call elimination. So we’ve effectively achived TCE, in exchange for some complexity. You would call this function as follows:

Strategy<Unit> s = Strategy.simpleThreadStrategy();
int x = foldRight(s, Integers.add, List.nil(), range(1, 10000)).claim();
System.out.println(x); // 49995000

Note that this latter technique works perfectly well for nonlinear recursion. That is, it will run in constant stack even algorithms that don’t have tail calls.

Another thing you can do is employ a technique called trampolining. A trampoline is a computation, reified as a data structure, that can be stepped through. The Functional Java library includes a Trampoline data type that I wrote, which effectively lets you turn any function call into a tail call. As an example here is a trampolined foldRightC that folds to the right in constant stack:

public final <B> Trampoline<B> foldRightC(final F2<A, B, B> f, final B b)
  {return Trampoline.suspend(new P1<Trampoline<B>>()
    {public Trampoline<B> _1()
      {return isEmpty()
         ? Trampoline.pure(b)
         : tail().foldRightC(f, b).map(f.f(head()));}});}

It’s the same principle as using multiple threads, except that instead of invoking each step in its own thread, we construct each step on the heap, very much like using a Stream, and then we run all the steps in a single loop with Trampoline.run.

Leave a Comment