Functional programming.

Or how to not chase a tail.

Jul 18, 2021

With growing popularity of functional programming, its patterns gradually find their way into mainstream imperative languages like C++ and Rust. One of such patterns is recursion. In fact it’s so important for functional programming that languages like Scheme go all the way to requiring all of its implementations to be properly tail-recursive:

3.5 Proper tail recursion
Implementations of Scheme are required to be properly tail-recursive.

It’s very rare for language specifications to mandate required optimizations, so there there better be a good reason for this:

Rationale:
Intuitively, no space is needed for an active tail call because the continuation that is used in the tail call has the same semantics as the continuation passed to the procedure containing the call. Although an improper implementation might use a new continuation in the call, a return to this new continuation would be followed immediately by a return to the continuation passed to the procedure. A properly tail-recursive implementation returns to that continuation directly.

But what about C++? Should we expect tail call optimization? Let’s find out. First, we can take a look at an example of delegation. Suppose we know how to check if given number is even efficiently and want to implement a check for odd numbers. A most natural way to achieve this would be to do something like this:

In other words, a number is even, if it’s not odd. Unfortunately, we get the following assembly for it

which does not look very efficient but should be expected, since is_odd is not in a tail position as we have to negate its result before returning it to is_even callers.

So let’s try to implement is_even using is_odd in a tail position

In other words, if a number is even, then its predecessor, as well as successor, is odd. Not only produced assembly looks much better

but clang also adds a helpful comment letting us know that it recognized a tail call.

This is great, but if performance of a critical function at stake, how can we make sure that we get the same guarantee as Scheme developers? Unfortunately there is no standard way of ensuring this, but recently Clang has acquired a custom musttail attribute that can do just that so both

and

can be used to ensure at compile time that tail call optimization is applicable. Also, in case you’re not sure if function call is in a tail position, you can use this attribute to ask the compiler. For instance, if we try to apply the attribute to a function call that is not in a tail position like

Clang will produce a helpful error message

Interestingly, our compilers are so smart that they are able to spot opportunities to rewrite our code to take advantage of a tail call optimization. So both both versions of factorial computation

produce the same highly optimized assembly.

I was very glad to see Scala adopting tailrec annotation which verifies that the method will be compiled with tail call optimization years ago and excited to see musttail attribute supported by clang. Hopefully it’s coming to your compiler of choice soon as well.

Software Bits Newsletter

Discussion about this post