As soon as C++ got lambdas, developers use and abuse them to make code more concise and better encapsulated. Since they come from lambda calculus and functional programming languages where recursion is very common, it’s tempting to use C++ lambdas for recursive functions as well. The first natural way attempt to implement factorial that looks something like
unfortunately would result in a compiler error
After some web searching, developers usually end up with a second attempt that uses std::function
It seems ok, but what about concerns about std::function issues some of which were mentioned in
Well, the generated assembly looks somewhat suspicious
but before we come to any conclusions it would be nice to have something working to compare this with. Is there a way to have a recursive implementation based on pure lambdas? Yes and that’s where Y combinators come into the picture. I’ll spare you the details that are easy to find and go directly to the code
It looks a little strange since we are passing the lambda as a parameter to itself, but this way we can recursively call the function. Its assembly looks a little scary
but as soon as SIMD registers show up in the assembly, it’s usually a good sign that compilers are able to vectorize some computations.
At this point we can spend a lot of time analyzing generated assembly and predicting potential performance impact or we can run some benchmarks
#include <functional> | |
int fact_std_func(int n) { | |
std::function<int(int)> fact = [&](auto n) { | |
if (n == 1) return 1; | |
return n * fact(n-1); | |
}; | |
return fact(n); | |
} | |
int fact_comb(int n) { | |
auto fact = [](auto n, auto self) -> int { | |
if (n == 1) return 1; | |
return n * self(n-1, self); | |
}; | |
return fact(n, fact); | |
} | |
const int n = 1000; | |
static void FactStdFunc(benchmark::State& state) { | |
for (auto _ : state) { | |
benchmark::DoNotOptimize(fact_std_func(n)); | |
} | |
} | |
BENCHMARK(FactStdFunc); | |
static void FactComb(benchmark::State& state) { | |
for (auto _ : state) { | |
benchmark::DoNotOptimize(fact_comb(n)); | |
} | |
} | |
BENCHMARK(FactComb); |
The results look unambiguous with 13X runtime difference.
This is yet another example of how useful Y combinators are and a reason why there is even a proposal to add Y combinator into the standard library. Until then, you know what to do.
a couple of notes
on gcc a regular function is still faster https://quick-bench.com/q/0BdxXp1mxOElTdTqdcQkF3kk2gM
constexpr function with lambda does not always calculate compile-time results