My favorite way of exploring a fascinating world of software engineering is by reading code. And since we usually discuss performance topics, today we’re going to use a real-world example from TigerBeetle. TigerBeetle is a distributed financial accounting database designed for mission critical safety and performance. It is the world's fastest financial accounting database, capable of processing 1 million journal entries per second on consumer-grade hardware. TigerBeetle is also the smallest and toughest financial accounting database, with an incredible storage fault model.
It’s a great example mechanical sympathy in action and I highly recommend read about its design principles and distributed consensus approach, since it uses neither Paxos nor Raft.
But today, instead of design principles and complicated data structures, we’re going to look at the implementation of div_ceil that returns a result of division rounded to the next larger integer
Frankly, I was expecting an implementation that is used much more frequently in programming competitive world
It looks more concise, doesn’t have a special case for the 0 denominator and features a shorter assembly
vs
Btw, just for fun, I’ve also included its variant with special case for 0 before we benchmark them.
Depending on how easy for CPU to predict this branch, it can end up having no impact as can be seen from
Looks like our alternative implementation is better all around, doesn’t it? Well, there is a small gotcha - adding numerator and denominator may result in overflow, which is an undefined behavior for signed integers. While such overflow is defined for unsigned integers, something like div_ceil3(std::numeric_limits<unsigned int>::max(), 2)
evaluates to 0, which is unlikely something that clients expect.
That’s yet another example of why undefined behavior is useful - it’s a way we can establish invariants for compiler that cannot be verified at compile-time. As such, if you know that your div_ceil function is never used for inputs that can lead to overflow, consider getting a 30% performance boost.