11 Comments

The optimization can also be done for unsigned integers, it just has to be done slightly differently. https://ideone.com/EBwo8e

So undefined behavior isn't really enabling much more of an optimization here. It's just that the compiler misses this optimization for unsigned ints.

Expand full comment

so why the compiler cannot use the more optimized, no loop method? unsigned has a well defined wrap around outcome.

Expand full comment
author

precisely because unsigned integer overflow behavior is defined, so if compiler performs any optimization, there is a chance that the computation result would be different from the one we'd get without the optimization. Note that it's not about correctness, but about result being equal to the one we expect for unoptimized computation.

Expand full comment

sorry about the dup comment. I thought my comment was not posted and needed to re-submit. It's interesting what you are saying. Basically, it is preferring 100% correctness versus speed that might not be 100% correct as the non super-opt version.

So if I take an max unsigned int (numeric_limits<uint32_t>::max()) and will do n(n+1)/2 vs doing the loop. would it be diff? n+1 will be 0 since there is a wrap around and the whole thing will be 0. vs the loop?

Expand full comment
author

again, this has nothing to do with correctness - it's about being consistent with unoptimized version. And yes, the results can be different and you can easily test this locally or using online editor like ideone - https://ideone.com/Cbuqcm

Expand full comment

thank you!

Expand full comment
author

you are very welcome!

Expand full comment

so in the case of unsigned, what could be wrong if the compiler would be using the more optimized, no loop code gen? why exactly it says "I cant do it" ? in unsigned, you have a defined wrap around, right?

Expand full comment
Comment deleted
Expand full comment
author

unfortunately that's not the case - you just have a different code that clang is able to optimize, but the original version, listed in the post, does not enjoy the same optimization - https://godbolt.org/z/9xeM3qEff

Expand full comment
Comment deleted
Expand full comment
author

sure, but that's exactly the point of undefined behavior :)

Expand full comment
Comment deleted
Expand full comment
author

thank you, Chris, but the point of the post was to emphasize the role of undefined behavior and not come up with a clever way to perform summation of the first N integers, even though it may very well be useful in practice. At the same time, I really enjoy super interesting discoveries, like yours, that come as a result of compiler optimization analysis, so please keep sharing them!

Expand full comment