precisely because unsigned integer overflow behavior is defined, so if compiler performs any optimization, there is a chance that the computation result would be different from the one we'd get without the optimization. Note that it's not about correctness, but about result being equal to the one we expect for unoptimized computation.
sorry about the dup comment. I thought my comment was not posted and needed to re-submit. It's interesting what you are saying. Basically, it is preferring 100% correctness versus speed that might not be 100% correct as the non super-opt version.
So if I take an max unsigned int (numeric_limits<uint32_t>::max()) and will do n(n+1)/2 vs doing the loop. would it be diff? n+1 will be 0 since there is a wrap around and the whole thing will be 0. vs the loop?
again, this has nothing to do with correctness - it's about being consistent with unoptimized version. And yes, the results can be different and you can easily test this locally or using online editor like ideone - https://ideone.com/Cbuqcm
so in the case of unsigned, what could be wrong if the compiler would be using the more optimized, no loop code gen? why exactly it says "I cant do it" ? in unsigned, you have a defined wrap around, right?
unfortunately that's not the case - you just have a different code that clang is able to optimize, but the original version, listed in the post, does not enjoy the same optimization - https://godbolt.org/z/9xeM3qEff
thank you, Chris, but the point of the post was to emphasize the role of undefined behavior and not come up with a clever way to perform summation of the first N integers, even though it may very well be useful in practice. At the same time, I really enjoy super interesting discoveries, like yours, that come as a result of compiler optimization analysis, so please keep sharing them!
The optimization can also be done for unsigned integers, it just has to be done slightly differently. https://ideone.com/EBwo8e
So undefined behavior isn't really enabling much more of an optimization here. It's just that the compiler misses this optimization for unsigned ints.
so why the compiler cannot use the more optimized, no loop method? unsigned has a well defined wrap around outcome.
precisely because unsigned integer overflow behavior is defined, so if compiler performs any optimization, there is a chance that the computation result would be different from the one we'd get without the optimization. Note that it's not about correctness, but about result being equal to the one we expect for unoptimized computation.
sorry about the dup comment. I thought my comment was not posted and needed to re-submit. It's interesting what you are saying. Basically, it is preferring 100% correctness versus speed that might not be 100% correct as the non super-opt version.
So if I take an max unsigned int (numeric_limits<uint32_t>::max()) and will do n(n+1)/2 vs doing the loop. would it be diff? n+1 will be 0 since there is a wrap around and the whole thing will be 0. vs the loop?
again, this has nothing to do with correctness - it's about being consistent with unoptimized version. And yes, the results can be different and you can easily test this locally or using online editor like ideone - https://ideone.com/Cbuqcm
thank you!
you are very welcome!
so in the case of unsigned, what could be wrong if the compiler would be using the more optimized, no loop code gen? why exactly it says "I cant do it" ? in unsigned, you have a defined wrap around, right?
unfortunately that's not the case - you just have a different code that clang is able to optimize, but the original version, listed in the post, does not enjoy the same optimization - https://godbolt.org/z/9xeM3qEff
sure, but that's exactly the point of undefined behavior :)
thank you, Chris, but the point of the post was to emphasize the role of undefined behavior and not come up with a clever way to perform summation of the first N integers, even though it may very well be useful in practice. At the same time, I really enjoy super interesting discoveries, like yours, that come as a result of compiler optimization analysis, so please keep sharing them!