this is a good point. I did not realize the aliasing part until you pointed it out. since "&" are just addresses there is a need to take care of aliasing.
yes, and the worst part is that passing by reference makes absolutely no sense for the types this function is used most of the time, e.g. ints and floats :(
Are you talking about vectorization? If so would you mind to clarifying why aliasing is a problem while we dont have any write operation in std::clamp?
there are 2 different issues in play. 1 is the performance difference between std::clamp and its hand-written version due to extra memory loads that happen when std::clamp is used
and 2nd is what benchmark measures - perf difference of processing vectors with these 2 functions and note, they both functions do write clamped results into the vector.
this is a good point. I did not realize the aliasing part until you pointed it out. since "&" are just addresses there is a need to take care of aliasing.
yes, and the worst part is that passing by reference makes absolutely no sense for the types this function is used most of the time, e.g. ints and floats :(
Are you talking about vectorization? If so would you mind to clarifying why aliasing is a problem while we dont have any write operation in std::clamp?
there are 2 different issues in play. 1 is the performance difference between std::clamp and its hand-written version due to extra memory loads that happen when std::clamp is used
std_clamp(float const&, float const&, float const&):
movss xmm1, DWORD PTR [rsi]
maxss xmm1, DWORD PTR [rdi]
movss xmm0, DWORD PTR [rdx]
minss xmm0, xmm1
ret
fast_clamp(float, float, float):
minss xmm2, xmm0
maxss xmm2, xmm1
movaps xmm0, xmm2
ret
and 2nd is what benchmark measures - perf difference of processing vectors with these 2 functions and note, they both functions do write clamped results into the vector.
Got it, thanks