One of the unfortunate consequences of C-string’s design that relies on null terminators instead of storing explicit size. Countless buffer overflows and memory corruptions incidents focus on the security aspects of this problem, but today we’ll look at some performance implications. It’s easy to see that when comparing for string equality, knowing string sizes helps to skip linear traversal in case they are different but that’s not all. In C string comparisons can be peformed using strcmp
Code snippet above from glibc library does exactly what we’d expect - looks for the first pair of characters that are different and returns their difference. Obviously this code assumes that passed strings are null-terminated and its behavior is undefined otherwise. Oftentimes, at least one of the strings is known at compile-time, so we can bound the number of traversed charecters. strncmp accepts a third argument that indicates the maximum number of characters that are used for comparison. It’s easy to extend strcmp implementation to take into account the maximum number, but interestingly glibc’s implementation of strncmp looks a little different
It does loop unrolling to handle comparisons in blocks of 4. It seems like a reasonable thing to do to reduce the number of n > 0 checks but since it does strictly more work than strcmp it seems like it should be slower, right?
Let’s find out.
#include <string.h> | |
/* Compare no more than N characters of S1 and S2, | |
returning less than, equal to or greater than zero | |
if S1 is lexicographically less than, equal to or | |
greater than S2. */ | |
int | |
STRNCMP (const char *s1, const char *s2, size_t n) | |
{ | |
unsigned char c1 = '\0'; | |
unsigned char c2 = '\0'; | |
if (n >= 4) | |
{ | |
size_t n4 = n >> 2; | |
do | |
{ | |
c1 = (unsigned char) *s1++; | |
c2 = (unsigned char) *s2++; | |
if (c1 == '\0' || c1 != c2) | |
return c1 - c2; | |
c1 = (unsigned char) *s1++; | |
c2 = (unsigned char) *s2++; | |
if (c1 == '\0' || c1 != c2) | |
return c1 - c2; | |
c1 = (unsigned char) *s1++; | |
c2 = (unsigned char) *s2++; | |
if (c1 == '\0' || c1 != c2) | |
return c1 - c2; | |
c1 = (unsigned char) *s1++; | |
c2 = (unsigned char) *s2++; | |
if (c1 == '\0' || c1 != c2) | |
return c1 - c2; | |
} while (--n4 > 0); | |
n &= 3; | |
} | |
while (n > 0) | |
{ | |
c1 = (unsigned char) *s1++; | |
c2 = (unsigned char) *s2++; | |
if (c1 == '\0' || c1 != c2) | |
return c1 - c2; | |
n--; | |
} | |
return c1 - c2; | |
} | |
/* Compare S1 and S2, returning less than, equal to or | |
greater than zero if S1 is lexicographically less than, | |
equal to or greater than S2. */ | |
int | |
STRCMP (const char *p1, const char *p2) | |
{ | |
const unsigned char *s1 = (const unsigned char *) p1; | |
const unsigned char *s2 = (const unsigned char *) p2; | |
unsigned char c1, c2; | |
do | |
{ | |
c1 = (unsigned char) *s1++; | |
c2 = (unsigned char) *s2++; | |
if (c1 == '\0') | |
return c1 - c2; | |
} | |
while (c1 == c2); | |
return c1 - c2; | |
} | |
char * name = "perf"; | |
static void BH_strncmp(benchmark::State& state) { | |
for (auto _ : state) { | |
benchmark::DoNotOptimize(STRNCMP(name, "perf", 5)); | |
} | |
} | |
BENCHMARK(BH_strncmp); | |
static void BH_strcmp(benchmark::State& state) { | |
for (auto _ : state) { | |
benchmark::DoNotOptimize(STRCMP(name, "perf")); | |
} | |
} | |
BENCHMARK(BH_strcmp); |
And the results are
2.2X faster! Don’t forget that the numbers can be different depending on the inputs, compilers and standard library implementations, so don’t take my word for it and run your own benchmarks. My main point point is that since strncmp is a little more safe than strcmp performance shouldn’t be a reason you prefer strcmp over strncmp.