In previous article we’ve managed to speed up number encoding in rustc by getting rid of reverse invocation and ended up with
Attentive readers must have noticed the .unwrap() call on the result of from_utf8. It’s required because from_utf8 returns Err
if the slice is not UTF-8 with a description as to why the provided slice is not UTF-8. It’s pretty obvious that in order to decide if error should be returned, from_utf8 has to validate provided slice, which is not free.
Fair enough, but is it possible for the slice to ever contain non UTF-8 characters? All characters come from BASE_64 array and all of its characters are valid UTF-8 characters. In such case, why not skip the validation?
Of course, we leave a safety comment explaining why this unsafe block is actually safe and use the following benchmark to see if it makes any difference
use criterion::{criterion_group, criterion_main, Criterion}; | |
use std::str; | |
pub const MAX_BASE: usize = 64; | |
pub const ALPHANUMERIC_ONLY: usize = 62; | |
pub const CASE_INSENSITIVE: usize = 36; | |
const BASE_64: &[u8; MAX_BASE] = | |
b"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ@$"; | |
#[inline] | |
pub fn push_str(mut n: u128, base: usize, output: &mut String) { | |
debug_assert!(base >= 2 && base <= MAX_BASE); | |
let mut s = [0u8; 128]; | |
let mut index = 0; | |
let base = base as u128; | |
loop { | |
s[index] = BASE_64[(n % base) as usize]; | |
index += 1; | |
n /= base; | |
if n == 0 { | |
break; | |
} | |
} | |
s[0..index].reverse(); | |
output.push_str(str::from_utf8(&s[0..index]).unwrap()); | |
} | |
#[inline] | |
pub fn push_str2(mut n: u128, base: usize, output: &mut String) { | |
debug_assert!(base >= 2 && base <= MAX_BASE); | |
let mut s = [0u8; 128]; | |
let mut index = s.len(); | |
let base = base as u128; | |
loop { | |
index -= 1; | |
s[index] = BASE_64[(n % base) as usize]; | |
n /= base; | |
if n == 0 { | |
break; | |
} | |
} | |
output.push_str(unsafe { | |
// SAFETY: `s` is populated using only valid utf8 characters from `BASE_64` | |
str::from_utf8_unchecked(&s[index..]) | |
}); | |
} | |
fn bench_push_str(c: &mut Criterion) { | |
let mut group = c.benchmark_group("push_str"); | |
let xs = [0, 1, 35, 36, 37, u64::MAX as u128, u128::MAX]; | |
group.bench_function("old", |b| { | |
b.iter(|| { | |
let mut total = 0; | |
let mut s = String::new(); | |
for base in 2..37 { | |
for n in xs { | |
s.clear(); | |
push_str(n, base, &mut s); | |
total += s.len(); | |
} | |
} | |
total | |
}) | |
}); | |
group.bench_function("new", |b| { | |
b.iter(|| { | |
let mut total = 0; | |
let mut s = String::new(); | |
for base in 2..37 { | |
for n in xs { | |
s.clear(); | |
push_str2(n, base, &mut s); | |
total += s.len(); | |
} | |
} | |
total | |
}) | |
}); | |
group.finish(); | |
} | |
criterion_group!(benches, bench_push_str); | |
criterion_main!(benches); |
The results on M1 macbook air:
Running benches/base_n_benchmark.rs (target/release/deps/base_n_benchmark-825fe5895b5c2693)
push_str/old time: [14.670 µs 14.852 µs 15.074 µs]
Found 11 outliers among 100 measurements (11.00%)
4 (4.00%) high mild
7 (7.00%) high severe
push_str/new time: [12.573 µs 12.674 µs 12.801 µs]
Found 11 outliers among 100 measurements (11.00%)
7 (7.00%) high mild
4 (4.00%) high severe
Not bad for a single liner? Compilers have a difficult job of optimizing code without affecting correctness. Sometimes we have knowledge that is outside of compiler’s jurisdiction and can help it by taking responsibility on maintaining some of the invariants on our own. When making a decision to do this, don’t forget to consider the cost of maintenance - when assumptions or code change, someone has to go through manually maintained invariants to make sure they are still up-to-date. In this particular case, the change is fairly localized and shouldn’t be hard to maintain, but you can see that reviewers are not rushing to accept it.