Avoiding allocations at all costs.

Or spending CPU cycles to prevent allocations.

May 01, 2021

While browsing Apache Arrow project, I have noticed the usage of a string replacement function:

and started to wonder what happens when pat contains neither % nor _. I expected replace function to return the original string although its documentation got me worried:

replace creates a new String, and copies the data from this string slice into it. While doing so, it attempts to find matches of a pattern. If it finds any, it replaces them with the replacement string slice.

So let's take a look at the source code

So replace most certainly allocates a new String no matter what. What does it mean in practice? Let's use 2 benchmarks to compare performance of a replacement without a match and a check for a match:

The numbers are somewhat expected and demonstrate that check is ~5X faster than a replacement without a match:

Go's strings.Replace takes into account this fact and performs a number of checks to avoid allocations:

Arguably allocations are more expensive in managed languages like Go, since in addition to allocation cost, they contribute to GC pressure. As such it's not surprising to see this level of attention to memory in Go's codebase. At the same time, depending on how common no-match scenario is, it may be worthwhile to handle it separately in Rust.

Software Bits Newsletter

Discussion about this post

Ready for more?