Even though std::vector is not known for its efficiency when it comes to element removal, there is often no reason to switch to a different container. As such, most of us eventually find ourselves having to remove some elements from the vector.
To prove that it’s not a totally made up situation, here is an example from duckdb
It seems like there isn’t much we can do about this - after all erasing is inherently expensive, since it has to shift left all elements after the one we erase. As such, our only hope is to have as few elements after removed one as possible. Obviously, in case of a single element to remove, the number of shifted elements is set in stone, but what if we need to remove a lot of elements?
Say, we have a vector {1,2,3} and we need to remove all elements. How many elements do we have to shift? To remove 1, we’ll need to shift 2 and 3, to remove 2, we’ll need to shift just 3 and nothing has to be moved when removing 3. It’s easy to notice that in order to remove all elements from the vector of size (n + 1), we’ll have to shift (n), (n-1), …, 1 elements, which is n*(n+1)/2.
But what if we remove these elements from the end? To remove 3, there is nothing to shift, to remove 2, since 3 is no longer there, there is nothing to shift. There is also nothing to shift once only 1 is left. This looks a lot more promising, isn’t it?
Let’s see what benchmarks have to say about this
As expect, removing from the end is a lot faster - 8.6X for tested input. Obviously, removing all elements using a single range erase is even more efficient. But in such case it’s better to use a simple clear.
But this range erase is very useful when removing elements based on some condition. We can use std::partition to “split” vector into “good” and “bad” parts and erase the range corresponding to “bad”. This approach minimizes the number of shifts and is very concise. In fact, this idiom, was so popular, but in C++20, there is now std::erase_if
So next time you have to remove elements from the std::vector, remember that there are many ways to go about it, some of which are not that expensive.
Good read. How about thread safety of erased vector?