Python is a programming language that is known for its simplicity and versatility. It is often used for web development, data science, machine learning and scripting thanks to its simplicity and a vast number of 3rd party libraries. It’s also known for its fairly poor performance, which is why most of the compute-heavy libraries are powered by C/C++.
But what if Python is a perfect fit for our project but there is just one algorithm that would benefit from some C++ love? It’s not too hard to write native extensions for Python or use something like Cython, but it certainly complicates project setup or deployment. Today, we’ll take a look at an interesting alternative - cppyy.
To make things concrete, we’ll implement a simple compute heavy algorithm - finding the longest common subsequence. In C++ naive implementation can look like
which is not much longer than Python’s
With cppyy all it takes to integrate it is
That’s right - it’s just a matter of passing our C++ code snippet cppyy.cppdef. Now we can call it
And now it’s time to find out if this “effort” was worth it.
Almost 50X speedup! Given that it’s just a matter of adding cppyy to a requirements list and a few lines of C++, I’d say that in this particular case it was definitely worth it.
As always, before using cppyy benchmark its performance on representative inputs, crossing runtime boundaries may involve type conversions and other types of overhead that, depending on the cost of the function, may be higher than the speedup.
Feel free to explore these examples using colab.
Does/will it support C++ execution policies? Parallel algorithms would be nice for otherwise concurrent Python. Sort of what Python ML libraries do.
Is there an option to precompile? I'm assuming the huge deviation for the cpp version was caused by the first iteration compiling the c++.