Sometimes I wander into wonderful rabbit holes.
This week's adventure, "What’s the fastest way to multiply two 32-bit numbers on an ARM Cortex-M0 CPU?"
If you're thinking Karatsuba multiplication (https://en.wikipedia.org/wiki/Karatsuba_algorithm), you’re wrong.
While Karatsuba offers a lower asymptotic complexity (around O(n^1.58)), it only pays off when used recursively on larger integers.
On the Cortex-M0, Karatsuba runs much slower than the plain naïve multiply. It turns out, simplicity wins at this scale.
Godbolt Source:
https://godbolt.org/z/4rP7bnv6r