Speeding up C++ Code with Template Lambdas -- Daniel Lemire
Integer division is one of the most expensive operations in C++, but when the divisor is known at compile time, the compiler can optimize it significantly. This post explores different approaches—using templates, lambda expressions, and template metaprogramming—to speed up division while maintaining clean and efficient code.
Speeding up C++ Code with Template Lambdas
by Daniel Lemire
From the article:
Let us consider a simple C++ function which divides all values in a range of integers:
void divide(std::span<int> i, int d) { for (auto& value : i) { value /= d; } }A division between two integers is one of the most expensive operations you can do over integers: it is much slower than a multiplication which is, in turn, more expensive than an addition. If the divisor d is known at compile-time, this function can be much faster. E.g., if d is 2, the compiler might optimize away the division and use a shift and a few cheap instructions instead. The same is true with all compile-time constant: the compiler can often do better knowing the constant. (See Lemire et al., Integer Division by Constants: Optimal Bounds, 2021)