CppCon 25 Matrix Multiplication Deep Dive || Cache Blocking, SIMD & Parallelization -- Aliaksei Sala : Standard C++

Registration is now open for CppCon 2026! The conference starts on September 12 and will be held in person in Aurora, CO. To whet your appetite for this year’s conference, we’re posting videos of some of the top-rated talks from last year's conference. Here’s another CppCon talk video we hope you will enjoy – and why not register today for CppCon 2026!

Matrix Multiplication Deep Dive || Cache Blocking, SIMD & Parallelization

by Aliaksei Sala

Summary of the talk:

Matrix multiplication is a fundamental operation in scientific computing, game development, AI, and numerous high-performance applications. While its mathematical definition is simple, achieving optimal performance in C++ is far from trivial.

In this talk, we will explore different optimization techniques for matrix multiplication, from naive implementations to highly tuned versions leveraging modern hardware features. We will cover key performance-enhancing strategies such as loop unrolling, cache blocking, SIMD vectorization, parallelization using threads and more. Through benchmarking and profiling, we will measure the real impact of these optimizations.

By the end of this session, attendees will gain insights into two critical questions:

How hard is it to implement an optimized matrix multiplication in C++? How effective is C++ for achieving peak performance in this task?

This talk is suitable for developers interested in performance optimization, computational efficiency, and modern C++ techniques for numerical computing.

CppCon 25 Matrix Multiplication Deep Dive || Cache Blocking, SIMD & Parallelization -- Aliaksei Sala

Matrix Multiplication Deep Dive || Cache Blocking, SIMD & Parallelization

Add a Comment

Comments (0)

Matrix Multiplication Deep Dive || Cache Blocking, SIMD & Parallelization

Share this Article

Add a Comment

Comments (0)