CppCon Program Highlights, 13 of N: Performance, Performance, and Parallel Performance : Standard C++

The CppCon 2014 conference program has been posted for the upcoming September conference. We've received requests that the program continue to be posted in "bite-sized" posts, a few sessions at a time, to make the 100+ sessions easier to absorb, so here is another set of talks. This series of posts will conclude once the entire conference program has been posted in this way.

Many CppCon talks touch on efficiency and performance optimization, with good reason. Here are five specific talks that focus on this topic, from algorithmic efficiency to full-throttle performance on modern parallel hardware, and how support for the latest mainstream architectures is coming quickly to a standard near you. As always, these talks are by Those Who Know and Those Who Do -- several of these speakers are personally leading the work on high performance and parallelism at major C++ compiler companies and in the C++ standards process.

In this post:

Efficiency with Algorithms, Performance with Data Structures
It's About Time
Parallelism in the Standard C++: What to Expect in C++ 17
Overview of Parallel Programming in C++
Decomposing a Problem for Parallel Execution

Efficiency with Algorithms, Performance with Data Structures

Why do you write C++ code? There is a good chance it is in part because of concerns about the performance of your software. Whether they stem from needing to run on every smaller mobile devices, squeezing the last few effects into video game, or because every watt of power in your data center costs too much, C++ programmers throughout the industry have an insatiable desire for writing high performance code.

Unfortunately, even with C++, this can be really challenging. Over the past twenty years processors, memory, software libraries, and even compilers have radically changed what makes C++ code fast. Even measuring the performance of your code can be a daunting task. This talk will dig into how modern processors work, what makes them fast, and how to exploit them effectively with modern C++ code. It will teach you how modern C++ optimizers see your code today, and how that is likely to change in the coming years. It will teach you how to reason better about the performance of your code, and how to write your code so that it performs better. You will even learn some tricks about how to measure the performance of your code.

Speaker: Chandler Carruth, Google and Member of Board of Directors, Standard C++ Foundation. Chandler Carruth leads the Clang team at Google, building better diagnostics, tools, and more. Previously, he worked on several pieces of Google’s distributed build system. He makes guest appearances helping to maintain a few core C++ libraries across Google’s codebase, and is active in the LLVM and Clang open source communities. He received his M.S. and B.S. in Computer Science from Wake Forest University, but disavows all knowledge of the contents of his Master’s thesis. He is regularly found drinking Cherry Coke Zero in the daytime and pontificating over a single malt scotch in the evening.

It's About Time

This session will build up your mental model about time at the very small scale.

We'll examine progressively smaller units of time and what modern computers can accomplish in these units of time. We'll put these in perspective with human factors on these scales. For example, we'll consider how fast a computer must respond in order for a human to consider it "instantaneous."

Then we'll build up our mental model of how expensive it is for us to use specific data types, containers, and operations in our code.

We'll close by looking at why and how we should measure, always measure.

Speaker: Jon Kalb. Jon has been programming in C++ for over twenty years. During the last two decades he has written C++ for Apple, Dow Chemical, Intuit, Lotus, Microsoft, Netscape, Sun, Yahoo! and some less well‐known companies. He taught C++ in the graduate school at Golden Gate University for three years and is a founding moderator of the Boost‐User and Boost‐Interest mailing lists. Jon is active in the Silicon Valley chapter of the ACCU and programs the C++ track at the Silicon Valley Code Camp. Jon blogs at http://slashshlash.info@_JonKalb JonKalb on G+ LinkedIn

Parallelism in the Standard C++: What to Expect in C++ 17

It is 2014 and parallel programming has entered the mainstream. No longer is it the domain of the few highly trained experts. The tools available in the C++ today make parallelism accessible - if not yet easy - to average developers.

However, writing efficient cross-platform parallel code in C++ is still hard. The standard constructs available in C++ 11/14 are too basic and too low-level. More advanced tools exist, but most are either vendor-specific or don't work on all platforms.

In this presentation, we'll talk about the joint effort spearheaded by several members of the ISO C++ Committee to bring parallelism into the C++ Standard Template Library. The project known as the "Parallel STL" aims to bring muliticore and SIMD parallelism into the next revision of the ISO C++ Standard.

Speaker: Artur Laksberg, Microsoft. Artur Laksberg leads the Visual C++ Libraries development team at Microsoft. His interests include concurrency, programming language and library design, and modern C++. Artur is one of the co-authors of the Parallel STL proposal; his team is now working on the prototype implementation of the proposal.

Overview of Parallel Programming in C++

Parallel programming was once considered to be the exclusive realm of weather forecasters and particle physicists working on multi-million dollar super computers while the rest us relied on chip manufacturers to crank out faster CPUs every year. That era has come to an end. Clock speedups have been largely replaced by having more CPUs on a chip. Your typical smart phone now has 2 to 4 cores and your typical laptop or tablet has 4 to 8 cores. Servers have dozens of cores and supercomputers have thousands of cores.

If you want to speed up a computation on modern hardware, you need to take advantage of the multiple cores available. This talk is provides an overview of the parallelism landscape. We'll explore the what, why, and how of parallel programming, discuss the distinction between parallelism and concurrency and how they overlap, and learn about the problems that one runs into. We'll conclude with an overview of existing parallelism technologies in C++ and the future directions being considered for parallel programming in standard C++.

Decomposing a Problem for Parallel Execution

So you want to speed up your computation using multicore parallel execution and you've picked a parallelism framework. What now? Parallelism frameworks give you the tools you need, but they don't actually parallelize the code; that's your job. To take advantage of parallel hardware, you must decompose your computation into tasks that can be computed in parallel. In this session, I'll present a real-world problem (the n-bodies problem) and guide you through several different ways in which it can be decomposed for parallel execution. We'll look at how to achieve scalability, resolve data races, and avoid negative multi-core cache effects. At the end of this session, you should have a conceptual understanding of parallel programming fundamentals that can be applied to a wide range of problems using a variety of frameworks.

Speaker: Pablo Halpern, Parallel Programming Languages Architect, Intel. Pablo Halpern has been programming in C++ since 1989 and has been a member of the C++ Standards Committee since 2007. He is currently the Parallel Programming Languages Architect at Intel Corp., where he coordinates the efforts of teams working on Cilk Plus, TBB, OpenMP, and other parallelism languages, frameworks, and tools targeted to C++, C, and Fortran users. Pablo came to Intel from Cilk Arts, Inc., which was acquired by Intel in 2009. During his time at Cilk Arts, he co-authored the paper "Reducers and other Cilk++ Hyperobjects", which won best paper at the SPAA 2009 conference. His current work is focused on creating simpler and more powerful parallel programming languages and tools for Intel's customers and promoting adoption of parallel constructs into the C++ and C standards. He lives with his family in southern New Hampshire, USA. When not working on parallel programming, he enjoys studying the viola, skiing, snowboarding, and watching opera. Twitter handle: @PabloGHalpern