performance

C++17 in detail: Parallel Algorithms—Bartlomiej Filipek

Let’s see how C++17 can make writing parallel code a bit easier.

C++17 in details: Parallel Algorithms

by Bartlomiej Filipek

From the article:

With C++17 we get a lot of algorithms that can be executed in a parallel/vectorized way. That’s amazing, as it’s a solid abstraction layer. With this making, apps is much easier. A similar thing could be achieved possibly with C++11/14 or third-party APIs, but now it’s all in the standard.

CppCon 2016: Instruction Re-ordering Everywhere: The C++ ‘As-If’ Rule and the Role…—Charles Bay

Have you registered for CppCon 2017 in September? Don’t delay – Registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2016 for you to enjoy. Here is today’s feature:

Instruction Re-ordering Everywhere: The C++ 'As-If' Rule and the Role of Sequence

by Charles Bay

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

This is an introductory (i.e., "First Principles") dive into instruction re-ordering (at compile-time, and at run-time) due to conspiring by the compiler and CPU to make most efficient use of execution units and resources within the CPU processor core. Discussion is made of the role for sequence, for tracing of data flows and control flows, how "out-of-order" execution occurs within the compiler and CPU, and why that's a "good thing". The importance of the C++ "As-If" rule that allows these optimizations is explained.

Exploration is made of imperative versus sequential devices, physical versus logical sequences, and the role of the CPU cache line. At the end of this talk, it will be obvious for how and why instruction re-ordering occurs, and the programmer's need to consider logical dependencies (and not instruction order) when defining algorithms.

This talk is ideal for any programmer confused after observing instruction reordering in their running systems, and provides a solid basis to begin reasoning about how to leverage parallelism and be concerned with concurrency.

The reviews at r/cpp_review have begun!

Participate in the first two reviews at r/cpp_review:

The reviews have begun

by Jens Weller

From the article

A few weeks ago I announced a C++ review community, which since then has grown to 250+ members on reddit. There has been great feedback and discussions since then, so that the idea is now ready to be tested.  With August, the first review period has started

CppCon 2016: The Blaze High Performance Math Library—Klaus Iglberger

Have you registered for CppCon 2017 in September? Don’t delay – Registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2016 for you to enjoy. Here is today’s feature:

The Blaze High Performance Math Library

by Klaus Iglberger

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

In this presentation we introduce the Blaze C++ math library, a hot contender for the linear algebra performance throne. Blaze is an open-source, high-performance library for dense and sparse arithmetic. It combines elegance and ease of use with HPC-grade performance, making it one of the most intuitive and at the same time fastest C++ math libraries available.

We demonstrate its basic linear algebra functionality by means of several BLAS level 1 to 3 operations and explain why Blaze outperforms even well established linear algebra libraries. Additionally, we present some advanced features that enable users to adapt Blaze to special circumstances: custom data structures, custom operations, and the customizable error reporting mechanism.

C++ Weekly Episode 75: Why You Cannot Move From Const—Jason Turner

Episode 75 of C++ Weekly.

Why You Cannot Move From Const

by Jason Turner

About the show:

You may have noticed that it's possible to use std::move with a const object, but have you stopped to consider what it does? What you think is a move is silently reverting to a copy without your knowing. In this episode Jason explains what is happening and why.

CppCon 2016: Implementing `static` control flow in C++14—Vittorio Romeo

Have you registered for CppCon 2017 in September? Don’t delay – Registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2016 for you to enjoy. Here is today’s feature:

Implementing `static` control flow in C++14

by Vittorio Romeo

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

There has always been great interest in imperative compile-time control flow: as an example, consider all the existing `static_if` proposals and the recently accepted `constexpr_if` construct for C++17.

What if you were told that it is actually possible to implement imperative control flow in C++14?

In this tutorial, the implementation and design of a compile-time `static_if` branching construct and of a compile-time `static_for` iteration construct will be shown and analyzed. These constructs will then be compared to traditional solutions and upcoming C++17 features, examining advantages and drawbacks.

itCppCon17: C++ executors to enable heterogeneous computing in tomorrow’s C++ today—Michael Wong

Videos of the Italian C++ Conference 2017 are popping up. Here is the keynote:

C++ executors to enable heterogeneous computing in tomorrow's C++ today

by Michael Wong

Slides

Summary of the talk:

For a long time, C++ has been outpaced by other models such as CUDA, OpenMP, OpenCL, HSA, which has offered Heterogeneous computing capability to enable dispatch to GPU, DSP, FPGA or other accelerators.
SG14 is an ISO C++ SG that works on low-latency, in the areas of Games, Financial, Embedded programming. One of their mandate is support of heterogeneous in Native C++, without having to drop to some other model as is needed today.
This talk will describe the first and possibly the most important step towards supporting heterogeneous computing with a description of C++ executors, an interface between concurrency constructs, and the agents/resources, one of which can be a GPU core, or a SIMD unit.  I have led a small group of experts from Google, Nvidia, Codeplay, and NASDAQ, to define a specification to enable separation of concerns in defining where, when, and how execution is done in service of the constructs in existing C++ standard library, Concurrency, Parallelism, Transactional Memory, and Networking TS.
This talk will help you understand these TSes which currently only works on CPUs and how they will tie together in future to enable execution on accelerators based on C++ models we already have today such as SYCL, HPX, Kokkos, and Raja. This will enable native C++ to be usable on machine learning algorithms on future self-driving cars, or medical devices.

CppCon 2016: Constant Fun—Dietmar Kühl

Have you registered for CppCon 2017 in September? Don’t delay – Registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2016 for you to enjoy. Here is today’s feature:

Constant Fun

by Dietmar Kühl

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

This presentation discusses why it is useful to move some of the processing to compile time and shows some applications of doing so. In particular it shows how to create associative containers created at compile time and what is needed from the types involved to make it possible. The presentation also does some analysis to estimate the costs in terms for compile-time and object file size.

Specifically, the presentation discusses:
- implications of static and dynamic initialization – the C++ language rules for implementing constexpr functions and classes supporting constexpr objects.
- differences in error handling with constant expressions.
- sorting sequences at compile time and the needed infrastructure – creating constant associative containers with compile-time and run-time look-up.

CppCon 2016: The C++17 Parallel Algorithms Library and Beyond—Bryce Adelstein Lelbach

Have you registered for CppCon 2017 in September? Don’t delay – Registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2016 for you to enjoy. Here is today’s feature:

The C++17 Parallel Algorithms Library and Beyond

by Bryce Adelstein Lelbach

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

One of the major library features in C++17 is a parallel algorithms library (formerly the Parallelism Technical Specification v1). The parallel algorithms library has both parallel versions of the existing algorithms in the standard library and a handful of new algorithms inspired by common patterns from parallel programming (such as std::reduce() and std::transform_reduce()).

We’ll talk about what’s in the parallel algorithms library, and how to utilize it in your code today. Also, we’ll discuss some exciting future developments relating to the parallel algorithms library which are targeted for the second version of the Parallelism Technical Specification – executors, and asynchronous parallel algorithms.

Future Ruminations—Sean Parent

This post is a lengthy answer to a question from Alisdair Meredith via Twitter

Future Ruminations

by Sean Parent

From the article:

The question is regarding the numerous proposals for a better future class template for C++, including the proposal from Felix Petriconi, David Sankel, and myself.

It is a valid question for any endeavor. To answer it, we need to define what we mean by a future so we can place bounds on the solution. We also need to understand the problems that a future is trying to solve, so we can determine if a future is, in fact, a useful construct for solving those problems.

The proposal started with me trying to solve a fairly concrete problem; how to take a large, heavily threaded application, and make it run in a single threaded environment (specifically, compiled to asm.js with the Emscripten compiler) but also be able to scale to devices with many cores. I found the current standard and boost implementation of futures to be lacking. I open sourced my work on a better solution, and discussed this in my Better Code: Concurrency talk. Felix heard my CppCast interview on the topic, and became the primary contributor to the project.