performance

Quick Q: Why doesn't c++ have a specified order for evaluating function arguments?

Quick A: In order to get the best performance.

recently on SO:

Why doesn't c++ have a specified order for evaluating function arguments?

C++ leaves as much as practical up to the implementation of the compiler, especially where optimization opportunities might occur. Before fixing something like this, existing compiler writers are consulted to see if there is a cost.

Some order of evaluation guarantees surrounding overloaded operators and complete-argument rules where added in C++17. But it remains that which argument goes first is left unspecified. In C++17, it is now specified that the expression giving what to call (the code on the left of the ( of the function call) goes before the arguments, and whichever argument is evaluated first is evaluated fully before the next one is started, and in the case of an object method the value of the object is evaluated before the arguments to the method are. (There may be some minor errors in this description: ask a narrow question about it for a more vetted answer).

These where vetted and determined to not cause significant prolems for existing compilers. Reordering of arguments was considered, and discarded, presumably for good reasons. Or even poor reasons, like existing compilers have a default order and it could break (possibly non-compliant) code on that compiler to force them all to do it in one global order.

In short, C++ leaves lots of freedom to compiler writers. This has let compiler writers find optimization opportunities in strange crannies. The kind of optimizations available today are way beyond what the original writers of C++, let alone C++, may have suspected where practical or be considered reasonable.

CppCon 2015 C++ Atomics: The Sad Story of memory_order_consume--Paul E. McKenney

Have you registered for CppCon 2016 in September? Don’t delay – Registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2015 for you to enjoy. Here is today’s feature:

C++ Atomics: The Sad Story of memory_order_consume: A Happy Ending At Last?

by Paul E. McKenney

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

One of the big advantages of C++ atomics is that developers can now let the compiler know about the intent behind their multi-threaded synchronization mechanisms. At least they can tell the compiler as long as the synchronization mechanism in question is not RCU. You see, all production compilers promote RCU's memory_order_consume to memory_order_acquire. Although this promotion does ensure correctness, it also ensures the additional overhead of memory-barrier instructions on weakly ordered systems and of needlessly suppressed compiler optimizations on all systems.

All previous attempts to resolve this issue have foundered on either standard-committee reluctance to eviscerate the standard for a special case, compiler-writer reluctance to eviscerate their compilers for a special case, and kernel-developers reluctance to eviscerate their source base for late-to-the-party compiler support.

But now there is a glimmer of hope in the guise of a small set of small patches to the Linux kernel that eliminate the most challenging use cases. Will this hope be realized? Come to this talk to here the story, which by September will hopefully have a happy ending!

CppCon 2015 Concurrency TS Editor's Report--Artur Laksberg

Have you registered for CppCon 2016 in September? Don’t delay – Registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2015 for you to enjoy. Here is today’s feature:

Concurrency TS Editor's Report

by Artur Laksberg

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

In this presentation we will talk about the new C++ concurrency features that have been included in the Concurrency Technical Specification.

The TS should be of interest to anyone writing concurrent code in C++. The proposal includes improved futures for wait-free composition of asynchronous operations (including their relationship with C++ 'await'), new synchronization constructs as well as atomic smart pointers.

C++/Graphics Workshop--Stephanie Hurlburt

Are you interested?

C++/Graphics Workshop

by Stephanie Hurlburt

Description of the event:

Ever been curious about C++ and graphics programming, but not sure where to start?
Maybe you are an artist who'd like to build your own tools. Maybe you're a game developer wishing your games would run faster, or have even better graphics effects. Regardless, knowledge of the way graphics work at a low level is an empowering skill.
We'll be covering real-time graphics with C++/OpenGL as well as raytracing. It'll be aimed at beginners, but everyone is welcome.
This'll be an intimate workshop, meant for around 20 people. We'll give a talk and then walk you through some hands-on examples. Be sure to bring a laptop if you can!

CppCon 2015 Parallelizing the C++ Standard Template Library--Grant Mercer & Danial Bourgeois

Have you registered for CppCon 2016 in September? Don’t delay – Registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2015 for you to enjoy. Here is today’s feature:

Parallelizing the C++ Standard Template Library

by Grant Mercer & Danial Bourgeois

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

As the era of frequency scaling comes to an end, multi-core parallelism has become an essential focal point in computational research. Mainstream languages, however, have not yet adapted to take full advantage of parallelism provided by the hardware. While new languages such as Rust and Swift are catching on and implementing multi-core algorithms in their libraries, C++ has only started to do so. A parallel Standard Library could bring with it many positive features that users can begin taking advantage of.

This talk will focus around two standards proposals, N4409 and N4406. N4409 outlines the details of a parallel Standard Library and features of these new parallel algorithms. The complementary N4406 outlines abstractions to take advantage of various mechanisms for parallel execution. We will cover the reasons why the new Standard Library would be beneficial to C++ users and our experience implementing these algorithms in HPX. The presentation will address what exactly the two proposals define, the challenges we faced, and the results we collected. In addition, we will discuss extensions made to these proposals and the C++11/14 standard in HPX to support these semantics in a distributed environment.

Zeroing Memory is Hard (VC++ 2015 arrays)--Bruce Dawson

Here is a curious behaviour:

Zeroing Memory is Hard (VC++ 2015 arrays)

by Bruce Dawson

From the article:

Quick, what’s the difference between these two C/C++ definitions of initialized local variables?

char buffer[32] = { 0 };
char buffer[32] = {};

One difference is that the first is legal in C and C++, whereas the second is only legal in C++.

Okay, so let’s focus our attention on C++. What do these two definitions mean?

HPX version 0.9.99 released -- STE||AR Group

The STE||AR Group has released V0.9.99 of HPX -- A general purpose parallel C++ runtime system for applications of any scale.

HPX V0.9.99 Released

The newest version of HPX (V0.9.99) is now available for download! Please see here for the release notes.

HPX exposes an API fully conforming to the concurrency related parts of the C++11/C++14/C++17 standards, extended and applied to distributed and heterogeneous computing, and aligned with the ongoing standardization discussions.

From the announcement:

  • Version 1.0 approaches! This release is significant as HPX is nearly feature complete. Over the next several months we will continue to test and polish HPX’s API and documentation. The feedback and experience gained from the community’s utilization of this release will provide guidance on where to add the finishing touches for V1.0.
  • We have put a lot of effort into improving the overall performance and stability of the library. Applications written using HPX have shown to outperform equivalent applications which are based on more conventional parallelization methods.
  • In this release we finished the implementation of transparent migration of components to another locality (e.g. physical compute-node). It is now possible to trigger a migration operation without ‘stopping the world’ for the object to migrate. HPX will make sure that no work is being executed on the object while it is moved and that all subsequently scheduled work for the migrated object will be transparently forwarded to the new locality. The global address of the migrated object does not change, thus the application will not have to be changed in any way to support this new functionality.

 

CppCon 2015 completion T : Improving the future T with monads--Travis Gockel

Have you registered for CppCon 2016 in September? Don’t delay – Registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2015 for you to enjoy. Here is today’s feature:

completion T : Improving the future T with monads

by Travis Gockel

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

std::future provides us a mechanism for asynchronous communication between a provider and receiver. However, the C++14 standard does not allow for actual asynchronous programming, as the only ways to interact with an std::future are blocking calls. The proposed then helps, but the interface is awkward and can be extremely slow when handling exceptions. Here, I will talk about completion a high-performance, async-only and monadic alternative to std::future and how it is used at SolidFire.

CppCon 2015 Executors for C++ - A Long Story ...--Detlef Vollmann

Have you registered for CppCon 2016 in September? Don’t delay – Early Bird registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2015 for you to enjoy. Here is today’s feature:

Executors for C++ - A Long Story ...

by Detlef Vollmann

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

Executors will be a base building block in C++ for asynchronous, concurrent and parallel work. The job of an executor is simple: run the tasks that are posted. So the first proposals for executors in C++ had a very simple interface. However, being a building block, the executor should provide an interface that's useful for all kind of higher level abstractions and needs to work together with different types of concurrency, like co-operative multi-tasking or GPU like hardware. This presentation will look at the evolution of the executor proposals for C++ and what they'll provide for normal application programmers.