performance

Both keynotes from Meeting C++ 2015 are now online!

See Chandler Carruth and Lars Knoll giving the keynotes at Meeting C++ this year:

Both Keynotes from Meeting C++ 2015 are online!

by Jens Weller

From the article:

Great news: Since yesterday, both of the keynotes from this years Meeting C++ conference are on youtube! Both keynote speakers chose to speak on a specific topic, and delivered very well. There is also a playlist for Meeting C++ 2015.

All Meeting C++ Lightning Talk videos are online

Meeting C++ just started a week ago, and I already managed to edit and upload all lightning talks:

Meeting C++ 2015 - all lightning talks are now online at youtube

by Jens Weller

From the article:

This year for the very first time we had lightning talks at the Meeting C++ conference. Two sessions with each 5 lightning talks were held...

HPX version 0.9.11 released -- STE||AR Group

The STE||AR Group has released V0.9.11 of HPX -- A general purpose parallel C++ runtime system for applications of any scale.

HPX V0.9.11 Released

The newest version of HPX (V0.9.11) is now available for download! Please see here for the release notes.

HPX exposes an API fully conforming to the concurrency related parts of the C++11 and C++14 standards, extended and applied to distributed computing.

From the announcement:

  • In this release our team has focused on developing higher level C++ programming interfaces which simplify the use of HPX in applications and ensure their portability in terms of code and performance. We paid particular attention to align all of these changes with the existing C++ Standard or with the ongoing standardization work. Other major features include the introduction of executors and various policies which enable customizing the ‘where’ and ‘when’ of task and data placement.
  • This release consolidates many of the APIs exposed by HPX. We introduced a new uniform way of creating (local and remote) objects, we added distribution policies allowing to manage and customize data placement and migration, we unified the way various types of parallelism are made available to the user.

Random Acts of Optimization --Tony Albrecht

Discussion regarding systematic approach to go about optimization of logic.

Random Acts of Optimization

by Tony Albrecht

From the article:

The three stages mentioned here, while seemingly obvious, are all too often overlooked when programmers seek to optimize. Just to reiterate:

    1. Identification: profile the application and identify the worst performing parts.
    2. Comprehension: understand what the code is trying to achieve and why it is slow.
    3. Iteration: change the code based on step 2 and then re-profile. Repeat until fast enough.

The solution above is not the fastest possible version, but it is a step in the right direction—the safest path to performance gains is via iterative improvements.

Do You Prefer Fast or Precise?--Jim Hogg

A nice article explaining the troubles of float numbers, and what effects it can have. It is talking in the case of Visual C++, but the problems are the same for other compilers.

Do You Prefer Fast or Precise?

by Jim Hogg

From the article:

Floating Point Basics

In C++, a float can store a value in the 3 (approximate) disjoint ranges { [-E+38, -E-38], 0, [E-38, E+38] }. Each float consumes 32 bits of memory. In this limited space, a float can only store approximately 4 billion different values. It does this in a cunning way, where adjacent values for small numbers lie close together; while adjacent values for big numbers lie far apart. You can count on each float value being accurate to about 7 decimal digits.

Floating Point Calculations

We all understand how a computer calculates with ints. But what about floats? One obvious effect is that if I add a big number and a small number, the small one may simply get lost. For example, E+20 + E-20 results in E+20 – there are not enough bits of precision within a float to represent the precise/exact/correct value...

Video available: Chandler Carruth, "Tuning C++: Benchmarks, and CPUs, and Compilers!" -- CppCon

Chandler's talk about benchmarking, cheating the compiler's optimizer and optimizing code from the recent CppCon is online.

Tuning C++: Benchmarks, and CPUs, and Compilers! Oh My! (YouTube)

by Chandler Carruth, CppCon 2015

From the talk's outline:

A primary use case for C++ is low latency, low overhead, high performance code. But C++ does not give you these things for free, it gives you the tools to control these things and achieve them where needed. How do you realize this potential of the language? How do you tune your C++ code and achieve the necessary performance metrics?

This talk will walk through the process of tuning C++ code from benchmarking to performance analysis. It will focus on small scale performance problems ranging from loop kernels to data structures and algorithms. It will show you how to write benchmarks that effectively measure different aspects of performance even in the face of advanced compiler optimizations and bedeviling modern CPUs. It will also show how to analyze the performance of your benchmark, understand its behavior as well as the CPUs behavior, and use a wide array of tools available to isolate and pinpoint performance problems. The tools and some processor details will be Linux and x86 specific, but the techniques and concepts should be broadly applicable.

Should you be using something instead of what you should use instead?--Scott Meyers

The title is confusing, the article is not and should be read!

Should you be using something instead of what you should use instead?

by Scott Meyers

From the article:

The April 2000 C++ Report included an important article by Matt Austern: "Why You Shouldn't Use set—and What to Use Instead." It explained why lookup-heavy applications typically get better performance from applying binary search to a sorted std::vector than they'd get from the binary search tree implementation of a std::set. Austern reported that in a test he performed on a Pentium III using containers of a million doubles, a million lookups in a std::set took nearly twice as long as in a sorted std::vector...