performance : Standard C++

Home » Blog » Tags » performance

« Prev Next »

performance

C++ Compiler Benchmarks--Imagine Raytracer

By Adrien Hamelin | Dec 19, 2014 04:38 PM | Tags: performance efficiency

On the blog Imagine Raytracer a few days ago:

C++ Compiler Benchmarks

by Imagine Raytracer

From the article:

[...] I decided to try 4.9.2 which had just been released. This seemed to showed a fairly serious regression in terms of speed (speed being a pretty important aspect for a renderer), so I decided I'd do a more comprehensive comparison of the latest main compilers for the Linux platform, as back in 2011 and 2012 I used to do compiler benchmarks (GCC, LLVM and ICC) regularly every six months or so on my own code (including Imagine), and on the commercial VFX compositor made by the company I worked for at the time, and it had been a while since I'd compared them myself...

Plain threads are the GOTO of todays computing

By Meeting C++ | Dec 19, 2014 03:57 AM | Tags: threading performance parallelism keynote intermediate experimental efficiency advanced

I'd like to share with you Hartmut Kaisers Keynote from Meeting C++ 2014:

Plain Threads are the GOTO of todays Computing

by Hartmut Kaiser

Video:

Blaze 2.2 released

By Felix Petriconi | Dec 10, 2014 08:02 AM | Tags: performance

Blaze, an open-source, high-performance C++ math library for dense and sparse arithmetic, released their new version.

Blaze 2.2 Released

After a total of five and a half months, a little late for SC'14, but right on time for Meeting C++, we finally release Blaze 2.2! But the waiting time was worthwhile! This release comes with several bug fixes and hundreds of improvements, many based on your hints, suggestions and ideas. Thank you very much for your support and help to make the Blaze library even better!

The big new feature of Blaze 2.2 is symmetric matrices. And this is not just any implementation of symmetric matrices, but one of the most complete and powerful implementations available. See the Blaze tutorial to get an idea of how symmetric matrices work and how they can help you prevent some inadvertent pessimizations of your code.

Efficiency with Algorithms, Performance with Data Structures -- Chandler Carruth

By George Ryan | Dec 2, 2014 08:38 AM | Tags: performance c++14 c++11

At the recent CppCon 2014, Chandler Carruth gave a great talk on using Modern C++ for writing high-performance applications.

Efficiency with Algorithms, Performance with Data Structures

by Chandler Carruth

From the video introduction:

C++ programmers throughout the industry have an insatiable desire for writing high performance code. Unfortunately, even with C++, this can be really challenging. Over the past twenty years processors, memory, software libraries, and even compilers have radically changed what makes C++ code fast. Even measuring the performance of your code can be a daunting task. This talk will dig into how modern processors work, what makes them fast, and how to exploit them effectively with modern C++ code.

New optimizations for X86 in upcoming GCC 5.0 -- Evgeny Stupachenko

By Razvan Pascalau | Dec 1, 2014 07:11 AM | Tags: performance gcc efficiency compiler advanced

Fresh on the Intel Developer Zone blog:

New optimizations for X86 in upcoming GCC 5.0

by Evgeny Stupachenko

From the article:

Part 1. Vectorization of loads/stores group.

GCC 5.0 significantly improves vector code quality for load groups and store groups. By loads/stores group I mean iterated consecutive sequence of loads/stores. For example:

x = a[i], y = a[i + 1], z = a[i + 2] iterated by “i” is loads group of size 3

...
The most frequent case where loads/stores groups are applicable is array of structures.

Image conversion (RGB structure to some other) ...

N-dimentional coordinates. (Normalize array of XYZ points) ...

Multiplication of vectors by constant matrix: ...

... GCC 5.0:

Introduces vectorization of load/store groups of size 3

Improves load groups vectorization for all supported sizes

Maximizes load/store groups performance by generating code that is more optimal for particular x86 CPU...

HPX version 0.9.9 released -- STE||AR Group

By Hartmut Kaiser | Nov 2, 2014 02:07 PM | Tags: performance parallelism distributed computing concurrency c++14 c++11

The STE||AR Group has released V0.9.9 of HPX -- A general purpose parallel C++ runtime system for applications of any scale.

HPX V0.9.9 Released

The newest version of HPX (V0.9.9) is now available for download! Please see here for the release notes.

HPX now exposes an API fully conforming to the concurrency related parts of the C++11 and C++14 standards, extended and applied to distributed computing.

From the announcement:

We completed the refactoring of hpx::future to be properly C++11 standards conforming.
We overhauled our build system to support newer CMake features to make it more robust and more portable.
We implemented a large part of the parallel algorithms and other parallel facilities proposed by C++ Technical Specifications N4104, N4088, and N4107.
We added many examples such as the 1D Stencil and the Matrix Transpose series.
We significantly improved the performance of the library and the existing documentation

C++ and Zombies: a moving question

By Meeting C++ | Oct 31, 2014 03:57 AM | Tags: performance intermediate efficiency c++14 c++11 boost advanced

One of the issues I was thinking about since C++Now: move and move-destruction

C++ and Zombies: a moving question

by Jens Weller

From the article:

This has been on my things to think about since C++Now. At C++Now, I realized, that we've might got zombies in the C++ standard. And that there are two fractions, one of them stating, that it is ok to have well defined zombies, while some people think that you'd better kill them.

Insights into new and C++

By Meeting C++ | Oct 3, 2014 07:49 AM | Tags: performance intermediate efficiency c++14 c++11 boost basics

I've written down some basic thinking on new and new standards:

Insights into new and C++

by Jens Weller

From the article:

Every now and then, I've been thinking about this. So this blogpost is also a summary of my thoughts on this topic, dynamic memory allocation and C++. Since I wrote the blog entries on smart pointers, and C++14 giving us make_unique, raw new and delete seem to disappear from C++ in our future code...

The Drawbacks of Implementing Move Assignment in Terms of Swap -- Scott Meyers

By Blog Staff | Jun 24, 2014 12:02 PM | Tags: performance intermediate efficiency

Hot off the Meyers press: How would you implement move, and why? Scott Meyers explains two related issues:

The Drawbacks of Implementing Move Assignment in Terms of Swap

by Scott Meyers

From the article:

More and more, I bump into people who, by default, want to implement move assignment in terms of swap. This disturbs me, because (1) it's often a pessimization in a context where optimization is important, and (2) it has some unpleasant behavioral implications as regards resource management.

Vector of Objects vs Vector of Pointers Updated -- Bartlomiej Filipek

By Blog Staff | May 22, 2014 07:30 AM | Tags: performance memory intermediate efficiency

More in the "contiguous enables fast" department:

Vector of Objects vs Vector of Pointers Updated

by Bartlomiej Filipek

From the article:

For 1000 particles we need on the average 2000 cache line reads! This is 78% more cache line reads than the first case! Additionally Hardware Prefetcher cannot figure out the pattern -- it is random -- so there will be a lot of cache misses and stalls.

In our experiment the pointer code for 80k of particles was more 266% slower than the continuous case.