performance

The SoA Vector – Part 2: Implementation in C++—Sidney Congard

The series continues and end.

The SoA Vector – Part 2: Implementation in C++

by Sidney Congard

From the article:

Like we saw in the first part of this series on SoA, the SoA is a way to organise the data of a collection of objects to optimise the performance of certain use cases: traversing the collection by accessing the same data member of all the objects:

struct person {
   std::string name;
   int age;
};

std::vector<person> persons = ...

for (auto& person : persons)
{
   ++person.age;
}
struct person {
   std::string name;
   int age;
};

std::vector<person> persons = ...

for (auto& person : persons)
{
   ++person.age;
}

The SoA in its barest expression is this:

struct persons {
    std::vector<std::string> names;
    std::vector<int> ages;
};
struct persons {
    std::vector<std::string> names;
    std::vector<int> ages;
};

By putting all the ages next to each other in memory, we optimise the performance of the traversal. But such a structure is not a container in itself, and is in particular not compatible with the STL.

Let’s design an SoA collection with an interface as close as possible to std::vector<persons>, but with the SoA structure of components stored in separate arrays...

CppCast Episode 180: Semantic Merge for C++ code, Plastic SCM and more on version control

Episode 180 of CppCast the first podcast for C++ developers by C++ developers. In this episode Rob and Jason are joined by Pablo Santos from Códice Software the company that develops a merge tool that parses and merges even refactored C++ code:

CppCast Episode 180: Semantic Merge for C++ code, Plastic SCM and more on version control

About the interviewee:

Prior to entering start-up mode to launch Plastic SCM back in 2005, Pablo worked as R&D engineer in fleet control software development (GMV, Spain) and later digital television software stack (Sony, Belgium). Then he moved to a project management position (GCC, Spain) leading the evolution of an ERP software package for industrial companies. During these years he became an expert in version control and software configuration management working as a consultant and participating in several events as a speaker. Pablo founded Codice SoftwaLogo-semantic-vertical-negative.pngre in 2005 and since then is focused on his role as chief engineer designing and developing Plastic SCM and SemanticMerge among other SCM products.

The SoA Vector – Part 1: Optimizing the Traversal of a Collection—Sidney Congard

It's all for speed.

The SoA Vector – Part 1: Optimizing the Traversal of a Collection

by Sidney Congard

From the article:

I like C++ because it offers a good compromise between writing expressive and fast code. But, I discovered a problem where I didn’t know any way to hide the implementation detail away from its use: The “Structure of Arrays” (SoA) versus the “Array of Structures” (AoS) problem.

This is the first part of a series of two articles:

  • what ‘SoA’ is about and what benefits it brings (part 1)
  • how to implement an SoA vector in C++ (part 2)

So let’s see what those SoA and AoS are all about...

How to optimize C and C++ code in 2018—Iurii Krasnoshchok

Are you aware?

How to optimize C and C++ code in 2018

by Iurii Krasnoshchok

From the article:

We are still limited by our current hardware. There are numerous areas where it just not good enough: neural networks and virtual reality to name a few. There are plenty of devices where battery life is crucial, and we must count every single CPU tick. Even when we’re talking about clouds and microservices and lambdas, there are enormous data centers that consume vast amounts of electricity.

Even boring tests routine may quietly start to take 5 hours to run. And this is tricky. Program performance doesn‘t matter, only until it does.

A modern way to squeeze performance out of silicon is to make hardware more and more sophisticated...

How to Boost Performance with Intel Parallel STL and C++17 Parallel Algorithms—Bartlomiej Filipek

Another one.

How to Boost Performance with Intel Parallel STL and C++17 Parallel Algorithms

by Bartlomiej Filipek

From the article:

C++17 brings us parallel algorithms. However, there are not many implementations where you can use the new features. The situation is getting better and better, as we have the MSVC implementation and now Intel’s version will soon be available as the base for libstdc++ for GCC.

Since the library is important, I’ve decided to see how to use it and what it offers...

A zero cost abstraction?—Josh Peterson

Safe and performant?

A zero cost abstraction?

by Josh Peterson

From the article:

Recently Joachim (CTO at Unity) has been talking about “performance by default”, the mantra that software should be as fast as possible from the outset. This is driving the pretty cool stuff many at Unity are doing around things like ECS, the C# job system, and Burst (find lots more about that here).

One question Joachim has asked internally of Unity developers is (I’m paraphrasing here): “What is the absolute lower bound of time this code could use?” This strikes me as a really useful way to think about performance. The question changes from “How fast is this?” to “How fast could this be?”. If the answers to those two questions are not the same, the next question is “Do we really need the additional overhead?”

Another way to think about this is to consider the zero-cost abstraction, a concept much discussed in the C++ and Rust communities. Programmers are always building abstractions, and those abstractions often lead to the difference between “how fast it is” and “how fast it could be”. We want to provide useful abstractions that don’t hurt performance...

The Amazing Performance of C++17 Parallel Algorithms, is it Possible?—Bartlomiej Filipek

Are you using it?

The Amazing Performance of C++17 Parallel Algorithms, is it Possible?

by Bartlomiej Filipek

From the article:

With the addition of Parallel Algorithms in C++17, you can now easily update your “computing” code to benefit from parallel execution. In the article, I’d like to examine one STL algorithm which naturally exposes the idea of independent computing. If your machine has 10-core CPU, can you always expect to get 10x speed up? Maybe more? Maybe less? Let’s play with this topic.