Accelerating your C++ on GPU with SYCL—Simon Brand

A post on writing GPGPU code in C++ using the SYCL standard from the Khronos group.

Accelerating your C++ on GPU with SYCL

By Simon Brand

From the article:

Leveraging the power of graphics cards for compute applications is all the rage right now in fields such as machine learning, computer vision and high-performance computing. Technologies like OpenCL expose this power through a hardware-independent programming model, allowing you to write code which abstracts over different architecture capabilities. The dream of this is “write once, run anywhere”, be it an Intel CPU, AMD discrete GPU, DSP, etc. Unfortunately, for everyday programmers, OpenCL has something of a steep learning curve; a simple Hello World program can be a hundred or so lines of pretty ugly-looking code. However, to ease this pain, the Khronos group have developed a new standard called SYCL, which is a C++ abstraction layer on top of OpenCL. Using SYCL, you can develop these general-purpose GPU (GPGPU) applications in clean, modern C++ without most of the faff associated with OpenCL.

Better C++ / Chicago July 12-14, 2017

Join us for a 3 day training event in Chicago, IL, USA July 12-14, 2017

Better C++ / Chicago

by Jason Turner

About the training:

Through this training you will gain a better understanding of how to write clean, maintainable, and well performing C++ code.

The topics covered apply to all types of C++ development: embedded, system or application development.

Jason's classes are highly interactive and have a limited class size to ensure that everyone has sufficient opportunity to participat

A la carte tickets are available for those wishing to attend only part of the training.

Wednesday: Demystifying C++11 and Beyond

C++11, 14, and 17 added many new features to C++ that have made many question the overhead of using these new features and the complexity they add to the language. We will make an in depth examination of these features to give you confidence in using and deploying modern C++ techniques in your organization.

Thursday: Understanding Object Lifetime in C++

C++ has what very few other languages have: a well defined object life cycle. Understanding this key aspect of the language is critical for writing high quality C++.
We will describe the lifecycle of an object in C++ and work through increasingly complex examples. There will be something for C++ developers of all skill levels to learn.

Friday: C++ Best Practices

On the final day of the course we will cover a series of tangible best practice rules for how to write C++ code that is maintainable and efficient by default.
We will wrap up with a discussion of how to use the tools available to maintain code quality.

Frozen - An header-only, constexpr alternative to gperf for C++14 users—Serge Guelton

Check this out!

Frozen - An header-only, constexpr alternative to gperf for C++14 users

by Serge Guelton

From the article:

An open source, header-only library that provides fast, immutable, constexpr-compatible implementation of std::set, std::map, std::unordered_map and std::unordered_set to C++14 users. It can be used as an alternative to gperf...

CppCon 2016: Want fast C++? Know your hardware!—Timur Doumler

Have you registered for CppCon 2017 in September? Don’t delay – Registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2016 for you to enjoy. Here is today’s feature:

Want fast C++? Know your hardware!

by Timur Doumler

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

As C++ evolves, it provides us with better and more powerful tools for optimal performance. But often, knowing the language very well is not enough. It is just as important to know your hardware. Modern computer architectures have many properties that can impact the performance of C++ code, such as cache locality, cache associativity, true and false sharing between cores, memory alignment, the branch predictor, the instruction pipeline, denormals, and SIMD. In this talk, I will give an overview over these properties, using C++ code. I will present a series of code examples, highlighting different effects, and benchmark their performance on different machines with different compilers, sometimes with surprising results. The talk will draw a picture of what every C++ developer needs to know about hardware architecture, provide guidelines on how to write modern C++ code that is cache-friendly, pipeline-friendly, and well-vectorisable, and highlight what to look for when profiling it.

CppCon 2016: Using Types Effectively—Ben Deane

Have you registered for CppCon 2017 in September? Don’t delay – Registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2016 for you to enjoy. Here is today’s feature:

Using Types Effectively

by Ben Deane

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

C++ has a pretty good type system, and modern C++ gives us a greater ability than ever before to use that type system for good: to make APIs easier to use and harder to misuse, to make our datatypes more closely express our intent, and generally to make code safer, more obvious in function and perhaps even faster.

This is an interactive session - incorporating games played between presenter and audience, even - taking a look at choices available to us as datatype and API designers, and examining how a little knowledge about the algebra of algebraic datatypes can help. We'll see why std::optional and (hopefully soon) std::variant will quickly become an essential part of everyone's toolbox, and also explore how types can be used to express not just the structure of data, but also the behaviour of objects and functions.

CppCon 2016: The strange details of std::string at Facebook—Nicholas Ormrod

Have you registered for CppCon 2017 in September? Don’t delay – Registration is open now.

While we wait for this year’s event, we’re featuring videos of some of the 100+ talks from CppCon 2016 for you to enjoy. Here is today’s feature:

The strange details of std::string at Facebook

by Nicholas Ormrod

(watch on YouTube) (watch on Channel 9)

Summary of the talk:

Standard strings are slowing you down. Strings are everywhere. Changing the performance of std::string has a measurable impact on the speed of real-world C++ programs. But how can you make strings better? In this talk, we'll explore how Facebook optimizes strings, especially with our open-source std::string replacement, fbstring. We'll dive into implementation tradeoffs, especially the storage of data in the struct; examine which standard rules can and cannot be flouted, such as copy-on-write semantics; and share some of the things we've learned along the way, like how hard it is to abolish the null-terminator. War stories will be provided.

Packing Bools, Performance tests

Can you pack 8 bools into one BYTE efficiently?

Packing Bools, Performance tests

by Bartlomiej Filipek

From the article:

The simple problem of packing seems to show a lot of performance issues. If the packing is really needed? Can we make the code parallel? What are the branching effects here?

HPX version 1.0 released—STE||AR Group

The STE||AR Group has released V1.0 of HPX -- A C++ Standard library for parallelism and concurrency.

HPX V1.0 Released

The newest version of HPX (V1.0) is now available for download! Please see here for the release notes.

HPX exposes an API fully conforming to the concurrency related parts of the C++11/C++14/C++17 standards, extended and applied to distributed and heterogeneous computing, and aligned with the ongoing standardization discussions.

From the announcement:

  • HPX is a general purpose parallel C++ runtime system for applications of any scale. It implements all of the related facilities as defined by the C++ Standard. As of this writing, HPX provides the only widely available open-source implementation of the new C++17 parallel algorithms. Additionally, HPX implements functionalities proposed as part of the ongoing C++ standardization process, such as large parts of the C++ Concurrency TS, task blocks, data-parallel algorithms, executors, index-based parallel for loops, and many more. It also extends the existing C++ Standard APIs to the distributed case (e.g. compute clusters) and for heterogeneous systems (e.g. GPUs).
  • HPX seamlessly enables a new asynchronous C++ Standard Programming Model which tends to improve the parallel efficiency of our applications and helps reduce complexities usually associated with concurrency


Zapcc 1.0 - A Much Faster C++ Compiler

We are happy to announce Zapcc, a Clang/LLVM based C++ compiler, with much faster compilations.


by the Zapcc team

About the compiler:

Typically, Zapcc accelerate full C++ builds by x2-x5, and incremental builds by x10 and more.

Zapcc is available for evaluation on the companies website.

Feel free to try it out!