In this post we talk about how data structure data layout effects software performance and how, by modifying it, we can speed up the access and modification of the data structure.
Faster Hash Maps, Binary Trees etc. Through Data Layout Modification
By Ivica Bogosavljević
From the article:
This post is a logical continuation of the previous post about class layout and performance, where we talked about how class size and class layout influences software performance.
The basic premises of good memory performance hold in this case as well:
- Everything that is accessed together should be stored together. Otherwise, the memory subsystem needs to more resources to fetch data.
- Exploite the cache line organization of the memory subsystem. Since data is brought in blocks of 64 bytes from the memory to the data caches, our programs should consume all of it.
- Keep data compact in memory speeds up access and modification. Smaller data more easily fits in the faster parts of the memory subsystem.
Techniques
Different techniques apply for different data structures. But in essence, all techniques revolve around the same idea of making the logically neighboring data physically close as well.
Polymorphism and Vectors of Pointers
In C++, to use polymorphism, you need to access the object through a pointer or a reference. And to access a collection of polymorphic objects, you need a vector of pointers (or some other data structure that holds pointers). As a data structure, a vector of pointers has two problems:
- For each pointed object there needs to be a call to a system allocator to allocate the memory for the objects.
- To access an object, a pointer needs to be dereferenced.
Add a Comment
Comments are closed.