Overview

So far the standard has been quiet about whether there is a default executor (See n3785 for more about executors). A well-written library that wishes to be flexible should take an executor as a parameter. Unfortunately there is today no good default for that library. Programmers want a way to run tasks asynchronously without having to create new executors.

This problem is compounded by the fact that thread pools are often much faster (and use fewer resources) than a thread-per-task executor. Many systems (e.g. Windows) provide a good system thread pool which would be a great default. We would like to make it easy to use that thread pool, but a thread pool is generally limited in size and is likely limited in queue depth. This adds new constraints to tasks running here, including the number of concurrently blocked threads. There should be a way to specify a real thread-per-task executor for when it is necessary to avoid these constraints.

An executor that reuses threads could also be faster than a thread-per-task executor, but it would leak thread locals between tasks. Many users would be willing to trade off the leakage for the performance, but this makes it complicated to use thread locals with these executors.

Another related issue is describing where a std::async task runs asynchronously. If we have a default executor, it should be easy to run an async task there. It has been proposed many times (e.g. N3970) to let std::async take an optional executor as an argument. But if no executor is specified, should it run on the std::default_executor? If the default executor has more constraints than a thread-per-task executor, those limitations would then also apply to async. This has the potential to break existing code. Still, most users of async would probably prefer to use a fast thread pool instead of spinning up a thread.

Beyond all of this (and not discussed in depth here), there’s the looming promise of fibers, and it’s uncertain how fibers might be integrated into the standard, and how they would affect executors.

Design Options

We’ve been discussing a number of proposals; they all have pros and cons. They all center around 3 new standard executors:

std::thread_executor

spins up a thread for each task

std::default_pool_executor:

uses a thread pool, possibly from the system
generally faster than thread_per_task
may have concurrent blocked thread limitations, but should have “enough” threads that programmers shouldn’t worry generally about it
has thread-local sharing
may have queueing effects

std::default_executor

defined differently in each proposal

We don’t think any specific favorite proposal is obviously the best, but would like to hear what others think. Here’s our list:

1. Do Nothing

We always have the option of not providing any default executor. We still provide an explicit std::thread_executor, and force users to explicitly use it. If they want to use the fast system thread pool they use the non portable my_system::thread_pool_executor.

Pros:

Easy (from the standards perspective)!
Async doesn’t change. (i.e. still spins up a thread)

Cons:

There’s no standardized pool executor, even though most tasks would be happiest running on the system thread pool executor.
Users will have to make a lot of executors, and the standard might need to provide an ExecutorService (like Java).
Libraries have to refer to every individual system’s thread pool they are compatible with.

2. Provide a std::default_executor (thread per task) and std::default_pool_executor

We would provide all three executors in the standard, with default_executor being the same as thread_executor (except maybe allowing for thread reuse, i.e. thread-local sharing).

We could also rename default_pool_executor to default_fast_executor, to emphasize why you’d use it, rather than the implementation details.

If std::async is extended to take an executor as parameter, there’s now a standardized easy way to use the system’s thread pool. If an executor is not specified, it runs on default_executor.

Pros:

Pretty easy!
Async doesn’t change. (By default it uses default_executor, i.e. spins up a thread)
Async can use the thread pool if explicitly passed
Libraries can specify they run ok on std::default_pool_executor, and can use it by default.
In practice, implementations may offer non-conforming options to use std::default_pool_executor as the default.

Cons:

Async still defaults to thread per task
Implementations that default to using default_pool_executor are non-conforming
Different systems have different numbers of threads in their thread pool. Defining how those limitations affect the code has complications. This might involve adding a definition of “blocked threads” and forcing interfaces to specify when they “block a thread”. Then the user needs to count up the total number of potential blocked threads, and make sure it’s below N (And what is N? Is it a template parameter? “Implementation defined”?). Alternately we may just say you shouldn’t block these threads, but that limits their usefulness.

Of course in practice the user has to worry about this today, without the benefit of the standard explicitly specifying these constraints in a portable way.

There’s two defaults… So which one is the real default?

3. Provide the two new executors (thread pool & thread per task), and let the user specify at compile time which should be the std::default_executor.

The standard doesn’t really discuss compile time options, but we could figure out a way to allow the default executor to be selected at compile time. This allows an implementation to default to the system thread pool, but allows the escape valve of forcing thread-per-task if you need it.

This comes in two varieties:

std::async doesn’t change, continues to create a new thread per task
std::async by default uses std::default_executor to create new tasks

Pros:

Async can use the thread pool or thread-per-task if explicitly passed
Libraries can specify they run ok on std::default_pool_executor
In practice, implementations may now offer a conforming option to use std::default_pool_executor as the default.
In the distant future, other executors like fibers or GPU pools could potentially use this hook to become the default.

Cons:

Under A) std::async still defaults to thread per task.
Under B) the constraints on std::async change based on a compile time flag!
If a library uses std::default_executor but can’t run on the default_thread_pool then it’s not safe the change the default_executor. See discussion above about specifying constraints. This is even worse under B) where libraries using async have the same issue.
Different implementations may have different default executors, making porting code more difficult. You’ll have to specify your default executor, and it may not be one that performs well on every platform.

4. Add std::default_executor but with weak enough guarantees that can be implemented by a system thread pool

Making default_executor compatible with thread pools will strongly encourage use of system thread pools. We still provide std::thread_executor as an escape valve, which will work for any existing code.

This on also comes in two varieties:

std::async doesn’t change, continues to create a new thread per task
std::async by default uses std::default_executor to create new tasks

Pros:

Most people use the better thread pools
If A)

std::async doesn’t change
You can use std::async(std::default_executor) to get faster async

If B)

everyone gets faster async by default
You can use std::async(std::thread_executor) to get old behavior

Cons:

We need to specify these weakened guarantees on default_executor such that it is compatible with all existing system thread pools, but also useful enough to be a decent default.

The lifetime of thread locals in the executor are very loose. They may be destroyed at basically any time, including after statics.
It’s only safe to block “a few” threads in the default executor, where we want “a few” to be large enough that most users don’t have to worry

In practice each platform’s default_executor will have it’s own properties, and users will accidentally rely on those properties, making porting more difficult.
If A)

default std::async is still threads

If B)

We’re changing the behavior of std::async. This is a breaking change, although the fix (specify std::thread_executor) is pretty easy to change (even just as text replacement). The majority of users won’t be broken though, and most will enjoy faster code.

Conclusion

We think that (1) & (2) are probably not great because they will tend to keep users away from the system thread pools.

(3) is interesting, and the most flexible solution. It allows advanced users to run exactly the code they want, and it encourages all libraries to specify what constraints they have on executors. In practice most users would move to using system thread pools by default, but we don’t break any existing code. It also provides a hook for other executors to come in the future. All that said, it introduces an “at compile time” concept that’s hard to explain in standardese, and many libraries will never specify their executor constraints, which forces users to either run with thread-per-task, or run without correctness guarantees.

(4 B) would be great, as it would move everyone to system thread pools unless they explicitly move away from it. Unfortunately it would add new constraints to std::async, which would change existing behavior. If async is being changed or replaced in the future it might make sense to use the default_executor, but doing that now would be difficult.

(4 A) may be the most reasonable compromise. It makes the default executor fairly weak in guarantees, but still a good default for users who “don’t care”. It is tricky to nail down exactly what the guarantees should be, but I suspect people who pay attention to the guarantees will prefer explicitly specifying their executor anyway. Unfortunately it is verbose to use the thread pool with std::async, but it doesn’t change any existing behavior.