C++ Concurrent Queues

ISO/IEC JTC1 SC22 WG21 N3533 - 2013-03-12

Lawrence Crowl, [email protected], [email protected]
Chris Mysen, [email protected], [email protected]

Introduction
Conceptual Interface
    Basic Operations
    Non-Waiting Operations
    Non-Blocking Operations
    Push Front Operations
    Closed Queues
    Empty and Full Queues
    Queue Names
    Element Type Requirements
    Exception Handling
    Queue Ordering
    Lock-Free Implementations
Concrete Queues
    Locking Buffer Queue
    Lock-Free Buffer Queue
Additional Conceptual Tools
    Fronts and Backs
    Streaming Iterators
    Storage Iterators
    Binary Interfaces
    Managed Indirection
Implementation
Revision History

Introduction

Queues provide a mechanism for communicating data between components of a system.

The existing deque in the standard library is an inherently sequential data structure. Its reference-returning element access operations cannot synchronize access to those elements with other queue operations. So, concurrent pushes and pops on queues require a different interface to the queue structure.

Moreover, concurrency adds a new dimension for performance and semantics. Different queue implementation must trade off uncontended operation cost, contended operation cost, and element order guarantees. Some of these trade-offs will necessarily result in semantics weaker than a serial queue.

Conceptual Interface

We provide basic queue operations, and then extend those operations to cover other important issues.

Basic Operations

The essential solution to the problem of concurrent queuing is to shift to value-based operations, rather than reference-based operations.

The basic operations are:

void queue::push(const Element&);
void queue::push(Element&&);

Push the Element onto the queue.

Element queue::value_pop();

Pop an Element from the queue. The element will be moved out of the queue in preference to being copied.

These operations will wait when the queue is full or empty. (Not all queue implementations can actually be full.) These operations may block for mutual exclusion as well.

Non-Waiting Operations

Waiting on a full or empty queue can take a while, which has an opportunity cost. Avoiding that wait enables algorithms to avoid queuing speculative work when a queue is full, to do other work rather than wait for a push on a full queue, and to do other work rather than wait for a pop on an empty queue.

queue_op_status queue::try_push(const Element&);
queue_op_status queue::try_push(Element&&);

If the queue is full, return queue_op_status::full. Otherwise, push the Element onto the queue. Return queue_op_status::success.

queue_op_status queue::try_pop(Element&);

If the queue is empty, return queue_op_status::empty. Otherwise, pop the Element from the queue. The element will be moved out of the queue in preference to being copied. Return queue_op_status::success.

These operations will not wait when the queue is full or empty. They may block for mutual exclusion.

Non-Blocking Operations

For cases when blocking for mutual exclusion is undesirable, we have non-blocking operations. The interface is the same as the try operations but is allowed to also return queue_op_status::busy in case the operation is unable to complete without blocking.

queue_op_status queue::nonblocking_push(const Element&);
queue_op_status queue::nonblocking_push(Element&&);

If the operation would block, return queue_op_status::busy. Otherwise, if the queue is full, return queue_op_status::full. Otherwise, push the Element onto the queue. Return queue_op_status::success.

queue_op_status queue::nonblocking_pop(Element&);

If the operation would block, return queue_op_status::busy. Otherwise, if the queue is empty, return queue_op_status::empty. Otherwise, pop the Element from the queue. The element will be moved out of the queue in preference to being copied. Return queue_op_status::success.

These operations will neither wait nor block. They may, however, not do anything.

The non-blocking operations highlight a terminology problem. In terms of synchronization effects, nonwaiting_push on queues is equivalent to try_lock on mutexes. And so one could conclude that the existing try_push should be renamed nonwaiting_push and nonblocking_push should be renamed try_push. However, at least Thread Building Blocks uses the existing terminology. Perhaps better is to not use try_push and instead use nonwaiting_push and nonblocking_push.

Push Front Operations

Occasionally, one may wish to return a popped item to the queue. We can provide for this with push_front operations.

void queue::push_front(const Element&);
void queue::push_front(Element&&);

Push the Element onto the back of the queue, i.e. in at the end of the queue that is normally popped. Return queue_op_status::success.

queue_op_status queue::try_push_front(const Element&);
queue_op_status queue::try_push_front(Element&&);

If the queue was full, return queue_op_status::full. Otherwise, push the Element onto the front of the queue, i.e. in at the end of the queue that is normally popped. Return queue_op_status::success.

queue_op_status queue::nonblocking_push_front(const Element&);
queue_op_status queue::nonblocking_push_front(Element&&);

If the operation would block, return queue_op_status::busy. Otherwise, if the queue is full, return queue_op_status::full. Otherwise, push the Element onto the front queue. i.e. in at the end of the queue that is normally popped. Return queue_op_status::success.

This feature was requested at the Spring 2012 meeting. However, we do not think the feature works.

In short, we do not think that in a concurrent environment push_front provides sufficient semantic value to justify its cost.

Closed Queues

Threads using a queue for communication need some mechanism to signal when the queue is no longer needed. The usual approach is add an additional out-of-band signal. However, this approach suffers from the flaw that threads waiting on either full or empty queues need to be woken up when the queue is no longer needed. To do that, you need access to the condition variables used for full/empty blocking, which considerably increases the complexity and fragility of the interface. It also leads to performance implications with additional mutexes or atomics. Rather than require an out-of-band signal, we chose to directly support such a signal in the queue itself, which considerably simplifies coding.

To achieve this signal, a thread may close a queue. Once closed, no new elements may be pushed onto the queue. Push operations on a closed queue will either return queue_op_status::closed (when they have a status return) or throw queue_op_status::closed (when they do not). Elements already on the queue may be popped off. When a queue is empty and closed, pop operations will either return queue_op_status::closed (when they have a status return) or throw queue_op_status::closed (when they do not).

The additional operations are as follows. They are essentially equivalent to the basic operations except that they return a status, avoiding an exception when queues are closed.

void queue::close();

Close the queue.

bool queue::is_closed();

Return true iff the queue is closed.

queue_op_status queue::wait_push(const Element&);
queue_op_status queue::wait_push(Element&&);

If the queue was closed, return queue_op_status::closed. Otherwise, push the Element onto the queue. Return queue_op_status::success.

queue_op_status queue::wait_push_front(const Element&);
queue_op_status queue::wait_push_front(Element&&);

If the queue was closed, return queue_op_status::closed. Otherwise, push the Element onto the front of the queue, i.e. in at the end of the queue that is normally popped. Return queue_op_status::success.

queue_op_status queue::wait_pop(Element&);

If the queue is empty and closed, return queue_op_status::closed. Otherwise, if the queue is empty, return queue_op_status::empty. Otherwise, pop the Element from the queue. The element will be moved out of the queue in preference to being copied. Return queue_op_status::success.

The push and pop operations will wait when the queue is full or empty. All these operations may block for mutual exclusion as well.

There are use cases for opening a queue that is closed. While we are not aware of an implementation in which the ability to reopen a queue would be a hardship, we also imagine that such an implementation could exist. Open should generally only be called if the queue is closed and empty, providing a clean synchronization point, though it is possible to call open on a non-empty queue. An open operation following a close operation is guaranteed to be visible after the close operation and the queue is guaranteed to be open upon completion of the open call. (But of course, another close call could occur immediately thereafter.)

void queue::open();

Open the queue.

Note that when is_closed() returns false, there is no assurance that any subsequent operation finds the queue closed because some other thread may close it concurrently.

If an open operation is not available, there is an assurance that once closed, a queue stays closed. So, unless the programmer takes care to ensure that all other threads will not close the queue, only a return value of true has any meaning.

Empty and Full Queues

It is sometimes desirable to know if a queue is empty.

bool queue::is_empty();

Return true iff the queue is empty.

This operation is useful only during intervals when the queue is known to not be subject to pushes and pops from other threads. Its primary use case is assertions on the state of the queue at the end if its lifetime, or when the system is in quiescent state (where there no outstanding pushes).

We can imagine occasional use for knowing when a queue is full, for instance in system performance polling. The motivation is significantly weaker though.

bool queue::is_full();

Return true iff the queue is full.

Not all queues will have a full state, and these would always return false.

Queue Names

It is sometimes desirable for queues to be able to identify themselves. This feature is particularly helpful for run-time diagnotics, particularly when 'ends' become dynamically passed around between threads. See Managed Indirection below. There is some debate on this facility, but we see no way to effectively replicate the facility.

const char* queue::name();

Return the name string provided as a parameter to queue construction.

Element Type Requirements

The above operations require element types with a default constructor, copy/move constructors, copy/move assignment operators, and destructor. These operations may be trivial. The default constructor and destructor shall not throw. The copy/move constructors and copy/move assignment operators may throw, but must must leave the objects in a valid state for subsequent operations.

Exception Handling

Concurrent queues cannot completely hide the effect of exceptions, in part because changes cannot be transparently undone when other threads are observing the queue.

Queues may rethrow exceptions from storage allocation, mutexes, or condition variables.

If the element type operations required do not throw exceptions, then only the exceptions above are rethrown.

When an element copy/move may throw, some queue operations have additional behavior.

Queue Ordering

The conceptual queue interface makes minimal guarantees.

In particular, the conceptual interface does not guarantee that the sequentially consistent order of element pushes matches the sequentially consistent order of pops. Concrete queues could specify more specific ordering guarantees.

Lock-Free Implementations

Lock-free queues will have some trouble waiting for the queue to be non-empty or non-full queues. Therefore, we propose a two closely-related concepts. A full concurrent queue concept as described above, and a non-waiting concurrent queue concept that has all the operations except push, push_front, wait_push, wait_push_front, value_pop and wait_pop. That is, it has blocking operations (presumably emulated with busy wait) but not waiting operations. We propose naming these WaitingConcurrentQueue and NonWaitingConcurrentQueue, respectively.

Note: Adopting this conceptual split requires splitting some of the facilities defined later.

It may be helpful to know if a concurrent queue has a lock free implementation.

bool queue::is_lock_free();

Return true iff the has a lock-free implementation.

Concrete Queues

In addition to the concept, the standard needs at least one concrete queue. We describe two concrete queues.

Locking Buffer Queue

We provide a concrete concurrent queue in the form of a fixed-size buffer_queue. It meets the WaitingConcurrentQueue concept. It provides for construction of an empty queue, and construction of a queue from a pair of iterators. Constructors take a parameter specifying the maximum number of elements in the buffer. Constructors may also take a parameter specifying the name of the queue. If the name is not present, it defaults to the empty string.

The buffer_queue deletes the default constructor, the copy constructor, and the copy assignment operator. We feel that their benefit might not justify their potential confusion.

Lock-Free Buffer Queue

We provide a concrete concurrent queue in the form of a fixed-size lock_free_buffer_queue. It meets the NonWaitingConcurrentQueue concept. The queue is still under development, so details may change.

Additional Conceptual Tools

There are a number of tools that support use of the conceptual interface. These tools are not part of the queue interface, but provide restricted views or adapters on top of the queue useful in implementing concurrent algorithms.

Fronts and Backs

Restricting an interface to one side of a queue is a valuable code structuring tool. This restriction is accomplished with the classes generic_queue_front and generic_queue_back parameterized on the concrete queue implementation. These act as pointers with access to only the front or the back of a queue. The front of the queue is where elements are popped. The back of the queue is where elements are pushed.

void send( int number, generic_queue_back<buffer_queue<int>> arv );

These fronts and backs are also able to provide begin and end operations that unambiguously stream data into or out of a queue.

Streaming Iterators

In order to enable the use of existing algorithms streaming through concurrent queues, they need to support iterators. Output iterators will push to a queue and input iterators will pop from a queue. Stronger forms of iterators are in general not possible with concurrent queues.

Iterators implicitly require waiting for the advance, so iterators are only supportable with the WaitingConcurrentQueue concept.

void iterate(
    generic_queue_back<buffer_queue<int>>::iterator bitr,
    generic_queue_back<buffer_queue<int>>::iterator bend,
    generic_queue_front<buffer_queue<int>>::iterator fitr,
    generic_queue_front<buffer_queue<int>>::iterator fend,
    int (*compute)( int ) )
{
    while ( fitr != fend && bitr != bend )
        *bitr++ = compute(*fitr++);
}

Note that contrary to existing iterator algorithms, we check both iterators for reaching their end, as either may be closed at any time.

Note that with suitable renaming, the existing standard front insert and back insert iterators could work as is. However, there is nothing like a pop iterator adapter.

Storage Iterators

In addition to iterators that stream data into and out of a queue, we could provide an iterator over the storage contents of a queue. Such and iterator, even when implementable, would mostly likely be valid only when the queue is otherwise quiecent. We believe such an iterator would be most useful for debugging, which may well require knowledge of the concrete class.

Binary Interfaces

The standard library is template based, but it is often desirable to have a binary interface that shields client from the concrete implementations. For example, std::function is a binary interface to callable object (of a given signature). We achieve this capability in queues with type erasure.

We provide a queue_base class template parameterized by the value type. Its operations are virtual. This class provides the essential independence from the queue representation.

We also provide queue_front and queue_back class templates parameterized by the value types. These are essentially generic_queue_front<queue_base<Value>> and generic_queue_front<queue_base<Value>>, respectively. Indeed, they probably could be template aliases.

To obtain a pointer to queue_base from an non-virtual concurrent queue, construct an instance the queue_wrapper class template, which is parameterized on the queue and derived from queue_base. Upcasting a pointer to the queue_wrapper instance to a queue_base instance thus erases the concrete queue type.

extern void seq_fill( int count, queue_back<int> b );

buffer_queue<int> body( 10 /*elements*/, /*named*/ "body" );
queue_wrapper<buffer_queue<int>> wrap( body );
seq_fill( 10, wrap.back() );

Managed Indirection

Long running servers may have the need to reconfigure the relationship between queues and threads. The ability to pass 'ends' of queues between threads with automatic memory management eases programming.

To this end, we provide shared_queue_front and shared_queue_back template classes. These act as reference-counted versions of the queue_front and queue_back template classes. These shared versions act on a queue_counted class template, which is a counted version of queue_base.

The concrete counted queues are the queue_owner class template, which takes ownership of a raw pointer to a queue, and the queue_object class template, which constructs and instance of the implementation queue within itself. Both class templates are derived from queue_counted.

queue_owner<buffer_queue<int>> own( new buffer_queue<int>(10, "own") );
seq_fill( 10, own.back() );
queue_object<buffer_queue<int>> obj( 10, "own" );
seq_fill( 10, obj.back() );

The share_queue_ends(Args ... args) template function will provide a pair of shared_queue_front and shared_queue_back to a dynamically allocated queue_object instance containing an instance of the specified implementation queue. When the last of these fronts and backs are deleted, the queue itself will be deleted. Also, when the last of the fronts or the last of the backs is deleted, the queue will be closed.

auto x = share_queue_ends<buffer_queue<int>>( 10, "shared" );
shared_queue_front<int> f(x.first);
shared_queue_back<int> b(x.second);
f.push(3);
assert(3 == b.value_pop());

Implementation

A free, open-source implementation of these interfaces is avaliable at the Google Concurrency Library project at http://code.google.com/p/google-concurrency-library/. The concrete buffer_queue is in ..../source/browse/include/buffer_queue.h. The concrete lock_free_buffer_queue is in ..../source/browse/include/lock_free_buffer_queue.h. The corresponding implementation of the conceptual tools is in ..../source/browse/include/queue_base.h.

Revision History

This paper revises N3434 = 12-0043 - 2012-01-14 as follows.

N3434 revised N3353 = 12-0043 - 2012-01-14 as follows.