Document #: | D2839R1 |
Date: | 2023-06-08 |
Project: | Programming Language C++ |
Audience: |
EWGI |
Reply-to: |
Brian Bi <bbi10@bloomberg.net> Joshua Berne <jberne4@bloomberg.net> |
const
objects
The fundamental operation of relocation, where a new object
is constructed with the same value as a source object while
simultaneously ending the lifetime of that source object, is a
fundamental building block of many algorithms. Numerous previous
proposals have attempted to capture the most common case for this
operation, where it is trivial. In this proposal, we extend this to its
natural conclusion of enabling the definition of a user-provided
relocation constructor. This constructor is formed by defining
a new kind of reference, an owning reference (spelled
T~
), which not only names an
existing object but also carries the responsibility of its destruction.
We then show how such references can be integrated with automatic
variables and generalized object destruction to provide a complete and
powerful extension to C++.
const
objectsNumerous proposals have attempted to introduce the ability to tie together the construction of one object with the destruction of a source object, migrating the value of that source object in the process. We discuss some such proposals in Section 9. Normal C++ move-initialization accomplishes migration of an object’s value but fails to address the source object’s lifetime, thus requiring that the source object support being in a valueless state (i.e., a state having a logically valid value that must be accounted for by any function having a wide contract, while semantically representing the absence of a value).
When an object’s lifetime can also be ended as part of moving its value to a new object, performance can be improved in various ways:
On top of that, although not all move-initializations immediately precede the destruction of the source object, many of them do:
std::vector
often involves
chains of move-construct and destroy operations.Surprisingly, types that do not include references to themselves tend
to not only be relocatable, but to be relocatable in a trivial fashion;
i.e., the relocation can be accomplished by simply invoking
memcpy
on the source object and
then not invoking that object’s destructor but nonetheless ending its
lifetime. This case is so prevalent and the performance benefits when
taking advantage of it within containers is so significant that numerous
historical proposals have been written to add just trivial
relocation to the library or language: [P1029R1], [P1144R6], and [P2786R0].
A common initial response to these proposals for trivial relocation is confusion about proposing a trivial version of an operation where we do not have a nontrivial option. Certainly no other fundamental C++ operation is supported only when trivial and provides no mechanism for a user to define their own version of that operation. This proposal aims to provide the complete context in which to understand how trivial relocation could fit into a larger picture. In particular, should a trivial relocation proposal such as [P2786R0] move forward, it would be completely compatible with extending to arbitrary user-defined relocation through the owning references that we propose here. The ability to opt-in to trivial relocation in a manner similar to that specified in [P2786R0] will still be needed, as it would never be well-founded to implicitly circumvent a user-provided move operation or destructor.
By expressing relocation through owning references, rather than simply providing library functions that allow such relocation, we also extend the ability to leverage relocation in more places as well as to safely prevent one of the most common issues with move operations, use after move.
The fundamental change we introduce is a new reference type,
T~
, which we call an owning
reference. In the same way that
T&&
is a reification of
the rvalue value category, owning references are a reification
of a new rlvalue (relocating value) value category.
T&
,
T&&
, and
T~
, identify an object in memory
with a value. Through any of these references, the value of the
referred-to object is accessible.T&&
and
T~
, both carry with them
ownership of the value of the object they refer to. By creating
and passing around such a reference, the recipient of the reference is
free to not only access but also consume the value of the
referred-to object.An owning reference can be in one of two states at a given program point, and that state is statically determined:
At any point where it might be ambiguous whether an owning reference
is engaged or disengaged, it will immediately become disengaged,
removing any intrinsic need for a runtime determination of the usability
of an owning reference. For example, disengaging an owning reference
inside an if
statement will
force it to be disengaged after the
if
statement completes, even
when a (possibly empty or missing) else branch did nothing explicit with
that owning reference.
A function taking an owning reference will therefore destroy the passed-in object when control flow leaves the function:
void f1(T~ t) {
// `t` goes out of scope; `~T()` is called implicitly
}
When the reloc
operator is
applied to an engaged owning reference, the result is an
rlvalue expression that owns that reference, and the operand is
disengaged:
void f2(T~ t) {
// `t` is hereby disengaged; the rlvalue result expires at the
reloc t; // end of the full-expression and calls `~T()`.
.doSomething(); // Ill formed because `t` is already disengaged.
t// Nothing extra happens at end of scope.
}
An rlvalue expression can be used to initialize an owning reference, which will then take ownership of the object that the rlvalue refers to.
void f3(T~ t) {
~ t2 = reloc t; // `t2` now owns the object originally referred to by `t`.
T.doSomething(); // Ill formed because `t` is disengaged.
t.doSomething(); // OK
t2
// `~T()` is called, as in `f2`.
reloc t2; .doSomething(); // Ill formed because `t2` is disengaged.
t2// Nothing extra happens at end of scope.
}
The reloc
operator is also a
primary mechanism for passing ownership from one function to another, as
the owning reference itself, when used by name, is an lvalue
and not an rlvalue:
void take(T~ t);
void f4(T~ t) {
(reloc t); // Passes ownership of the object referred to by `t`;
take// that object will be destroyed by `take` at some point.
.doSomething(); // Ill formed because `t` is disengaged.
t}
Even when passed to a function that does not take an owning reference
parameter, the object being relocated with the
reloc
operator will still be
destroyed, because the rlvalue result of the
reloc
operator has taken
ownership, and expires at the end of the full-expression:
void takeVal(T&& t);
void f5(T~ t) {
(reloc t); // Object is now owned by the rlvalue result of `reloc t`;
takeVal// `~T()` is called at the end of the full-expression.
.doSomething(); // Ill formed because `t` is disengaged.
t}
As with the other reference types and value categories, owning references and rlvalues participate in reference collapsing and have special overload resolution rules as well as certain other behaviors specific to owning references. These details follow many of the intuitions that are built up by starting from lvalue references and considering increased levels of rights that are granted to the holder of the reference. We expand on this reasoning below.
The functionality of an owning reference is a fundamental subset of
the functionality of an automatic variable. An automatic variable owns
both the storage for an object and the object’s lifetime (and therefore
destroys and deallocates the object when it goes out of scope); one
could conceive of an entity that represents the responsibility for
destroying an object, but has no responsibility for its storage. An
owning reference is precisely such an entity, and we simply propose
having every automatic variable be associated with an owning reference
to the object that exists within the storage that the automatic variable
represents. These owning references can, like all other owning
references, be disengaged with the
reloc
operator:
void f6() {
T at;{
~ r = reloc at; // `at` is disengaged; `r` now owns the `T` object.
T// `r` goes out of scope and calls `~T()`.
}
.doSomething(); // Ill formed
at// `at` goes out of scope and its storage is deallocated.
}
For objects with dynamic storage duration, it is also helpful to have a way to give the responsibility for destruction of that object back to the compiler by creating an owning reference to the specified object:
void f7(T& t) {
~ ot = static_cast<T~>(t); // Creation of owning reference is forced.
T// `ot` goes out of scope and destroys its referent.
}
Similarly to std::move
, this
same static_cast
is provided
with a more explicit name by the
std::force_relocate
function
(Section 6.9.3).
The fundamental ability of an owning reference to pass ownership of an object from one function to another naturally lends itself to a mechanism to define the relocation operation for a type: it is a constructor that takes an owning reference of the same type:
struct S {
(S~ s) = default; // relocation constructor
S};
As with all other special member functions, this constructor may be
implicitly defined and we propose rules that govern its defaulted
definition. It is often helpful to think of relocation as first
performing a move construction and then destroying the source object, as
if we had delegated the implementation of
S(S~)
to
S(S&&)
:
struct S {
(S~ s) : S(std::move(s)) // New object is move-constructed from xvalue denoting source object.
S{
// `s` goes out of scope and destroys the source object.
}
};
In cases where both the constructor selected for an rvalue argument and the destructor are not user-provided, the defaulted definition performs memberwise relocation.
Destructors currently have a unique interaction with ownership. When an object’s destructor begins, all of its member and base class subobjects are scheduled to be destroyed when the destructor’s body exits. In effect, as soon as the lifetime of the object ends (i.e., the destructor call begins), the destructor’s body acquires ownership of each subobject.
We propose that when an object’s destructor begins execution, hidden owning references are declared to refer to each of the object’s member and base class subobjects:
struct B {};
struct D : B {
S d_s;
~D()
{
// Hidden owning reference declared to `static_cast<B&>(*this)`.
// Hidden owning reference declared to `this->d_s`.
// Statements written by the programmer go here.
// `d_s` is destroyed when its owning reference goes out of scope.
// `B` subobject is destroyed when its owning reference goes out of scope.
}
};
Certain expressions that reference member and base class subobjects are instead considered to name the corresponding owning references, which may then be relocated:
void takeS(S~ s);
::~D() {
D(reloc d_s); // Ownership of `d_s` is passed to `takeS`.
takeS
// `d_s` is not destroyed a second time; the owning reference is already disengaged.
// `B` subobject is destroyed when its owning reference goes out of scope.
}
This deconstruction of an object into owning references to its
subobjects, which always occurs implicitly in a destructor, can be
explicitly triggered in other contexts using the
reloc_begin_destruction
operator:
void f8() {
D d;
reloc_begin_destruction d;// `d` still refers to same object, but `~D` will not be invoked;
// owning references exist for `d.d_s` and `static_cast<B&>(d)`.
.d_s; // Hidden owning reference referring to `d.d_s` is disengaged.
reloc d// `B` subobject is destroyed.
}
reloc_begin_destruction
allows a relocation constructor to decompose a source object and
relocate the source object’s subobjects to initialize the destination
object’s subobjects:
::D(D~ source) {
D
reloc_begin_destruction source;this : B(reloc static_cast<B&>(source)),
(reloc source.d_s);
d_s// Owning references to `d_s` and `B` subobjects are already disengaged;
// no implicit destructor calls occur at the end of scope.
}
Various use cases for this flexibility of ordering and the ability to relocate members will be explored below.
Our proposal is layered in three parts, with each part dependent on only the previous parts. Later parts could easily be delayed for future Standards while reaping a subset of the benefits and expressivity with a smaller initial feature.
Part I introduces owning references and the core language rules governing their behavior that are needed to support defaulted relocation constructors, which the authors believe will suffice for the vast majority of use cases that can benefit from nontrivial relocation.
Part II is a minimal extension to enable users to write their own relocation constructors. A new syntax is proposed to allow the compiler to track which subobjects have been relocated from within the ctor-initializer of a user-defined relocation constructor.
Part III builds on top of Part II to provide further usability benefits in the implementation of relocation constructors and to enable destructors to relocate subobjects.
Appendix A discusses extensions that depend only on Part I but are not included in any of the three main parts because of their potential impact upon existing code. Appendix D discusses a further usability improvement that could be added on top of Part I, II, or III, but the main proposal does not depend on Appendix D.
For every object type T
, we
propose the introduction of a type called owning reference to
T
, which is denoted by
T~
. An owning reference binds
only to a value category known as the rlvalue. An rlvalue
denotes an object that can be relocated from, just as an xvalue denotes
an object that can be moved from. (See Appendix C for a discussion of
alternative names.)
A variable of owning reference type may be either “engaged” or “disengaged.” An engaged owning reference owns an object: When the owning reference’s lifetime ends, the object it owns is destroyed. (If the owning reference is a function parameter, the implicit destructor call at the end of the reference’s lifetime is performed in callee context because the caller cannot know whether the callee has disengaged its owning reference parameter.) Whether a particular owning reference is engaged or disengaged at a particular program point is always known statically according to the rules that we will describe later in this section, and no runtime flags need to be maintained to track that status.
The name of a variable of type
T~
is an lvalue, just like the
name of any other reference variable. The lvalue can be converted to an
rlvalue using the reloc
operator, which will be discussed in more detail later in this section.
The resulting rlvalue is then engaged, and the original variable is
disengaged. An id-expression that names a disengaged owning
reference is ill formed. If a variable of owning reference type is
disengaged along some paths of control flow, it is implicitly disengaged
at the end of all other branches (i.e., immediately before they rejoin
the branch containing the explicit disengagement), as necessary, to
ensure that it is known to be disengaged when the branches rejoin.
struct T {
int m;
};
void g(T& x);
void f(T~ ref) { // `ref` is engaged and owns some object.
(ref); // OK; `ref` is an lvalue.
g
~ ref2 = reloc ref;
T// `ref` is disengaged; `ref2` is engaged and owns the object.
(ref); // ill formed; `ref` is disengaged
g++ref.m; // ditto
(ref2); // OK
g
if (rand() % 2) {
{
~ ref3 = reloc ref2;
T// `ref3` is engaged and `ref2` is disengaged.
// `ref3`'s lifetime ends here; `ref3.~T()` is called.
}
(ref2); // error
g} else {
(ref2); // OK
g// `ref2` is implicitly disengaged here; `ref2.~T()` is called.
}
(ref2); // error
g}
See also Section 6.8 below.
A relocation constructor is a nontemplate constructor of a
class T
whose first parameter is
of type T~
and all of whose
remaining parameters (if any) have default arguments. From the previous
paragraph, it is apparent that when the constructor’s parameter is
initialized, the parameter has unique ownership of the object it refers
to, and any other rlvalue referring to the same object will have become
disengaged. A relocation constructor, like any other constructor,
creates an object. Because the
T~
parameter’s lifetime ends at
the closing brace of the constructor, the source object’s lifetime will
end by the time the relocation constructor returns.
Certain types will have implicitly declared relocation constructors
that are declared and defined in a similar manner to other special
member functions but are unconditionally
noexcept
.
C
such that
overload resolution for direct-initializing
C
from an xvalue of
C
succeeds and finds a
nondeleted constructor and C
has
a nondeleted destructor, the declared relocation constructor will have
the more restrictive of access levels of those two functions.Throwing relocation constructors raise difficult specification problems. When a relocation constructor throws, some of the source object’s subobjects will have been destroyed already, and destroying the remaining subobjects might not be safe because the order of destruction in a relocation constructor is opposite to the usual order of destruction (i.e., in a destructor). Like throwing move constructors, throwing relocation constructors are likely to cause problems for authors of generic code. For these reasons, so we do not propose to allow throwing relocation constructors at this time.
The behavior of an (implicitly or explicitly) defaulted relocation constructor is described in the following list.
std::memcpy
. As with any other
relocation constructor, a trivial relocation constructor ends the
lifetime of the source object. However, a trivial relocation constructor
does not actually call the destructor for the source object, because the
notion of trivial relocatability is that performing a bitwise copy
followed by forgetting to destroy the old object has the desired
semantics.C
is a class
type described by the second bullet in the previous list:
If both the constructor selected for direct-initialization of
C
from an xvalue of
C
and the destructor of
C
are defaulted at their first
declaration and if all direct subobjects and virtual base class
subobjects of C
are relocatable,
the default relocation constructor for
C
performs a memberwise
relocation of C
’s direct
subobjects and virtual base class subobjects. (The rationale is that we
can assume that such a class does not need to be patched up after a
memberwise relocation; any such required patchups would have to be
performed by the constructor selected for move-construction, which would
therefore have to be user provided.)
Otherwise, the relocation constructor for
C
behaves as if it delegates to
the constructor of C
that would
be selected to perform a move:
(T~ source) : T(static_cast<T&&>(source)) {} T
After the target constructor returns, the lifetime of the
source
parameter ends; according
to the rules described in Section 6.1, the source object’s destructor is
then implicitly invoked, implying that the relocation constructor
implements move-and-destroy semantics.
An rlvalue of type cv1
T
can be implicitly converted to
an rlvalue of type cv2
T
if cv2 is more
cv-qualified than cv1.
An rlvalue of type T
can be
implicitly converted to
T&&
. This conversion
occurs automatically when the rlvalue expression is the left operand of
the .
or
.*
operator.
A prvalue of object type T
can be implicitly converted to an rlvalue of type
T
, which has the effect of
materializing a temporary that is owned by the resulting rlvalue. During
overload resolution, this conversion is considered better than binding
to an rvalue reference or const lvalue reference. For example:
void foo(T~ r); // 1
void foo(T&& r); // 2
int main() {
(T{});
foo}
A temporary of type T
is
materialized and converted to an rlvalue. Then
// 1
is called, and
r
is bound to the resulting
rlvalue. When the owning reference
r
is destroyed at the end of its
lifetime, it implicitly destroys the temporary object (unless
r
was disengaged prior to the
end of its lifetime). The binding of the rlvalue to the temporary object
extends the storage lifetime for the object in the same manner as the
binding of any other reference and suppresses the implicit destruction
of the temporary object at the end of the full-expression in which it
was created; the rlvalue has ownership and is responsible for destroying
the object.
There is no implicit conversion from
D~
to
B~
, where
B
is a base class of
D
. Allowing such a conversion
would allow the referenced object to be passed to
B
’s relocation constructor,
leaving the complete D
object in
a partially destroyed state. Since such a conversion is not permitted,
the implicit destructor call at the end of the lifetime of an engaged
owning reference does not perform dynamic dispatch.
A glvalue of type T
can be
explicitly converted to T~
by
static_cast
. This generates an
rlvalue referring to the object that the glvalue refers to, which
implies that this rlvalue will be responsible for destroying the object,
and the caller must ensure that they do not otherwise destroy the
object. Such casts should therefore generally be used only with objects
that have dynamic storage duration.
An rlvalue of type T
decays
to simply T
when deduced by
value. An owning reference behaves like any other reference when named
by a simple-capture. As when capturing a variable of object
type, the programmer must ensure that the lambda closure object does not
outlive the captured entity, lest the reference become dangling. A
lambda closure object cannot have an owning reference member, for
reasons that are discussed in Section 6.7.
struct T {
int m;
};
template <class U>
void g(U u);
void f1(T~ ref) {
(reloc ref);
g// Calls `g<T>`, not `g<T~>`;
// the parameter `u` is relocated from the object that `ref` refers to.
// `ref` is disengaged and the lifetime of the object `ref` refers to ends.
(reloc ref); // ill formed
g}
auto f2(T~ ref) {
auto result = [ref] { return ref.m; };
// The closure type has a member of type `T`, which is *copied* from the
// object that `ref` refers to.
(reloc ref); // OK; `ref` was not previously disengaged.
g
return result;
}
auto f3(T~ ref) {
auto result = [&ref] { return ref.m; };
(result); // OK
g
(reloc ref); // OK; `ref` was not previously disengaged.
g
return result; // UB; the reference is now dangling.
}
We propose to change the current model of automatic variables to
allow them to be relocated using the
reloc
operator. Automatic
variables that are not passed to the
reloc
operator will continue to
be implicitly destroyed upon scope exit, just as they always have been —
though that mechanism now becomes defined in terms of owning
references.
To accomplish these objectives, we propose that for each automatic
variable, x
, an implicit owning
reference (call it __x~
) is
considered to be declared immediately after the locus of
x
’s declaration in the same
scope. Immediately after its declaration,
__x~
is engaged and owns
x
.
x
is no longer inherently
implicitly destroyed when it goes out of scope, but since
__x~
owns
x
, it will destroy
x
upon scope exit, unless some
other owning reference takes over ownership of
x
first or an rlvalue referring
to x
has been passed to a
relocation constructor or otherwise disengaged.
__x~
cannot be named directly
but is needed to define the
reloc
operator (see below). An
id-expression naming x
is ill formed if __x~
is
disengaged.
Although one might occasionally want to construct a new object in the
storage location designated by x
and re-engage __x~
to that
object, we do not currently propose to allow
x
to be named for such purposes,
nor do we propose any method by which a disengaged owning reference can
be re-engaged, because of the complexity of specifying such a feature
and because the safety of doing so isn’t clear. See Appendix A for more
discussion.
struct T {
int m;
};
int main() {
= {0};
T x
T y;~ r = reloc x; // `__x~` is disengaged; `r` owns `x`.
T++x.m; // ill formed; `__x~` is disengaged.
++r.m; // OK; `r` is an lvalue.
// `r` goes out of scope and destroys `x`.
// `__y~` goes out of scope and destroys `y`.
// `y` goes out of scope; `~T()` is not implicitly called.
// `__x~` goes out of scope and does nothing since already disengaged.
// `x` goes out of scope; `~T()` is not implicitly called.
}
reloc
operatorThe reloc
operator is used to
obtain an rlvalue expression that owns a given entity and to disengage
the previous owner. For these purposes, it may be applied to the
following categories of id-expressions.
x
of type
T~
belonging to a block scope or
function parameter scope associated with the immediately enclosing
function definition, the result is an rlvalue referring to the object
that x
referred to;
x
is thereby disengaged.x
of object type belonging to a
block scope associated with the immediately enclosing function
definition, the result is
reloc __x~
.We do not propose allowing
reloc
to be applied to a
reference variable that is extending the lifetime of the temporary
object it is bound to, because the reference might be to a subobject of
the temporary object. See Section 6.3.
Because some ABIs require function parameters of object type to be
destroyed on the caller side, applying
reloc
to the names of such
parameters is not permitted in general; if the programmer wishes to
relocate from a function parameter, they should ensure that the function
parameter is declared with type
T~
rather than
T
. However, as an optional
add-on to Part I, we propose adopting an idea from [D2785]: If
T
is a relocate-only type (i.e.,
a type that has no eligible copy constructor and no eligible move
constructor but does have an eligible relocation constructor), then it
is permitted to relocate from a
T
function parameter (implying
that callee-destroy is required for such types, which currently do not
exist).
Because T~
is a reference
type, there shall be no pointers to
T~
, references to
T~
, or arrays of
T~
. Writing out a type such as
T~&
directly is ill formed.
However, owning references participate in reference collapsing.
These reference collapsing rules follow the “principle of lesser privilege” that currently governs the collapsing of lvalue and rvalue references. Owning references give the most privileges (the holder is permitted to destroy the object it refers to, possibly relocating its value to another object), followed by rvalue references (the holder is permitted to take ownership of the held resources, leaving the object in a moved-from state, but is not permitted to destroy the object), and lvalue references.
It follows that in the presence of owning references, forwarding
references should be spelled
“T~
”, where
T
is a template parameter of a
function that has a parameter of type
T~
. The template argument for
T
is then deduced as an lvalue
reference, rvalue reference, or nonreference when the function argument
is, respectively, an lvalue, xvalue, or rlvalue. (Since, as discussed in
Section 6.3, a prvalue of type U
prefers to be bound to U~
rather
than U&&
, using such a
prvalue as the function argument will also result in
T
being deduced as
U
.) A forwarding reference that
is spelled T&&
can bind
to an rlvalue of type U
but
cannot forward it as an rlvalue; the function parameter type will be
U&&
, not
U~
.
The issue of how to actually perform forwarding (which is typically
done using an expression of the form
std::forward<T>(r)
, static_cast<T&&>(r)
, or
static_cast<decltype(r)>(r)
in
current C++) is thorny. When r
is an owning reference, reloc
must be used so that the disengagement of
r
that must be performed at the
call site is visible to the compiler. However, it is essential to
support a single syntax that perfectly forwards
r
regardless of whether it is an
lvalue reference, rvalue reference, or owning reference; any alternative
that would force users to implement a compile-time switch to call
reloc
on forwarding references
of owning reference type — and an ordinary
static_cast
(or call to
std::forward
) in other cases —
is not workable. For this reason, we propose to resurrect the proposal
for a unary >>
forwarding
operator, which was described in [P0644R1] and rejected in Albuquerque
(November 2017). When applied to a forwarding reference that is an
owning reference, >>
would
be equivalent to reloc
, and when
applied to any other entity,
>>
would be equivalent to
a static_cast
as originally
proposed. A function template that needs to perfectly forward one or
more arguments would then take this form:
template <class T, class... Args>
<T> make_foo(Args~... args) {
fooreturn foo<T>(>> args...);
}
As an alternative to the
>>
forwarding operator, we
propose to adopt an idea from [D2785], wherein
reloc
can also be applied to
lvalue references and rvalue references, not only to the entities
described in the previous section. Using
reloc
as the forwarding
operator, the above function template could be written:
template <class T, class... Args>
<T> make_foo(Args~... args) {
fooreturn foo<T>(reloc args...);
}
The main disadvantage of
reloc
as the forwarding operator
is that it would use the same keyword for two essentially
distinct operators: an operator that disengages its operand to
allow the compiler to track who owns a particular object and an operator
that simply casts to lvalue reference or rvalue reference to facilitate
perfect forwarding. To mitigate this disadvantage, we propose that when
reloc
is applied to an lvalue or
rvalue reference, that operand shall be an owning reference. This
restriction would not completely eliminate the inelegance and possible
confusion arising from the use of
reloc
as the forwarding
operator. For this reason, we believe that the unary
>>
operator would provide
a better solution for perfect forwarding in the presence of owning
references.
A third option is to specify that
static_cast<T~>(r)
implicitly applies reloc
to
r
when
r
is a forwarding reference with
declared type T~
. The above
function template could then be written:
template <class T, class... Args>
<T> make_foo(Args~... args) {
fooreturn foo<T>(static_cast<Args~>(args)...);
}
This syntax is much more verbose than the
>>
and
reloc
syntaxes, and would likely
increase the popularity of FWD
macros. We consider this outcome undesirable. We also believe that it is
dangerous to allow the
static_cast
operator, which can
accept any expression as an operand, to implicitly disengage its operand
only when that operand has a very specific form. We do not propose this
syntax for forwarding, but include it only for completeness.
We discuss some alternative specifications for forwarding references in Appendix B.
A variable of owning reference type must have automatic storage duration. The purpose of this rule is to make it harder to accidentally create an owning reference that later becomes dangling. The rules we propose make it ill formed to reference an owning reference of automatic storage duration after it has become disengaged; there does not seem to be a similar strategy to prevent such unsafe accesses to owning references of static and dynamic storage duration. In the particular case of dynamic storage duration, there is a considerable risk that an owning reference attempts to destroy an object whose storage has already been released or reused (e.g., a variable of automatic storage duration whose block has already been exited).
Because owning reference variables are required to have automatic storage duration, they are not permitted as nonstatic data members. (The alternative — namely to make classes containing nonstatic data members of owning reference type ineligible to have any storage duration other than automatic storage duration — would create more problems than it solves.)
In addition, ~
is not
permitted as a ref-qualifier in a function declarator; no
variable could take ownership in such a case (considering that
this
is a pointer).
Explicit object parameters are permitted to have owning reference type. Note that calling a function with such an explicit object parameter will usually result in the implicit destruction of the object argument:
struct S {
/* ... */
void self_destruct(this S~ self);
};
S s;(reloc s).self_destruct();
We discuss an application for explicit object parameters of owning reference type in Part III.
Structured binding declarations are not permitted to have owning
reference type; they suffer from the same issue as
~
on an implicit object member
function: you can’t actually name the entity to which the
ref-qualifier applies (known as e in 9.6
[dcl.struct.bind]).
If all flow-of-control paths through a particular branch result in a jump that exits the scope to which an owning reference belongs, the other branches do not implicitly disengage the owning reference. The reason for this exception to the usual implicit disengagement rules is that the branch containing the jump cannot rejoin the other branches, so implicit disengagement is not required in the other branches to prevent a situation in which the owning reference may or may not be disengaged after such rejoining.
struct T {
void method();
};
void g(T);
(bool b) {
T f
T t;if (b) {
return reloc t;
} else {
.method(); // OK
t}
return reloc t; // OK
// `__t~` is disengaged and goes out of scope.
// `t` goes out of scope.
}
A jump construct is not permitted to jump from a point where an owning reference is disengaged to a point that follows the definition of the owning reference but precedes an id-expression naming the owning reference. An implicit jump from the end of a loop back to its beginning is considered to occur.
int g(int~ r);
int f1(int~ r) {
while (true) {
(reloc r); // ill formed
g}
}
int f2(int~ r) {
while (true) {
int x = 0;
(reloc x); // OK
gif (rand() % 8 == 0) {
(reloc r);
greturn; // OK
}
}
}
When an rlvalue expression is evaluated and is not otherwise disengaged by the end of the containing full-expression, the rlvalue is implicitly disengaged as part of the last step in evaluating the full-expression; the timing of this implicit disengagement is the same as the timing of the implicit destructor call for a hypothetical temporary object that was created at the point at which the rlvalue expression was evaluated. This implicit disengagement can occur, for example, when the rlvalue expression is a discarded-value expression or when it is converted to an xvalue instead of being used to initialize an owning reference variable.
void f(T~ ref) {
{}, (reloc ref), V{};
U// `V` object destroyed, then `ref.~T()` called, and then `U` object destroyed.
}
If an evaluation that disengages an owning reference variable is indeterminately sequenced or unsequenced relative to another evaluation in the same full-expression that names the owning reference variable (where an id-expression naming an automatic variable is considered to name its implicit owning reference for the purpose of this rule), the program is ill formed because we have no guarantee that the latter occurs before the former.
struct S {
(int x, int~ r);
S};
void bar(int~ r) {
(r, reloc r); // Ill formed; `reloc r` may occur before the copy.
S s1{r, reloc r}; // OK; `x` is copied from `r`, and then `reloc r` is evaluated.
S s2}
See Part II for an example in which this rule must be carefully understood.
Because the evaluation of a ternary conditional expression entails
control flow, it performs implicit disengagement in the same manner as
an if
statement:
int bar(int~);
void foo(bool b) {
int x = 0;
int y = b ? bar(reloc x) : x;
// OK; if `b` is false, `__x~` is implicitly disengaged after the third
// operand is evaluated.
int z = x; // Ill formed; `__x~` is disengaged.
}
Some of this section’s subsections propose new library facilities. The library facilities that will be proposed in a future revision of this paper should Part I move forward are not exhaustively enumerated herein.
return
statementsThe return x;
statements that
currently implicitly move will behave as if by
return reloc x;
instead. Note
that if no relocation constructor is available, the prvalue of
T~
will implicitly convert to an
rvalue of T
, so the move
constructor will be selected. The behavior of returning an object whose
type does not have a relocation constructor (or whose type has a
defaulted relocation constructor that is defined as deleted) will
therefore be unchanged by this rule.
If x
is an
id-expression (possibly parenthesized) naming an automatic
variable of type T~
belonging to
a block scope or function parameter scope associated with the
immediately enclosing function definition, the expression
x.~T()
destroys the referenced
object and disengages x
. (This
effect can also be achieved by evaluating
reloc x
in a discarded-value
expression context, but the pseudo-destructor syntax is more
evocative.)
std::force_relocate
functionWe propose that a library function,
std::force_relocate
, shall be
provided by <utility>
:
template <class T>
constexpr T~ force_relocate(T&& r) {
return static_cast<T~>(r);
}
The std::force_relocate
function can be used by, e.g., a
std::vector
-like container when
reallocating. Let’s look at an example of how such reallocation can be
performed. The reallocation does not suppress any implicit destructor
call that would occur for its argument; the caller must remember not to
destroy the source object separately.
template <class T>
void my_vector<T>::reallocate(size_type new_capacity) {
* new_buf = std::allocator_traits<Alloc>::allocate(alloc_, new_capacity);
Tfor (size_type i = 0; i < size_; i++) {
::new (static_cast<void*>(new_buf + i)) T(std::force_relocate(buf_[i]));
}
= new_capacity;
capacity_ = new_buf;
buf_ }
(Factory functions such as std::allocator_traits<Alloc>::construct
should be updated to accept a pack of the new forwarding reference,
Args~...
. We have not yet
enumerated all Standard Library function templates to which this change
should be made. After the Standard Library function templates are
updated with this change, the above placement-new expression should be
replaced by a call to std::allocator_traits<Alloc>::construct
.)
std::relocate_ptr
smart
pointerWe propose a smart pointer type that is similar to
std::unique_ptr
but can be only
relocated (not moved). Like
std::unique_ptr
, the smart
pointer type guarantees that the deleter it holds will eventually be
called to release the resources owned by the raw pointer it owns.
However, while a std::unique_ptr
can be accidentally dereferenced after it has been moved from (and
become null), a
std::relocate_ptr
cannot be
accessed in any way after it has been relocated from and omits the
release
and
reset
functions that can be used
to change its value to null. We expect that
std::relocate_ptr
can be used in
place of std::unique_ptr
in most
situations where std::unique_ptr
is currently used, leading to safer code.
std::disengage
functionWe propose a library function,
std::disengage
:
template <class T>
constexpr void disengage(T~) requires is_object_v<T>;
Calling disengage
(unsurprisingly) disengages the rlvalue argument and ends the lifetime
of the object to which it refers, without calling any
destructors or relocation constructors. (Therefore, the effect of
calling disengage
is different
from that of an implicit disengagement that is inserted by the
compiler along some branches of control flow; such implicit
disengagements always call the destructor.)
#include <utility>
struct T {
int m;
};
int main() {
{1};
T x& r = x;
T::disengage(x); // OK; `x.~T()` not called.
stdint y = x.m; // Ill formed; `x` is disengaged.
int z = r.m; // UB; dangling reference
}
Not particularly useful in Part I, the effect of
std::disengage
is purely to end
the lifetime of the object to which the rlvalue refers. This effect can
be easily misused to subvert RAII but may be useful in user-provided
relocation constructors; see Part II.
A user cannot implement
std::disengage
because it
behaves as if it stashes away an owning reference in some place where
the latter can live until the program terminates, which is not possible
in user code since all owning references have automatic storage
duration.
If Part I of this proposal is adopted, we expect that the vast majority of relocatable types will be trivially relocatable, and for the vast majority of nontrivially relocatable types, the defaulted relocation constructor (which will move then destroy) will do the right thing, because the necessary nontrivial work will already have been done when writing the move constructor and destructor. However, as relocate-only types become more common, so will class types that cannot be moved because they contain relocate-only subobjects. In some cases, patch-ups will need to be performed after memberwise relocation of these types, and since such types cannot be given a move constructor that performs the patch-ups, users must be able to write their own relocation constructor. In other words, if Part I is adopted without provisions to enable users to provide their own relocation constructors, relocation in C++ will become a victim of its own success. However, we propose Part II separately from Part I because specifying the semantics of user-provided relocation constructors involves additional complexities with less clear-cut solutions.
When users are allowed to write their own relocation constructors, the source object must not be implicitly destroyed, since the relocation operation takes the place of destruction. Therefore, the relocation constructor must ensure that each subobject of the source object is either relocated from or destroyed to avoid leaks. For usability and safety, we must ensure that destruction occurs automatically for each source subobject that is not relocated (i.e., the burden should not be on the user to remember to destroy them). (A relocation constructor thus offers the same guarantee with respect to its source object as a destructor, except it destroys the subobjects in the opposite order.) If the implicit destruction of subobjects that were not relocated does not occur in the ctor-initializer, then the body of the relocation constructor will see a source object that is partially alive. This situation is likely to result in unsafe code. The desire to prevent this situation leads to the conclusion that implicit destruction should occur in the ctor-initializer.
For the compiler to know which source subobjects to implicitly
destroy, there must be a mechanism for the compiler to know which
destination subobjects will be constructed by relocation from the
corresponding source subobjects. The [D2785] approach in this area is for
relocation constructors to implicitly relocate each destination
subobject from the corresponding source subobject unless the subobject
is explicitly named in the ctor-initializer. However, since we
have owning references in our proposal, we can support more general
constructors that do not have the exact signature
T::T(T~)
, and such constructors
can have additional parameters as well. This raises the question of
which such constructors should receive this implicit relocation
treatment. We explain a use case for such extended relocation
constructors below.
We propose the reloc
specifier (distinct from the
reloc
operator that was
introduced in Part I) that may be applied only to a parameter of a
constructor for type T
, where
the parameter must have type T~
and at most one parameter may have this specifier. The use of this
specifier marks the corresponding parameter to have its subobjects
implicitly relocated to the destination subobject unless overridden by
the ctor-initializer. The
reloc
specifier is an
implementation detail of the definition of the constructor and is not
part of the constructor’s signature. We recommend that it be omitted on
nondefining declarations of a constructor.
struct S {
<Foo> d_foo;
relocate_ptr::mutex d_mutex;
std
// S(S~) = default;
// Would be defined as deleted because `std::mutex` is neither relocatable nor movable.
(reloc S~ src)
S: // `d_foo` is implicitly relocated from `src.d_foo`.
// Explicit mem-initializer for `d_mutex` overrides implicit relocation.
()
d_mutex// Destructor of `src.d_mutex` is implicitly called here.
{}
};
The reloc
specifier also
tells the compiler to implicitly call
std::disengage
on the owning
reference parameter when the ctor-initializer is left (either
because it has completed or because it was interrupted by an exception),
unless the constructor is a delegating constructor. When the
ctor-initializer completes normally, this implicit
disengagement is necessary because after the ctor-initializer
runs, each subobject of the T
object owned by the owning reference parameter will be either relocated
or destroyed; for the owning reference to also remain engaged and to be
destroyed at the end of the destructor, thus double-destroying the
object it would otherwise continue to own, would make no sense. When the
ctor-initializer is interrupted by an exception, implicit
disengagement is needed to ensure that the subobjects of the source
object that have already been relocated from or implicitly destroyed are
not destroyed a second time, since the entire source object’s destructor
would be called if the owning reference were not disengaged first.
The reloc
specifier is not
permitted in a delegating constructor because its semantics of
performing memberwise relocation and destruction do not make sense for a
constructor that does not itself initialize any subobjects.
Note that if the programmer does not mark a parameter
reloc
and instead attempts to
manually relocate from one of its subobjects in the
ctor-initializer, the compiler will tell the programmer that
the reloc
operator can be
applied to only an id-expression, not to the class member
access or cast expression that they would need to write to reference the
subobject they are trying to relocate from. We recommend that
implementations try to provide a helpful diagnostic in such cases:
struct S {
T d_foo;(S~ other) : d_foo(reloc other.d_foo) {
S// ^^^^^^^^^^^^^^^^^
// Possible error message:
// "The `reloc` operator may only be applied to the name
// of a variable; to relocate from subobjects of `other`,
// declare `other` with the `reloc` specifier".
::cout << "S(S~)\n";
std}
};
A possible extension (that we are not currently proposing) is to
extend the reloc
specifier
(possibly spelled differently) to other kinds of parameters (typically
for copy and move constructors), with the same meaning of “use this
parameter as the default source for initialization of subobjects” (i.e.,
by copy or move depending on the parameter type). The implicit
disengagement would still apply to only owning references.
For the same reasons that we propose that relocation constructors be
implicitly noexcept
when
implicitly declared or when explicitly defaulted on their first
declaration (see Part I), we also propose that every constructor having
a parameter that is declared
reloc
be implicitly
noexcept
and that it be a
diagnosable error if such a constructor is declared
noexcept(false)
.
struct S1 {
T d_foo;(reloc S1~ other) {} // `S1::S1(S1~)` is implicitly `noexcept`.
S1};
struct S2 {
T d_foo;(reloc S2~ other) noexcept(false) {} // Ill formed.
S2};
struct S3 {
T d_foo;(S3~ other); // `S3::S3(S3~)` is not implicitly `noexcept`.
S3};
::S3(reloc S3~ other) {} // Ill formed: this definition is implicitly
S3// `noexcept`, and doesn't match the declaration.
// We recommend that implementations try to provide
// a useful error message here suggesting to add
// `noexcept` in the declaration.
struct S4 {
T d_foo;(S4~ other) noexcept;
S4};
::S4(reloc S4~ other) {} // OK S4
A small vector is a class template that provides inline
storage for up to N
objects of
type T
, where
N
is a template parameter and
either employs dynamic memory allocation when the user attempts to store
more than N
objects or causes
the operation to fail (e.g., by throwing an exception or terminating the
program).
If Part I of this proposal is accepted, library authors might
implement a small_vector
template that supports relocate-only types. Such a
small_vector
would itself be
relocate-only. Because a
small_vector
stores its elements
inline, the relocation of a
small_vector
object invalidates
all iterators into that object.
Consider now a struct S
that
holds both a
small_vector<T>
(where
T
is a relocate-only type) and
an iterator into that
small_vector<T>
.
S
will not be trivially
relocatable, since the iterator member must be patched up during
relocation, nor will S
have a
usable defaulted relocation constructor, since it is not movable (see
Section 6.2). The author of S
must implement a relocation constructor:
struct S {
<T> d_v;
small_vector<T>::iterator d_it;
small_vector
(const S&) = delete;
S& operator=(const S&) = delete;
S
(S~ src);
S};
To relocate both d_v
and
d_it
correctly to the
destination object, the relocation constructor must compute the value
d_it - d_v.begin()
(call it
idx
) for the source object, and
then initialize the destination’s
d_it
member with
idx + d_v.begin()
. Because
d_v
will be implicitly relocated
by the ctor-initializer, the computation of
idx
cannot be deferred to the
compound-statement of the relocation constructor. It follows
that S::S(S~)
needs to compute
idx
and then immediately
delegate to another constructor that actually relocates
d_v
:
struct S {
// other members previously described...
::S(size_t idx, reloc S~ src)
S: d_it(d_v.begin() + idx) {}
::S(S~ src) : S{src.d_it - src.d_v.begin(), reloc src} {}
S}
A number of features of the above implementation are noteworthy:
src
from being accessed after
relocation (and would therefore be ill formed).reloc
specifier is
omitted for the two-parameter constructor, the compiler will attempt to
default-initialize d_v
and then
destroy src.d_v
at the end of
the ctor-initializer, which is unlikely to be the semantics the
programmer desired.reloc
specifier, the construction of the
d_v
member by relocation from
src.d_v
is implicit. The
explicit mem-initializer for
d_it
overrides the implicit
relocation that would otherwise occur for
d_it
.We believe that such subtleties make user-provided relocation
constructors an expert-only feature, and even experts are likely to err.
The additional features that we propose in Part III will simplify the
implementation of S
but at the
cost of further complexity in the language specification.
The example given in Part II for a class containing a small vector of
a relocate-only type can be rewritten much more simply if we introduce a
feature that allows the construction of bases and members of the
destination object to be deferred until some point in the
compound-statement of its constructor. We propose that such
deferred construction be performed by a new kind of statement called a
delayed-ctor-initializer, consisting of
this :
and followed by a list of
mem-initializers and terminated by a semicolon:
struct S {
<T> d_v;
small_vector<T>::iterator d_it;
small_vector
(const S&) = delete;
S& operator=(const S&) = delete;
S
::S(reloc S~ src) {
Sconst size_t idx = src.d_it - src.d_v.begin();
this : d_it(d_v.begin() + idx);
}
}
When the definition of a constructor contains a delayed-ctor-initializer, it shall not contain a ctor-initializer and shall not implicitly initialize bases and members prior to the constructor’s compound-statement.
Since C++11, no compelling need for delayed-ctor-initializers in the language has arisen because delegating constructors can be employed as an alternative. We believe that the example from Part II, with its various gotchas, demonstrates a case in which delegation is particularly difficult to use correctly and difficult to read when used correctly due to the interaction of delegation with owning references. Thus, we propose delayed-ctor-initializers as part of this paper. Note that we propose to allow delayed-ctor-initializers in all constructors, not just relocation constructors. We believe that judicious use of delayed-ctor-initializers can result in less error-prone implementation of move constructors. (When programmers introduce a bug related to evaluation order in delegating move constructors, they will not receive a compile-time diagnostic as they would for a relocation constructor like the one discussed in Part II, so although delayed-ctor-initializers are important in supporting user-defined relocation constructors, delayed-ctor-initializers could end up being used more widely in move constructors than relocation constructors.)
To define a constructor in which control flow can pass through more
than one delayed-ctor-initializer shall be a diagnosable error.
To be more specific, suppose a hypothetical owning reference variable
named __r
were declared at the
very beginning of the constructor’s compound-statement, and
each delayed-ctor-initializer were replaced by
reloc __r;
. The constructor is
ill formed if the transformed version would be ill formed due to
potentially referencing __r
when
__r
is disengaged. Any implicit
disengagement of __r
and
destruction of its referent that would occur due to the rules about
owning references will instead result in the implicit execution of a
delayed-ctor-initializer of the form
this : ;
.
A subobject’s lifetime does not begin until the subobject is
initialized by the delayed-ctor-initializer. In order to make
delayed-ctor-initializers safer to use, we propose that a
potentially evaluated id-expression that would be transformed
into a member access to a subobject is ill formed if no
delayed-ctor-initializer precedes the point at which such
member access would occur. To be precise, such an id-expression
shall appear only within a delayed-ctor-initializer or at a
point where it would be ill formed to name
__r
. However, certain uses of
glvalues referring to yet-to-be-constructed subobjects have well-defined
behavior as specified in 6.7.3
[basic.life] and
11.9.5
[class.cdtor]. If the
programmer needs to form a glvalue referring to a nonstatic data member
m
prior to a
delayed-ctor-initializer in a constructor of class
C
, they can use the construct
this->*(&C::m)
.
The implicit relocation and disengagement semantics provided by the
reloc
specifier might not always
be desired. In some cases, the programmer might wish for more explicit
control. Consider, for example, an allocator-extended
relocation constructor that does not use the allocator from the source
object but instead uses an allocator supplied by the caller. That
allocator must then be passed down to the allocator-extended relocation
constructors of any subobjects that use allocators:
using allocator_type = ...;
struct S {
allocator_type d_alloc;
(S~ src);
S(allocator_type alloc, S~ src);
S};
struct T {
allocator_type d_alloc;
S d_s;
(T~ src);
T(allocator_type alloc, T~ src);
T};
Implementing T
’s
allocator-extended relocation constructor using the tools provided by
Part II is not possible because that constructor needs some way to
execute a mem-initializer resembling
d_s(alloc, reloc src.d_s)
, but
src.d_s
isn’t an
id-expression, so under the rules in Parts I and II,
reloc src.d_s
is ill formed. As
we explained in Part II, such constructs are not permitted because they
make it impossible, in general, for the compiler to know which
subobjects of the source object must be prevented from being destroyed a
second time.
To allow this allocator-extended relocation constructor to be
implemented, we need to specify the meaning of
reloc src.d_s
. The intuition
behind the semantics of such an expression is that
reloc
acts upon an owning
reference with a known declaration, so
src.d_s
must behave as if it
names an owning reference variable, even though
src.d_s
is not one (nor could it
be, since owning references are not permitted as members). Furthermore,
if owning references are to exist to the subobjects of
src
, then at the point where
such owning references exist, there must not be an owning reference to
the complete object that is still planning on destroying it, lest the
subobjects of src
be destroyed
twice.
We must therefore have an operation that acts upon the owning
reference src
, such that after
this operation has been executed,
src
still refers to the same
object as it did before, and referring to
src
is still well formed, but
src
no longer intends to destroy
the object to which it refers. Such an owning reference is said to be
under destruction. We chose this name because the state of an
owning reference that is under destruction parallels the state of an
object whose destructor has begun execution (namely, its base and member
subobjects remain to be destroyed). The current “placeholder” syntax for
this operation is
reloc_begin_destruction src
.
(The identifier
reloc_begin_destruction
seems
unlikely to be have been used in real code but is unappealing; we hope
to propose a better syntax eventually.)
The allocator-extended relocation constructor described at the beginning of this section can be implemented easily:
struct T {
allocator_type d_alloc;
S d_s;
(T~ src);
T(allocator_type alloc, T~ src) {
T
reloc_begin_destruction src;this : d_alloc(alloc),
(alloc, reloc src.d_s);
d_s
// `src.d_s` goes out of scope and was already disengaged.
// `src.d_alloc` goes out of scope and is destroyed.
}
};
The small vector example from the previous section can be rewritten
so that it employs explicit relocation instead of the
reloc
specifier:
struct S {
<T> d_v;
small_vector<T>::iterator d_it;
small_vector
(const S&) = delete;
S& operator=(const S&) = delete;
S
::S(S~ src) {
Sconst size_t idx = src.d_it - src.d_v.begin();
reloc_begin_destruction src;this : d_v(reloc src.d_v),
(d_v.begin() + idx);
d_it
// `src.d_it` goes out of scope and is destroyed.
// `src.d_v` goes out of scope and was already disengaged.
}
}
Control flow that potentially calls
reloc_begin_destruction
twice on
the same owning reference is ill formed. Essentially,
reloc_begin_destruction r
is
permitted only if replacing all such evaluations for a given
r
with
reloc r
would not result in any
such invented reloc r
violating
the rules on use of owning references after disengagement. Implicit
calls to reloc_begin_destruction
are inserted as necessary (in a manner similar to implicit
disengagements) to ensure that whether an owning reference is under
destruction at a particular point is statically known.
reloc_begin_destruction
is
permitted outside of a constructor but only if every nonstatic data
member, direct base class, and virtual base class of the operand would
be accessible at the point where
reloc_begin_destruction
occurs.
reloc_begin_destruction src
must not only place src.d_s
under destruction, but also must initialize the subordinate owning
references src.d_alloc
and
src.d_s
. In general, there is
one such subordinate owning reference for each direct nonstatic data
member of object type and direct base class and one for each virtual
base class if src
is not itself
a subordinate owning reference (i.e., it refers to a complete object).
These subordinate owning references are declared in the same order in
which the subobjects would be constructed so that when the subordinate
owning references go out of scope, they destroy the corresponding
subobjects in the same order in which the destructor of the complete
object would destroy them.
At a particular point in the constructor, if
src
is under destruction, then a
member access expression naming a member of
src
, whose left operand is the
id-expression src
(possibly parenthesized), is instead considered to name the subordinate
owning reference corresponding to the named member. If
src
has a direct base class of
type B
, the syntax
static_cast<B&>(src)
names the subordinate owning reference corresponding to that base class
subobject. (The syntax is not
static_cast<B~>(src)
because the result is an lvalue; this is consistent with the idea that
the expression names an owning reference variable and that the
name of an owning reference variable is always an lvalue referring to
the owned object.)
struct S2 {
::string d_s1;
std::string d_s2;
std::string d_s3 = "not used for this object yet";
std
(S2~ src) {
S2
reloc_begin_destruction src;// declares subordinate references to `src.d_s1`, `src.d_s2`, and
// `src.d_s3`, in that order;
// `src` is now under destruction
this : d_s1(reloc src.d_s1)
// disengages subordinate reference to `src.d_s1`
(reloc src.d_s2)
, d_s2// disengages subordinate reference to `src.d_s2`
;// initializes `d_s3` using its default member initializer
::cout << src.d_s1.size();
std// Ill formed: src.d_s1 names subordinate reference,
// which is disengaged.
::cout << src.d_s3.size();
std// OK
// Subordinate reference to `src.d_s3` goes out of scope and
// destroys `src.d_s3`.
// Subordinate reference to `src.d_s2` goes out of scope and is
// already disengaged.
// Subordinate reference to `src.d_s1` goes out of scope and is
// already disengaged.
// `src` goes out of scope, but it is under destruction so it does
// not call `src.~S()`.
}
};
The meaning of src.d_s
depends on whether src
is under
destruction, so performing a member access through
src
at a point where control
flow is ambiguous as to whether
reloc_begin_destruction src
has
been evaluated is ill formed.
When a constructor containing a delayed-ctor-initializer
also has a parameter that bears the
reloc
specifier, the implicit
relocation and destruction semantics of the
reloc
specifier do not go into
effect until the delayed-ctor-initializer is executed.
struct S1 {
T d_foo;
T d_bar;
(S1~ source) {
S1if (rand() % 2) {
reloc_begin_destruction source;this : d_foo(source.d_foo),
(0);
d_bar// `source.d_foo` may no longer be referenced;
// `source.d_bar` may still be referenced.
}
// implicitly:
// else {
// reloc_begin_destruction source;
// this : ;
// }
}
};
struct S2 {
T d_foo;
T d_bar;
(reloc S2~ source) {
S2if (rand() % 2) {
this : d_bar(0);
// `d_foo` is implicitly initialized by relocation;
// `source.d_foo` is destroyed by `T`'s relocation constructor
// while `source.d_bar` is implicitly destroyed;
// `std::disengage(reloc source)` is called implicitly.
}
// implicitly:
// else {
// this : ;
// }
}
}
Evaluating
reloc_begin_destruction
for an
owning reference parameter that is declared with the
reloc
specifier is ill formed.
(The reloc
specifier, described
in Part II, implicitly performs a function that is very similar to
reloc_begin_destruction
prior to
entering the ctor-initializer. However, note that the implicit
destruction semantics afforded by the
reloc
specifier will destroy
subobjects of the source object in the opposite order from the source
object’s destructor, while the subordinate references declared by
reloc_begin_destruction
will go
out of scope in the same order as their corresponding subobjects would
be destroyed by the source object’s destructor.)
In Part II, if an exception interrupts a ctor-initializer
that has implicitly relocated one or more members due to the
reloc
specifier, the source
object may be left in a partially destroyed state. Preventing this
partially destroyed state from being observed was one of the motivations
for forcing all constructors that have a parameter with the
reloc
specifier to be
noexcept
, although it can still
be observed in a handler of the constructor’s
function-try-block, if any.
Subordinate owning references solve this problem. When a delayed-ctor-initializer is interrupted by an exception, constructed subobjects of the object being constructed are destroyed in reverse order of construction, just like in an ordinary ctor-initializer; then, in the absence of intervention by the programmer, any subordinate owning references referring to subobjects of the source object clean up subobjects that have not already been relocated from or otherwise had their lifetimes ended:
struct T;
(T~); // may throw
T getT
struct S {
T d_t1;
T d_t2;
T d_t3;
T d_t4;
::S(S~ src) {
S
reloc_begin_destruction src;// Implicitly declares:
// T~ __r1; // owns `src.d_t1`
// T~ __r2; // owns `src.d_t2`
// T~ __r3; // owns `src.d_t3`
// T~ __r4; // owns `src.d_t4`
try {
this : d_t1(reloc src.d_t1),
(getT(reloc src.d_t2)),
d_t2(reloc src.d_t3),
d_t3(reloc src.d_t4);
d_t4} catch (...) {
// All subobjects of `*this` were already destroyed by stack unwinding.
// `src.d_t1` and `src.d_t2` were already relocated from.
// `r4` goes out of scope; `src.d_t4` is destroyed.
// `r3` goes out of scope; `src.d_t3` is destroyed.
// `r2` and `r1` go out of scope, but are already disengaged.
// Exception is implicitly rethrown, as in a _function-try-block_.
}
}
};
This destruction order could be problematic if members that are later
in declaration order (and are normally destroyed before members that are
earlier in declaration order) hold references to members that are
earlier in declaration order and have already been destroyed during the
execution of the delayed-ctor-initializer. In general, there
might not be a way to recover from this situation. However, subordinate
owning references give the programmer as much flexibility as possible.
Within the catch
block, the
programmer could for example force
src.d_t3
to be destroyed before
src.d_t4
(using the statement
reloc src.d_t3;
), or force
src.d_t4
to not be destroyed
(using
std::disengage(reloc src.d_t4);
).
When the reloc
specifier is
not used to declare a user-provided relocation constructor, the
user-provided relocation constructor, like any other user-provided
constructor, is implicitly
noexcept(false)
; it may also be
explicitly declared
noexcept(false)
. We recommend
that implementations issue a warning whenever a constructor for a class
C
has a parameter of type
C~
on which
reloc_begin_destruction
is
called, and that constructor is implicitly
noexcept(false)
. We expect that
in the vast majority of cases, such a function will have the signature
C::C(C~)
, and
noexcept(false)
will not be
intended. Should noexcept(false)
be desired, the programmer can silence the warning by writing
noexcept(false)
explicitly.
Currently, when a destructor’s body completes, the destructor
implicitly calls the destructors of direct members, direct base classes,
and (only if the destructor is for a complete object) virtual base
classes, in reverse order of construction. We propose that the
responsibility for these destructor calls be transferred from the
destructor itself to a set of subordinate owning references that are
implicitly declared at the opening brace of the destructor body. In
effect, a destructor as it is currently written has a hidden owning
reference to the object it is destroying, and it begins by executing
reloc_begin_destruction
on that
hidden owning reference. An id-expression naming a direct
member of the destructor’s class implicitly names the corresponding
subordinate owning reference.
struct Base {};
struct Derived {
T d_member;
~Derived() {
// Implicitly declared:
// Derived~ __self~;
// Implicitly inserted by the compiler:
// reloc_begin_destruction __self~;
// Equivalent to `reloc __self~.d_member`.
reloc d_member;
// Subordinate owning reference referring to `d_member` goes out of scope.
// Subordinate owning reference referring to `Base` subobject
// goes out of scope and calls `~Base()`.
// `__self~` goes out of scope, but it is under destruction, so it does not
// attempt to re-destroy the object to which it refers.
// Under the new rules, `~Derived()` does not automatically call `~T()` and `~Base()`.
// Subordinate owning references are responsible for destruction.
}
};
Subordinate owning references to members, enable, for example, a destructor to return a resource to a pool by relocation, where that resource is owned by a member of the destructor’s class:
struct S {
<Resource> d_resource;
relocate_ptr* d_pool;
ResourcePool::string d_name;
std
~S() {
->release(reloc d_resource);
d_pool// `ResourcePool::release` takes ownership of subordinate owning reference.
// Subordinate owning reference corresponding to `d_name` goes out of scope
// and destroys `d_name`.
// Subordinate owning reference corresponding to `d_pool` goes out of scope.
// Subordinate owning reference corresponding to `d_resource` is already disengaged.
}
};
At times, it might be necessary for the destructor of a class
S
to explicitly refer to the
hidden owning reference that owns
S
. For example, if
S
has a base class
B
, how can the destructor refer
to the subordinate owning reference that owns the
B
subobject? We would like to
avoid giving special meaning to the expression
*this
;
*this
is not an
id-expression, so it does not name the object being
destroyed, and it would be counterintuitive for
static_cast<B&>(*this)
to be considered to name the subordinate owning reference.
According to this logic, reloc static_cast<B&>(*this)
is not valid.
We propose that the syntax
~T(this T~ self)
be permitted
for declaring a destructor. Note that destructors are currently not
permitted to use explicit object parameter syntax; e.g.,
~T(this T& self)
is not
permitted. We propose to permit a destructor to use explicit object
parameter syntax solely in the case where the parameter is an owning
reference to the class type to which the destructor belongs. In an
explicit object parameter destructor, the hidden owning reference
__self~
in an implicit object
parameter destructor declaration is replaced by the explicit owning
reference self
.
reloc_begin_destruction self
is
implicitly executed at the beginning of the destructor’s
compound-statement, and if
m
is a direct member of the
destructor’s class, either m
or
self.m
can be used to name the
subordinate owning reference. The syntax static_cast<Base&>(self)
must be used to name the subordinate owning reference corresponding to a
direct base class subobject of type
Base
.
struct ResourceBase {
<Resource> d_resource;
relocate_ptr* d_pool;
ResourcePool};
struct S : ResourceBase {
::string d_name;
std
~S(this S~ self) {
// `reloc_begin_destruction self` is executed implicitly.
static_cast<ResourceBase&>(self);
reloc_begin_destruction // The subordinate owning reference corresponding to `ResourceBase`
// is now itself under destruction and itself has two subordinate
// owning references.
->release(reloc d_resource);
d_pool// `ResourcePool::release` takes ownership of subordinate subordinate
// owning reference referring to `d_resource`.
// Subordinate subordinate owning reference for `d_pool` goes out of scope.
// Subordinate subordinate owning reference for `d_resource`
// is already disengaged and goes out of scope.
// Subordinate owning reference for `ResourceBase` goes out of scope;
// it is under destruction and does not attempt to re-destroy the object to which it refers.
// Subordinate owning reference for `d_name` goes out of scope and destroys `d_name`.
// `self` goes out of scope; it is under destruction, so it does not
// attempt to re-destroy the object to which it refers.
}
};
Note that a subordinate owning reference does not have any special
properties, other than the fact that it can be named using syntax that
is derived from the syntax used to name the complete object’s owning
reference. It therefore follows that
reloc_begin_destruction
can
operate on a subordinate owning reference and produce subordinate
subordinate owning references which themselves have all the properties
of owning references. The above implementation could also be simplified
to avoid subordinate subordinate owning references, by giving an
explicit name to the owning reference that owns the
ResourceBase
base class:
::~S(this S~ self) {
Sauto~ base = reloc static_cast<ResourceBase&>(self);
// `base` now owns the `ResourceBase` subobject; the subordinate owning
// reference is now disengaged.
::cout << d_pool << std::endl;
std// Ill formed; the access to `d_pool` is through the subordinate owning
// reference to the `ResourceBase` class, but that owning reference is
// already disengaged.
reloc_begin_destruction base;.d_pool->return(reloc base.d_resource);
base}
All destructors that have an explicit object parameter of owning
reference type are considered prospective (just like implicit object
parameter destructors) until the end of the class definition. Overload
resolution is then performed among all destructors that have an explicit
object parameter to select the one that is the most constrained. If both
a selected destructor with no parameters and a selected destructor with
an explicit owning reference parameter are present, the class definition
is ill formed. If the selected destructor has an explicit owning
reference parameter, any (explicit or implicit) call to that destructor
implicitly applies reloc
, as
necessary, to initialize the destructor’s parameter.
WG21 members have created many proposals for relocation in C++. We will first discuss the other known proposals for nontrivial relocation and explain why we are proposing the introduction of owning references while the other nontrivial relocation proposals make do without them. Afterward, we will discuss the interaction of our proposal with the two most recent trivial relocation proposals, which are known to be actively pursued by their authors.
[D2785] is most similar to our proposal
and introduces no additional types but does introduce a new kind of
prvalue obtained from relocating a glvalue: a prvalue that already has
storage backing it (as opposed to one that will construct an object into
the storage determined by the context). In effect, D2785 also proposes a
new value category but does not propose a generalized vocabulary for
manipulating expressions of this value category. The type
T~
in our proposal is a
reification of the fourth value category, just as
T&&
is a reification of
the xvalue category that was introduced in C++11.
We believe that the introduction of the rlvalue category — and of owning reference types that may bind to them — results in a conceptually simpler model than the model of D2785.
Owning references also provide practical benefits. Because an owning reference can be perfectly forwarded with only the runtime cost of copying a pointer and the actual relocation only occurs at the end of this process, our approach never requires intermediate relocations when multiple function calls intervene between the scope in which a source object is declared and the scope in which a destination object is constructed by relocation from that source object:
void consume(T t);
void logAndConsume(T~ r) {
::cout << "Eating: " << &r << std::endl;
std(reloc r); // calls relocation constructor
consume}
void f() {
T src;(reloc src);
logAndConsume}
Functions with T~
parameters
in our approach are expected to be declared with parameters of type
T
in the D2785 approach:
void consume(T t);
void logAndConsume(T r) {
::cout << "Eating: " << &r << std::endl;
std(reloc r); // calls relocation constructor
consume}
void f() {
T src;(reloc src); // may call relocation constructor
logAndConsume}
In the above snippet, the creation of a new
T
object named
r
when calling
logAndConsume
can be elided if
that function is inlined or if that function is given an ABI in which
the T
parameter is implicitly
passed by reference, which is not possible in general. (The
implementation decision to use such an ABI would necessarily affect
all functions with the same signature as
logAndConsume
.) Giving users the
ability to explicitly declare parameters to have type
T~
gives them a way to select
which ABI they want and also avoids the issue in the D2785 approach
wherein the value of &r
depends on whether r
has been
elided by the implementation.
The fundamental relocation operation in [N4158] is a call to a customization
point called
uninitialized_destructive_move
,
which takes two pointer arguments called
from
and
to
and constructs an object at
to
having the value held by
*from
while also ending the
lifetime of *from
.
A pure library facility such as that proposed by N4158 cannot be used
to relocate automatic variables because the call to
uninitialized_destructive_move
does not suppress the implicit destructor call when the variable goes
out of scope. Therefore, the N4158 approach is necessarily
pointer-based, while our approach is value-based and enables a natural
coding style where objects that are to be relocated can be declared as
local variables of object type.
In N4158, a programmer can avoid heap allocation for objects that are to be relocated by constructing them into a stack buffer but must then ensure that if the relocation actually occurs, the object is not thereafter accessed, and that if the relocation does not occur, the object’s destructor is eventually called to release any resources owned by the object. In our approach, by declaring the object as an automatic variable, the necessary guarantees are provided by the compiler. Our approach is therefore safer than N4158 because, by making most such accesses ill formed, it prevents accidental access to objects that might have been relocated from. (However, we are not proposing a borrow checker for C++; an lvalue reference to a local variable that is then relocated from can still be used to attempt to perform an access of that variable, resulting in undefined behavior.)
The N4158 approach encourages the programmer to pass the source
object by pointer until the point at which the relocation will actually
occur; doing so avoids unnecessary intermediate relocations, but the
intent to relocate cannot be perfectly forwarded when the
source pointer is passed, since it will appear to a factory function as
just a pointer, and the factory function will pass that pointer to a
constructor rather than calling
uninitialized_destructive_move
.
In our approach, rlvalues can be perfectly forwarded by functions that
have a forwarding reference parameter spelled
T~
, and no additional machinery
is required.
The
uninitialized_destructive_move
function proposed by N4158 could be implemented as follows under our
proposal:
template <class T>
void uninitialized_destructive_move(T* from, T* to) {
::new (static_cast<void*>(to)) T(static_cast<T~>(*from));
}
Note that customization of the functionality of the implementation
shown above would be accomplished by customizing the relocation
constructor of T
, not by
declaring an overload. Also note that because our proposal specifies a
move-and-destroy fallback behavior for defaulted move constructors, we
need not explicitly specify such fallback behavior for the std::uninitialized_destructive_move
function template.
[P0023R0] uses the syntax
new (dest) >>T(*src)
to
construct a T
object at
dest
having the value held by
*src
. The actual relocation is
performed by a function called a relocator, introduced in the
scope of T
by a declarator of
the form >>T(T& src)
.
Despite looking very different from N4158, P0023 has similar
limitations; it cannot be used to safely relocate objects with automatic
storage duration, does not prevent use-after-relocation, and does not
provide a facility for perfectly forwarding the intent to evaluate
new (dest) >>T(*src)
instead of new (dest) T(src)
,
where src
is an argument of
pointer type.
[P1144R7] and [P2786R0] are similar proposals.
std::relocate
and
std::relocate_at
, that take a
pair of pointers to T
and either
perform trivial relocation (when
T
is trivially relocatable) or
destroy the source object after move-constructing the destination
object. Such library facilities can be used to speed up operations that
are currently expressed exclusively in terms of move plus destroy, e.g.,
the operation of relocating elements of
std::vector<T>
when the
vector’s capacity is increased.The definition of the category of implicitly trivially relocatable types differs slightly between P1144R7 and P2786R0. We do not express an opinion on which definition should be chosen; debate in EWG should resolve this question, and the authors of P1144R7 and P2786R0 are well positioned to argue their respective cases. The same is true for the syntax and semantics of explicitly declaring a class type to be trivially relocatable, which also differ between P1144R7 and P2786R0. Our proposal can build upon the trivial relocatability machinery of either P1144R7 or P2786R0. (Clearly, a class with a user-provided relocation constructor — see Part II — will not be trivially relocatable and a diagnostic should be required if the programmer attempts to declare the class to be trivially relocatable.)
Because P1144R7 and P2786R0 both employ pointer-based approaches to
performing relocation, they suffer from the same limitations as N4158
and P0023 with respect to automatic variables. Nevertheless, the use of
such pointer-based interfaces for relocating objects of dynamic storage
duration is compatible with our proposal. For example, if P1144R7 is
accepted, our proposal will be to modify the specification of
std::relocate_at
so that when
T
is not trivially relocatable
but has a usable relocation constructor, that constructor will be called
in preference to performing a move-and-destroy operation.
[P1029R3] proposed a pure core language
extension to define a move constructor as performing a bitwise copy from
source to destination, followed by resetting the source object by
copying into it the bit pattern that would be produced by its default
constructor (which is required to be
constexpr
).
P1029R3 was made deliberately minimal to have the best possible chance of being adopted into C++23, and its author is no longer pursuing it. In particular, P1029R3 offers no form of nontrivial relocation.
Our proposal offers the semantics of a P1029R3-style relocation:
template <class T, int = (T(), 0)>
void P1029R3_relocate(T* from, T* to)
requires std::is_trivially_relocatable_v<T> {
static constexpr T zero;
::memcpy(to, from, sizeof(T));
std::memcpy(from, &zero, sizeof(T));
std}
x.~T()
, where
x
is an automatic variable of
type T
, can also disengage
__x~
. This rule would make
naming x
after this line ill
formed, unless and until a placement new expression is used to recreate
x
. This rule would also break
some existing code but potentially increase safety by preventing access
to a variable whose lifetime has ended. Specifying this rule would be
painful: We would need to define the class of expressions we consider to
be placement new expressions for the purposes of this rule, and we would
also need to introduce another set of control flow rules. (The control
flow rules for Part I assume that owning references can be disengaged
along some paths of control flow; here, we would need rules that account
for owning references being able to become re-engaged.)reloc
to be applied
to function parameters.We considered three possible approaches to perfectly forwarding owning references. We propose the first approach below but are open to polling to determine the best choice.
T~
syntaxThe approach we propose in this paper is that a function parameter
whose declared type is T~
, where
T
is the name of a template
parameter of the function, is a forwarding reference. The main advantage
of this syntax is its consistency with the reference collapsing rules in
the same way as the current forwarding reference syntax,
T&&
. Like
T&&
, the syntax
T~
requires only the addition of
a special template argument deduction rule to ensure that
T
is deduced as a type that will
give the appropriate reference type after the collapsing rules are
applied to T~
.
The T~
syntax has two
disadvantages. The first is that all function templates that currently
perform perfect forwarding using
T&&
, including Standard
Library function templates, would need to be updated to accept
T~
; otherwise, they would
forward rlvalues as xvalues, not as rlvalues. The second is that when a
programmer wants to write a function template that accepts rlvalues,
not glvalues, of any type and deduces that type, a constraint
must be introduced into the declaration, as we have done in our proposed
declaration of std::disengage
.
This annoyance very rarely arises in the context of
T&&
forwarding
references, because few situations arise where a function template must
accept only rvalues but doesn’t care about the types of those
rvalues. We anticipate that this annoyance will occur much more
frequently if the T~
syntax for
forwarding references is adopted.
T&&
syntaxTo enable all function templates that currently accept forwarding
references to perfectly forward rlvalues without any changes to their
signatures, EWG could adopt an approach in which
T&&
gains the ability to
perfectly forward rlvalues. However, all such approaches known to the
authors have considerable disadvantages. Furthermore, adopting such an
approach will not enable such function templates to
automatically begin supporting rlvalue forwarding; while the
signatures of such functions would not need to change, the function
implementations could not effect rlvalue forwarding using the syntax
std::forward<T>(t)
, since
such a function call would never be able to disengage
t
and transfer ownership to the
result of the function call.
One possible approach that would allow the
T&&
syntax to perfectly
forward rlvalues is to specify that
T~&&
collapses to
T~
, not
T&&
. Unfortunately, this
violates the principle of lesser privilege discussed in the Summary of
Part I. We do not fully understand the practical implications of such a
counterintuitive reference collapsing rule. Standardizing this rule
might not be catastrophic for the safety of the language, because a
variable whose type is spelled
T&&
and turns out to be
an owning reference is unlikely to be destroyed unintentionally; the
reloc
operator must be used to
transfer ownership to another owning reference. However, having
T~&&
collapse to
T~
would interfere with the
declarations of the std::get
function templates for
std::tuple
. When
std::get<T~>
is called on
an rvalue of type std::tuple
,
one of whose element types is
T~
, the declared return type is
U&&
, where
U
is
T~
. If
T~&&
is
T~
, then the result of the call
is an rlvalue, which it should not be, since the only way to return an
rlvalue would be to leave the tuple in a partially relocated state. This
issue with std::get
was not
immediately obvious to the authors, and other unanticipated issues are
likely if this reference collapsing rule is adopted.
A variant of subapproach 1 is to retain the natural reference
collapsing rules in which
T~&&
collapses to
T&&
but add a special
exemption solely for forwarding references: When
U&&
is a forwarding
reference and U
is
T~
, the result is
T~
, while in all other contexts,
the result would be T&&
.
That the meaning of code would depend too much on whether a reference is
a forwarding reference is the main disadvantage of this approach.
A variant of subapproach 2 is to make
T~&&
collapse to
T~
in forwarding reference
context and be ill formed in every other context. Such an approach
avoids the main disadvantage of the subapproach 1 but suffers from the
same issue as subapproach 1 concerning
std::get
and forces writers of
generic code to guard against the creation of the
T~&&
type.
Another subapproach for avoiding counterintuitive reference
collapsing outside of forwarding references is to specify that
T~&&
is neither
T~
nor
T&&
but is adjusted to
T~
in a function declaration
that uses a forwarding reference. In all other contexts,
T~&&
would be an
abominable type, and attempting to declare a variable or
evaluate an expression whose type would be
T~&&
would be ill
formed. This subapproach suffers from the same issue with
std::get
as subapproaches 1 and
3 but might avoid the disadvantages of subapproach 3 in other contexts.
Unfortunately, if WG21 adopts this subapproach and it later turns out to
be untenable, removing the abominable types from the language and
specify a different behavior will be difficult.
We could invent a new syntax for forwarding references that would
perfectly forward rlvalues, such as
T&&&
or
T~~
. This approach would avoid
all the disadvantages associated with the
T&&
syntax and the
second disadvantage associated with the
T~
syntax but suffers from a
severe disadvantage of its own: It removes design space for more general
improvements to perfect forwarding, such as a syntax that would enable
forwarding of overload sets or braced-init-lists.
We propose the T~
syntax
because, although it is imperfect, its problems are less severe than
those introduced by all known alternatives. The problems with the
T~
syntax parallel the problems
with the existing T&&
syntax in current C++, which have proven to be tractable.
A possible alternative name for rlvalue is dvalue, the d of which connotes permission to destroy the referent. The name dvalue is analogous to xvalue, whereas using the term rlvalue is more comparable to referring to xvalues as mvalues, i.e., connoting the likely (but not certain) fate of the object rather than the permission granted to the holder of the reference.
const
objectsWhen a class T
has a
relocation constructor whose parameter type is
T~
, that parameter cannot bind
to an rlvalue of type const T
,
such as the rlvalue that would result from calling
reloc
on a
const T
variable:
struct T {
();
T(T&&) = delete;
T(T~ src);
T};
int main() {
T x;const T y = reloc x; // OK
const T z = reloc y; // Error: `T~` cannot bind to `reloc y`.
}
In practice, the inability to relocate from
const
objects is likely to be
viewed as a limitation on the use of
const
variables, not a
limitation on the use of relocation: A programmer should therefore not
define a local variable const
if
they intend to relocate from that variable.
This limitation on the use of
const
variables will be familiar
to C++ programmers because a
const
variable also cannot be
efficiently moved using the move constructor or move assignment operator
for its type, as illustrated by the example below:
struct T {
();
T(const T&);
T(T&&);
T};
void foo(const T&);
() {
T bar
T x;(x);
fooreturn x;
}
The programmer might wish to declare
x
const
so that the fact that
foo
does not modify
x
is visible at the call site.
Unfortunately, if x
is declared
const
and the implementation
does not elide x
entirely, the
copy constructor will be called to initialize the return value of
bar
instead of the move
constructor.
We believe that experience with move semantics has proven that being
forced to declare some variables
non-const
for the sake of
efficiency is not a serious problem in practice. Nevertheless, we offer
an additional proposal that would eliminate this issue and permit
const
variables to be relocated
from.
An object no longer has a value once its destructor begins execution.
For this reason, within a const
object’s own destructor, the restrictions imposed by
const
no longer apply; the
destructor has non-const
access
to any subobjects that were not themselves declared
const
. Because a relocation
constructor subsumes the destructor of the source object, we propose to
also give relocation constructors
non-const
access to
const
source objects.
It would not be appropriate to allow
T~
to bind to
const T
rlvalues in general,
because such an operation could then occur outside relocation and
subvert the const-correctness of the program. For the same reason,
invoking reloc
on a
const T
object should not yield
a non-const
T
rlvalue. Therefore, we propose
the following modifications to Parts I, II, and III:
An implicitly declared relocation constructor, as described in
Part I, is declared with parameter type
const T~
instead of
T~
, and can therefore accept an
rlvalue of type const T
. A
user-declared relocation constructor that is explicitly defaulted may
have parameter type either T~
or
const T~
. In either case, if a
defaulted relocation constructor has parameter type
const T~
and performs a
move-then-destroy operation, the constructor treats the source object as
non-const
, as if the
constructor’s definition were:
(const T~ source) : T(const_cast<T&&>(source)) {} T
A user-provided relocation constructor, as described in Part II,
may (but need not) be declared with parameter type
const T~
. In the definition of a
constructor, if the reloc
specifier pertains to a parameter of type
const T~
, that parameter is
treated as non-const
at the
start of the ctor-initializer. For example:
struct MoveOnly {
(MoveOnly&&) = default;
MoveOnly(const MoveOnly~) = delete;
MoveOnly};
struct S {
MoveOnly d_other;<T> d_v;
small_vector<T>::iterator d_it;
small_vector
::S(size_t idx, reloc const S~ src)
S: d_other(std::move(src.d_other)), // OK; `src.d_other` is not `const`.
// `src.d_other` is implicitly destroyed.
// `d_v` is implicitly relocated from `src.d_v`.
(d_v.begin() + idx) {}
d_it
::S(const S~ src) : S{src.d_it - src.d_v.begin(), reloc src} {}
S}
Note that we do not propose that such a parameter be treated as
non-const
within the
parameter-declaration-clause, trailing-return-type, or
noexcept-specifier of the function definition, as such a rule
could result in a mismatch between the declaration and the definition of
the function whenever any of those parts of the function definition
depends on the type of the parameter.
If r
has type
const T~
, then following an
invocation of
reloc_begin_destruction r
that
appears within a constructor of
T
, as described in Part III, the
id-expression r
has
type T
instead of
const T
, and the original
const
ness of
r
does not propagate to the
declared types of the subordinate references. For example:
struct S {
MoveOnly d_other;<T> d_v;
small_vector<T>::iterator d_it;
small_vector
::S(const S~ src) {
Sconst size_t idx = src.d_it - src.d_v.begin();
// `src` is an lvalue of type `const S`.
reloc_begin_destruction src;// `src` is an lvalue of type `S`.
this : d_other(std::move(src.d_other)), // OK; `src.d_other` is an
// lvalue of type `MoveOnly`.
(reloc src.d_v),
d_v(d_v.begin() + idx);
d_it}
}
Allowing a relocation constructor to treat a
const
object as
non-const
is potentially
dangerous if that object is allocated in read-only memory; any attempt
to write to such an object is likely to cause a fault. For destructors,
this problem is solved by implementations declining to allocate an
object in read-only memory unless the object has constant destruction
(of which trivial destruction is a common special case).
The fact that a const
variable of static storage duration might be allocated in read-only
memory does not pose a serious problem, because such an object could
only be the source object for a relocation if the programmer employs an
unusual construct such as
static_cast<T~>
or
std::force_relocate
.
Implementations can also allocate some
const
automatic variables in
read-only memory, but we are not aware of any implementations that will
do so when the variable is odr-used, and think it is unlikely that any
such implementations exist. Therefore, there is no risk of a relocation
constructor attempting to write to an object that is stored in read-only
memory.
T~
function parameters can bind
to const
rlvaluesAn alternative approach for relocating
const
variables is to permit
function parameters of type T~
to bind to rlvalues of type
const T
, even though such a
reference binding is not permitted in any other context. This approach
has the advantage that it is simple to specify, and does not require
relocation constructors to be counterintuitively declared to accept
const T~
parameters that they
will potentially use to modify the referents. We also believe that this
approach avoids most of the disadvantages associated with subverting
const
-correctness in general:
once the calling function uses
reloc
to cede ownership of a
const T
variable to the called
function, the calling function can typically no longer access the
variable, and therefore, cannot observe modifications that are made
through the T~
parameter.
However, this alternative approach seems subtly dangerous in that it
permits an unbounded set of functions to modify
const
objects. In current C++, a
const T
object can only be
modified by constructors of T
and the destructor of T
. We
propose to bestow this special ability upon additional member functions
of T
while they are in the
process of destroying the
const T
object. The alternative
approach would allow arbitrary functions to modify
const T
objects whose lifetimes
have not yet ended, which could violate assumptions that must hold in
order for a program to be correct. For this reason, we mention this
alternative approach as an option but do not propose it.