Fixing std :: bit_cast of types
with padding bits
- Document number:
- D3969R0
- Date:
2026-02-02 - Audience:
- LEWG
- Project:
- ISO/IEC 14882 Programming Languages — C++, ISO/IEC JTC1/SC22/WG21
- Reply-to:
- Jan Schultke <janschultke@gmail.com>
- GitHub Issue:
- wg21.link/P3969/github
- Source:
- github.com/eisenwave/cpp-proposals/blob/master/src/bit-cast-padding.cow
degenerates into an alternative spelling for
(some exceptions apply).
Two viable solutions to the problem are presented:
diagnosing
and adding a function with alternative behavior,
or simply changing the current behavior of .
Contents
Introduction
Design
Advantages of the two-function solution
Advantages of the single-function solution
Can't you clear padding bits before bit-casting?
Padding bits are finicky
No padding bits during constant evaluation
std :: clear_padding is not ergonomic for bit-casting
std :: clear_padding is less capable
Can't you make std :: bit_cast produce unspecified or erroneous values?
The problem of bit-casting union types
Constraints vs Mandates
Requiring std :: bit_cast UB to be diagnosed in constant expressions
Bumping the feature-test macro
Implementation experience
Wording
Two-function solution
[version.syn]
[bit.syn]
[bit.cast]
Single-function solution
[version.syn]
[bit.cast]
References
1. Introduction
has undefined behavior at compile time:
That is because an 80-bit x87 has 6 bytes of padding,
and it is undefined behavior to map those padding bits onto non-padding bits
in the destination type via .
[bit.cast] does not disqualify this use of
from being a constant expression.
Surprisingly, the undefined behavior in such cases does not depend on the argument.
A specialization is an alternative spelling for
if has padding bits and does not,
a degenerate form.
Despite not depending on the argument,
the degenerate form of does not violate the
Constraints or Mandates element,
leaving the bug undetected.
Compilers also have no warning for the degenerate form at the time of writing.
are all mapped onto
or objects within ,
the behavior is well-defined.
This behavior is a footgun, and is not very useful.
If users wanted a function that always has UB,
they should be writing , not .
Surprisingly, the UB is not even required to be diagnosed within constant evaluation;
see §2.7. Requiring UB to be diagnosed in constant expressions for details.
Furthermore, it would be useful if bit-casting between
and a 128-bit integer type was easily possible.
After all, reinterpreting floating-point types and integer types
is part and parcel of implementing mathematical functions like those in .
It would also be useful if this could be done portably in constant expressions.
Another case where the degenerate form may arise frequently is bit-casting
(supported by Clang as an extension and proposed in [P3666R2]),
considering that most types (at least 7/8) have padding bits.
to , although it requires multiple steps:
2. Design
To address these issues with ,
there are two viable approaches:
-
Make the degenerate form of
ill-formed. Also add a newstd :: bit_cast function which treats padding bits in the source as zero instead of as indeterminate. Other than that, this new function has the same behavior asstd :: bit_cast_zero_padding .std :: bit_cast -
Make
behave likestd :: bit_cast without adding any new function. This should be done as a DR against C++20.std :: bit_cast_zero_padding
These are referred to as the two-function solution and single-function solution below, respectively.
2.1. Advantages of the two-function solution
The single-function solution is problematic because
can be used to convert padded types to a byte array without undefined behavior
and with zero overhead.
Wiping padding bits would add more cost to existing code.
With only a single function, there is also no way to opt out of that cost
other than using instead,
and that only works outside of constant evaluation.
Furthermore, if users assumed to clear padding,
they may inadvertently access uninitialized memory on older compiler versions,
where that behavior is not implemented yet.
Perfectly well-defined C++29 code with no erroneous behavior
that uses could be copied and pasted into older code bases,
and suddenly obtain undefined behavior.
Last but not least,
users may be surprised by changing the value of any bits.
Conceptually, it is a reinterpretation of existing bits as a new type,
and it is desirable to express behavior like zeroing of padding explicitly.
2.2. Advantages of the single-function solution
The obvious benefit of changing the behavior of
is that existing UB in users' code disappears,
without any refactoring effort.
This would especially be the case if the proposal is treated as a DR against C++20.
Additionally,
some may argue that should be the default anyway,
considering that it's safer
to use.
2.3. Can't you clear padding bits before bit-casting?
In the discussion of this proposal prior to publication,
it was suggested to clear the padding before bit-casting.
That is, standardizing
However, there are severe problems with this aproach, explained below.
2.3.1. Padding bits are finicky
There are only a few places in the standard where padding bits receive a useful value. For example, zero-initialization is also stated to result in padding bits being zeroed. In most scenarios (e.g. local variables), the padding bits have erroneous or indeterminate value. Even when the padding bits have defined value, lvalue-to-rvalue conversion does not propagate padding bits, and the assignment operator may render them indeterminate or erroneous.
This makes it highly questionable to access padding bits
and rely on them having any specific value.
If the user forgets to write or falsely assumes
that padding bits are already cleared,
they could easily acccess uninitialized memory
(which may be a security vulnerability).
2.3.2. No padding bits during constant evaluation
Besides the safety issues,
the approach of clearing padding bits in the object
does not make any sense for constant evaluation.
For instance, Clang does not store an object representation for values
during constant evaluation.
When bit-casting, one is generated on the fly
.
This would likely mean that is effectively
implementable in current compilers.
2.3.3. std :: clear_padding is not ergonomic for bit-casting
We typically pass large types by reference,
even if they are trivially copyable.
Assuming we want to cast a type to another type
while clearing padding,
the procedure has a lot of steps:
This procedure gets even more complicated when we receive a
or operate on a ,
in which case we need to create a temporary variable that we can mutate
with .
Regardless, this procedure is fairly complex compared to using a
function that does it all in one go.
All of that complexity yields no advantage;
even if was ,
isn't,
so cannot be made .
2.3.4. std :: clear_padding is less capable
Last but not least, is strictly less capable
than
because (at least with current compiler technology)
is not a viable solution during constant evaluation.
However, can be implemented
in terms of :
2.4. Can't you make std :: bit_cast produce unspecified or erroneous values?
A possible approach would be to make produce
unspecified bit values instead of indeterminate bit values.
That is, would create a
with 10 predictable bytes and 6 bytes with unspecified value.
There are two problems with this idea:
- Since the byte values are now unspecified, UBSan (undefined behavior sanitizer) can no longer diagnose accessing/branching based on the upper 6 bytes as a bug. The bug (possibly CWE-908: Use of Uninitialized Resource) didn't go away, it just became non-conforming to diagnose it with termination.
- This approach should not work for constant evaluation because it would add non-determinism at compile time.
Overall, this design sweeps the problem under the rug
with little to no benefit to the user.
It is also possible to make the result have erroneous value.
However, once again, this approach could not be used to portably
bit-cast to ,
especially not during constant evaluation;
the degenerate form of would then always produce erroneous values,
so it makes no sense to let it compile in the first place.
This solution would only benefit the case of bit-casting to a byte array;
perhaps that is worth pursuing,
but the only way not to add cost to (with no opt-out)
would be to give the bytes an unspecified value
that is considered an erroneous value.
This provides minimal (if any) benefit,
and could be explored in a separate paper;
it is a separate issue from the one presented in this paper.
2.5. The problem of bit-casting union types
Consider the following code:
There are two possible interpretations of why this code has undefined behavior:
-
The set of padding bits changes based on which union alternative is active,
and
has three padding bytes whenU is active.c -
Even if
has no padding bits, whenU is active, the following three bytes of the value representation ofc are indeterminate.U
The latter interpretation is more reasonable because padding bytes are intuitively a property of the type, given that object and value representations are defined as properties of types. It would be a surprising wording strategy if we considered the set of padding bits to change at run-time.
2.6. Constraints vs Mandates
The degenerate form of should be diagnosed using
a Mandates element (that is, ).
That is because the condition for the degenerate form
is relatively complicated and may change in the future.
Also, Constraints tempts the user to test whether
is safe
using ,
but this test can have false positives.
The detection of the degenerate form would only tell the user whether
all possible arguments result in undefined behavior.
Conceptually, Constraints for
should tell the user whether bit-casting is technically feasible
due to sizes matching and types being trivially copyable,
whereas Mandates should catch misuses such as passing
consteval-only types or types that result in the degenerate form.
2.7. Requiring std :: bit_cast UB to be diagnosed in constant expressions
[bit.cast] paragraph 4, bullet 2 explicitly makes indeterminate result bits undefined behavior
inside ,
which arguably makes it library UB
,
which is generally not required to be diagnosed during constant evaluation.
I argue that it should be diagnosed.
While is technically a library feature,
it is spiritually a core language feature,
and just acts as a portable spelling for the underlying
intrinsic in compilers.
Core language UB is generally diagnosed as per [expr.const].
It should be noted that [P0476R1] never motivated this lack of diagnostics,
and it is likely an unintentional wording defect anyway.
After all, in the cases where has library UB,
it also produces an indeterminate result,
and constant expressions do not allow for indeterminate scalar prvalues
([expr.const] definition of "expression,constant").
The only reason why calls are arguably undiagnosed
library UB is that as soon as undefined behavior occurs,
the usual rules of the language are thrown out the window.
Crucially, the library UB
precedes the production of an indeterminate result
in [bit.cast].
2.8. Bumping the feature-test macro
For both the two-function and single-function solution,
the macro should be bumped:
-
For the two-function solution,
this lets the user detect the presence of
.std :: bit_cast_zero_padding -
For the single-function solution,
this lets the user detect whether they can evaluate
without undefined behavior.std :: bit_cast < __int128 > ( 0 . 0 L )
3. Implementation experience
The proposed behavior of is already implemented
by the compiler intrinsic in GCC.
In fact, implementing the single-function solution
would only require GCC maintainers to bump their feature-test macro.
There is no implementation experience for the detection of the degenerate form in the two-function solution, and such detection would require compiler support because there exists no way to query which bits or bytes of a type are padding bits, or whether a type has padding bits in the first place.
is
despite not having padding bits.
These false positives make it not suitable for detecting the presence of padding bits.
4. Wording
The changes are relative to [N5014].
4.1. Two-function solution
[version.syn]
Bump the feature-test macro in [version.syn] as follows:
[bit.syn]
Change [bit.syn] as follows:
bit_cast [bit.cast]
Change [bit.cast] as follows:
Function template bit_cast Bit-casting [bit.cast]
bit_cast Constraints:
issizeof ( To ) == sizeof ( From ) ;true isis_trivially_copyable_v < To > ;true isis_trivially_copyable_v < From > .true
Mandates:
-
Neither
norTo are consteval-only types ([basic.types.general])From .; and -
for some argument of type
, the result of the function call expression is well-defined.From
Constant When:
, , and the types of all subobjects of and
are types such that:
isis_union_v < T > ;false isis_pointer_v < T > ;false isis_member_pointer_v < T > ;false isis_volatile_v < T > ;false has no non-static data members of reference type.T
Returns:
An object of type .
Implicitly creates objects nested within the result ([intro.object]).
Each bit of the value representation of the result
is equal to the corresponding bit in the object representation of .
Padding bits of the result are unspecified.
For the result and each object created within it,
if there is no value of the object's type corresponding to the value representation produced,
the behavior is undefined.
If there are multiple such values,
which value is produced is unspecified.
A bit in the value representation of the result is indeterminate
if it does not correspond to a bit in the value representation of
or corresponds to a bit for which the smallest enclosing object
is not within its lifetime or has an indeterminate value ([basic.indet]).
A bit in the value representation of the result is erroneous
if it corresponds to a bit for which the smallest enclosing object has an erroneous value.
For each bit in the value representation of the result
that is indeterminate or erroneous,
let be the smallest object containing that bit enclosing :
-
If is of unsigned ordinary character type or
type, has an indeterminate value if any of the bits in its value representation are indeterminate, or otherwise has an erroneous value.std :: byte - Otherwise, if is indeterminate, the behavior is undefined.
- Otherwise, the behavior is erroneous, and the result is as specified above.
The result does not otherwise contain any indeterminate or erroneous values.
Remarks: A function call expression whose behavior is undefined as per the Returns element is not a core constant expression ([expr.const]).
Append the following declaration to [bit.cast]:
Effects:
Equivalent to ,
except that if a bit in the value representation of the result
does not correspond to a bit in the value representation of ,
is zero, not indeterminate.
[Example:
The following example assumes that
is .
bit_cast < char8_t , S > char8_t { 0 } — end example]
4.2. Single-function solution
[version.syn]
Bump the feature-test macro in [version.syn] as follows:
[bit.cast]
Change [bit.cast] as follows:
Function template bit_cast [bit.cast]
Constraints:
issizeof ( To ) == sizeof ( From ) ;true isis_trivially_copyable_v < To > ;true isis_trivially_copyable_v < From > .true
Mandates:
Neither nor are consteval-only types ([basic.types.general]).
Constant When:
, , and the types of all subobjects of and
are types such that:
isis_union_v < T > ;false isis_pointer_v < T > ;false isis_member_pointer_v < T > ;false isis_volatile_v < T > ;false has no non-static data members of reference type.T
Returns:
An object of type .
Implicitly creates objects nested within the result ([intro.object]).
Each bit of the value representation of the result
is equal to the corresponding bit in the object representation of .
Padding bits of the result are unspecified.
For the result and each object created within it,
if there is no value of the object's type corresponding to the value representation produced,
the behavior is undefined.
If there are multiple such values,
which value is produced is unspecified.
A bit in the value representation of the result is indeterminate zero
if it does not correspond to a bit in the value representation of or, and is indeterminate if it
corresponds to a bit for which the smallest enclosing object
is not within its lifetime or has an indeterminate value ([basic.indet]).
A bit in the value representation of the result is erroneous
if it corresponds to a bit for which the smallest enclosing object has an erroneous value.
For each bit in the value representation of the result
that is indeterminate or erroneous,
let be the smallest object containing that bit enclosing :
-
If is of unsigned ordinary character type or
type, has an indeterminate value if any of the bits in its value representation are indeterminate, or otherwise has an erroneous value.std :: byte - Otherwise, if is indeterminate, the behavior is undefined.
- Otherwise, the behavior is erroneous, and the result is as specified above.
The result does not otherwise contain any indeterminate or erroneous values.
Remarks: A function call expression whose behavior is undefined as per the Returns element is not a core constant expression ([expr.const]).