Fixing std :: bit_cast of types
with padding bits
- Document number:
- D3969R0
- Date:
2026-01-17 - Audience:
- LEWG
- Project:
- ISO/IEC 14882 Programming Languages — C++, ISO/IEC JTC1/SC22/WG21
- Reply-to:
- Jan Schultke <janschultke@gmail.com>
- GitHub Issue:
- wg21.link/P3969/github
- Source:
- github.com/eisenwave/cpp-proposals/blob/master/src/bit-cast-padding.cow
degenerates into an alternative spelling for
(some exceptions apply).
I propose alternative behavior in the case of padded source types.
Contents
Introduction
Design
Why not make std :: bit_cast_zero_padding the default behavior
?
Can't you clear padding bits before bit-casting?
Can't you make std :: bit_cast produce unspecified or erroneous values?
The problem of bit-casting union types
Constraints vs Mandates
Implementation experience
Wording
[version.syn]
[bit.syn]
[bit.cast]
References
1. Introduction
has undefined behavior:
That is because an 80-bit x87 has 6 bytes of padding,
and it is undefined behavior to map those padding bits onto non-padding bits
in the destination type via .
Surprisingly, the undefined behavior in such cases does not depend on the argument.
A specialization is an alternative spelling for
if has padding bits and does not,
a degenerate form.
Despite not depending on the argument,
the degenerate form of does not violate the
Constraints or Mandates element,
leaving the bug undetected.
Compilers also have no warning for the degenerate form at the time of writing.
are all mapped onto
or objects within ,
the behavior is well-defined.
This behavior is a footgun, and is not very useful.
If users want a function that always has UB,
they should be writing , not .
Furthermore, it would be useful if bit-casting between
and a 128-bit integer type was possible.
After all, reinterpreting floating-point types and integer types
is part and parcel of implementing mathematical functions like those in .
It would also be useful if this could be done portably in constant expressions.
Another case where the degenerate form may arise frequently is bit-casting
(supported by Clang as an extension and proposed in [P3666R2]),
considering that most types (at least 7/8) have padding bits.
2. Design
The paper proposes the following changes:
-
Make the degenerate form of
ill-formed by expanding the Constraints element.std :: bit_cast -
Add a new
function which maps padding bits in the source object onto zero bits in the result. Other than that, this new function has the same behavior asstd :: bit_cast_zero_padding .std :: bit_cast
2.1. Why not make std :: bit_cast_zero_padding the default behavior
?
It may also be possible to alter the behavior of to clear padding,
rather than creating a new function.
However, this would be problematic because can be used
to convert padded types to a byte array without undefined behavior.
Wiping padding bits would add more cost to existing code.
Furthermore, if users assumed to wipe padding,
they may inadvertently access uninitialized memory on older compiler versions,
where that behavior is not implemented yet.
Perfectly well-defined C++29 code with no erroneous behavior
that uses could be copied and pasted into older code bases,
and suddenly obtain undefined behavior.
Last but not least,
users may be surprised by changing the value of any bits.
Conceptually, it is a reinterpretation of existing bits as a new type,
and it is desirable to express behavior like zeroing of padding explicitly.
2.2. Can't you clear padding bits before bit-casting?
In the discussion of this proposal prior to publication,
it was suggested to clear the padding before bit-casting.
That is, standardizing
However, this approach does not make any sense in the C++ object model because the state of padding bits is not observable, and attempting to modify and later read their value is futile.
From a hardware perspective,
may be stored on the x87 floating-point stack,
so while it superficially has 6 padding bytes,
those only exist on paper, not in hardware.
Similarly, may superficially have 4 padding bits,
but can be stored in a 20-bit register,
where none of those bits actually exist.
The assumption that padding bits cannot be observed, may not even exist, and don't have to be preserved is crucial for compiler optimization.
via floating-point stack
because its padding bits are lost in the process, like in:
Clang on
Even though the code uses ,
which copies all bytes in the object representation,
all padding bytes are discarded when loading onto the floating-point stack.
Besides the hardware perspective,
the approach of clearing padding bits in the object
does not make any sense for constant evaluation.
For instance, Clang does not store an object representation for values
during constant evaluation.
When bit-casting, one is generated on the fly
.
2.3. Can't you make std :: bit_cast produce unspecified or erroneous values?
A possible approach would be to make produce
unspecified bit values instead of indeterminate bit values.
That is, would create a
with 10 predictable bytes and 6 bytes with unspecified value.
There are two problems with this idea:
- Since the byte values are now unspecified, UBSan (undefined behavior sanitizer) can no longer diagnose accessing/branching based on the upper 6 bytes as a bug. The bug (possibly CWE-908: Use of Uninitialized Resource) didn't go away, it just became non-conforming to diagnose it with termination.
- This approach should not work for constant evaluation because it would add non-determinism at compile time.
Overall, this design sweeps the problem under the rug
with little to no benefit to the user.
It is also possible to make the result have erroneous value.
However, once again, this approach could not be used to portably
bit-cast to ,
especially not during constant evaluation;
the degenerate form of would then always produce erroneous values,
so it makes no sense to let it compile in the first place.
This solution would only benefit the case of bit-casting to a byte array;
perhaps that is worth pursuing,
but the only way not to add cost to (with no opt-out)
would be to give the bytes an unspecified value
that is considered an erroneous value.
This provides minimal (if any) benefit,
and could be explored in a separate paper;
it is a separate issue from the one presented in this paper.
2.4. The problem of bit-casting union types
Consider the following code:
There are two possible interpretations of why this code has undefined behavior:
-
The set of padding bits changes based on which union alternative is active,
and
has three padding bytes whenU is active.c -
Even if
has no padding bits, whenU is active, the following three bytes of the value representation ofc are indeterminate.U
The latter interpretation is more reasonable because padding bytes are intuitively a property of the type, given that object and value representations are defined as properties of types. It would be a surprising wording strategy if we considered the set of padding bits to change at run-time.
2.5. Constraints vs Mandates
The degenerate form of should be diagnosed using
a Mandates element (that is, ).
That is because the condition for the degenerate form
is relatively complicated and may change in the future.
Also, Constraints tempts the user to test whether
is safe
using ,
but this test can have false positives.
The detection of the degenerate form would only tell the user whether
all possible arguments result in undefined behavior.
Conceptually, Constraints for
should tell the user whether bit-casting is technically feasible
due to sizes matching and types being trivially copyable,
whereas Mandates should catch misuses such as passing
consteval-only types or types that result in the degenerate form.
3. Implementation experience
None yet.
There exists no way to query which bits or bytes of a type are padding bits,
or whether a type has padding bits in the first place.
Therefore, an implementation requires compiler intrinsics,
both for detecting the degenerate form of
and for .
4. Wording
The changes are relative to [N5014].
[version.syn]
Add a feature-test macro to [version.syn] as follows:
[bit.syn]
Change [bit.syn] as follows:
bit_cast [bit.cast]
Change [bit.cast] as follows:
Function template bit_cast Bit-casting [bit.cast]
bit_cast Constraints:
issizeof ( To ) == sizeof ( From ) ;true isis_trivially_copyable_v < To > ;true isis_trivially_copyable_v < From > .true
Mandates:
-
Neither
norTo are consteval-only types ([basic.types.general])From .; and -
for some argument of type
, the result of the function call expression is well-defined.From
Constant When:
, , and the types of all subobjects of and
are types such that:
isis_union_v < T > ;false isis_pointer_v < T > ;false isis_member_pointer_v < T > ;false isis_volatile_v < T > ;false has no non-static data members of reference type.T
Returns:
An object of type .
Implicitly creates objects nested within the result ([intro.object]).
Each bit of the value representation of the result
is equal to the corresponding bit in the object representation of .
Padding bits of the result are unspecified.
For the result and each object created within it,
if there is no value of the object's type corresponding to the value representation produced,
the behavior is undefined.
If there are multiple such values,
which value is produced is unspecified.
A bit in the value representation of the result is indeterminate
if it does not correspond to a bit in the value representation of
or corresponds to a bit for which the smallest enclosing object
is not within its lifetime or has an indeterminate value ([basic.indet]).
A bit in the value representation of the result is erroneous
if it corresponds to a bit for which the smallest enclosing object has an erroneous value.
For each bit in the value representation of the result
that is indeterminate or erroneous,
let be the smallest object containing that bit enclosing :
-
If is of unsigned ordinary character type or
type, has an indeterminate value if any of the bits in its value representation are indeterminate, or otherwise has an erroneous value.std :: byte - Otherwise, if is indeterminate, the behavior is undefined.
- Otherwise, the behavior is erroneous, and the result is as specified above.
The result does not otherwise contain any indeterminate or erroneous values.
Append the following declaration to [bit.cast]:
Effects:
Equivalent to ,
except that if a bit in the value representation of the result
does not correspond to a bit in the value representation of ,
is zero, not indeterminate.
[Example:
The following example assumes that
is .
bit_cast < char8_t , S > char8_t { 0 } — end example]