Fix defects in floating-point std :: from_chars
(LWG3081, LWG3082, LWG3456)
- Document number:
- P4168R0
- Date:
2026-04-05 - Audience:
- SG6
- Project:
- ISO/IEC 14882 Programming Languages — C++, ISO/IEC JTC1/SC22/WG21
- Reply-to:
- Jan Schultke <janschultke@gmail.com>
- GitHub Issue:
- wg21.link/P4168/github
- Source:
- github.com/eisenwave/cpp-proposals/blob/master/src/fix-floating-from-chars.cow
is inconsistent;
the implementations diverge from each other,
and every implementation diverges from the wording in the standard.
Contents
Introduction
Implementation divergence
Defective wording
Range of representable values
No handling of non-ISO/IEC 60559 types
No specification of parsing
Signaling NaNs make std::from_chars unimplementable
Design
Limited design options
Note on ISO/IEC 60559 conformance
Impact on existing code
Alternatives considered
Further edge cases
Implementation experience
Editorial problems
Wording
[version.syn]
[charconv.to.chars]
[charconv.from.chars]
References
1. Introduction
In 2018, [LWG3081] pointed out that the API
does not give the user a means of distinguishing floating-point overflow and underflow.
The perceived defect is a loss of functionality compared to .
This issue remains open to this day despite having Priority 2.
libstdc++ leaves unmodified, but libc++ and MSVC STL set it to .
is for all implementations,
but the wording requires that is set to and
that is value-initialized,
which no one implements.
In 2023, [P2827R1] attempted to solve this issue, but died in LEWG eventually. Lack of motivation within the paper was among the feedback given. Perplexingly, [P2827R1] does not mention the implementation divergence between libstdc++ and MSVC STL, which already existed at the time; mentioning it surely would have added motivation.
The issue is much more severe than the LWG issue and paper make it out to be.
The wording is defective, implementations diverge, and users suffer from it
to the point where ([BoostCharConv])
should arguably be recommended to users over the standard feature;
at least it is portable and well-specified.
1.1. Implementation divergence
There are three cases of interest where typical implementations likely diverge:
-
Treatment of floating-point overflow (always resulting in infinity),
such as when parsing
as" 1e+1000 " .std :: float32_t -
Treatment of floating-point underflow that results in an inexact zero,
such as when parsing
as" 1e-10000 " .std :: float32_t -
Treatment of floating-point underflow that results in an inexact subnormal,
such as when parsing
as" 1e-45 " .std :: float32_t
tininess
up to the implementation.
This means that it is unclear whether the case is analogous
to floating-point underflow.
At the time of writing,
implementations behave as shown in the table below
when parsing a string as (binary32).
Each cell shows what value the given is set to,
as well as the returned value.
| Input | libstdc++ | libc++ | MSVC STL |
|---|---|---|---|
|
/ |
/ |
/ |
|
unmodified / |
unmodified / |
unmodified / |
|
/ |
/ |
/ |
|
unmodified / |
/ |
/ |
|
unmodified / |
/ |
/ |
|
unmodified / |
/ |
/ |
|
unmodified / |
/ |
/ |
1.2. Defective wording
1.2.1. Range of representable values
Out of the implementations above, none comply with the existing wording. [charconv.from.chars] states:
[…] If the parsed value is not in the range representable by the type of
,value is unmodified and the membervalue of the return value is equal toec . Otherwise,errc :: result_out_of_range is set to the parsed value, after rounding according tovalue ([round.style]), and the memberround_to_nearest is value-initialized.ec
The problem is that for several years while existed,
there was no definition of range of representable values
(marked above)
in the standard.
That definition was only added recently in 2023 via [CWG2723];
the definition in [basic.fundamental] paragraph 13 includes all real numbers
in that range,
meaning that every standard library is incorrect
for returning
on overflow and underflow.
Ergo, not only is underflow indistinguishable from overflow via error code,
overflow, underflow, and exact rounding are all indistinguishable
if the wording is implemented (which it isn't).
1.2.2. No handling of non-ISO/IEC 60559 types
Another problem with the wording is that it does not account for types
that do not have a representation of infinity or NaN.
The string pattern accepted by
is always permitted to be ,
but what should happen when returning a floating-point type without infinity representations?
1.2.3. No specification of parsing
Last but most severe, the entire mechanism of mapping the matched pattern onto a floating-point value is described as:
Otherwise,
is set to the parsed value, after rounding according tovalue ([round.style]), and the memberround_to_nearest is value-initialized.ec
The fact that a string such as results in infinity
or that results in is handwaved as
the single word parsed
, which is not explained anywhere.
While most such cases are obvious, in libc++
represents signaling NaNs as ,
and it is unclear whether
can recover a signaling NaN from such a string
because there exists no wording for parsed
.
The intent is presumably to handle these cases like ,
but this is not stated anywhere;
only the syntax of the matched pattern is specified,
not the interpretation of the pattern.
The description of the matched pattern is also deemed underspecified,
as explained in [LWG3456].
1.2.4. Signaling NaNs make std::from_chars unimplementable
Yet another problem is that
cannot be implemented for types that can represent signaling NaNs;
the problem lies in [charconv.to.chars] paragraph 2:
The functions that take a floating-point value but not a precision parameter ensure that the string representation consists of the smallest number of characters such that there is at least one digit before the radix point (if present) and parsing the representation using the corresponding from_chars function recovers value exactly.
It is not entirely clear whether this requirement was intended
to apply to non-finite inputs,
but it technically could be read in such a way.
If so, it would additionally mandate that instead of
is emitted when the is ,
which implementations do.
The problem is that the pattern accepted by
is that of , which supports
, , and
but no as inputs.
interprets every NaN input as a quiet NaN,
which is presumably the intent for ,
even if the wording does not explicitly say so;
in fact, the wording does not say anything about how these text representations of
infinity or NaN are interpreted as a floating-point value by .
As things stand, the requirement to recover the value exactly
cannot be implemented for signaling NaNs in any reasonable way.
Therefore, it is necessary to exempt signaling NaNs from being recovered.
has this exemption, as does ISO/IEC 60559;
see §5.12.1 External character sequences representing zeros, infinities, and NaNs
:
Conversion of a signaling NaN in a supported format to an external character sequence should produce a language-defined one of "snan" or "nan" or a sequence that is equivalent except for case, with an optional preceding sign.
2. Design
In short, the MSVC STL and libc++ behavior is proposed, with additional wording clarifications.
2.1. Limited design options
At this stage,
the available design options are extremely limited because
floating-point implementations of
have already existed for years
(MSVC STL since 2018, libstdc++ since 2021, libc++ since 2025).
Changing the behavior of
can break substantial amounts of existing code.
There seem to be only two plausible options with sufficiently low impact:
-
Standardize the libstdc++ behavior,
which is to leave the given
unmodified on overflow/underflow.float & -
Standardize the MSVC STL and libc++ behavior,
which is to overwrite the given
with either zero or infinity.float &
I argue that the libc++ behavior is strictly better because
- the behavior resolves [LWG3081] by making underflow distinguishable from overflow,
-
users may actually want to ignore
and simply take the 0 resulting from underflow and resulting from overflow, which the libc++ behavior makes possible, andresult_out_of_range - the behavior is consistent with
.std :: strtod
To provide the maximum amount of information to the user, it is also important that the user obtains a correctly signed zero on underflow and a correctly signed on overflow. That is, the sign of the result value should always be the sign of the parsed mathematical value, even if rounded to zero or infinity. Both MSVC STL and libc++ provide correctly signed values.
2.2. Note on ISO/IEC 60559 conformance
The choice of (correctly signed) zeros and infinities
to signal underflow and overflow is far from arbitrary;
has that design,
and correctly implements the ISO/IEC 60559 (or IEEE-754) operation
convertFromDecimalCharacter
for floating-point types
(see C23 §F.3).
should have that same behavior to increase ISO/IEC 60559
conformance.
[charconv.from.chars] even uses the rounding mode
corresponding to roundTiesToEven
already,
which makes it extremely similar to the ISO/IEC 60559 operation.
As mentioned in §1.2.4. Signaling NaNs make convertFromDecimalCharacter
operation.
The easiest way to do so would be to double down on the practice
of always producing for both quiet and signaling NaNs;
the wording merely needs to relax the exact recoverability requirement for that case.
2.3. Impact on existing code
Furthermore, we must consider how the change in behavior affects existing users, if we standardize the behavior of the other implementation:
-
Users of libstdc++ likely ignore the written value on
because it is unmodified. Now modifying the value does not cause any problem if they were ignoring it already.result_out_of_range -
Users of MSVC STL and libc++ may rely on infinity or zero being written
on
, and suddenly leaving the value unmodified would result in use of erroneous or even indeterminate values for these users.result_out_of_range
When considering these two options, standardizing the GCC behavior appears more risky.
2.4. Alternatives considered
Neither [LWG3081] nor [P2827R1] fully resolve the defects in the wording.
Each proposes a different value that should be written for overflow;
the LWG issue proposes , and
the paper proposes .
Neither of these behaviors standardizes existing practice in standard libraries,
is particularly well-motivated,
or matches the behavior of other functions such as .
Both alternatives make it difficult to simply ignore the
error and use the value.
2.5. Further edge cases
As explained in §1.2.2. No handling of non-ISO/IEC 60559 types,
there are various edge cases such as parsing the string
and attempting to store it in a floating-point type
that does not have a representation of infinity.
These edge cases are currently not handled in any explicit way
by major standard library implementations.
To my knowledge, all major standard libraries assume ISO/IEC-60559-like types,
meaning that negative and positive infinity and NaN are also supported.
The proposed solution is to handle these cases
as closely as possible to ,
as specified in C23 §7.24.1.5.
For example, this means that when is parsed but not representable in the type,
the greatest finite value is returned instead, and
the error code is .
3. Implementation experience
The proposed behavior has been released in MSVC STL and libc++, except that the behavior in §2.5. Further edge cases does not need to be implemented and only exists on paper for these implementations.
The proposed behavior has also been implemented as
([BoostCharConv]).
4. Editorial problems
Beyond the normative issues, there are also significant editorial problems with [charconv.from.chars] which make it difficult to understand what pattern it actually accepts. Namely, the wording is currently:
Effects: The pattern is the expected form of the subject sequence in the
locale, as described for" C " , except thatstrtod
- the sign
may only appear in the exponent part;' + ' - if
hasfmt set but notchars_format ::scientific, the otherwise optional exponent part shall appear;chars_format ::fixed- if
hasfmt set but notchars_format ::fixed, the optional exponent part shall not appear; andchars_format ::scientific- if
isfmt , the prefixchars_format ::hexor" 0x " is assumed." 0X " In any case, the resulting
is one of at most two floating-point values closest to the value of the string matching the pattern.value
We don't actually get to understand what the pattern is here,
just that we should look at and make mental changes relative to that.
C23 §7.24.1.5 paragraph 3
then describes the sequence partially using a mixture of prose and grammar,
containing descriptions such as
[…] a nonempty sequence of decimal digits optionally containing a decimal-point character, then an optional exponent part as defined in 6.4.4.3, excluding any digit separators (6.4.4.2);
Even the C wording doesn't directly describe the pattern but refers to various
other definitions such as exponent part
and digit separators
that must be looked up elsewhere.
Anyone trying to understand what actually accepts
needs to go through a multi-standard scavenger hunt and keep track of two layers
of excluding
and except that
.
This is an ineffective way to communicate the behavior of
and should be rewritten so that the accepted pattern can be understood
solely through [charconv.from.chars].
[LWG3456] performs such a rewrite,
but does not fully decouple from the C wording.
A proper rewrite can also fix the lack of clarity pointed out in [LWG3082].
5. Wording
The changes are relative to [N5032].
[version.syn]
Bump the feature-test macro in [version.syn] as follows:
.
can be viewed as a representative feature-test
for all of .
[charconv.to.chars]
Change [charconv.to.chars] paragraph 2 as follows:
The functions that take a floating-point value
but not a parameter ensure that
when the given is finite,
the string representation consists of the smallest number of characters
such that there is at least one digit before the radix point (if present) and
parsing the representation using the corresponding function
recovers exactly.
[Note:
This guarantee applies only if and
are executed on the same implementation.
— end note]
If there are several such representations,
the representation with the smallest difference
from the floating-point argument value is chosen,
resolving any remaining ties using rounding
according to ([round.style]).
[charconv.from.chars]
Change [charconv.from.chars] as follows:
¶
All functions named from_chars analyze the string [, ) for a pattern,
where [, ) is required to be a valid range.
If no characters match the pattern,
is unmodified,
the member of the return value is and
the member is equal to .
[Note: If the pattern allows for an optional sign, but the string has no digit characters following the sign, no characters match the pattern. — end note]
Otherwise, the characters matching the pattern
are interpreted as a representation of a value of the type of .
The member of the return value points to the first character not matching the pattern,
or has the value if all characters match.
If the parsed value is not in the range representable by the type of ,
is unmodified and
the member of the return value is equal to .
Otherwise, is set to the parsed value,
after rounding according to ([round.style]),
and the member is value-initialized.
¶
Let the pattern be the maximal sequence of characters starting with that
matches a is of a signed type, and
matches a
from-chars-signed-integer-pattern :- opt from-chars-unsigned-integer-patternfrom-chars-unsigned-integer-pattern :- base-digit-seq
base-digit-seq :- base-digit base-digit-seqopt
.
The code points U+0030..U+0039 DIGIT ZERO..NINE
represent digit characters with value 0..9, respectively;
both U+0041..U+005A LATIN CAPITAL LETTER A..Z and
U+0061..U+007A LATIN SMALL LETTER A..Z
represent digit characters with value 10..35, respectively.
and other character types.
¶
Preconditions:
has a value between 2 and 36 (inclusive).
¶
Effects:
The pattern is the expected form of the subject sequence
in the locale for the given nonzero base,
as described for ,
except that no or prefix shall appear
if the value of base is 2,
no or prefix shall appear if the value of base is 16,
and except that is the only sign that may appear,
and only if has a signed type.
¶
Effects:
If the pattern is empty,
the effect is as described above.
Otherwise,
an integer value is obtained by interpreting the
, and
negating that value if the leading U+002D HYPHEN MINUS is present.
If is in the range of values representable by the type of ,
is set to and
the member of the result is value-initialized;
otherwise, is unmodified and
the member of the result is .
¶ Throws: Nothing.
¶
Let the pattern be the maximal sequence of characters starting with
that matches a
from-chars-floating-pattern :- opt from-chars-unsigned-floating-patternfrom-chars-unsigned-floating-pattern :- from-chars-fixed-float
- from-chars-scientific-float
- from-chars-general-float
- from-chars-hex-float
- from-chars-infinity
- from-chars-nan
from-chars-fixed-float :- digit-sequence decimal-fractionopt
from-chars-scientific-float :- digit-sequence decimal-fractionopt exponent-part
from-chars-general-float :- digit-sequence decimal-fractionopt exponent-partopt
from-chars-hex-float :- hexadecimal-digit-sequence hexadecimal-fractionopt binary-exponent-partopt
decimal-fraction :. digit-sequencehexadecimal-fraction :. hexadecimal-digit-sequencefrom-chars-infinity :INF INFINITY from-chars-nan :NAN NAN( nan-char-seqopt) nan-char-seq :- nan-char nan-char-seqopt
nan-char :- digit
- nondigit
¶
Matching of a
-
from-chars-fixed-float ,from-chars-scientific-float ,from-chars-general-float , andfrom-chars-hex-float may only be matched ifequalsfmt ,chars_format :: fixed ,chars_format :: scientific , andchars_format :: general , respectively;chars_format :: hex -
case is ignored in
from-chars-infinity and in the leadingNAN part of afrom-chars-nan ; and -
a
digit separator is not matched anywhere in a' from-chars-floating-pattern .
¶
Preconditions:
has the value of one of the enumerators of .
¶
Effects:
The pattern is the expected form of the subject sequence
in the locale,
as described for , except that
- the sign
may only appear in the exponent part;' + ' -
if
hasfmt set but notchars_format ::scientific, the otherwise optional exponent part shall appear;chars_format ::fixed -
if
hasfmt set but notchars_format ::fixed, the optional exponent part shall not appear; andchars_format ::scientific -
if
isfmt , the prefixchars_format ::hexor" 0x " is assumed." 0X "
In any case, the resulting
is one of at most two floating-point values
closest to the value of the string matching the pattern.
¶ Effects:
- If the pattern is empty, the effect is as described above.
-
Otherwise, if the pattern is of the form
,- from-chars-infinity -
if negative infinity is representable in the type of
,value is set to negative infinity and the membervalue of the result is value-initialized;ec -
otherwise,
is set to the lowest finite value representable in the type ofvalue and the membervalue of the result isec .errc :: result_out_of_range
-
if negative infinity is representable in the type of
-
Otherwise, if the pattern is of the form
from-chars-infinity ,-
if either positive or unsigned infinity is representable in the type of
,value is set to such an infinity and the membervalue of the result is value-initialized;ec -
otherwise,
is set to the greatest finite value representable in the type ofvalue and the membervalue of the result isec .errc :: result_out_of_range
-
if either positive or unsigned infinity is representable in the type of
-
Otherwise, if the pattern is of the form
-opt
,from-chars-nan -
if quiet NaN is representable in the type of
,value is set to quiet NaN and the membervalue of the result is value-initialized, where the meaning of the optionalec nan-char-seq is implementation-defined, -
otherwise,
is set to positive or unsigned zero and the membervalue of the result isec .errc :: invalid_argument
-
if quiet NaN is representable in the type of
-
Otherwise, the character sequence represents a rational number
whose value is determined by
-
interpreting the pattern without leading U+002D HYPHEN MINUS
but with an added
hexadecimal-prefix as ahexadecimal-floating-point-literal ifisfmt , andchars_format :: hex -
interpreting the pattern without leading U+002D HYPHEN MINUS
as a
decimal-floating-point-literal otherwise,
([round.style]).round_to_nearest is set to , and the membervalue of the result isec -
value-initialized if
isisnormal ( ) ,true -
if is negative or positive infinity or is not in the range of values representable by the type oferrc :: result_out_of_range ,value -
if is zero but is not zero, orerrc :: result_out_of_range -
an implementation-defined choice of either
or a value-initialized result if is a subnormal.errc :: result_out_of_range
-
interpreting the pattern without leading U+002D HYPHEN MINUS
but with an added
¶ Throws: Nothing.
See also: ISO/IEC 9899:2024, 7.24.2.6, 7.24.2.8