Document number: N3994 Date: 2014-05-22 Project: Programming Language C++, Evolution Working Group Reply-to: Stephan T. Lavavej Range-Based For-Loops: The Next Generation (Revision 1) I. Introduction This updates N3853 (see [1]) which proposed the syntax "for (elem : range)", by adding support for attributes and answering additional questions. Please see the original proposal for the rationale behind this feature, which is not repeated here. II. Standardese 1. In 6.5 [stmt.iter]/1 and A.5 [gram.stmt], after: iteration-statement: [...] for ( for-range-declaration : for-range-initializer ) statement add: for ( for-range-identifier : for-range-initializer ) statement 2. In 6.5 [stmt.iter]/1 and A.5 [gram.stmt], after: for-range-initializer: expression braced-init-list add: for-range-identifier: identifier attribute-specifier-seqopt 3. At the end of 6.5.4 [stmt.ranged], add a new paragraph: A range-based for statement of the form for ( for-range-identifier : for-range-initializer ) statement is equivalent to for ( auto&& for-range-identifier : for-range-initializer ) statement III. Questions And Answers Q15. Has this been implemented? A15. Yes! David Vandevoorde and Jonathan Caves have reported that they were able to implement N3853 in less than an hour each. As N3853's "Q8. What about attributes?" was an open question, Vandevoorde took the opportunity to support them on both sides of the identifier, while this revision differs for the following reason: Q16. Why are attributes not permitted to appear before the identifier? A16. Their meanings would be ambiguous to humans. A for-range-declaration of the form "auto&& elem" can be marked with attributes in several places: "[[attr1]] auto [[attr2]] && [[attr3]] elem [[attr4]]". (See the definitions of for-range-declaration in 6.5 [stmt.iter]/1, decl-specifier-seq in 7.1 [dcl.spec]/1, ptr-operator in 8 [dcl.decl]/4, and noptr-declarator in 8 [dcl.decl]/4, respectively.) Here's what these attributes appertain to: attr1 appertains to elem. 6.5.4 [stmt.ranged]/1 produces "for-range-declaration = *__begin;", and 7 [dcl.dcl]/1 says: "The attribute-specifier-seq in a simple-declaration appertains to each of the entities declared by the declarators of the init-declarator-list." attr2 appertains to auto. 7.1 [dcl.spec]/1 says: "The optional attribute-specifier-seq in a decl-specifier-seq appertains to the type determined by the preceding decl-specifiers (8.3)." attr3 appertains to auto&&. 8.3.2 [dcl.ref]/1 says: "The optional attribute-specifier-seq appertains to the reference type." attr4 appertains to elem. 8.3 [dcl.meaning]/1 says: "The optional attribute-specifier-seq following a declarator-id appertains to the entity that is declared." Permitting "for ([[attrBefore]] elem : range)" could lead to confusion - does that expand to "for ([[attrBefore]] auto&& elem : range)" (like attr1) or to "for (auto&& [[attrBefore]] elem : range)" (like attr3)? The Standard could unambiguously choose a particular meaning, but programmers could still be confused - not everyone reads the Standard for a living. In contrast, permitting "for (elem [[attrAfter]] : range)" isn't problematic. As one would expect, attrAfter appertains to elem because this expands to "for (auto&& elem [[attrAfter]] : range)" (like attr4) with no potential for confusion. In the highly unlikely event that a programmer needs to apply an attribute to auto or auto&&, they can simply fall back to The Original Syntax of range-for. Q17. Are you sure that you want to use auto&& (to permit modification) instead of const auto& (to forbid modification)? A17. Yes. This is a common question, because most loops observe elements and unintentional modification is dangerous. However, some loops have to modify elements - not just through assignments, but also through calling non-const member functions. The philosophy behind this proposal's minimal range-for syntax is that programmers basically never view elements as being separate from their containers (or ranges in general). To avoid surprises, range-for should "transparently" access elements. That certainly means in-place (instead of copying), but it also means with the same constness as the range. Implementation experience actually exists to guide this decision. Visual C++ provides the non-Standard syntax "for each (Elem elem in range)". In addition to being more verbose and less flexible than C++11's range-for syntax (which permits ADL customization), the implementation of "for each" adds constness for poorly understood reasons, so "for each (Elem& elem in range)" cannot be used to modify elements in-place. This limitation has repeatedly confused users, as encountered on Microsoft's internal mailing lists. Programmers who really want to add constness when observing a non-const range will still be able to say "for (const auto& elem : range)", but it would be confusing and limiting if "for (elem : range)" silently added constness. If the EWG wants to make adding constness slightly more convenient, syntax like "for (const elem : range)", "for (elem : const range)", or "for const (elem : range)" could be considered, but isn't being proposed here. Q18. Are you sure that you want to use auto&& to handle prvalues instead of decltype(auto) or something else? A18. Yes. N3853 considered various alternatives (see Q4), which would cause more problems than they would solve. In Issaquah, the EWG didn't object to relying on auto&&, which works for proxy objects most of the time (and compilers can freely warn about the dangers). Thiago Macieira has suggested using decltype(auto) (see [2]), which has slightly different behavior than auto&&. They both produce X& for lvalues and X&& for xvalues. For prvalues, decltype(auto) produces X, while auto&& produces X&&. But while decltype(auto) preserves information about the element's value category, it doesn't work with non-copyable/non-movable types, whereas auto&& works (as Marc Glisse observed in c++std-ext-14747 and Richard Smith confirmed in c++std-ext-14749). Here's Smith's example, slightly expanded: struct X { X(int) { } X(const X&) = delete; }; X f() { return { 0 }; } int main() { X&& r1 = f(); // OK: no copying auto&& r2 = f(); // also OK X x3 = f(); // error: copying decltype(auto) x4 = f(); // also error } This proposal uses auto&& because it works with anything that *__begin can return. Q19. Instead of this proposal's semantics, should the syntax "for (elem : range)" be given the semantics of assigning to an elem variable previously declared outside the loop? A19. No. This question (also raised by Macieira in [2]) is reasonable, and related to N3853's "Q7. What about shadowing?". The minimal syntax "for (elem : range)" can be given only one meaning, so it should definitely be chosen carefully. However, "outside-element-variable" semantics would not be useful in the vast majority of cases, would prevent this proposal from solving the problem of unintentional copies in C++11's range-for, and would actually encourage unnecessary copy assignments. First, consider traditional iterator/pointer/index loops. (For brevity, I'll refer to iterators, but pointers and indices behave identically here.) It's usually preferable for iterators to be scoped to their loops - usually, but not always. Iterators declared outside of their loops can be used to do a couple of things: carry information into the loop, and carry information out of the loop. Occasionally, a function obtains an iterator inside a range, and wants to loop over the remaining subrange, instead of starting from the beginning. A loop-scoped iterator could be copied from the given iterator, but directly using the given iterator is often simpler (as it avoids introducing an additional variable). More importantly, longer-lived iterators can be used to carry information outside of their loops. After a loop with one or more potential breaks has finished, the iterator can be inspected to determine whether the loop ran to the end of the range, or broke out earlier. Next, observe how range-based for-loops are different. They give elements to users, not iterators (although they internally use iterators). They insist on starting at the beginning, which couldn't be affected by "outside-element-variable" semantics, because element values don't contain positioning information like iterator values do. (If the EWG wants to make it convenient to start range-based for-loops somewhere other than the beginning, that should be accomplished via range adapters; such adapters would work equally well with C++11's range-for and this proposal.) Most importantly, "outside-element-variable" semantics would have significant difficulties with carrying information out of the loop: * Observing the element that the loop was looking at when it finished (either normally or early) doesn't provide positioning information (i.e. where the loop finished), unlike observing an iterator. * The case of running to completion, and the case of breaking while observing the range's last element, produce the same observable state for an "outside-element-variable", unlike an iterator. * Dealing with a potentially empty loop is problematic for an "outside-element-variable", unlike an iterator. This is especially problematic if there are no "sentinel values" available, i.e. values for initializing the outside element that can be distinguished from any elements expected in the input range. So while outside-iterator loops are occasionally very useful, "outside-element-variable" loops are fraught with peril (and inefficiency due to copy assignments). N3853 argued that "In addition to reducing overall verbosity, making common cases terse has the bonus effect of making uncommon cases stand out due to their remaining verbosity." For ranges, the most common case is looping over elements in-place. Giving the current element a name (that can be conveniently mentioned by the body of the loop) requires initializing a reference that's scoped to the current iteration - it has to be a reference in order to work in-place, and references can't be rebound. (And as we've just seen, an "outside-element-variable" isn't really useful.) The minimal syntax and in-place semantics of this proposal have been chosen to work together. IV. Acknowledgements Thanks to David Vandevoorde and Jonathan Caves for providing implementation experience. Additionally, Vandevoorde suggested the support for attributes that was added in this revision. Thanks to Thiago Macieira, Marc Glisse, and Richard Smith for their comments. Thanks to Deskin Miller, Eric Albright, Giovanni Dicanio, and Neil Coles for reviewing this proposal. V. References All of the Standardese citations in this proposal are to Working Paper N3936: http://www.open-std.org/jtc1/sc22/wg21/prot/14882fdis/n3936.pdf [1] N3853 "Range-Based For-Loops: The Next Generation" by Stephan T. Lavavej: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3853.htm [2] "Why do range-for loops require a variable declaration?" by Thiago Macieira: https://groups.google.com/a/isocpp.org/d/msg/std-proposals/BgE7b7aqE08/4fJVPes-8fgJ (end)