Doc. No.: N3526
Date: 2013-01-21
Reply To: Michael Price
<[email protected]>

Uniform initialization for arrays and class aggregate types

I. Introduction

This document proposes a slight relaxation of the rules for eliding braces from aggregate initialization in order to make initialization of arrays and class aggregates more uniform. This change is required in order to support class aggregate types with a single member subaggregate that behave similarly to their array counterparts (i.e. std::array).

II. Motivation and Scope

The Theoretical Problem

The C++ Standard defines the behavior for aggregrate initialization using initialization list syntax in section 8.5.1. It defines two kinds of types as aggregate types: arrays and classes that meet certain restrictions.

Paragraph 8.5.1.1 reads:

An aggregate is an array or a class (Clause 9) with no user-provided constructors (12.1), no brace-or-equal-initializers for non-static data members (9.2), no private or protected non-static data members (Clause 11), no base classes (Clause 10), and no virtual functions (10.3).

The remaining paragraphs of 8.5.1 lay out the rules for initialization of these aggregate types. The syntax for initializing the two different kinds of aggregate types are uniform for the simple cases, but as complexity grows beyond simple cases, the initialization begins to diverge as shown in the following examples.

  int aggr_array[2] = {1, 2};

  struct aggr_t {
      int a;
      int b;
  } instance = {3, 4};
  

Initializing aggregates with subaggregates of the same kind, is also uniform.

  int aggr_array[2][2] = {{1, 2}, {3, 4}};

  struct aggr_t {
      struct {
          int a;
          int b;
      } x, y;
  } instance = {{1, 2}, {3, 4}};
  

However, when we begin to mix the aggregate types, uniform initialization begins to break down.

  struct aggr_t {
      int a;
      int b;
  }  array_of_aggr[2] = {{1, 2}, {3, 4}};

  struct aggr_ex_t {
      int x[2][2];
  };

  aggr_ex_t bad  = {{1, 2}, {3, 4}};      // Error: Too many initializers, see below for details
  aggr_ex_t good = {{{1, 2}, {3, 4}}};
  

The reason for this subtle behavior lays in paragraphs 8.5.1.2 and 8.5.1.6.

Paragraph 8.5.1.2 reads:

When an aggregate is initialized by an initializer list, as specified in 8.5.4, the elements of the initializer list are taken as initializers for the members of the aggregate, in increasing subscript or member order. Each member is copy-initialized from the corresponding initializer-clause. If the initializer-clause is an expression and a narrowing conversion (8.5.4) is required to convert the expression, the program is ill-formed. [ Note: If an initializer-clause is itself an initializer list, the member is list-initialized, which will result in a recursive application of the rules in this section if the member is an aggregate. — end note]

Paragraph 8.5.1.6 reads:

An initializer-list is ill-formed if the number of initializer-clauses exceeds the number of members or elements to initialize.

The outer {} pair cause paragraph 8.5.1.2 to apply. The first inner {} pair (i.e. {1, 2}) qualify as the initializer-clause for the first member of aggr_ex_t, namely int x[2][2] and is governed by paragraph 8.5.1.10. Then the second inner {} pair (i.e. {3, 4}) are considered an initializer-clause, but paragraph 8.5.1.6 applies since there is not another member of aggr_ex_t and the statement is thus ill-formed.

The Practical Problem

This limitation in the current standard was discovered during the process of working on a different proposal to extend the standard library std::array type to support more than one dimension. It turns out that implementing those changes in a way that does not introduce another type name (e.g. std::narray) might be impossible given the current language.

    // A pared-down array implementation
    template <class T, size_t N, size_t ... Rest>
    struct array
    {
        typedef array<T, Rest...> elem_type;
        elem_type __elems_[N];
    };

    template <class T, size_t N>
    struct array<T, N>
    {
        typedef T elem_type;
        elem_type __elems_[N];
    };

    // Using this multidimensional array

    // 1-dimension, everything works as expected
    int _a[2] = {0, 0};
    array<int, 2> a = {0, 0};

    // 2-dimensions, extra braces are needed
    int _b[2][2] = {{1, 1}, {2, 2}};
    array<int, 2, 2> b = {{{1, 1}, {2, 2}}};

    // 3-dimensions, this begins to get out-of-hand
    int _c[2][2][2] = {{{3, 3}, {4, 4}}, {{5, 5}, {6, 6}}};
    array<int, 2, 2, 2> c = {{{{{3, 3}, {4, 4}}}, {{{5, 5}, {6, 6}}}}};
  

All of the redundant braces except the first inner pair can be eliminated by implementing array differently as seen in the following example. Yet we still have a situation where the initialization syntax for the array is dissimilar to C-arrays.

    // A slightly better, pared-down array implementation
    template <class T, size_t N, size_t ... Rest>
    struct array
    {
        typedef typename array<T, Rest...>::value_type value_type[N];
        value_type __elems_;
    };

    template <class T, size_t N>
    struct array<T, N>
    {
        typedef T value_type[N];
        value_type __elems_;
    };
  

A Possible Solution

The desirable outcome is to be able to elide the braces in the scenario I have presented. Currently, paragraph 8.5.1.11 discusses valid brace elsion and the semantics of initialization in those cases. I propose that paragraph 8.5.1.11 be modified to also allow brace elision in the case where the members of the aggregate type consist of a single subaggregate member, such that if there are more than one initializer clause, then the semantics should behave as if there had been a {} pair surrounding all initializer clauses at that nesting level.

Paragraph 8.5.1.11 currently reads:

Braces can be elided in an initializer-list as follows. If the initializer-list begins with a left brace, then the succeeding comma-separated list of initializer-clauses initializes the members of a subaggregate; it is erroneous for there to be more initializer-clauses than members. If, however, the initializer-list for a subaggregate does not begin with a left brace, then only enough initializer-clauses from the list are taken to initialize the members of the subaggregate; any remaining initializer-clauses are left to initialize the next member of the aggregate of which the current subaggregate is a member.

The solution presented above is represented in the following example.

  struct aggr_ex_t {
      int x[2][2];
  };

  aggr_ex_t fully_braced = {{{1, 2}, {3, 4}}};

  aggr_ex_t good = {{1, 2}, {3, 4}}; // Behaves as if the
                                     // initializer list had
                                     // been {{{1, 2}, {3, 4}}}

  // Would still be an error per paragraphs 8.5.1.6 and 8.5.1.11 since the
  // initializer-list would be equivalent to {{{1, 2}, {3, 4}, {5, 6}}}
  // and would have too many initialization-clauses.
  aggr_ex_t still_bad =  {{1, 2}, {3, 4}, {5, 6}};
  

Deeper nesting would apply the rules recursively (per the note in 8.5.1.2) as in the following example.

  struct aggr_ex_t {
      int x[2][2];
  };

  struct aggr_more_t {
      aggr_ex_t _a;
  };

  aggr_more_t fully_braced = {{{{1, 2}, {3, 4}}}};

  aggr_more_t good = {{{1, 2}, {3, 4}}};    // Both behave as if the
  aggr_more_t also_good = {{1, 2}, {3, 4}}; // initializer lists had
                                            // been {{{{1, 2}, {3, 4}}}}
  

III. Impact On the Standard

These changes would have the affect that some programs that would be ill-formed under the current language would now be well-formed. It will not affect programs that are currently well-formed. The extent of the changes to the standard document would be constrained to one paragraph only, 8.5.1.11.

IV. Design Decisions

Allowing brace-elision on aggregates with more than one member was considered and rejected as it seems likely to cause difficulties parsing programs that were already well-formed.

V. Technical Specifications

After paragraph 8.5.1.10, insert a new paragraph.

If the aggregate object being initialized has a single, subaggregate member, and there is more than a single initializer-clause (in a comma-separated list), then initialization occurs as if there had been a left brace immediately before the first initializer-clause and a right brace immediately after the last initializer-clause. [ Example:
Given the aggregate types

    struct S {
        int x[2][2];
    };

    struct A {
        S s;
    };
    
Then the following declarations are all equivalent
    A a1 = {{{{1, 2}, {3, 4}}}};
    A a2 = {{{1, 2}, {3, 4}}};
    A a3 = {{1, 2}, {3, 4}};
    A a4 = {1, 2, 3, 4};
end example]

Edit paragraph 8.5.1.12 (formerly 8.5.1.11) as follows.

Braces can be elided in an initializer-list as follows. If the initializer-list begins with a left brace, then the succeeding comma-separated list of initializer-clauses initializes the members of a subaggregate; it is erroneous for there to be more initializer-clauses than members (after applying the rule in 8.5.1.11). If, however, the initializer-list for a subaggregate does not begin with a left brace, then only enough initializer-clauses from the list are taken to initialize the members of the subaggregate; any remaining initializer-clauses are left to initialize the next member of the aggregate of which the current subaggregate is a member.

The example attached to paragraph 8.5.1.12 (formerly 8.5.1.11) remains the same.

VI. Acknowledgements

I'd like to thank my employer, Perceptive Software, for the allotment of time to work on this proposal and for their incredible support for career advancement in the R&D department.

Several committee members were willing to answer questions that I have had about formatting and presentation, and for that I'm particularly grateful to Marshall Clow, Beman Dawes, and Herb Sutter.

I'd also like to thank my co-workers who constantly put up with my fascination with this beautiful language, particularly Andrew Regier, Chris Sammis, Adam Riha, John Drouhard, William Lichtenberger, and Don Gay.