Document number: N4126
Date: 2014-07-29
Project: Programming Language C++, Language Evolution Working Group
Reply-to: Oleg Smolsky <oleg.smolsky@gmail.com>

Explicitly defaulted comparison operators

Revision history

N3950 was presented to the Evolution WG at the Rapperswil meeting and the response was very positive. A later revision, N4114 was amended to handle the following points requested at the meeting:

This proposal makes the following changes after the technical review on the c++std-ext list:

I. Introduction

Equality for composite types is an age-old problem which effects an overwhelming chunk of the programming community. Specifically, the users that program with Regular1 types and expect that they would compose naturally.

The proposal presents means of generating default equality/inequality for Regular, as well as relational operators for Totally Ordered2 user-defined types. Such types should be trivial to implement and easy to understand.

Specifically, I argue that:

  1. The presence of operator==() implies that the type is Regular. I expect that the equality is defined to return true IFF both objects represent the same value (this relation is transitive, reflexive and symmetric)
  2. The presence of operator<() implies that the type is Totally Ordered.

Finally, the feature is strictly "opt in" so that semantics of existing code remain intact.

II. Motivation and scope

Simple means of doing equality are really useful in modern C++ code that operates with types composed of Regular members. The definition of equality is trivial in such cases - member-wise comparison. Inequality should then be generated as its boolean negation.

This proposal focuses on Regular and Totally Ordered types as they naturally compose. Such cases are becoming more prevalent as people program more with value types and so writing the same equality and relational operators becomes tiresome. This is especially true when trying to lexicographically compare members to achieve total ordering.

Consider the following trivial example where a C++ type represents some kind of user record:

struct user
{
    uint32_t id, rank, position;

    std::string first_name, last_name;

    std::string address1, address2, city, state, country;
    uint32_t us_zip_code;

    friend bool operator==(const user &, const user &);
    friend bool operator!=(const user &, const user &);

    friend bool operator<(const user &, const user &);
    friend bool operator>=(const user &, const user &);
    friend bool operator>(const user &, const user &);
    friend bool operator<=(const user &, const user &);
};

Verbosity

The structure consists of regular members and the implementation of the equality operator is trivial yet verbose:
bool operator==(const user &a, const user &b)
{
    return a.id == b.id &&
           a.rank == b.rank &&
           a.position == b.position &&
           a.address1 == b.address1 &&
           a.address2 == b.address2 &&
           a.city == b.city &&
           a.state == b.state &&
           a.country == b.country &&
           a.us_zip_code == b.us_zip_code;
}

Also, the composite type is naturally Totally Ordered, yet that takes even more code:

bool operator<(const user &a, const user &b)
{
    // I could implement the full lexicographical comparison of members manually, yet I 
    // choose to cheat by using standard libraries
    return std::tie(a.id, a.rank, a.position,
                    a.address1, a.address2, a.city, a.state, a.country, 
                    a.us_zip_code)
           <
           std::tie(b.id, b.rank, b.position, 
                    b.address1, b.address2, b.city, b.state, b.country,
                    b.us_zip_code);
}
Specifically, this code, while technically required, suffers from the following issues:

Correctness

It is vital that equal/unequal, less/more-or-equals and more/less-or-equal pairs behave as boolean negations of each other. After all, we are building total ordering and the world would make no sense if both operator==() and operator!=() returned false!

As such, it is common to implement these operators in terms of each other.

Inequality for Regular types:

bool operator!=(const user &a, const user &b) 
{
    return !(a == b);
}

Relational operators for Totally Ordered types:

bool operator>=(const user &a, const user &b)
{
    return !(a < b);
}

bool operator>(const user &a, const user &b)
{
    return b < a;
}

bool operator<=(const user &a, const user &b)
{
    return !(a > b);
}
Notes:

III. The syntax

The proposed syntax: long form

Member-wise generation of special functions is already present in the Standard (see Section 12), so it seems natural to extend the scope of generation and reuse the existing syntax.

The proposed syntax for generating the new explicitly defaulted non-member operators is as follows:

struct Thing
{
    int a, b, c;
    std::string d;
};

bool operator==(const Thing &, const Thing &)= default;
bool operator!=(const Thing &, const Thing &)= default;

There are cases where members are private and so the operators need to be declared as friend. Consider the following syntax:

class AnotherThing
{
    int a, b;

public:
    // ...

    friend bool operator<(Thing, Thing) = default;
    friend bool operator>(Thing, Thing) = default;
    friend bool operator<=(Thing, Thing) = default;
    friend bool operator>=(Thing, Thing) = default;
};

I feel this is a natural choice because:

The proposed syntax: short form

Several committee members expressed a strong desire for a shorter form of notation that would radically reduce the amount of code it takes to declare the non-member operators. Here is the short-hand that extends to the long form defined above.

struct Thing
{
    int a, b, c;
    std::string d;

    default: ==, !=, <, >, <=, >=;   // defines the six non-member functions
};

Notes:

IV. Design decisions and discussion

A library-only solution

It is possible to write a templated "CRTP" base class that implements equality and relational operators. For an example, see Boost.Operators.

Comment from Nevin Liber: I believe this breaks standard layout if you use private derivation (and it is unlikely that we will want public derivation, given the deprecation of things like unary_function and binary_function).

Other committee members mentioned "upcoming, to be specified" reflection facilities yet, I feel, a first-class language feature is needed now.

Mutable members

The Evolution WG was divided on the mutable treatment. There were two mutually exclusive views:

  1. Exclude mutable members from the comparison operators.
    - such members were invented for caching data derived from the object's state
    - as such, they are not a part of the value type and should not participate in comparisons
  2. Include mutable members when doing comparisons (ie no special treatment)
    - such implementation is consistent with special member functions: copy constructor and assignment operator

I prefer option (1) above, yet the only way to resolve the committee dead lock is to make code with such members ill-formed. The user would have to implement the comparison operators manually. The committee thus reserves an option to reconsider the decision at a later stage, as part of a follow up proposal.

Single member types

The feedback on the c++std-ext list included the "single member wrapper struct" case where the author expects every overloaded operator of the wrapper to work consistently to those of the member.

Consider the following user-defined type:

struct wrapper
{
    double val;
    
    default: ==, !=, <, >, <=, >=;   // defines the six non-member functions
};
Such a thing would be built from a double and it makes sense to build the equality and relational operators based on the member's operators. This treatment covers every possible case: total order, strict weak order and even partial order (as the ambiguities in the last case are simply bridged to the caller).

Namely,

bool operator==(const wrapper &a, const wrapper &b) 
{
    return a.val == b.val;
}

bool operator!=(const wrapper &a, const wrapper &b) 
{
    return a.val != b.val;
}

bool operator<(const wrapper &a, const wrapper &b) 
{
    return a.val < b.val;
}

bool operator<=(const wrapper &a, const wrapper &b) 
{
    return a.val <= b.val;
}

bool operator>(const wrapper &a, const wrapper &b) 
{
    return a.val > b.val;
}

bool operator>=(const wrapper &a, const wrapper &b) 
{
    return a.val >= b.val;
}

Multi-member types

The original usecase for the proposal revolves around user-defined types that contain many regular members. These types must receive memberwise implementations of operator==() and operator<() and the other operators may be derived.

Namely, consider the shortest implementation of operator>=():

bool operator>=(const thing &t1, const thing &t2)
{
    return !(a < b);
}

Notes on this option:

Conclusion: the most consistent and straightforward option is to follow the dominating single-member case and generate each explicitly defaulted operator fully.

Efficiency of the lexicographical comparison

Consider the following user-defined type:
struct thing
{
   int a, b, c;
};

Option 1: use the members' strict weak order to implement operator<(). This is consistent with std::pair and std::tuple.

bool operator<(const thing &t1, const thing &t2) 
{
    if (t1.a < t2.a) return true;
    if (t2.a < t1.a) return false;
    if (t1.b < t2.b) return true;
    if (t2.b < t1.b) return false;

    return t1.c < t2.c;
}

Option 2: use the members' total order to implement operator<(). This puts an implicit dependency on operator==().

bool operator<(const thing &t1, const thing &t2) 
{
    if (t1.a != t2.a)
        return t1.a < t2.a;
    if (t1.b != t2.b)
        return t1.b < t2.b;
    
    return t1.c < t2.c;
}

Domain of the operator functions

There are some built-in types that are not totally ordered or cannot always be compared. Namely, < is only defined for pointers of the same type that refer to memory allocated from a single contiguous region, IEEE floating point numbers have the NaN value and the comparisons are defined in a very special way.

Design decisions:

  1. Integral types, enumerated types, pointer types and floating point types are supported
  2. The generated operators are only defined in the domain of the members' normal values

V. Technical specifications

Edit section 8.4.2 "Explicitly-defaulted functions [dcl.fct.def.default]"

  1. A function definition of the form:
    attribute-specifier-seqopt decl-specifier-seqopt declarator virt-specifier-seqopt = default ;
    is called an explicitly-defaulted definition. A function that is explicitly defaulted shall
    — be a special member function, or an explicitly defaultable operator function. See [defaultable]

New section in 8.4

After 8.4.3 add a new section

8.4.4 Explicitly defaultable operator functions [defaultable]

The following friend operator functions are explicitly defaultable:

  1. Non-member equality operators: operator==(), operator!=(), see [class.equality]
  2. Non-member relational operators: operator<(), operator>(), operator<=(), operator>=(), see [class.relational]

Edit section "12 Special member function [special]"

The default constructor (12.1), copy constructor and copy assignment operator (12.8), move constructor and move assignment operator (12.8) and destructor (12.4) are special member functions. These, together with equality operators (12.10) and relational operators (12.11) may be explicitly defaulted as per [dcl.fct.def.default]

New sections in 12

After 12.9 add a new section

12.10 Equality operators [class.equality]

  1. A class may provide overloaded operator==() and operator!=() as per [over.oper]. A default implementation of these non-member operators may be generated via the = default notation as it may be explicitly defaulted as per [dcl.fct.def.default].
  2. The defaulted operator==() definition is generated if and only if all sub-objects are fundamental types or compound types thereof, that provide operator==().
  3. If a class with a defaulted operator==() has a mutable member, the program is ill-formed
  4. The defaulted operator==() for class X shall take two arguments of type X by value or by const reference and return bool.
  5. The explicitly defaulted non-member operator==() for a class X shall perform memberwise equality comparison of its subobjects. Namely, a comparison of the subobjects that have the same position in both objects against each other until one subobject is not equal to the other.

    Direct base classes of X are compared first, in the order of their declaration in the base-specifier-list, and then the immediate non-static data members of X are compared, in the order in which they were declared in the class definition.

    Let x and y be the parameters of the defaulted operator function. Each subobject is compared in the manner appropriate to its type:

    • if the subobject is of class type, as if by a call to operator==() with the subobject of x and the corresponding subobject of y as a function arguments (as if by explicit qualification; that is, ignoring any possible virtual overriding functions in more derived classes);
    • if the subobject is an array, each element is compared in the manner appropriate to the element type;
    • if the subobject is of a scalar type, the built-in == operator is used.
  6. The explicitly-defaulted non-member operator!=() for a non-union class shall be implemented in a manner described in (5) while calling operator!=() and the built-in != operator where appropriate.

Example:

struct T {
    int a, b, c;
    std::string d;
};

bool operator==(const T &, const T &) = default;

Note, floating point values are regular only in the domain of normal values (outside of the NaN) and so the explicitly-defaulted non-member operators are only defined in that domain too.

After 12.10 add a new section

12.11 Relational operators [class.relational]
  1. A class may provide overloaded relational operators as per [over.oper]. A default implementation of a non-member relational operator may be generated via the = default notation as these may be explicitly defaulted as per [dcl.fct.def.default].
  2. The defaulted operator<() definition is generated if and only if all sub-objects are fundamental types or compound types thereof, that provide operator<().
  3. If a class with a defaulted operator<() has a mutable member, the program is ill-formed
  4. The defaulted operator<() for class X shall take two arguments of type X by value or by const reference and return bool.
  5. The explicitly-defaulted operator<() for a class X shall perform memberwise lexicographical comparison of its subobjects. Namely, a comparison of the subobjects that have the same position in both objects against each other until one subobject is not equivalent to the other. The result of comparing these first non-matching elements is the result of the function.

    Direct base classes of X are compared first, in the order of their declaration in the base-specifier-list, and then the immediate non-static data members of X are compared, in the order in which they were declared in the class definition.

    Let x and y be the parameters of the defaulted operator function. Each subobject is compared in the manner appropriate to its type:

    • if the subobject is of class type, as if by a call to operator<() with the subobject of x and the corresponding subobject of y as a function arguments (as if by explicit qualification; that is, ignoring any possible virtual overriding functions in more derived classes);
    • if the subobject is an array, each element is compared in the manner appropriate to the element type;
    • if the subobject is of a scalar type, the built-in < operator is used.
  6. An explicitly-defaulted non-member operator>() for a non-union class shall be implemented in a manner described in (5) while calling operator>() and the built-in > operator where appropriate.
  7. An explicitly-defaulted non-member operator>=() for a non-union class shall be implemented in a manner described in (5) while calling operator>=() and the built-in >= operator where appropriate.
  8. An explicitly-defaulted non-member operator<=() for a non-union class shall be implemented in a manner described in (5) while calling operator<=() and the built-in <= operator where appropriate.

Example:

struct T {
    int a, b;

    friend bool operator<(T, T) = default;
};

Note, pointer comparisons are only defined for a subset of values, floating point values are totally ordered only in the domain of normal values (outside of the NaN), so the explicitly-defaulted non-member operators are only defined in the domain of members' normal values.

After 12.11 add a new section

12.12 Explicitly defaulted equality and relational operators - short form [class.oper-short]
  1. A class may provide explicitly defaulted equality and relational operators as per [class.equality] and [class.relational] respectively. These non-member operators can also be generated via the short form of the notation:
    default: [the coma-separated list of operators];
  2. The following six short-hand names map to the explicitly defaultable equality and relational operators: ==, !=, <, <=, >, >=.
  3. The implementation must expand each term of the short form into a full declaration subject to [class.equality] and [class.relational], while choosing how to pass the arguments in order to maximize performance.

Example:

struct Thing
{
    int a, b, c;
    std::string d;

    default: ==, !=;   // defines equality/inequality non-member functions
};

VI. Acknowledgments

I believe the fundamental idea comes from Alex Stepanov and his work3 on regular types. These types are generalizations of the built-in types, so they need to support copying, assignment, and comparison. The C++ language has natively supported the first two points from the beginning and this proposal attempts to address the last one.

I want to thank Andrew Sutton for the early feedback and guidance, as well as Bjarne Stroustrup for loudly asking for consensus on small, fundamental language cleanups that strive to make users' lives easier.

Editorial credits go to Daniel Krügler, Ville Voutilainen, Jens Maurer and Lawrence Crowl - thank you for helping with the technical specification!

Finally, many folks on the c++std-ext list have provided valuable advice and guidance. Thank you for the lively discussion and your help with steering the design!

VII. References

  1. The Regular concept is defined by Stepanov in the following way:
  2. The Totally Ordered concept extends Regular with the following:
  3. Dehnert, James C. and Alexander A. Stepanov. "Fundamentals of generic programming." In Generic Programming, International Seminar on Generic Programming, Dagstuhl Castle, Germany, April/May 1998. Selected Papers, eds. Mehdi Jazayeri, Rüdiger G. K. Loos, and David R. Musser, vol. 1766 of Lecture Notes in Computer Science, pages 1-11. Springer, 2000.
    See http://www.stepanovpapers.com/DeSt98.pdf

See "Elements of programming" by Alexander Stepanov and Paul McJones for a full treatment of Regular and Totally Ordered concepts.