misc technical issues

Save to:
Instapaper Pocket Readability

Miscellaneous Technical Issues

What is a function object?

An object that in some way behaves like a function, of course. Typically, that would mean an object of a class that defines the application operator – operator().

A function object is a more general concept than a function because a function object can have state that persist across several calls (like a static local variable) and can be initialized and examined from outside the object (unlike a static local variable). For example:

    class Sum {
        int val;
    public:
        Sum(int i) :val(i) { }
        operator int() const { return val; }        // extract value

        int operator()(int i) { return val+=i; }    // application
    };

    void f(vector<int> v)
    {
        Sum s = 0;  // initial value 0
        s = for_each(v.begin(), v.end(), s);    // gather the sum of all elements
        cout << "the sum is " << s << "\n";

        // or even:
        cout << "the sum is " << for_each(v.begin(), v.end(), Sum(0)) << "\n";
    }

Note that a function object with an inline application operator inlines beautifully because there are no pointers involved that might confuse optimizers. To contrast: current optimizers are rarely (never?) able to inline a call through a pointer to function.

Function objects are extensively used to provide flexibility in the standard library.

How do I convert a value (a number, for example) to a std::string?

Call to_string.

For advanced and corner-case uses that aren’t covered by that answer, read on…

There are two easy ways to do this: you can use the <cstdio> facilities or the <iostream> library. In general, you should prefer the <iostream> library.

The <iostream> library allows you to convert pretty much anything to a std::string using the following syntax (the example converts a double, but you could substitute pretty much anything that prints using the << operator):

// File: convert.h
#include <iostream>
#include <sstream>
#include <string>
#include <stdexcept>

class BadConversion : public std::runtime_error {
public:
  BadConversion(const std::string& s)
    : std::runtime_error(s)
    { }
};

inline std::string stringify(double x)
{
  std::ostringstream o;
  if (!(o << x))
    throw BadConversion("stringify(double)");
  return o.str();
}

The std::ostringstream object o offers formatting facilities just like those for std::cout. You can use manipulators and format flags to control the formatting of the result, just as you can for other std::cout.

In this example, we insert x into o via the overloaded insertion operator, <<. This invokes the iostream formatting facilities to convert x into a std::string. The if test makes sure the conversion works correctly — it should always succeed for built-in/intrinsic types, but the if test is good style.

The expression o.str() returns the std::string that contains whatever has been inserted into stream o, in this case the string value of x.

Here’s how to use the stringify() function:

#include "convert.h"

void myCode()
{
  double x = /*...*/ ;
  // ...
  std::string s = "the value is " + stringify(x);
  // ...
}

How do I convert a std::string to a number?

Call stoi.

For advanced and corner-case uses that aren’t covered by that answer, read on…

There are two easy ways to do this: you can use the <cstdio> facilities or the <iostream> library. In general, you should prefer the <iostream> library.

The <iostream> library allows you to convert a std::string to pretty much anything using the following syntax (the example converts a double, but you could substitute pretty much anything that can be read using the >> operator):

// File: convert.h
#include <iostream>
#include <sstream>
#include <string>
#include <stdexcept>

class BadConversion : public std::runtime_error {
public:
  BadConversion(const std::string& s)
    : std::runtime_error(s)
    { }
};

inline double convertToDouble(const std::string& s)
{
  std::istringstream i(s);
  double x;
  if (!(i >> x))
    throw BadConversion("convertToDouble(\"" + s + "\")");
  return x;
}

The std::istringstream object i offers formatting facilities just like those for std::cin. You can use manipulators and format flags to control the formatting of the result, just as you can for other std::cin.

In this example, we initialize the std::istringstream i passing the std::string s (for example, s might be the string "123.456"), then we extract i into x via the overloaded extraction operator, >>. This invokes the iostream formatting facilities to convert as much of the string as possible/appropriate based on the type of x.

The if test makes sure the conversion works correctly. For example, if the string contains characters that are inappropriate for the type of x, the if test will fail.

Here’s how to use the convertToDouble() function:

#include "convert.h"

void myCode()
{
  std::string s = /*...a string representation of a number...*/ ;
  // ...
  double x = convertToDouble(s);
  // ...
}

You probably want to enhance convertToDouble() so it optionally checks that there aren’t any left-over characters:

inline double convertToDouble(const std::string& s,
                              bool failIfLeftoverChars = true)
{
  std::istringstream i(s);
  double x;
  char c;
  if (!(i >> x) || (failIfLeftoverChars && i.get(c)))
    throw BadConversion("convertToDouble(\"" + s + "\")");
  return x;
}

Can I templatize the above functions so they work with other types?

Yes — for any types that support iostream-style input/output.

For example, suppose you want to convert an object of class Foo to a std::string, or perhaps the reverse: from a std::string to a Foo. You could write a whole family of conversion functions based on the ones shown in the previous FAQs, or you could write a template function so the compiler does the grunt work.

For example, to convert an arbitrary type T to a std::string, provided T supports syntax like std::cout << x, you can use this:

// File: convert.h
#include <iostream>
#include <sstream>
#include <string>
#include <typeinfo>
#include <stdexcept>

class BadConversion : public std::runtime_error {
public:
  BadConversion(const std::string& s)
    : std::runtime_error(s)
    { }
};

template<typename T>
inline std::string stringify(const T& x)
{
  std::ostringstream o;
  if (!(o << x))
    throw BadConversion(std::string("stringify(")
                        + typeid(x).name() + ")");
  return o.str();
}

Here’s how to use the stringify() function:

#include "convert.h"

void myCode()
{
  Foo x;
  // ...
  std::string s = "this is a Foo: " + stringify(x);
  // ...
}

You can also convert from any type that supports iostream input by adding this to file convert.h:

template<typename T>
inline void convert(const std::string& s, T& x,
                    bool failIfLeftoverChars = true)
{
  std::istringstream i(s);
  char c;
  if (!(i >> x) || (failIfLeftoverChars && i.get(c)))
    throw BadConversion(s);
}

Here’s how to use the convert() function:

#include "convert.h"

void myCode()
{
  std::string s = /*...a string representation of a Foo...*/ ;
  // ...
  Foo x;
  convert(s, x);
  // ...
  // ...code that uses x...
}

To simplify your code, particularly for light-weight easy-to-copy types, you probably want to add a return-by-value conversion function to file convert.h:

template<typename T>
inline T convertTo(const std::string& s,
                   bool failIfLeftoverChars = true)
{
  T x;
  convert(s, x, failIfLeftoverChars);
  return x;
}

This simplifies your “usage” code some. You call it by explicitly specifying the template parameter T:

#include "convert.h"

void myCode()
{
  std::string a = /*...string representation of an int...*/ ;
  std::string b = /*...string representation of an int...*/ ;
  // ...
  if (convertTo<int>(a) < convertTo<int>(b))
    /*...*/ ;
}

Why do my compiles take so long?

You may have a problem with your compiler. It may be old, you may have it installed wrongly, or your computer might be an antique. I can’t help you with such problems.

However, it is more likely that the program that you are trying to compile is poorly designed, so that compiling it involves the compiler examining hundreds of header files and tens of thousands of lines of code. In principle, this can be avoided. If this problem is in your library vendor’s design, there isn’t much you can do (except changing to a better library/vendor), but you can structure your own code to minimize re-compilation after changes. Designs that do that are typically better, more maintainable, designs because they exhibit better separation of concerns.

Consider a classical example of an object-oriented program:

    class Shape {
    public:     // interface to users of Shapes
        virtual void draw() const;
        virtual void rotate(int degrees);
        // ...
    protected:  // common data (for implementers of Shapes)
        Point center;
        Color col;
        // ...
    };

    class Circle : public Shape {
    public: 
        void draw() const;
        void rotate(int) { }
        // ...
    protected:
        int radius;
        // ...
    };

    class Triangle : public Shape {
    public: 
        void draw() const;
        void rotate(int);
        // ...
    protected:
        Point a, b, c;
        // ...
    };  

The idea is that users manipulate shapes through Shape’s public interface, and that implementers of derived classes (such as Circle and Triangle) share aspects of the implementation represented by the protected members.

There are three serious problems with this apparently simple idea:

  • It is not easy to define shared aspects of the implementation that are helpful to all derived classes. For that reason, the set of protected members is likely to need changes far more often than the public interface. For example, even though “center” is arguably a valid concept for all Shapes, it is a nuisance to have to maintain a point “center” for a Triangle – for triangles, it makes more sense to calculate the center if and only if someone expresses interest in it.
  • The protected members are likely to depend on “implementation” details that the users of Shapes would rather not have to depend on. For example, much (most?) code using a Shape will be logically independent of the definition of a “color”, yet the presence of Color in the definition of Shape will probably require compilation of header files defining the operating system’s notion of color.
  • When something in the protected part changes, users of Shape have to recompile – even though only implementers of derived classes have access to the protected members.

Thus, the presence of “information helpful to implementers” in the base class that also acts as the interface to users is the source of instability in the implementation, spurious recompilation of user code (when implementation information changes), and excess inclusion of header files into user code (because the “information helpful to implementers” needs those headers). This is sometimes known as the “brittle base class problem.”

The obvious solution is to omit the “information helpful to implemeters” for classes that are used as interfaces to users. That is, to make interfaces, pure interfaces. That is, to represent interfaces as abstract classes:

    class Shape {
    public:     // interface to users of Shapes
        virtual void draw() const = 0;
        virtual void rotate(int degrees) = 0;
        virtual Point center() const = 0;
        // ...

        // no data
    };

    class Circle : public Shape {
    public: 
        void draw() const;
        void rotate(int) { }
        Point center() const { return cent; }
        // ...
    protected:
        Point cent;
        Color col;
        int radius;
        // ...
    };

    class Triangle : public Shape {
    public: 
        void draw() const;
        void rotate(int);
        Point center() const;
        // ...
    protected:
        Color col;
        Point a, b, c;
        // ...
    };  

The users are now insulated from changes to implementations of derived classes. I have seen this technique decrease build times by orders of magnitudes.

But what if there really is some information that is common to all derived classes (or simply to several derived classes)? Simply make that information a class and derive the implementation classes from that also:

    class Shape {
    public:     // interface to users of Shapes
        virtual void draw() const = 0;
        virtual void rotate(int degrees) = 0;
        virtual Point center() const = 0;
        // ...

        // no data
    };

    struct Common {
        Color col;
        // ...
    };

    class Circle : public Shape, protected Common {
    public: 
        void draw() const;
        void rotate(int) { }
        Point center() const { return cent; }
        // ...
    protected:
        Point cent;
        int radius;
    };

    class Triangle : public Shape, protected Common {
    public: 
        void draw() const;
        void rotate(int);
        Point center() const;
        // ...
    protected:
        Point a, b, c;
    };  

What should be done with macros that contain if?

Ideally you’ll get rid of the macro. Macros are evil in 4 different ways: evil#1, evil#2, evil#3, and evil#4, regardless of whether they contain an if (but they’re especially evil if they contain an if).

Nonetheless, even though macros are evil, sometimes they are the lesser of the other evils. When that happens, read this FAQ so you know how to make them “less bad,” then hold your nose and do what’s practical.

Here’s a naive solution:

#define MYMACRO(a,b) \                (Bad)
    if (xyzzy) asdf()

This will cause big problems if someone uses that macro in an if statement:

if (whatever)
    MYMACRO(foo,bar);
else
    baz;

The problem is that the else baz nests with the wrong if: the compiler sees this:

if (whatever)
    if (xyzzy) asdf();
    else baz;

Obviously that’s a bug.

The easy solution is to require {...} everywhere, but there’s another solution that I prefer even if there’s a coding standard that requires {...} everywhere (just in case someone somewhere forgets): add a balancing else to the macro definition:

#define MYMACRO(a,b) \                (Good)
    if (xyzzy) asdf(); \
    else (void)0

(The (void)0 causes the compiler to generate an error message if you forget to put the ; after the ‘call’.)

Your usage of that macro might look like this:

if (whatever)
    MYMACRO(foo,bar);
                    ↑ // This ; closes off the else (void)0 part
else
    baz;

which will get expanded into a balanced set of ifs and elses:

if (whatever)
    if (xyzzy)
        asdf();
    else
        (void)0;
        ↑↑↑↑↑↑↑↑ // A do-nothing statement
else
    baz;

Like I said, I personally do the above even when the coding standard calls for {...} in all the ifs. Call me paranoid, but I sleep better at night and my code has fewer bugs.

There is another approach that old-line C programmers will remember:

#define MYMACRO(a,b) \                (Okay)
    do { \
      if (xyzzy) asdf(); \
    } while (false)

Some people prefer the do {...} while (false) approach, though if you choose to use that, be aware that it might cause your compiler to generate less efficient code. Both approaches cause the compiler to give you an error message if you forget the ; after MYMACRO(foo,bar).

What should be done with macros that have multiple lines?

Answer: Choke, gag, cough. Macros are evil in 4 different ways: evil#1, evil#2, evil#3, and evil#4. Kill them all!! (Just kidding.)

Seriously, sometimes you need to use them anyway, and when you do, read this to learn some safe ways to write a macro that has multiple statements.

Here’s a naive solution:

#define MYMACRO(a,b) \                (Bad)
    statement1; \
    statement2; \
    /*...*/ \
    statementN;

This can cause problems if someone uses the macro in a context that demands a single statement. E.g.,

while (whatever)
    MYMACRO(foo, bar);

The naive solution is to wrap the statements inside {...}, such as this:

#define MYMACRO(a,b) \                (Bad)
    { \
        statement1; \
        statement2; \
        /*...*/ \
        statementN; \
    }

But this will cause compile-time errors with things like the following:

if (whatever)
    MYMACRO(foo, bar);
else
    baz;

…since the compiler will see:

if (whatever)
{
    statement1;
    statement2;
    // ...
    statementN;
} ; else
↑↑↑↑↑↑↑↑ // Compile-time error!
    baz;

One solution is to use a do { <statements go here> } while (false) pseudo-loop. This executes the body of the “loop” exactly once. The macro might look like this:

#define MYMACRO(a, b) \                (Okay)
    do { \
        statement1; \
        statement2; \
        /*...*/ \
        statementN; \
    } while (false)
                  ↑ // Intentionally not adding a ; here!

The ; gets added by the macro’s user, such as:

if (whatever)
    MYMACRO(foo, bar);
                     ↑ // The user of MYMACRO() adds the ; here
else
    baz;

After expansion, the compiler will see this:

if (whatever)
    do {
        statement1;
        statement2;
        // ...
        statementN;
    } while (false);
                   ↑ // From user's code, not from MYMACRO() itself
else
    baz;

There is an unlikely but possible downside to the above approach: historically some C++ compilers have refused to inline-expand any function containing a loop. If your C++ compiler has that limitation, it will not inline-expand any function that uses MYMACRO(). Chances are this won’t be a problem, either because you don’t use MYMACRO() in any inline functions, or because your compiler (subject to all its other constraints) is willing to inline-expand functions containing loops (provided the inline function meets all your compiler’s other requirements). However, if you are concerned, do some tests with your compiler: examine the resulting assembly code and/or perform a few simple timing tests.

If you have any problems with your compiler’s willingness to inline-expand functions containing loops, you can change MYMACRO()’s definition to if (true) {} else (void)0

#define MYMACRO(a, b) \
    if (true) { \
        statement1; \
        statement2; \
        /*...*/ \
        statementN; \
    } else
        (void)0
               ↑ // Intentionally not adding a ; here!

After expansion, the compiler will see a balanced set of ifs and elses):

if (whatever)
    if (true) {
        statement1;
        statement2;
        // ...
        statementN;
    } else
        (void)0;
        ↑↑↑↑↑↑↑ // A do-nothing statement
else
    baz;

The (void)0 in the macro definition forces users to remember the ; after any usage of the macro. If you forgot the ; like this…

foo();
MYMACRO(a, b)
             ↑ // Whoops, forgot the ; here
bar();
baz();

…then after expansion the compiler would see this:

foo();
if (true) {
    statement1; \
    statement2; \
    /*...*/ \
    statementN; \
} else
    (void)0 bar();
            ↑↑↑↑↑ // Fortunately(!) this will produce a compile-time error-message
baz();

Even though the specific error message is likely to be confusing, it will at least cause the programmer to notice that something is wrong. That’s a lot better than the alternative: without the (void)0 in the MYMACRO() definition, the compiler would silently generate the wrong code: the bar() call would never be called, since it would erroneously be on the unreachable else branch of the if.

What should be done with macros that need to paste two tokens together?

Groan. I really hate macros. Yes they’re useful sometimes, and yes I use them. But I always wash my hands afterwards. Twice. Macros are evil in 4 different ways: evil#1, evil#2, evil#3, and evil#4.

Here we go again, desperately trying to make an inherently evil thing a little less evil.

First, the basic approach is use the ISO/ANSI C and ISO/ANSI C++ “token pasting” feature: ##. On the surface this would look like the following:

Suppose you have a macro called “MYMACRO”, and suppose you’re passing a token as the parameter of that macro, and suppose you want to concatenate that token with the token “Tmp” to create a variable name. For example, the use of MYMACRO(Foo) would create a variable named FooTmp and the use of MYMACRO(Bar) would create a variable named BarTmp. In this case the naive approach would be to say this:

#define MYMACRO(a) \
    /*...*/ a ## Tmp /*...*/

However you need a double layer of indirection when you use ##. Basically you need to create a special macro for “token pasting” such as:

#define NAME2(a,b)         NAME2_HIDDEN(a,b)
#define NAME2_HIDDEN(a,b)  a ## b

Trust me on this — you really need to do this! (And please nobody write me saying it sometimes works without the second layer of indirection. Try concatenating a symbol with __LINE__ and see what happens then.)

Then replace your use of a ## Tmp with NAME2(a,Tmp):

#define MYMACRO(a) \
    /*...*/ NAME2(a,Tmp) /*...*/

And if you have a three-way concatenation to do (e.g., to paste three tokens together), you’d create a NAME3() macro like this:

#define NAME3(a,b,c)         NAME3_HIDDEN(a,b,c)
#define NAME3_HIDDEN(a,b,c)  a ## b ## c

Why can’t the compiler find my header file in #include "c:\test.h" ?

Because "\t" is a tab character.

You should use forward slashes ("/") rather than backslashes ("\") in your #include filenames, even on operating systems that use backslashes such as DOS, Windows, OS/2, etc. For example:

#if 1
  #include "/version/next/alpha/beta/test.h"    // RIGHT!
#else
  #include "\version\next\alpha\beta\test.h"    // WRONG!
#endif

Note that you should use forward slashes ("/") on all your filenames, not just on your #include files.

Note that your particular compiler might not treat a backslash within a header-name the same as it treats a backslash within a string literal. For instance, your particular compiler might treat #include "foo\bar\baz" as if the '\' chars were quoted. This is because header names and string literals are different: your compiler will always parse backslashes in string literals in the usual way, with '\t' becoming a tab character, etc., but it might not parse header names using those same rules. In any case, you still shouldn’t use backslashes in your header names since there’s something to lose but nothing to gain.

What are the C++ scoping rules for for loops?

Loop variables declared in the for statement proper are local to the loop body.

The following code used to be legal, but not any more, since i’s scope is now inside the for loop only:

for (int i = 0; i < 10; ++i) {
  // ...
  if ( /* something weird */ )
    break;
  // ...
}

if (i != 10) {
  // We exited the loop early; handle this situation separately
  // ...
}

If you’re working with some old code that uses a for loop variable after the for loop, the compiler will (hopefully!) give you a warning or an error message such as “Variable i is not in scope”.

Unfortunately there are cases when old code will compile cleanly, but will do something different — the wrong thing. For example, if the old code has a global variable i, the above code if (i != 10) silently change in meaning from the for loop variable i under the old rule to the global variable i under the current rule. This is not good. If you’re concerned, you should check with your compiler to see if it has some option that forces it to use the old rules with your old code.

Note: You should avoid having the same variable name in nested scopes, such as a global i and a local i. In fact, you should avoid globals altogether whenever you can. If you abided by these coding standards in your old code, you won’t be hurt by a lot of things, including the scoping rules for for loop variables.

Note: If your new code might get compiled with an old compiler, you might want to put {...} around the for loop to force even old compilers to scope the loop variable to the loop. And please try to avoid the temptation to use macros for this. Remember: macros are evil in 4 different ways: evil#1, evil#2, evil#3, and evil#4.

Why can’t I overload a function by its return type?

If you declare both char f() and float f(), the compiler gives you an error message, since calling simply f() would be ambiguous.

What is “persistence”? What is a “persistent object”?

A persistent object can live after the program which created it has stopped. Persistent objects can even outlive different versions of the creating program, can outlive the disk system, the operating system, or even the hardware on which the OS was running when they were created.

The challenge with persistent objects is to effectively store their member function code out on secondary storage along with their data bits (and the data bits and member function code of all member objects, and of all their member objects and base classes, etc). This is non-trivial when you have to do it yourself. In C++, you have to do it yourself. C++/OO databases can help hide the mechanism for all this.

How can I create two classes that both know about each other?

Use a forward declaration.

Sometimes you must create two classes that use each other. This is called a circular dependency. For example:

class Fred {
public:
  Barney* foo();  // Error: Unknown symbol 'Barney'
};

class Barney {
public:
  Fred* bar();
};

The Fred class has a member function that returns a Barney*, and the Barney class has a member function that returns a Fred*. You may inform the compiler about the existence of a class or structure by using a “forward declaration”:

class Barney;

This line must appear before the declaration of class Fred. It simply informs the compiler that the name Barney is a class, and further it is a promise to the compiler that you will eventually supply a complete definition of that class.

What special considerations are needed when forward declarations are used with member objects?

The order of class declarations is critical.

The compiler will give you a compile-time error if the first class contains an object (as opposed to a pointer to an object) of the second class. For example,

class Fred;  // Okay: forward declaration

class Barney {
  Fred x;  // Error: The declaration of Fred is incomplete
};

class Fred {
  Barney* y;
};

One way to solve this problem is to reverse order of the classes so the “used” class is defined before the class that uses it:

class Barney;  // Okay: forward declaration

class Fred {
  Barney* y;  // Okay: the first can point to an object of the second
};

class Barney {
  Fred x;  // Okay: the second can have an object of the first
};

Note that it is never legal for each class to fully contain an object of the other class since that would imply infinitely large objects. In other words, if an instance of Fred contains a Barney (as opposed to a Barney*), and a Barney contains a Fred (as opposed to a Fred*), the compiler will give you an error.

What special considerations are needed when forward declarations are used with inline functions?

The order of class declarations is critical.

The compiler will give you a compile-time error if the first class contains an inline function that invokes a member function of the second class. For example,

class Fred;  // Okay: forward declaration

class Barney {
public:
  void method()
  {
    x->yabbaDabbaDo();  // Error: Fred used before it was defined
  }
private:
  Fred* x;  // Okay: the first can point to an object of the second
};

class Fred {
public:
  void yabbaDabbaDo();
private:
  Barney* y;
};

There are a number of ways to work around this problem. One workaround would be to define Barney::method() with the keyword inline below the definition of class Fred (though still within the header file). Another would be to define Barney::method() without the keyword inline in file Barney.cpp. A third would be to use nested classes. A fourth would be to reverse the order of the classes so the “used” class is defined before the class that uses it:

class Barney;  // Okay: forward declaration

class Fred {
public:
  void yabbaDabbaDo();
private:
  Barney* y;  // Okay: the first can point to an object of the second
};

class Barney {
public:
  void method()
  {
    x->yabbaDabbaDo();  // Okay: Fred is fully defined at this point
  }
private:
  Fred* x;
};

Just remember this: Whenever you use forward declaration, you can use only that symbol; you may not do anything that requires knowledge of the forward-declared class. Specifically you may not access any members of the second class.

Why can’t I put a forward-declared class in a std::vector<>?

Because the std::vector<> template needs to know the sizeof() its contained elements, plus the std::vector<> probably accesses members of the contained elements (such as the copy constructor, the destructor, etc.). For example,

class Fred;  // Okay: forward declaration

class Barney {
  std::vector<Fred> x;  // Error: the declaration of Fred is incomplete
};

class Fred {
  Barney* y;
};

One solution to this problem is to change Barney so it uses a std::vector<> of Fred pointers (raw pointers or smart pointers such as unique_ptr or shared_ptr) rather than a std::vector<> of Fred objects:

class Fred;  // Okay: forward declaration

class Barney {
  std::vector<std::unique_ptr<Fred>> x;  // Okay: Barney can use Fred pointers
};

class Fred {
  Barney* y;
};

Another solution to this problem is to reverse the order of the classes so Fred is defined before Barney:

class Barney;  // Okay: forward declaration

class Fred {
  Barney* y;  // Okay: the first can point to an object of the second
};

class Barney {
  std::vector<Fred> x;  // Okay: Fred is fully defined at this point
};

Just remember this: Whenever you use a class as a template parameter, the declaration of that class must be complete and not simply forward declared.

Why do some people think x = ++y + y++ is bad?

Because it’s undefined behavior, which means the runtime system is allowed to do weird or even bizarre things.

The C++ language says you cannot modify a variable more than once between sequence points. Quoth the standard (section 5, paragraph 4):

Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.

What’s the value of i++ + i++?

It’s undefined. Basically, in C and C++, if you read a variable twice in an expression where you also write it, the result is undefined. Don’t do that. Another example is:

    v[i] = i++;

Related example:

    f(v[i],i++);

Here, the result is undefined because the order of evaluation of function arguments is undefined.

Having the order of evaluation undefined is claimed to yield better performing code. Compilers could warn about such examples, which are typically subtle bugs (or potential subtle bugs). It’s disappointing that after decades, most compilers still don’t warn, leaving that job to specialized, separate, and underused tools.

What’s the deal with “sequence points”?

Note: The C++11 standard has expressed the same rules as below in a different way. It no longer refers to “sequence points,” but the effects should be the same as described below.

The C++98 standard said (1.9p7):

At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.

For example, if an expression contains the subexpression y++, then the variable y will be incremented by the next sequence point. Furthermore if the expression just after the sequence point contains the subexpression ++z, then z will not have yet been incremented at the moment the sequence point is reached.

The “certain specified points” that are called sequence points are (section and paragraph numbers are from the standard):

  • the semicolon (1.9p16)
  • the non-overloaded comma-operator (1.9p18)
  • the non-overloaded || operator (1.9p18)
  • the non-overloaded && operator (1.9p18)
  • the ternary ?: operator (1.9p18)
  • after evaluation of all a function’s parameters but before the first expression within the function is executed (1.9p17)
  • after a function’s returned object has been copied back to the caller, but before the code just after the call has yet been evaluated (1.9p17)
  • after the initialization of each base and member (12.6.2p3)