Five Popular Myths about C++, Part 2

myth.png[For your winter reading pleasure, we're pleased to present this three-part series of new material by Bjarne Stroustrup. Part one is here, and part three will be posted next Monday which will complete the series just in time for the holidays. Enjoy. -- Ed.]

 

Five Popular Myths about C++ (Part 2)

by Bjarne Stroustrup

Morgan Stanley, Columbia University, Texas A&M University

 

1. Introduction (repeated from Part 1)

In this three-part series, I will explore, and debunk, five popular myths about C++:

  1. "To understand C++, you must first learn C"
  2. "C++ is an Object-Oriented Language"
  3. "For reliable software, you need Garbage Collection"
  4. "For efficiency, you must write low-level code"
  5. "C++ is for large, complicated, programs only"

If you believe in any of these myths, or have colleagues who perpetuate them, this short article is for you. Several of these myths have been true for someone, for some task, at some time. However, with today’s C++, using widely available up-to date ISO C++ 2011 compilers, and tools, they are mere myths.

I deem these myths “popular” because I hear them often. Occasionally, they are supported by reasons, but more often they are simply stated as obvious, needing no support. Sometimes, they are used to dismiss C++ from consideration for some use.

Each myth requires a long paper or even a book to completely debunk, but my aim here is simply to raise the issues and to briefly state my reasons.

The first two myths were addressed in my first installment.

4. Myth 3: "For reliable software, you need Garbage Collection"

Garbage collection does a good, but not perfect, job at reclaiming unused memory. It is not a panacea. Memory can be retained indirectly and many resources are not plain memory. Consider:

class Filter { // take input from file iname and produce output on file oname
public:
  Filter(const string& iname, const string& oname); // constructor
  ~Filter();                                        // destructor
  // ...
private:
  ifstream is;
  ofstream os;
  // ...
};

This Filter’s constructor opens two files. That done, the Filter performs some task on input from its input file producing output on its output file. The task could be hardwired into Filter, supplied as a lambda, or provided as a function that could be provided by a derived class overriding a virtual function. Those details are not important in a discussion of resource management. We can create Filters like this:

void user()
{
  Filter flt {“books”,”authors”};
  Filter* p = new Filter{“novels”,”favorites”};
  // use flt and *p
  delete p;
}

From a resource management point of view, the problem here is how to guarantee that the files are closed and the resources associated with the two streams are properly reclaimed for potential re-use.

The conventional solution in languages and systems relying on garbage collection is to eliminate the delete (which is easily forgotten, leading to leaks) and the destructor (because garbage collected languages rarely have destructors and “finalizers” are best avoided because they can be logically tricky and often damage performance). A garbage collector can reclaim all memory, but we need user actions (code) to close the files and to release any non-memory resources (such as locks) associated with the streams. Thus memory is automatically (and in this case perfectly) reclaimed, but the management of other resources is manual and therefore open to errors and leaks.

The common and recommended C++ approach is to rely on destructors to ensure that resources are reclaimed. Typically, such resources are acquired in a constructor leading to the awkward name “Resource Acquisition Is Initialization” (RAII) for this simple and general technique. In user(), the destructor for flt  implicitly calls the destructors for the streams is and os. These constructors in turn close the files and release the resources associated with the streams. The delete would do the same for *p.

Experienced users of modern C++ will have noticed that user() is rather clumsy and unnecessarily error-prone. This would be better:

void user2()
{
  Filter flt {“books”,”authors”};
  unique_ptr<Filter> p {new Filter{“novels”,”favorites”}};
  // use flt and *p
}

Now *p will be implicitly released whenever user() is exited. The programmer cannot forget to do so. The unique_ptr is a standard-library class designed to ensure resource release without runtime or space overheads compared to the use of built-in “naked” pointers.

However, we can still see the new, this solution is a bit verbose (the type Filter is repeated), and separating the construction of the ordinary pointer (using new) and the smart pointer (here, unique_ptr) inhibits some significant optimizations. We can improve this by using a C++14 helper function make_unique that constructs an object of a specified type and returns a unique_ptr to it:

void user3()
{
  Filter flt {“books”,”authors”};
  auto p = make_unique<Filter>(“novels”,”favorites”);
  // use flt and *p
}

Unless we really needed the second Filter to have pointer semantics (which is unlikely) this would be better still:

void user3()
{
  Filter flt {“books”,”authors”};
  Filter flt2 {“novels”,”favorites”};
  // use flt and flt2
}

This last version is shorter, simpler, clearer, and faster than the original.

But what does Filter’s destructor do? It releases the resources owned by a Filter; that is, it closes the files (by invoking their destructors). In fact, that is done implicitly, so unless something else is needed for Filter, we could eliminate the explicit mention of the Filter destructor and let the compiler handle it all. So, what I would have written was just:

class Filter { // take input from file iname and produce output on file oname
public:
  Filter(const string& iname, const string& oname);
  // ...
private:
  ifstream is;
  ofstream os;
  // ...
};

void user3()
{
  Filter flt {“books”,”authors”};
  Filter flt2 {“novels”,”favorites”};
  // use flt and flt2
}

This happens to be simpler than what you would write in most garbage collected languages (e.g., Java or C#) and it is not open to leaks caused by forgetful programmers. It is also faster than the obvious alternatives (no spurious use of the free/dynamic store and no need to run a garbage collector). Typically, RAII also decreases the resource retention time relative to manual approaches.

This is my ideal for resource management. It handles not just memory, but general (non-memory) resources, such as file handles, thread handles, and locks. But is it really general? How about objects that needs to be passed around from function to function? What about objects that don’t have an obvious single owner?

4.1 Transferring Ownership: move

Let us first consider the problem of moving objects around from scope to scope. The critical question is how to get a lot of information out of a scope without serious overhead from copying or error-prone pointer use. The traditional approach is to use a pointer:

X* make_X()
{
  X* p = new X:
  // ... fill X ..
  return p;
}

void user()
{
  X* q = make_X();
  // ... use *q ...
  delete q;
}

Now who is responsible for deleting the object? In this simple case, obviously the caller of make_X() is, but in general the answer is not obvious. What if make_X() keeps a cache of objects to minimize allocation overhead? What if user() passed the pointer to some other_user()? The potential for confusion is large and leaks are not uncommon in this style of program.

I could use a shared_ptr or a unique_ptr to be explicit about the ownership of the created object. For example:

unique_ptr<X> make_X();

But why use a pointer (smart or not) at all? Often, I don’t want a pointer and often a pointer would distract from the conventional use of an object. For example, a Matrix addition function creates a new object (the sum) from two arguments, but returning a pointer would lead to seriously odd code:

unique_ptr<Matrix> operator+(const Matrix& a, const Matrix& b);
Matrix res = *(a+b);

That * is needed to get the sum, rather than a pointer to it. What I really want in many cases is an object, rather than a pointer to an object. Most often, I can easily get that. In particular, small objects are cheap to copy and I wouldn’t dream of using a pointer:

double sqrt(double); // a square root function
double s2 = sqrt(2); // get the square root of 2

On the other hand, objects holding lots of data are typically handles to most of that data. Consider istream, string, vector, list, and thread. They are all just a few words of data ensuring proper access to potentially large amounts of data. Consider again the Matrix addition. What we want is

Matrix operator+(const Matrix& a, const Matrix& b); // return the sum of a and b
Matrix r = x+y;

We can easily get that.

Matrix operator+(const Matrix& a, const Matrix& b)
{
  Matrix res;
  // ... fill res with element sums ...
  return res;
}

By default, this copies the elements of res into r, but since res is just about to be destroyed and the memory holding its elements is to be freed, there is no need to copy: we can “steal” the elements. Anybody could have done that since the first days of C++, and many did, but it was tricky to implement and the technique was not widely understood. C++11 directly supports “stealing the representation” from a handle in the form of move operations that transfer ownership. Consider a simple 2-D Matrix of doubles:

class Matrix {
  double* elem; // pointer to elements
  int nrow;     // number of rows
  int ncol;     // number of columns
public:
  Matrix(int nr, int nc)                  // constructor: allocate elements
    :elem{new double[nr*nc]}, nrow{nr}, ncol{nc}
  {
    for(int i=0; i<nr*nc; ++i) elem[i]=0; // initialize elements
  }

  Matrix(const Matrix&);                  // copy constructor
  Matrix operator=(const Matrix&);        // copy assignment

  Matrix(Matrix&&);                       // move constructor
  Matrix operator=(Matrix&&);             // move assignment

  ~Matrix() { delete[] elem; }            // destructor: free the elements

// …
};

A copy operation is recognized by its reference (&) argument. Similarly, a move operation is recognized by its rvalue reference (&&) argument. A move operation is supposed to “steal” the representation and leave an “empty object” behind. For Matrix, that means something like this:

Matrix::Matrix(Matrix&& a)                   // move constructor
  :nrow{a.nrow}, ncol{a.ncol}, elem{a.elem}  // “steal” the representation
{
  a.elem = nullptr;                          // leave “nothing” behind
}

That’s it! When the compiler sees the return res; it realizes that res is soon to be destroyed. That is, res will not be used after the return. Therefore it applies the move constructor, rather than the copy constructor to transfer the return value. In particular, for

Matrix r = a+b;

the res inside operator+() becomes empty -- giving the destructor a trivial task -- and res’s elements are now owned by r. We have managed to get the elements of the result -- potentially megabytes of memory -- out of the function (operator+()) and into the caller’s variable. We have done that at a minimal cost (probably four word assignments).

Expert C++ users have pointed out that there are cases where a good compiler can eliminate the copy on return completely (in this case saving the four word moves and the destructor call). However, that is implementation dependent, and I don’t like the performance of my basic programming techniques to depend on the degree of cleverness of individual compilers. Furthermore, a compiler that can eliminate the copy, can as easily eliminate the move. What we have here is a simple, reliable, and general way of eliminating complexity and cost of moving a lot of information from one scope to another.

Often, we don’t even need to define all those copy and move operations. If a class is composed out of members that behave as desired, we can simply rely on the operations generated by default. Consider:

class Matrix {
    vector<double> elem; // elements
    int nrow;            // number of rows
    int ncol;            // number of columns
public:
    Matrix(int nr, int nc)    // constructor: allocate elements
      :elem(nr*nc), nrow{nr}, ncol{nc}
    { }

    // ...
};

This version of Matrix behaves like the version above except that it copes slightly better with errors and has a slightly larger representation (a vector is usually three words).

What about objects that are not handles? If they are small, like an int or a complex<double>, don’t worry. Otherwise, make them handles or return them using “smart” pointers, such as unique_ptr and shared_ptr. Don’t mess with “naked” new and delete operations.

Unfortunately, a Matrix like the one I used in the example is not part of the ISO C++ standard library, but several are available (open source and commercial). For example, search the Web for “Origin Matrix Sutton” and see Chapter 29 of my The C++ Programming Language (Fourth Edition) [11] for a discussion of the design of such a matrix.

4.2 Shared Ownership: shared_ptr

In discussions about garbage collection it is often observed that not every object has a unique owner. That means that we have to be able ensure that an object is destroyed/freed when the last reference to it disappears. In the model here, we have to have a mechanism to ensure that an object is destroyed when its last owner is destroyed. That is, we need a form of shared ownership. Say, we have a synchronized queue, a sync_queue, used to communicate between tasks. A producer and a consumer are each given a pointer to the sync_queue:

void startup()
{
  sync_queue* p  = new sync_queue{200};  // trouble ahead!
  thread t1 {task1,iqueue,p};  // task1 reads from *iqueue and writes to *p
  thread t2 {task2,p,oqueue};  // task2 reads from *p and writes to *oqueue
  t1.detach();
  t2.detach();
}

I assume that task1, task2, iqueue, and oqueue have been suitably defined elsewhere and apologize for letting the thread outlive the scope in which they were created (using detatch()). Also, you may imagine pipelines with many more tasks and sync_queues. However, here I am only interested in one question: “Who deletes the sync_queue created in startup()?” As written, there is only one good answer: “Whoever is the last to use the sync_queue.” This is a classic motivating case for garbage collection. The original form of garbage collection was counted pointers: maintain a use count for the object and when the count is about to go to zero delete the object. Many languages today rely on a variant of this idea and C++11 supports it in the form of shared_ptr. The example becomes:

void startup()
{
  auto p = make_shared<sync_queue>(200);  // make a sync_queue and return a stared_ptr to it
  thread t1 {task1,iqueue,p};  // task1 reads from *iqueue and writes to *p
  thread t2 {task2,p,oqueue};  // task2 reads from *p and writes to *oqueue
  t1.detach();
  t2.detach();
}

Now the destructors for task1 and task2 can destroy their shared_ptrs (and will do so implicitly in most good designs) and the last task to do so will destroy the sync_queue.

This is simple and reasonably efficient. It does not imply a complicated run-time system with a garbage collector. Importantly, it does not just reclaim the memory associated with the sync_queue. It reclaims the synchronization object (mutex, lock, or whatever) embedded in the sync_queue to mange the synchronization of the two threads running the two tasks. What we have here is again not just memory management, it is general resource management. That “hidden” synchronization object is handled exactly as the file handles and stream buffers were handled in the earlier example.

We could try to eliminate the use of shared_ptr by introducing a unique owner in some scope that encloses the tasks, but doing so is not always simple, so C++11 provides both unique_ptr (for unique ownership) and shared_ptr (for shared ownership).

4.3 Type safety

Here, I have only addressed garbage collection in connection with resource management. It also has a role to play in type safety. As long as we have an explicit delete operation, it can be misused. For example:

X* p = new X;
X* q = p;
delete p;
// ...
q->do_something();  // the memory that held *p may have been re-used

Don’t do that. Naked deletes are dangerous -- and unnecessary in general/user code. Leave deletes inside resource management classes, such as string, ostream, thread, unique_ptr, and shared_ptr. There, deletes are carefully matched with news and harmless.

4.4 Summary: Resource Management Ideals

For resource management, I consider garbage collection a last choice, rather than “the solution” or an ideal:

  1. Use appropriate abstractions that recursively and implicitly handle their own resources. Prefer such objects to be scoped variables.
  2. When you need pointer/reference semantics, use “smart pointers” such as unique_ptr and shared_ptr to represent ownership.
  3. If everything else fails (e.g., because your code is part of a program using a mess of pointers without a language supported strategy for resource management and error handling), try to handle non-memory resources “by hand” and plug in a conservative garbage collector to handle the almost inevitable memory leaks.

Is this strategy perfect? No, but it is general and simple. Traditional garbage-collection based strategies are not perfect either, and they don’t directly address non-memory resources.

 

Postscript

The first installment of this series addressed

  • Myth 1: "To understand C++, you must first learn C"
  • Myth 2: "C++ is an Object-Oriented Language"

In my next installment I will address

  • Myth 4: "For efficiency, you must write low-level code"
  • Myth 5: "C++ is for large, complicated, programs only"
     

Add a Comment

Comments are closed.

Comments (11)

1 0

Vlad said on Dec 15, 2014 12:58 PM:

Great post! I think it is worth mentioning that even if return value optimization and copy elision is performed by most compilers (so one can live without a move constructor), there is still the problem of copying in an assignment, when the left hand side was previously constructed. In that case, as far as I know, pre C++11 will always perform a copy in the assignment
res = a + b;
But the move assignment operator gets rid of this extra copy.

PS: minor typo: the constructor of Matrix, line
Matrix(int nr, int nc)                  // constructor: allocate elements
:elem{double[nr*nc]}, nrow{nr}, ncol{nc}
should have a
 new double[nr*nc]
as opposed to how it is now.
0 0

Bjarne Stroustrup said on Dec 15, 2014 04:13 PM:

Thanks. Yes, that it would be a good idea to call out assignment

PS Oops! I must have shipped the uncorrected version. Sorry.
0 0

Ezo said on Dec 15, 2014 04:15 PM:

Hi. There is small typo here:
unique_prt<Matrix> operator+(const Matrix& a, const Matrix& b);

prt -> ptr

I agree with the article. I don't understand need for garbage collection. This doesn't seem like elegant solution to the problem, but like a hack. RAII makes more sense.
1 0

Bjarne Stroustrup said on Dec 15, 2014 06:47 PM:

Thanks.
Please don't underestimate the value of garbage collection in some/many areas. My problem with GC is that it is promoted as a panacea, when it is neither ideal or sufficient. We need to do better.
0 0

Nicky C said on Dec 15, 2014 07:44 PM:

RAII should be the default and general resources management technique, and average C++ programers should rarely need one. But it does not mean GC is useless since there is always some situation not "default and general". I do heard that some lock-free algorithms can be more practical when they are allowed to leak memory. Point 3 of the summary is another example. Such specific situations are where GC is useful.

GC is not bad. It is just commonly abused. GC is like a guardian. We don't always need a guardian along side. But when we have to do something highly specific and it is carefully proved to need a guardian, I would like the guardian to be conveniently available. And once we finish the task and no longer need GC, I would hope the guardian can leave happily.

In this sense, the best form of GC in my mind is not an embedded component of the rumtime system, no matter it is compulsory or optional. I would like my guardian to be a library object, and I explicitly ask it for memory.

Perhaps the guardian itself is a resource. Guardian Deployment Is Initialization.
0 0

solomonchild said on Dec 15, 2014 11:49 PM:

A typo here:
In user(), the destructor for flt implicitly calls the destructors for the streams is and os. These *constructors* in turn close the files and release the resources associated with the streams. The delete would do the same for *p.

I think you meant "These destructors".
0 0

Arseny Tolmachev said on Dec 15, 2014 11:52 PM:

By the way, is default dynamic memory allocator prone to memory fragmentation in long-running programs?
0 0

sabetaytoros said on Dec 16, 2014 02:12 AM:

Hi,

There is a shared pointer constructor with a deleter function. Is there the same type constructor for make_shared.

Thanks

0 0

matt said on Dec 16, 2014 05:16 AM:

Hi Bjarne!

I'm wondering, what are your thoughts on GC alternatives like region-based memory management -- together with (compile-time) region inference?
http://en.wikipedia.org/wiki/Region-based_memory_management#Region_inference

Rust is one example:
http://doc.rust-lang.org/nightly/intro.html#ownership
http://doc.rust-lang.org/nightly/intro.html#safety-[em]and[/em]-speed // having a really hard time with reformatting here, so replace square brackets with angle brackets

This seems to fit in with the C++'s zero-overhead philosophy -- while also being close to the current evolution trajectory (e.g., std::unique_ptr also expresses sole ownership, a concept enforced in Rust at compile time).

In addition: "In Rust, pointers are non-nullable and the compiler enforces the invariant that pointers are never dangling."

// Further similarities: https://github.com/rust-lang/rust/wiki/Rust-for-CXX-programmers
0 0

Steven Stewart-Gallus said on Dec 16, 2014 08:37 AM:

The filter example is incorrect. One needs to manually call the close method on the output file stream (calling it on the input stream is debatable) to check for errors as the output file stream destructor does not signal an error or throw an exception.
0 0

MichaelMueller said on Dec 16, 2014 10:49 AM:

GC is frustrating when used in a long-running environment that performs a great deal of memory operations. It becomes a CPU hog over time, and performance suffers at the expense of GC doing its job. You then have to tune your memory allocation patterns or try to get the GC to run when you can afford it to. I'd trade the black-box GC for determinism any day.