Articles & Books

Five Popular Myths about C++, Part 2

myth.png[For your winter reading pleasure, we're pleased to present this three-part series of new material by Bjarne Stroustrup. Part one is here, and part three will be posted next Monday which will complete the series just in time for the holidays. Enjoy. -- Ed.]

 

Five Popular Myths about C++ (Part 2)

by Bjarne Stroustrup

Morgan Stanley, Columbia University, Texas A&M University

 

1. Introduction (repeated from Part 1)

In this three-part series, I will explore, and debunk, five popular myths about C++:

  1. "To understand C++, you must first learn C"
  2. "C++ is an Object-Oriented Language"
  3. "For reliable software, you need Garbage Collection"
  4. "For efficiency, you must write low-level code"
  5. "C++ is for large, complicated, programs only"

If you believe in any of these myths, or have colleagues who perpetuate them, this short article is for you. Several of these myths have been true for someone, for some task, at some time. However, with today’s C++, using widely available up-to date ISO C++ 2011 compilers, and tools, they are mere myths.

I deem these myths “popular” because I hear them often. Occasionally, they are supported by reasons, but more often they are simply stated as obvious, needing no support. Sometimes, they are used to dismiss C++ from consideration for some use.

Each myth requires a long paper or even a book to completely debunk, but my aim here is simply to raise the issues and to briefly state my reasons.

The first two myths were addressed in my first installment.

4. Myth 3: "For reliable software, you need Garbage Collection"

Garbage collection does a good, but not perfect, job at reclaiming unused memory. It is not a panacea. Memory can be retained indirectly and many resources are not plain memory. Consider:

class Filter { // take input from file iname and produce output on file oname
public:
  Filter(const string& iname, const string& oname); // constructor
  ~Filter();                                        // destructor
  // ...
private:
  ifstream is;
  ofstream os;
  // ...
};

This Filter’s constructor opens two files. That done, the Filter performs some task on input from its input file producing output on its output file. The task could be hardwired into Filter, supplied as a lambda, or provided as a function that could be provided by a derived class overriding a virtual function. Those details are not important in a discussion of resource management. We can create Filters like this:

void user()
{
  Filter flt {“books”,”authors”};
  Filter* p = new Filter{“novels”,”favorites”};
  // use flt and *p
  delete p;
}

From a resource management point of view, the problem here is how to guarantee that the files are closed and the resources associated with the two streams are properly reclaimed for potential re-use.

The conventional solution in languages and systems relying on garbage collection is to eliminate the delete (which is easily forgotten, leading to leaks) and the destructor (because garbage collected languages rarely have destructors and “finalizers” are best avoided because they can be logically tricky and often damage performance). A garbage collector can reclaim all memory, but we need user actions (code) to close the files and to release any non-memory resources (such as locks) associated with the streams. Thus memory is automatically (and in this case perfectly) reclaimed, but the management of other resources is manual and therefore open to errors and leaks.

The common and recommended C++ approach is to rely on destructors to ensure that resources are reclaimed. Typically, such resources are acquired in a constructor leading to the awkward name “Resource Acquisition Is Initialization” (RAII) for this simple and general technique. In user(), the destructor for flt  implicitly calls the destructors for the streams is and os. These constructors in turn close the files and release the resources associated with the streams. The delete would do the same for *p.

Experienced users of modern C++ will have noticed that user() is rather clumsy and unnecessarily error-prone. This would be better:

void user2()
{
  Filter flt {“books”,”authors”};
  unique_ptr<Filter> p {new Filter{“novels”,”favorites”}};
  // use flt and *p
}

Now *p will be implicitly released whenever user() is exited. The programmer cannot forget to do so. The unique_ptr is a standard-library class designed to ensure resource release without runtime or space overheads compared to the use of built-in “naked” pointers.

However, we can still see the new, this solution is a bit verbose (the type Filter is repeated), and separating the construction of the ordinary pointer (using new) and the smart pointer (here, unique_ptr) inhibits some significant optimizations. We can improve this by using a C++14 helper function make_unique that constructs an object of a specified type and returns a unique_ptr to it:

void user3()
{
  Filter flt {“books”,”authors”};
  auto p = make_unique<Filter>(“novels”,”favorites”);
  // use flt and *p
}

Unless we really needed the second Filter to have pointer semantics (which is unlikely) this would be better still:

void user3()
{
  Filter flt {“books”,”authors”};
  Filter flt2 {“novels”,”favorites”};
  // use flt and flt2
}

This last version is shorter, simpler, clearer, and faster than the original.

But what does Filter’s destructor do? It releases the resources owned by a Filter; that is, it closes the files (by invoking their destructors). In fact, that is done implicitly, so unless something else is needed for Filter, we could eliminate the explicit mention of the Filter destructor and let the compiler handle it all. So, what I would have written was just:

class Filter { // take input from file iname and produce output on file oname
public:
  Filter(const string& iname, const string& oname);
  // ...
private:
  ifstream is;
  ofstream os;
  // ...
};

void user3()
{
  Filter flt {“books”,”authors”};
  Filter flt2 {“novels”,”favorites”};
  // use flt and flt2
}

This happens to be simpler than what you would write in most garbage collected languages (e.g., Java or C#) and it is not open to leaks caused by forgetful programmers. It is also faster than the obvious alternatives (no spurious use of the free/dynamic store and no need to run a garbage collector). Typically, RAII also decreases the resource retention time relative to manual approaches.

This is my ideal for resource management. It handles not just memory, but general (non-memory) resources, such as file handles, thread handles, and locks. But is it really general? How about objects that needs to be passed around from function to function? What about objects that don’t have an obvious single owner?

4.1 Transferring Ownership: move

Let us first consider the problem of moving objects around from scope to scope. The critical question is how to get a lot of information out of a scope without serious overhead from copying or error-prone pointer use. The traditional approach is to use a pointer:

X* make_X()
{
  X* p = new X:
  // ... fill X ..
  return p;
}

void user()
{
  X* q = make_X();
  // ... use *q ...
  delete q;
}

Now who is responsible for deleting the object? In this simple case, obviously the caller of make_X() is, but in general the answer is not obvious. What if make_X() keeps a cache of objects to minimize allocation overhead? What if user() passed the pointer to some other_user()? The potential for confusion is large and leaks are not uncommon in this style of program.

I could use a shared_ptr or a unique_ptr to be explicit about the ownership of the created object. For example:

unique_ptr<X> make_X();

But why use a pointer (smart or not) at all? Often, I don’t want a pointer and often a pointer would distract from the conventional use of an object. For example, a Matrix addition function creates a new object (the sum) from two arguments, but returning a pointer would lead to seriously odd code:

unique_ptr<Matrix> operator+(const Matrix& a, const Matrix& b);
Matrix res = *(a+b);

That * is needed to get the sum, rather than a pointer to it. What I really want in many cases is an object, rather than a pointer to an object. Most often, I can easily get that. In particular, small objects are cheap to copy and I wouldn’t dream of using a pointer:

double sqrt(double); // a square root function
double s2 = sqrt(2); // get the square root of 2

On the other hand, objects holding lots of data are typically handles to most of that data. Consider istream, string, vector, list, and thread. They are all just a few words of data ensuring proper access to potentially large amounts of data. Consider again the Matrix addition. What we want is

Matrix operator+(const Matrix& a, const Matrix& b); // return the sum of a and b
Matrix r = x+y;

We can easily get that.

Matrix operator+(const Matrix& a, const Matrix& b)
{
  Matrix res;
  // ... fill res with element sums ...
  return res;
}

By default, this copies the elements of res into r, but since res is just about to be destroyed and the memory holding its elements is to be freed, there is no need to copy: we can “steal” the elements. Anybody could have done that since the first days of C++, and many did, but it was tricky to implement and the technique was not widely understood. C++11 directly supports “stealing the representation” from a handle in the form of move operations that transfer ownership. Consider a simple 2-D Matrix of doubles:

class Matrix {
  double* elem; // pointer to elements
  int nrow;     // number of rows
  int ncol;     // number of columns
public:
  Matrix(int nr, int nc)                  // constructor: allocate elements
    :elem{new double[nr*nc]}, nrow{nr}, ncol{nc}
  {
    for(int i=0; i<nr*nc; ++i) elem[i]=0; // initialize elements
  }

  Matrix(const Matrix&);                  // copy constructor
  Matrix operator=(const Matrix&);        // copy assignment

  Matrix(Matrix&&);                       // move constructor
  Matrix operator=(Matrix&&);             // move assignment

  ~Matrix() { delete[] elem; }            // destructor: free the elements

// …
};

A copy operation is recognized by its reference (&) argument. Similarly, a move operation is recognized by its rvalue reference (&&) argument. A move operation is supposed to “steal” the representation and leave an “empty object” behind. For Matrix, that means something like this:

Matrix::Matrix(Matrix&& a)                   // move constructor
  :nrow{a.nrow}, ncol{a.ncol}, elem{a.elem}  // “steal” the representation
{
  a.elem = nullptr;                          // leave “nothing” behind
}

That’s it! When the compiler sees the return res; it realizes that res is soon to be destroyed. That is, res will not be used after the return. Therefore it applies the move constructor, rather than the copy constructor to transfer the return value. In particular, for

Matrix r = a+b;

the res inside operator+() becomes empty -- giving the destructor a trivial task -- and res’s elements are now owned by r. We have managed to get the elements of the result -- potentially megabytes of memory -- out of the function (operator+()) and into the caller’s variable. We have done that at a minimal cost (probably four word assignments).

Expert C++ users have pointed out that there are cases where a good compiler can eliminate the copy on return completely (in this case saving the four word moves and the destructor call). However, that is implementation dependent, and I don’t like the performance of my basic programming techniques to depend on the degree of cleverness of individual compilers. Furthermore, a compiler that can eliminate the copy, can as easily eliminate the move. What we have here is a simple, reliable, and general way of eliminating complexity and cost of moving a lot of information from one scope to another.

Often, we don’t even need to define all those copy and move operations. If a class is composed out of members that behave as desired, we can simply rely on the operations generated by default. Consider:

class Matrix {
    vector<double> elem; // elements
    int nrow;            // number of rows
    int ncol;            // number of columns
public:
    Matrix(int nr, int nc)    // constructor: allocate elements
      :elem(nr*nc), nrow{nr}, ncol{nc}
    { }

    // ...
};

This version of Matrix behaves like the version above except that it copes slightly better with errors and has a slightly larger representation (a vector is usually three words).

What about objects that are not handles? If they are small, like an int or a complex<double>, don’t worry. Otherwise, make them handles or return them using “smart” pointers, such as unique_ptr and shared_ptr. Don’t mess with “naked” new and delete operations.

Unfortunately, a Matrix like the one I used in the example is not part of the ISO C++ standard library, but several are available (open source and commercial). For example, search the Web for “Origin Matrix Sutton” and see Chapter 29 of my The C++ Programming Language (Fourth Edition) [11] for a discussion of the design of such a matrix.

4.2 Shared Ownership: shared_ptr

In discussions about garbage collection it is often observed that not every object has a unique owner. That means that we have to be able ensure that an object is destroyed/freed when the last reference to it disappears. In the model here, we have to have a mechanism to ensure that an object is destroyed when its last owner is destroyed. That is, we need a form of shared ownership. Say, we have a synchronized queue, a sync_queue, used to communicate between tasks. A producer and a consumer are each given a pointer to the sync_queue:

void startup()
{
  sync_queue* p  = new sync_queue{200};  // trouble ahead!
  thread t1 {task1,iqueue,p};  // task1 reads from *iqueue and writes to *p
  thread t2 {task2,p,oqueue};  // task2 reads from *p and writes to *oqueue
  t1.detach();
  t2.detach();
}

I assume that task1, task2, iqueue, and oqueue have been suitably defined elsewhere and apologize for letting the thread outlive the scope in which they were created (using detatch()). Also, you may imagine pipelines with many more tasks and sync_queues. However, here I am only interested in one question: “Who deletes the sync_queue created in startup()?” As written, there is only one good answer: “Whoever is the last to use the sync_queue.” This is a classic motivating case for garbage collection. The original form of garbage collection was counted pointers: maintain a use count for the object and when the count is about to go to zero delete the object. Many languages today rely on a variant of this idea and C++11 supports it in the form of shared_ptr. The example becomes:

void startup()
{
  auto p = make_shared<sync_queue>(200);  // make a sync_queue and return a stared_ptr to it
  thread t1 {task1,iqueue,p};  // task1 reads from *iqueue and writes to *p
  thread t2 {task2,p,oqueue};  // task2 reads from *p and writes to *oqueue
  t1.detach();
  t2.detach();
}

Now the destructors for task1 and task2 can destroy their shared_ptrs (and will do so implicitly in most good designs) and the last task to do so will destroy the sync_queue.

This is simple and reasonably efficient. It does not imply a complicated run-time system with a garbage collector. Importantly, it does not just reclaim the memory associated with the sync_queue. It reclaims the synchronization object (mutex, lock, or whatever) embedded in the sync_queue to mange the synchronization of the two threads running the two tasks. What we have here is again not just memory management, it is general resource management. That “hidden” synchronization object is handled exactly as the file handles and stream buffers were handled in the earlier example.

We could try to eliminate the use of shared_ptr by introducing a unique owner in some scope that encloses the tasks, but doing so is not always simple, so C++11 provides both unique_ptr (for unique ownership) and shared_ptr (for shared ownership).

4.3 Type safety

Here, I have only addressed garbage collection in connection with resource management. It also has a role to play in type safety. As long as we have an explicit delete operation, it can be misused. For example:

X* p = new X;
X* q = p;
delete p;
// ...
q->do_something();  // the memory that held *p may have been re-used

Don’t do that. Naked deletes are dangerous -- and unnecessary in general/user code. Leave deletes inside resource management classes, such as string, ostream, thread, unique_ptr, and shared_ptr. There, deletes are carefully matched with news and harmless.

4.4 Summary: Resource Management Ideals

For resource management, I consider garbage collection a last choice, rather than “the solution” or an ideal:

  1. Use appropriate abstractions that recursively and implicitly handle their own resources. Prefer such objects to be scoped variables.
  2. When you need pointer/reference semantics, use “smart pointers” such as unique_ptr and shared_ptr to represent ownership.
  3. If everything else fails (e.g., because your code is part of a program using a mess of pointers without a language supported strategy for resource management and error handling), try to handle non-memory resources “by hand” and plug in a conservative garbage collector to handle the almost inevitable memory leaks.

Is this strategy perfect? No, but it is general and simple. Traditional garbage-collection based strategies are not perfect either, and they don’t directly address non-memory resources.

 

Postscript

The first installment of this series addressed

  • Myth 1: "To understand C++, you must first learn C"
  • Myth 2: "C++ is an Object-Oriented Language"

In my next installment I will address

  • Myth 4: "For efficiency, you must write low-level code"
  • Myth 5: "C++ is for large, complicated, programs only"
     

In defense of printf -- Kenny Kerr

To use or not to use cout instead of printf -- that is the question!

In case you missed Kenny Kerr's article on printf vs cout:

In defense of printf

From the article:

Folks seem to enjoy pointing out that I use printf in many of my examples of “modern C++”, as if printf is not really proper C++. Apparently, I should be using cout. The trouble is that cout isn’t exactly modern either...

 

Quick Q: What is noexcept useful for? -- StackOverflow

Quick A: Guaranteeing "this function won't throw" enables optimization opportunities.

Recently on SO:

What is noexcept useful for?

I saw that c++11 added the noexcept keyword. But I don't really understand why is it useful.

If the function throws when it's not supposed to throw - why would I want the program to crash?

So when should I use it?

Also, how will it work along with compiling with /Eha and using _set_se_translator? This means that any line of code can throw c++ exception - because it might throw a SEH exception (Because of accessing protected memory) and it will be translated to c++ exception.

What will happen then?

A gotcha with ptr_vector -- Andrzej Krzemieński

Do you know Boost.Pointer Container? Andrzej gives some useful information in his recent blog entry:

A gotcha with ptr_vector

He gives answers to these questions:

What would you use boost::ptr_vector for? Why would you need to have a vector of pointers, which you want to delete yourself? Is it because:

  1. You want the objects to remain at the same address even if you re-allocate the array under the vector?
  2. You want to inter-operate with a library that already deals with owing pointers?
  3. You want it to be faster than if you were storing values in std::vector?
  4. You want the “polymorphic behavior” of your objects?

Property Testing in C++ -- Jeremy Wright

In case you missed it, this may be midly controversial but we felt it could be of interest:

Property Testing in C++

by Jeremy Wright

From the article:

One might say tests are shiny. I don't know if they are really shiny as much as I found another cool use for uniform_int_distribution<>...

Tutorial to develop a client side web application in C++ with PNaCl -- Chrome Developer Guide

In case you missed it, now you can develop web apps in C++ to run natively on Google Chrome. This tutorial shows how to develop a Native Client module in C++ and build and run it using PNaCl toolchain:

C++ Tutorial: Getting Started (Part 1)

This tutorial shows how to build and run a web application using Portable Native Client (PNaCl). This is a client-side application that uses HTML, JavaScript and a Native Client module written in C++. The PNaCl toolchain is used to enable running the Native Client module directly from a web page...

Portable Native Client or PNacl, which is supported on Google Chrome at the moment, lets you compile and run native code developed in C or C++ within a secure sandbox in browser. PNaCl uses the LLVM toolchain to generate OS-independent and architecture agnostic intermediate bytecode which is later loaded, translated and executed by the browser. To get more information on PNaCl and it's less portable companion NaCl check the native client developer documentation.

C++ Template Metaprogramming Introduction -- Nicolás Brailovsky

tmp-pandora.PNGIn case you missed it, from Nicolás Brailovsky's blog -- an interesting post after you get past a little ranting:

C++ Template Metaprogramming Introduction

by Nicolás Brailovsky

From the article:

First, we need to start with a little clarification: using template <class T> to parametrize a class, something like std::vector does, is not template metaprogramming. That’s just a generic class (Java-pun intended). That is indeed a useful case for templates, but it has little fun in it.

Template metaprogramming is much more fun than mere generic classes...

A few notes:

  • Even if template metaprogramming isn't something that every C++ developer will do on daily basis, it's worth being able to understand when you see it.
  • For homework, try solving the Hanoi Tower problem by using template metaprogramming. [A sample for the 8-Queens problem is available here.]

Five Popular Myths about C++, Part 1

myth.png[For your winter reading pleasure, we're pleased to present this three-part series of new material by Bjarne Stroustrup. This is part one; parts two and three will be posted on the following two Mondays, which will complete the series just in time for Christmas. Enjoy. -- Ed.]

 

Five Popular Myths about C++ (Part 1)

by Bjarne Stroustrup

Morgan Stanley, Columbia University, Texas A&M University

 

1. Introduction

In this three-part series, I will explore, and debunk, five popular myths about C++:

  1. "To understand C++, you must first learn C"
  2. "C++ is an Object-Oriented Language"
  3. "For reliable software, you need Garbage Collection"
  4. "For efficiency, you must write low-level code"
  5. "C++ is for large, complicated, programs only"

If you believe in any of these myths, or have colleagues who perpetuate them, this short article is for you. Several of these myths have been true for someone, for some task, at some time. However, with today’s C++, using widely available up-to date ISO C++ 2011 compilers, and tools, they are mere myths.

I deem these myths “popular” because I hear them often. Occasionally, they are supported by reasons, but more often they are simply stated as obvious, needing no support. Sometimes, they are used to dismiss C++ from consideration for some use.

Each myth requires a long paper or even a book to completely debunk, but my aim here is simply to raise the issues and to briefly state my reasons.

2.  Myth 1: "To understand C++, you must first learn C"

No. Learning basic programming using C++ is far easier than with C.

C is almost a subset of C++, but it is not the best subset to learn first because C lacks the notational support, the type safety, and the easier-to-use standard library offered by C++ to simplify simple tasks. Consider a trivial function to compose an email address:

string compose(const string& name, const string& domain)
{
  return name+'@'+domain;
}

It can be used like this

string addr = compose("gre","research.att.com");

The C version requires explicit manipulation of characters and explicit memory management:

char* compose(const char* name, const char* domain)
{
  char* res = malloc(strlen(name)+strlen(domain)+2); // space for strings, '@', and 0
  char* p = strcpy(res,name);
p += strlen(name);
  *p = '@';
  strcpy(p+1,domain);
  return res;
}

It can be used like this

char* addr = compose("gre","research.att.com");
// …
free(addr); // release memory when done

Which version would you rather teach? Which version is easier to use? Did I really get the C version right? Are you sure? Why?


Finally, which version is likely to be the most efficient? Yes, the C++ version, because it does not have to count the argument characters and does not use the free store (dynamic memory) for short argument strings.

2.1 Learning C++

This is not an odd isolated example. I consider it typical. So why do so many teachers insist on the “C first” approach?

  • Because that’s what they have done for ages.
  • Because that’s what the curriculum requires.
  • Because that’s the way the teachers learned it in their youth.
  • Because C is smaller than C++ it is assumed to be simpler to use.
  • Because the students have to learn C (or the C subset of C++) sooner or later anyway.

However, C is not the easiest or most useful subset of C++ to learn first. Furthermore, once you know a reasonable amount of C++, the C subset is easily learned. Learning C before C++ implies suffering errors that are easily avoided in C++ and learning techniques for mitigating them.

For a modern approach to teaching C++, see my Programming: Principles and Practice Using C++ [13]. It even has a chapter at the end showing how to use C. It has been used, reasonably successfully, with tens of thousands of beginning students in several universities. Its second edition uses C++11 and C++14 facilities to ease learning.

With C++11 [11-12], C++ has become more approachable for novices. For example, here is standard-library vector initialized with a sequence of elements:

vector<int> v = {1,2,3,5,8,13};

In C++98, we could only initialize arrays with lists. In C++11, we can define a constructor to accept a {} initializer list for any type for which we want one.

We could traverse that vector with a range-for loop:

for(int x : v) test(x);

This will call test() once for each element of v.

A range-for loop can traverse any sequence, so we could have simplified that example by using the initializer list directly:

for (int x : {1,2,3,5,8,13}) test(x);

One of the aims of C++11 was to make simple things simple. Naturally, this is done without adding performance penalties.

3.  Myth 2: "C++ is an Object-Oriented Language"

No. C++ supports OOP and other programming styles, but is deliberately not limited to any narrow view of “Object Oriented.” It supports a synthesis of programming techniques including object-oriented and generic programming. More often than not, the best solution to a problem involves more than one style (“paradigm”). By “best,” I mean shortest, most comprehensible, most efficient, most maintainable, etc.

The “C++ is an OOPL” myth leads people to consider C++ unnecessary (when compared to C) unless you need large class hierarchies with many virtual (run-time polymorphic) functions – and for many people and for many problems, such use is inappropriate. Believing this myth leads others to condemn C++ for not being purely OO; after all, if you equate “good” and “object-oriented,” C++ obviously contains much that is not OO and must therefore be deemed “not good.” In either case, this myth provides a good excuse for not learning C++.

Consider an example:

void rotate_and_draw(vector<Shape*>& vs, int r)
{
  for_each(vs.begin(),vs.end(), [](Shape* p) { p->rotate(r); });  // rotate all elements of vs
  for (Shape* p : vs) p->draw();                                  // draw all elements of vs
}

Is this object-oriented? Of course it is; it relies critically on a class hierarchy with virtual functions. It is generic? Of course it is; it relies critically on a parameterized container (vector) and the generic function for_each. Is this functional? Sort of; it uses a lambda (the [] construct). So what is it?  It is modern C++: C++11.

I used both the range-for loop and the standard-library algorithm for_each just to show off features. In real code, I would have use only one loop, which I could have written either way.

3.1 Generic Programming

Would you like this code more generic? After all, it works only for vectors of pointers to Shapes. How about lists and built-in arrays? What about “smart pointers” (resource-management pointers), such as shared_ptr and unique_ptr? What about objects that are not called Shape that you can draw() and rotate()? Consider:

template<typename Iter>
void rotate_and_draw(Iter first, Iter last, int r)
{
  for_each(first,last,[](auto p) { p->rotate(r); });  // rotate all elements of [first:last)
  for (auto p = first; p!=last; ++p) p->draw();       // draw all elements of [first:last)
}

This works for any sequence you can iterate through from first to last. That’s the style of the C++ standard-library algorithms. I used auto to avoid having to name the type of the interface to “shape-like objects.” That’s a C++11 feature meaning “use the type of the expression used as initializer,” so for the for-loop p’s type is deduced to be whatever type first is. The use of auto to denote the argument type of a lambda is a C++14 feature, but already in use.

Consider:

void user(list<unique_ptr<Shape>>& lus, Container<Blob>& vb)
{
rotate_and_draw(lst.begin(),lst.end());
rotate_and_draw(begin(vb),end(vb));
}

Here, I assume that Blob is some graphical type with operations draw() and rotate() and that Container is some container type. The standard-library list (std::list) has member functions begin() and end() to help the user traverse its sequence of elements. That’s nice and classical OOP. But what if Container is something that does not support the C++ standard library’s notion of iterating over a half-open sequence, [b:e)? Something that does not have begin() and end() members? Well, I have never seen something container-like, that I couldn’t traverse, so we can define free-standing begin() and end() with appropriate semantics. The standard library provides that for C-style arrays, so if Container is a C-style array, the problem is solved – and C-style arrays are still very common.

3.2 Adaptation

Consider a harder case: What if Container holds pointers to objects and has a different model for access and traversal? For example, assume that you are supposed to access a Container like this

for (auto p = c.first(); p!=nullptr; p=c.next()) { /* do something with *p */}

This style is not uncommon. We can map it to a [b:e) sequence like this

template<typename T> struct Iter {
  T* current;
  Container<T>& c;
};

template<typename T> Iter<T> begin(Container<T>& c) { return Iter<T>{c.first(),c}; }
template<typename T> Iter<T> end(Container<T>& c)   { return Iter<T>{nullptr}; }
template<typename T> Iter<T> operator++(Iter<T> p)  { p.current = c.next(); return this; }
template<typename T> T*      operator*(Iter<T> p)   { return p.current; }

Note that this is modification is nonintrusive: I did not have to make changes to Container or some Container class hierarchy to map Container into the model of traversal supported by the C++ standard library. It is a form of adaptation, rather than a form of refactoring.

I chose this example to show that these generic programming techniques are not restricted to the standard library (in which they are pervasive). Also, for most common definitions of “object oriented,” they are not object-oriented.

The idea that C++ code must be object-oriented (meaning use hierarchies and virtual functions everywhere) can be seriously damaging to performance. That view of OOP is great if you need run-time resolution of a set of types. I use it often for that. However, it is relatively rigid (not every related type fits into a hierarchy) and a virtual function call inhibits inlining (and that can cost you a factor of 50 in speed in simple and important cases).

 

Postscript

In my next installment, I’ll address “For reliable software, you need Garbage Collection.”

Quick Q: How do I allow iteration on an internal vector without exposing the vector?--StackExchange

Quick A: By exposing wrapped iterators in your interface.

From Programmers StackExchange, StackOverflow's sister site:

Allow iteration of an internal vector without leaking the implementation

I have a class that represents a list of people.

class AddressBook
{
public:
  AddressBook();

private:
  std::vector<People> people;
}
I want to allow clients to iterate over the vector of people. The first thought I had was simply:
std::vector<People> & getPeople { return people; }
However, I do not want to leak the implementation details to the client. I may want to maintain certain invariants when the vector is modified, and I lose control over these invariants when I leak the implementation.
What's the best way to allow iteration without leaking the internals?

Continuation Passing Style -- whanhee

In case you missed it, from Functional C++:

Continuation Passing Style

by whanhee

From the article:

Continuation passing style (CPS) is a method of programming which allows for intricate manipulation of control flow while still maintaining functional purity.

Continuations can be used to implement a whole range of things from exceptions to coroutines, but today we’ll just introduce them and a few interesting and useful concepts...

As summarized in Wikipedia: "continuation-passing style (CPS) is a style of programming in which control is passed explicitly in the form of a continuation."