Five Popular Myths about C++, Part 1

Save to:
Instapaper Pocket Readability

myth.png[For your winter reading pleasure, we're pleased to present this three-part series of new material by Bjarne Stroustrup. This is part one; parts two and three will be posted on the following two Mondays, which will complete the series just in time for Christmas. Enjoy. -- Ed.]

 

Five Popular Myths about C++ (Part 1)

by Bjarne Stroustrup

Morgan Stanley, Columbia University, Texas A&M University

 

1. Introduction

In this three-part series, I will explore, and debunk, five popular myths about C++:

  1. "To understand C++, you must first learn C"
  2. "C++ is an Object-Oriented Language"
  3. "For reliable software, you need Garbage Collection"
  4. "For efficiency, you must write low-level code"
  5. "C++ is for large, complicated, programs only"

If you believe in any of these myths, or have colleagues who perpetuate them, this short article is for you. Several of these myths have been true for someone, for some task, at some time. However, with today’s C++, using widely available up-to date ISO C++ 2011 compilers, and tools, they are mere myths.

I deem these myths “popular” because I hear them often. Occasionally, they are supported by reasons, but more often they are simply stated as obvious, needing no support. Sometimes, they are used to dismiss C++ from consideration for some use.

Each myth requires a long paper or even a book to completely debunk, but my aim here is simply to raise the issues and to briefly state my reasons.

2.  Myth 1: "To understand C++, you must first learn C"

No. Learning basic programming using C++ is far easier than with C.

C is almost a subset of C++, but it is not the best subset to learn first because C lacks the notational support, the type safety, and the easier-to-use standard library offered by C++ to simplify simple tasks. Consider a trivial function to compose an email address:

string compose(const string& name, const string& domain)
{
  return name+'@'+domain;
}

It can be used like this

string addr = compose("gre","research.att.com");

The C version requires explicit manipulation of characters and explicit memory management:

char* compose(const char* name, const char* domain)
{
  char* res = malloc(strlen(name)+strlen(domain)+2); // space for strings, '@', and 0
  char* p = strcpy(res,name);
p += strlen(name);
  *p = '@';
  strcpy(p+1,domain);
  return res;
}

It can be used like this

char* addr = compose("gre","research.att.com");
// …
free(addr); // release memory when done

Which version would you rather teach? Which version is easier to use? Did I really get the C version right? Are you sure? Why?


Finally, which version is likely to be the most efficient? Yes, the C++ version, because it does not have to count the argument characters and does not use the free store (dynamic memory) for short argument strings.

2.1 Learning C++

This is not an odd isolated example. I consider it typical. So why do so many teachers insist on the “C first” approach?

  • Because that’s what they have done for ages.
  • Because that’s what the curriculum requires.
  • Because that’s the way the teachers learned it in their youth.
  • Because C is smaller than C++ it is assumed to be simpler to use.
  • Because the students have to learn C (or the C subset of C++) sooner or later anyway.

However, C is not the easiest or most useful subset of C++ to learn first. Furthermore, once you know a reasonable amount of C++, the C subset is easily learned. Learning C before C++ implies suffering errors that are easily avoided in C++ and learning techniques for mitigating them.

For a modern approach to teaching C++, see my Programming: Principles and Practice Using C++ [13]. It even has a chapter at the end showing how to use C. It has been used, reasonably successfully, with tens of thousands of beginning students in several universities. Its second edition uses C++11 and C++14 facilities to ease learning.

With C++11 [11-12], C++ has become more approachable for novices. For example, here is standard-library vector initialized with a sequence of elements:

vector<int> v = {1,2,3,5,8,13};

In C++98, we could only initialize arrays with lists. In C++11, we can define a constructor to accept a {} initializer list for any type for which we want one.

We could traverse that vector with a range-for loop:

for(int x : v) test(x);

This will call test() once for each element of v.

A range-for loop can traverse any sequence, so we could have simplified that example by using the initializer list directly:

for (int x : {1,2,3,5,8,13}) test(x);

One of the aims of C++11 was to make simple things simple. Naturally, this is done without adding performance penalties.

3.  Myth 2: "C++ is an Object-Oriented Language"

No. C++ supports OOP and other programming styles, but is deliberately not limited to any narrow view of “Object Oriented.” It supports a synthesis of programming techniques including object-oriented and generic programming. More often than not, the best solution to a problem involves more than one style (“paradigm”). By “best,” I mean shortest, most comprehensible, most efficient, most maintainable, etc.

The “C++ is an OOPL” myth leads people to consider C++ unnecessary (when compared to C) unless you need large class hierarchies with many virtual (run-time polymorphic) functions – and for many people and for many problems, such use is inappropriate. Believing this myth leads others to condemn C++ for not being purely OO; after all, if you equate “good” and “object-oriented,” C++ obviously contains much that is not OO and must therefore be deemed “not good.” In either case, this myth provides a good excuse for not learning C++.

Consider an example:

void rotate_and_draw(vector<Shape*>& vs, int r)
{
  for_each(vs.begin(),vs.end(), [](Shape* p) { p->rotate(r); });  // rotate all elements of vs
  for (Shape* p : vs) p->draw();                                  // draw all elements of vs
}

Is this object-oriented? Of course it is; it relies critically on a class hierarchy with virtual functions. It is generic? Of course it is; it relies critically on a parameterized container (vector) and the generic function for_each. Is this functional? Sort of; it uses a lambda (the [] construct). So what is it?  It is modern C++: C++11.

I used both the range-for loop and the standard-library algorithm for_each just to show off features. In real code, I would have use only one loop, which I could have written either way.

3.1 Generic Programming

Would you like this code more generic? After all, it works only for vectors of pointers to Shapes. How about lists and built-in arrays? What about “smart pointers” (resource-management pointers), such as shared_ptr and unique_ptr? What about objects that are not called Shape that you can draw() and rotate()? Consider:

template<typename Iter>
void rotate_and_draw(Iter first, Iter last, int r)
{
  for_each(first,last,[](auto p) { p->rotate(r); });  // rotate all elements of [first:last)
  for (auto p = first; p!=last; ++p) p->draw();       // draw all elements of [first:last)
}

This works for any sequence you can iterate through from first to last. That’s the style of the C++ standard-library algorithms. I used auto to avoid having to name the type of the interface to “shape-like objects.” That’s a C++11 feature meaning “use the type of the expression used as initializer,” so for the for-loop p’s type is deduced to be whatever type first is. The use of auto to denote the argument type of a lambda is a C++14 feature, but already in use.

Consider:

void user(list<unique_ptr<Shape>>& lus, Container<Blob>& vb)
{
rotate_and_draw(lst.begin(),lst.end());
rotate_and_draw(begin(vb),end(vb));
}

Here, I assume that Blob is some graphical type with operations draw() and rotate() and that Container is some container type. The standard-library list (std::list) has member functions begin() and end() to help the user traverse its sequence of elements. That’s nice and classical OOP. But what if Container is something that does not support the C++ standard library’s notion of iterating over a half-open sequence, [b:e)? Something that does not have begin() and end() members? Well, I have never seen something container-like, that I couldn’t traverse, so we can define free-standing begin() and end() with appropriate semantics. The standard library provides that for C-style arrays, so if Container is a C-style array, the problem is solved – and C-style arrays are still very common.

3.2 Adaptation

Consider a harder case: What if Container holds pointers to objects and has a different model for access and traversal? For example, assume that you are supposed to access a Container like this

for (auto p = c.first(); p!=nullptr; p=c.next()) { /* do something with *p */}

This style is not uncommon. We can map it to a [b:e) sequence like this

template<typename T> struct Iter {
  T* current;
  Container<T>& c;
};

template<typename T> Iter<T> begin(Container<T>& c) { return Iter<T>{c.first(),c}; }
template<typename T> Iter<T> end(Container<T>& c)   { return Iter<T>{nullptr}; }
template<typename T> Iter<T> operator++(Iter<T> p)  { p.current = c.next(); return this; }
template<typename T> T*      operator*(Iter<T> p)   { return p.current; }

Note that this is modification is nonintrusive: I did not have to make changes to Container or some Container class hierarchy to map Container into the model of traversal supported by the C++ standard library. It is a form of adaptation, rather than a form of refactoring.

I chose this example to show that these generic programming techniques are not restricted to the standard library (in which they are pervasive). Also, for most common definitions of “object oriented,” they are not object-oriented.

The idea that C++ code must be object-oriented (meaning use hierarchies and virtual functions everywhere) can be seriously damaging to performance. That view of OOP is great if you need run-time resolution of a set of types. I use it often for that. However, it is relatively rigid (not every related type fits into a hierarchy) and a virtual function call inhibits inlining (and that can cost you a factor of 50 in speed in simple and important cases).

 

Postscript

In my next installment, I’ll address “For reliable software, you need Garbage Collection.”

Add a Comment

You must sign in or register to add a comment.

Comments (17)

4 0

theypsilon said on Dec 9, 2014 09:27 AM:

I don't understand the following lines:

template<typename T> Iter<T> end(Container<T>& c)   { return Iter<T>{nullptr}; }
template<typename T> Iter<T> operator++(Iter<T> p) { p.current = c.next(); return this; }


Shouldn't be better like this?

template<typename T> Iter<T> end(Container<T>& c)   { return Iter<T>{nullptr,c}; }
template<typename T> Iter<T> operator++(Iter<T> p) { p.current = p.c.next(); return p; }
0 0

Chris Bouchard said on Dec 9, 2014 09:57 AM:

theypsilon: You're right. The operator++ function would not have any access to this since it's not a member function. The end function would raise a warning (and should crash and burn if run) because the c reference would be dangling. Your suggestion is certainly one way to fix it, but you could also make c an explicit pointer and set it to nullptr. This would have the property that end(a) == end(b) for all Containers a and b, in case that is desirable.
5 0

Bjarne Stroustrup said on Dec 9, 2014 10:29 AM:

:-(. Thanks: So: Never make changes to code for publication after testing.
0 0

Gvidon said on Dec 9, 2014 12:49 PM:

> because it does not have to count the argument characters
But it does, when you pass string literals into the function. Also, you didn't check the result of malloc (which, I believe, you did on purpose, but it still made me uneasy)
0 0

Alexandre Bourlon said on Dec 9, 2014 02:16 PM:

How can
template<typename T> Iter<T> operator++(Iter<T> p)  { p.current = p.c.next(); return p; }
have any side effect if p is taken by value?
0 0

maredsous10 said on Dec 9, 2014 02:54 PM:

void user(list<unique_ptr<Shape>>& lus, Container<Blob>& vb) 


Did you mean lst instead of lus?
0 0

Bjarne Stroustrup said on Dec 9, 2014 06:37 PM:

Thanks for the comments.

Please try not to miss main points from an obsession with details.
0 0

Srdjan Veljkovic said on Dec 9, 2014 09:29 PM:

I guess I would teach a C newcomer something like this:


errno_t compose_s(char *dest, size_t n, char const *name, char const *domain)
{
return strcpy_s(dest, n, name) || strcat_s(dest, n, "@") || strcat_s(dest, n, domain);
}
// ...
char email[80];
if (0 == compose_s(email, sizeof email, "gre", "research.att.com")) {
// use `email`, it's OK now...
}


No heap allocation, thus faster, yet safe. Sure, you may have to enlarge email[] if compose_s() fails, but you could look at it as a form of "manual small string optimization". smile
0 3

polomora said on Dec 10, 2014 12:39 AM:

Having started programming with C, and then having moved to C++, I wouldn't like to move back to C. Having said that, this statement is definitely a myth
"Learning basic programming using C++ is far easier than with C."

Learning and using C++ is *much* more difficult than C.
0 1

Sektor said on Dec 10, 2014 03:36 AM:

About the myth of "C++ is an OOP language", there's another aspect for that, this time from the perspective of fans of "more advanced languages" - more less something like that:

C++ is not really an OOP language and it has the OOP features faked and weak. So, it's better to use some "high-level", "real object-oriented" languages, such as Java. Because it has GC and reflections. :D :D :D

If you are considering Java as "high-level language" by any means, I've created some larger portion of explanation, why it's better to "think again":

http://sektorvanskijlen.wordpress.com/2012/09/14/why-java-is-not-a-high-level-language/
0 0

stereomatchingkiss said on Dec 10, 2014 04:06 AM:

About the initialization part


vector<int> v = {1,2,3,5,8,13};


This codes looks tight and clean, but it still suffer from the confusion from

vector<int> v(10, 2);

and

vector<int> v{10, 2};

Atleast I had a gotcha moment about this

@polomora
I have different view, I think c++ is hard to learn(many techniques to learn), but much easier to use than c
0 1

mpeg_guy said on Dec 10, 2014 08:19 AM:

Bjarne:

I think the biggest problem with your premise is that C++11 is *not* C++. C++ is what I learned over 20 years ago and was an much better C than C, but a pretty awful OOP language.

I recognize very little of what you describe as "C++". I believe that many of the myths you mention come from those who learned C++ when it was new and not terribly useful (debuggers that worked only for C but not for C++, weird name mangling schemes, etc).

We really ought to call these new languages something else to avoid the confusion.

These are modern languages with truly useful build and debug tools for computer programming. I'm not sure I want my 787 aircraft or Orion space capsule running C++ or C++11, though.

Ray
0 0

Fábio Franco said on Dec 10, 2014 10:04 AM:

I still think C should be taught first. In my opinion, C provides a few advantages when preparing the aspiring programmer:

1 - It's syntactically simpler than C++. C++ can get very complex and I don't think syntax should get in the way at this point. It's not the time for object orientation, generics, etc.

2 - You get to learn the nuts and bolts of memory allocation. This is very important to not let the novice getting used to not care about memory allocation, stack and heap. Learning right the first time is much easier than removing bad practices after the damage is done. Of course C++ can also help you with that. But, because C is much more "do it yourself", I believe it's better for learning, instead of using C++ libraries. Later on the novice will have an easier time with C++ and other higher level languages, including auto managed ones.

I'm not a teacher so I do not have that bias. But was my first language in college. And I'm glad for that. I just wish I had taught myself C instead of VB earlier smile
4 0

Bjarne Stroustrup said on Dec 10, 2014 04:14 PM:

I have bundled some comments rather than answering individual messages.

The comments prove – yet again - that the “Myths” paper was needed. People keep repeating the old hairy rationalizations. Unfortunately, many programmers don’t read long papers and dismiss short ones for being incomplete.

I have observed students being taught by the “C first” approach for many years and seen the programs written by such students for decades. I have taught C++ as the first programming language to many hundreds of students over several years. My claims about the teachability of C++ are based on significant experience, rather than introspection.

I know C and its standard library pretty well. I wrote considerable amounts of C before many of today’s students were even born and contributed significantly to the C language: function prototypes, const, inline, declarations in for-statement, declarations as statements, and more came from my work. I have followed its development of C and the evolution programming styles in C.

C++ is easier to teach to beginners than C because of a better type system and more notational support. There are also fewer tricks and workarounds to learn. Just imagine how you would teach the styles of programming you use in C using C++; C++’s support for those is better.

I would never dream of giving a beginner’s C++ course that
• didn’t include a thorough grounding in memory management, pointers, etc.
• didn’t give the students a look at ``plain C’’ and some idea of how to use it
• tried to teach all of C++ and every C++ technique.
Similarly, good C teachers do not try to teach all of C and all C techniques to beginners.

http://www.stroustrup.com/programming.html is my answer to the question “How would you teach C++ to beginners?” It works.

For a rather old paper comparing aspects of teachability of C and C++, see B. Stroustrup: Learning Standard C++ as a New Language. C/C++ Users Journal. pp 43-54. May 1999. Today, I could write the C version a bit better and the C++ version quite a bit better. The examples reflect common styles of the time (and were reviews by expert C and C++ programmers).
C++ today is ISO Standard C++14, rather than what I described 30 years ago or what your teacher may have taught you 20 years ago. Learn C++11/C++14 as supported by current mainstream compilers and get used to it. It is a far better tool than earlier versions of C++. Similarly, C today is ISO Standard C11, rather than K&R C (though I am not sure if the C compilers today are as close to C11 as the C++ compilers are close to C++14).

I am appalled of much that is taught as “good C++.”

C++ is not (and never were meant to be) an “OOP” language. It is a language that supports OOP, other programming techniques, and combinations of such techniques.

C++ has been used for demanding embedded systems and critical systems for years, examples are The Mars Rovers (scene analysis and autonomous operations), The F-35s and F-16s (flight controls), and many, many more: http://www.stroustrup.com/applications.html . And, yes, the Orion space capsule is programmed in C++.
0 0

nengels said on Dec 15, 2014 06:40 AM:

Great article! That’s a gift for the holidays =D

I think the best of C++ it’s keep his conceptual principles of design and this is hard to see nowadays. The ideas like "no pay for what you don't use" modeled the language and proved that techniques like a "root" class it's not necessary to write reusable modern code. "Paradigm" agnostic of C++ let him benefits of the best of worlds.

Exciting to see the work of the C++ community and committee to present a more powerful, concise and simple language.

Thanks and Merry Christmas Bjarne!
PS: Please forgive the written errors, english is not my native language sorry... smile

// Nicolas Engels
1 0

Sektor said on Dec 15, 2014 09:05 AM:

Bjarne,

It would be really nice if everyone can use C++14 right now - especially that there are already compilers on the market that support it.

The reality of software engineering isn't always that bright, though.

For example, in our production environment is that we have some requirements that our product run on Linux CentOS 6.6. The CentOS distribution - a mirror of RedHat - is known of being very clean and stable, however also of having stone-age packages. The latest gcc version provided for repository predicted for that version is 4.4.5, which is C++11 agnostic. Of course, we are considering moving to some later version. In one of our product we even have a library that requires C++11 (gcc practically at least 4.8), so for that occasion I have prepared a kind-of homecooked gcc 4.9 package only for that thing and only for static linkage against libstdc++.

There are also some other products, which require much more portability and some systems have weird compilers, which are more-less C++03 compliant. I wish we could use gcc or clang for them, but it's not so easy.

Luckily the situation isn't like it was with C++98, in case of which the first "enough compliant" compilers appeared on the market something around 2005 year. Many features of C++11 have been added to gcc even before the standard has been closed, same with clang. So, the situation is definitely better. Also many library vendors (such as Qt) have prepared their products to be C++11 friendly.

It doesn't change the fact, though, that the software market reacts slowly for technology changes, especially if it still has lots of legacy software written a long time ago.
0 0

InoS Heo said on Dec 26, 2014 02:32 AM:

Hello, Thank you for great article.

>> which version is likely to be the most efficient? Yes, the C++ version,

I have tested "compose()" function performance. It seems like that "efficient" depends on wheather compiler supports "short string optimization" or not.

Please, see my question in SO if you are intersted in.

http://stackoverflow.com/questions/27632805/stdstring-performance-for-handling-short-strings

Thanks.