multiple inheritance

Inheritance — Multiple and Virtual Inheritance

How is this section organized?

This section covers a wide spectrum of questions/answers, ranging from the high-level / strategy / design issues, going all the way down to low-level / tactical / programming issues. We cover them in that order.

Please make sure you understand the high-level / strategy / design issues. Too many programmers worry about getting “it” to compile without first deciding whether they really want “it” in the first place. So please read the first several FAQs in this section before worrying about the (important) mechanical details in the last several FAQs.

Do we really need multiple inheritance?

Not really. We can do without multiple inheritance by using workarounds, exactly as we can do without single inheritance by using workarounds. We can even do without classes by using workarounds. C is a proof of that contention. However, every modern language with static type checking and inheritance provides some form of multiple inheritance. In C++, abstract classes often serve as interfaces and a class can have many interfaces. Other languages – often deemed “not MI” – simply have a separate name for their equivalent to a pure abstract class: an interface. The reason languages provide inheritance (both single and multiple) is that language-supported inheritance is typically superior to workarounds (e.g. use of forwarding functions to sub-objects or separately allocated objects) for ease of programming, for detecting logical problems, for maintainability, and often for performance.

I’ve been told that I should never use multiple inheritance. Is that right?

Grrrrrrrrr.

It really bothers me when people think they know what’s best for your problem even though they’ve never seen your problem!! How can anybody possibly know that multiple inheritance won’t help you accomplish your goals without knowing your goals?!?!?!?!!!

Next time somebody tells you that you should never use multiple inheritance, look them straight in the eye and say, “One size does not fit all.” If they respond with something about their bad experience on their project, look them in the eye and repeat, slower this time, “One size does not fit all.”

People who spout off one-size-fits-all rules presume to make your design decisions without knowing your requirements. They don’t know where you’re going but know how you should get there.

Don’t trust an answer from someone who doesn’t know the question.

So there are times when multiple inheritance isn’t bad?!??

Of course there are!

You won’t use it all the time. You might not even use it regularly. But there are some situations where a solution with multiple inheritance is cheaper to build, debug, test, optimize, and maintain than a solution without multiple inheritance. If multiple inheritance cuts your costs, improves your schedule, reduces your risk, and performs well, then please use it.

On the other hand, just because it’s there doesn’t mean you should use it. Like any tool, use the right tool for the job. If MI (multiple inheritance) helps, use it; if not, don’t. And if you have a bad experience with it, don’t blame the tool. Take responsibility for your mistakes, and say, “I used the wrong tool for the job; it was my fault.” Do not say, “Since it didn’t help my problem, it’s bad for all problems in all industries across all time.” Good workmen never blame their tools.

What are some disciplines for using multiple inheritance?

M.I. rule of thumb #1: Use inheritance only if doing so will remove if / switch statements from the caller code. Rationale: this steers people away from “gratuitous” inheritance (either of the single or multiple variety), which is often a good thing. There are a few times when you’ll use inheritance without dynamic binding, but beware: if you do that a lot, you may have been infected with wrong thinking. In particular, inheritance is not for code-reuse. You sometimes get a little code reuse via inheritance, but the primary purpose for inheritance is dynamic binding, and that is for flexibility. Composition is for code reuse, inheritance is for flexibility. This rule of thumb isn’t specific to MI, but is generic to all usages of inheritance.

M.I. rule of thumb #2: Try especially hard to use ABCs when you use MI. In particular, most classes above the join class (and often the join class itself) should be ABCs. In this context, “ABC” doesn’t simply mean “a class with at least one pure virtual function”; it actually means a pure ABC, meaning a class with as little data as possible (often none), and with most (often all) its methods being pure virtual. Rationale: this discipline helps you avoid situations where you need to inherit data or code along two paths, plus it encourages you to use inheritance properly. This second goal is subtle but is extremely powerful. In particular, if you’re in the habit of using inheritance for code reuse (dubious at best; see above), this rule of thumb will steer you away from MI and perhaps (hopefully!) away from inheritance-for-code-reuse in the first place. In other words, this rule of thumb tends to push people toward inheritance-for-interface-substitutability, which is always safe, and away from inheritance-just-to-help-me-write-less-code-in-my-derived-class, which is often (not always) unsafe.

M.I. rule of thumb #3: Consider the “bridge” pattern or nested generalization as possible alternatives to multiple inheritance. This does not imply that there is something “wrong” with MI; it simply implies that there are at least three alternatives, and a wise designer checks out all the alternatives before choosing which is best.

Can you provide an example that demonstrates the above guidelines?

Suppose you have land vehicles, water vehicles, air vehicles, and space vehicles. (Forget the whole concept of amphibious vehicles for this example; pretend they don’t exist for this illustration.) Suppose we also have different power sources: gas powered, wind powered, nuclear powered, pedal powered, etc. We could use multiple inheritance to tie everything together, but before we do, we should ask a few tough questions:

  1. Will the users of LandVehicle need to have a Vehicle& that refers to a LandVehicle object? In particular, will the users call methods on a Vehicle-reference and expect the actual implementation of those methods to be specific to LandVehicles?
  2. Ditto for GasPoweredVehicles: will the users want a Vehicle reference that refers to a GasPoweredVehicle object, and in particular will they want to call methods on that Vehicle reference and expect the implementations to get overridden by GasPoweredVehicle?

If both answers are “yes,” multiple inheritance is probably the best way to go. But before you close the door on the alternatives, here are a few more “decision criteria.” Suppose there are N geographies (land, water, air, space, etc.) and M power sources (gas, nuclear, wind, pedal, etc.). There are at least three choices for the overall design: the bridge pattern, nested generalization, and multiple inheritance. Each has its pros/cons:

  • With the bridge pattern, you create two distinct hierarchies: ABC Vehicle has derived classes LandVehicle, WaterVehicle, etc., and ABC Engine has derived classes GasPowered, NuclearPowered, etc. Then the Vehicle has an Engine* (that is, an Engine-pointer), and users mix and match vehicles and engines at run-time. This has the advantage that you only have to write N+M derived classes, which means things grow very gracefully: when you add a new geography (incrementing N) or engine type (incrementing M), you need add only one new derived class. However you have several disadvantages as well: you only have N+M derived classes which means you only have at most N+M overrides and therefore N+M concrete algorithms / data structures. If you ultimately want different algorithms and/or data structures in the N×M combinations, you’ll have to work hard to make that happen, and you’re probably better off with something other than a pure bridge pattern. The other thing the bridge doesn’t solve for you is eliminating the nonsensical choices, such as pedal powered space vehicles. You can solve that by adding extra checks when the users combine vehicles and engines at run-time, but it requires a bit of skullduggery, something the bridge pattern doesn’t provide for free. The bridge also restricts users since, although there is a common base class above all geographies (meaning a user can pass any kind of vehicle as a Vehicle&), there is not a common base class above, for example, all gas powered vehicles, and therefore users cannot pass any gas powered vehicle as a GasPoweredVehicle&. Finally, the bridge has the advantage that it shares code between the group of, for example, water vehicles as well as the group of, for example, gas powered vehicles. In other words, the various gas powered vehicles share the code in derived class GasPoweredEngine.
  • With nested generalization, you pick one of the hierarchies as primary and the other as secondary, and you have a nested hierarchy. For example, if you choose geography as primary, Vehicle would have derived classes LandVehicle, WaterVehicle, etc., and those would each have further derived classes, one per power source type. E.g., LandVehicle would have derived classes GasPoweredLandVehicle, PedalPoweredLandVehicle, NuclearPoweredLandVehicle, etc.; WaterVehicle would have a similar set of derived classes, etc. This requires you to write roughly N×M different derived classes, which means things don’t grow gracefully when you increment N or M, but it gives you the advantage over the bridge that you can have N×M different algorithms and data structures. It also gives you fine granular control, since the user cannot select nonsensical combinations, such as pedal powered space vehicles, since the user can select only those combinations that a programmer has decided are reasonable. Unfortunately nested generalization doesn’t improve the problem with passing any gas powered vehicle as a common base class, since there is no common base class above the secondary hierarchy, e.g., there is no GasPoweredVehicle base class. And finally, it’s not obvious how to share code between all vehicles that use the same power source, e.g., between all gas powered vehicles.
  • With multiple inheritance, you have two distinct hierarchies, just like the bridge, but you remove the Engine* from the bridge and instead create roughly N×M derived classes below both the hierarchy of geographies and the hierarchy of power sources. It’s not as simple as this, since you’ll need to change the concept of the Engine classes. In particular, you’ll want to rename the classes in that hierarchy from, for example, GasPoweredEngine to GasPoweredVehicle; plus you’ll need to make corresponding changes to the methods in the hierarchy. In any case, class GasPoweredLandVehicle will multiply inherit from GasPoweredVehicle and LandVehicle, and similarly with GasPoweredWaterVehicle, NuclearPoweredWaterVehicle, etc. Like nested generalization, you have to write roughly N×M classes, which doesn’t grow gracefully, but it does give you fine granular control over both which algorithm and data structures to use in the various derived classes as well as which combinations are deemed “reasonable,” meaning you simply don’t create nonsensical choices like PedalPoweredSpaceVehicle. It solves a problem shared by both bridge and nested generalization, namely it allows a user to pass any gas powered vehicle using a common base class. Finally it provides a solution to the code-sharing problem, a solution that is at least as good as that of the bridge solution: it lets all gas powered vehicles share common code when that is desired. We say this is “at least as good as the solution from the bridge” since, unlike the bridge, the derived classes can share common code within gas powered vehicles, but can also, unlike with the bridge, override and replace that code in cases where the shared code is not ideal.

The most important point: there is no universally “best” answer. Perhaps you were hoping I would tell you to always use one or the other of the above choices. I’d be happy to do that except for one minor detail: it would be a lie. If exactly one of the above was always best, then one size would fit all, and we know it does not.

So here’s what you have to do: T H I N K. You’ll have to make a decision. I’ll give you some guidelines, but ultimately you will have to decide what is best (or perhaps “least bad”) for your situation.

Is there a simple way to visualize all these tradeoffs?

Here are some of the “goodness criteria,” that is, qualities you might want. In this description, N is the number of geographies and M is the number of power sources:

  • Grow Gracefully: Does the size of the code-base grow gracefully when you add a new geography or power source? If you add a new geography (going from N to N+1), do you need to add one new chunk of code (best), M new chunks of code (worst), or something in between?
  • Low Code Bulk: Is there a reasonably small amount of code bulk? This usually is proportional to ongoing maintenance cost — the more code the more cost, all other things being equal. It is also usually related to the “Grow Gracefully” criteria: in addition to the code bulk of the framework proper, best case there would be N+M chunks of code, worst case there would be N×M chunks of code.
  • Fine Grained Control: Do you have fine granular control over the algorithms and data structures? For example, do you have the option of having a different algorithm and/or data structure for any of the N×M possibilities, or are you stuck with using the same algorithm and/or data structure for all, say, gas powered vehicles?
  • Static Detect Bad Combos: Can you statically (“at compile time”) detect and prevent invalid combinations. For example, suppose for the moment that there are no pedal-powered space vehicles. If someone tries to create a pedal-powered space vehicle, can that be detected at compile time (good), or do we need to detect it at run-time?
  • Polymorphic on Both Sides: Does it let users treat either base class polymorphically? In other words, can you create some user code f() that takes any and all, say, land vehicles (where you can add a new kind of land vehicle without requiring any changes to f()), and also create some other user code g() that takes any and all, say, gas powered vehicles (where you can add a new kind of gas powered vehicle without requiring any changes to g())?
  • Share Common Code: Does it let new combinations share common code from either side? For example, when you create a new kind of gas powered land vehicle, can that new class choose to optionally share code that is common to many gas-powered vehicles and choose to optionally share code is common to many land vehicles?

This matrix shows techologies as rows and “goodness criteria” as columns. SMILE! means the row’s technology has the column’s goodness criteria, “—” means it does not.

  Grow Gracefully? Low Code Bulk? Fine Grained Control? Static Detect Bad Combos? Polymorphic on Both Sides? Share Common Code?
Bridge SMILE! SMILE!
(N+M chunks)
SMILE!
Nested generalization
(N×M chunks)
SMILE! SMILE!
Multiple inheritance
(N×M chunks)
SMILE! SMILE! SMILE! SMILE!

Important: do not be naive. Do not simply add up the number of SMILE!s, choosing based on most good or least bad. THINK!!

  1. The first step is to think about whether your particular situation has other design options, that is, additional rows.
    • Recall that the “bridge” row is really a pair of rows — it has an assymetry that could go in either direction. In other words, one could put an Engine* in Vehicle or a Vehicle* in Engine (or both, or some other way to pair them up, such as a small object that contains just a Vehicle* and an Engine*).
    • Similar comments for the nested generalization row: it is actually a pair of rows because it also has an assymetry, and that assymetry gives you an extra option: you could first decompose by geography (land, water, etc.) or first by power source (gas, nuclear, etc.). These two orders yield two distinct designs with distinct tradeoffs.
  2. The second step in using the above matrix is to think about which column is most important for your particular situation. This will let you give a “weight” or “importance” to each column.
    • For example, in your particular situation, the amount of code that must get written (second column) may be more or less important than the fine grained control over algorithms/data structures. Do not get caught up trying to figure out which column is more important in some abstract, generic, one-size-fits-all view of the world, because one size does not fit all!!
    • Question: is code bulk (and therefore maintenance cost) more or less important than fine grained control? Answer: Yes, code bulk (and therefore maintenance cost) is either more or less important than fine grained control. That’s a joke; lighten up.
    • But this part isn’t a joke: don’t trust anyone who thinks they know whether code bulk (and therefore maintenance cost) is always more or always less important than fine grained control. There’s no way to know until you look at all the requirements and constraints on your particular situation! Far too many programmers think they know the answer before they are familiar with the situation. That’s worse than dumb; it is unprofessional and dangerous. Their one-size-fits-all answer will sometimes be right. Their one-size-fits-all answer might have been right in every case they have ever seen in their limited range of experience. But if their past success blinds them from asking the tough questions in the future, they are a danger to your project and should get a thonk on the noggin (“thonk” and “noggin” are highly technical terms).

Your ultimate choice will be made by finding out which approach is best for your situation. One size does not fit all — do not expect the answer in one project to be the same as the answer in another project. Your past successes can become, if you are not careful, the seeds of your future failure. Just because “it” was best on your previous project does not mean “it” will be best on your next project.

Can you give another example to illustrate the above disciplines?

This second example is only slightly different from the previous since it is more obviously symmetric. This symmetry tilts the scales slightly toward the multiple inheritance solution, but one of the others still might be best in some situations.

In this example, we have only two categories of vehicles: land vehicles and water vehicles. Then somebody points out that we need amphibious vehicles. Now we get to the good part: the questions.

  1. Do we even need a distinct AmphibiousVehicle class? Is it also viable to use one of the other classes with a “bit” indicating the vehicle can be both in water and on land? Just because “the real world” has amphibious vehicles doesn’t mean we need to mimic that in software.
  2. Will the users of LandVehicle need to use a LandVehicle& that refers to an AmphibiousVehicle object? Will they need to call methods on the LandVehicle& and expect the actual implementation of those methods to be specific to (“overridden in”) AmphibiousVehicle?
  3. Ditto for water vehicles: will the users want a WaterVehicle& that might refer to an AmphibiousVehicle object, and in particular to call methods on that reference and expect the implementation will get overridden by AmphibiousVehicle?

If we get three “yes” answers, multiple inheritance is probably the right choice. To be sure, you should ask the other questions as well, e.g., the grow-gracefully issue, the granularity of control issues, etc.

What is the “dreaded diamond”?

The “dreaded diamond” refers to a class structure in which a particular class appears more than once in a class’s inheritance hierarchy. For example,

class Base {
public:
  // ...
protected:
  int data_;
};

class Der1 : public Base { /*...*/ };

class Der2 : public Base { /*...*/ };

class Join : public Der1, public Der2 {
public:
  void method()
  {
     data_ = 1;  // Bad: this is ambiguous; see below
  }
};

int main()
{
  Join* j = new Join();
  Base* b = j;   // Bad: this is ambiguous; see below
}

Forgive the ASCII-art, but the inheritance hierarchy looks something like this:

                         Base
                         /  \
                        /    \
                       /      \
                    Der1      Der2
                       \      /
                        \    /
                         \  /
                         Join

Before we explain why the dreaded diamond is dreaded, it is important to note that C++ provides techniques to deal with each of the “dreads.” In other words, this structure is often called the dreaded diamond, but it really isn’t dreaded; it’s more just something to be aware of.

The key is to realize that Base is inherited twice, which means any data members declared in Base, such as data_ above, will appear twice within a Join object. This can create ambiguities: which data_ did you want to change? For the same reason the conversion from Join* to Base*, or from Join& to Base&, is ambiguous: which Base class subobject did you want?

C++ lets you resolve the ambiguities. For example, instead of saying data_ = 1 you could say Der2::data_ = 1, or you could convert from Join* to a Der1* and then to a Base*. However please, Please, PLEASE think before you do that. That is almost always not the best solution. The best solution is typically to tell the C++ compiler that only one Base subobject should appear within a Join object, and that is described next.

Where in a hierarchy should I use virtual inheritance?

Just below the top of the diamond, not at the join-class.

To avoid the duplicated base class subobject that occurs with the “dreaded diamond”, you should use the virtual keyword in the inheritance part of the classes that derive directly from the top of the diamond:

class Base {
public:
  // ...
protected:
  int data_;
};

class Der1 : public virtual Base {
                    ↑↑↑↑↑↑↑ // This is the key
public:
  // ...
};

class Der2 : public virtual Base {
                    ↑↑↑↑↑↑↑ // This is the key
public:
  // ...
};

class Join : public Der1, public Der2 {
public:
  void method()
  {
     data_ = 1;  // Good: this is now unambiguous
  }
};

int main()
{
  Join* j = new Join();
  Base* b = j;   // Good: this is now unambiguous
}

Because of the virtual keyword in the base-class portion of Der1 and Der2, an instance of Join will have only a single Base subobject. This eliminates the ambiguities. This is usually better than using full qualification as described in the previous FAQ.

For emphasis, the virtual keyword goes in the hierarchy above Der1 and Der2. It doesn’t help to put the virtual keyword in the Join class itself. In other words, you have to know that a join class will exist when you are creating class Der1 and Der2.

                         Base
                         /  \
                        /    \
               virtual /      \ virtual
                    Der1      Der2
                       \      /
                        \    /
                         \  /
                         Join

What does it mean to “delegate to a sister class” via virtual inheritance?

Consider the following example:

class Base {
public:
  virtual void foo() = 0;
  virtual void bar() = 0;
};

class Der1 : public virtual Base {
public:
  virtual void foo();
};

void Der1::foo()
{ bar(); }

class Der2 : public virtual Base {
public:
  virtual void bar();
};

class Join : public Der1, public Der2 {
public:
  // ...
};

int main()
{
  Join* p1 = new Join();
  Der1* p2 = p1;
  Base* p3 = p1;

  p1->foo();
  p2->foo();
  p3->foo();
}

Believe it or not, when Der1::foo() calls this->bar(), it ends up calling Der2::bar(). Yes, that’s right: a class that Der1 knows nothing about will supply the override of a virtual function invoked by Der1::foo(). This “cross delegation” can be a powerful technique for customizing the behavior of polymorphic classes.

What special considerations do I need to know about when I use virtual inheritance?

Generally, virtual base classes are most suitable when the classes that derive from the virtual base, and especially the virtual base itself, are pure abstract classes. This means the classes above the “join class” have very little if any data.

Note: even if the virtual base itself is a pure abstract class with no member data, you still probably don’t want to remove the virtual inheritance within classes Der1 and Der2. You can use fully qualified names to resolve any ambiguities that arise, and you might even be able to squeeze out a few cycles in some cases, however the object’s address is somewhat ambiguous (there are still two Base class subobjects in the Join object), so simple things like trying to find out if two pointers point at the same instance might be tricky. Just be careful — very careful.

What special considerations do I need to know about when I inherit from a class that uses virtual inheritance?

Initialization list of most-derived-class’s ctor directly invokes the virtual base class’s ctor.

Because a virtual base class subobject occurs only once in an instance, there are special rules to make sure the virtual base class’s constructor and destructor get called exactly once per instance. The C++ rules say that virtual base classes are constructed before all non-virtual base classes. The thing you as a programmer need to know is this: constructors for virtual base classes anywhere in your class’s inheritance hierarchy are called by the “most derived” class’s constructor.

Practically speaking, this means that when you create a concrete class that has a virtual base class, you must be prepared to pass whatever parameters are required to call the virtual base class’s constructor. And, of course, if there are several virtual base classes anywhere in your classes ancestry, you must be prepared to call all their constructors. This might mean that the most-derived class’s constructor needs more parameters than you might otherwise think.

However, if the author of the virtual base class followed the guideline in the previous FAQ, then the virtual base class’s constructor probably takes no parameters since it doesn’t have any data to initialize. This means (fortunately!) the authors of the concrete classes that inherit eventually from the virtual base class do not need to worry about taking extra parameters to pass to the virtual base class’s ctor.

What special considerations do I need to know about when I use a class that uses virtual inheritance?

No C-style downcasts; use dynamic_cast instead.

(Rest to be written.)

One more time: what is the exact order of constructors in a multiple and/or virtual inheritance situation?

The very first constructors to be executed are the virtual base classes anywhere in the hierarchy. They are executed in the order they appear in a depth-first left-to-right traversal of the graph of base classes, where left to right refer to the order of appearance of base class names.

After all virtual base class constructors are finished, the construction order is generally from base class to derived class. The details are easiest to understand if you imagine that the very first thing the compiler does in the derived class’s ctor is to make a hidden call to the ctors of its non-virtual base classes (hint: that’s the way many compilers actually do it). So if class D inherits multiply from B1 and B2, the constructor for B1 executes first, then the constructor for B2, then the constructor for D. This rule is applied recursively; for example, if B1 inherits from B1a and B1b, and B2 inherits from B2a and B2b, then the final order is B1a, B1b, B1, B2a, B2b, B2, D.

Note that the order B1 and then B2 (or B1a then B1b) is determined by the order that the base classes appear in the declaration of the class, not in the order that the initializer appears in the derived class’s initialization list.

What is the exact order of destructors in a multiple and/or virtual inheritance situation?

Short answer: the exact opposite of the constructor order.

Long answer: suppose the “most derived” class is D, meaning the actual object that was originally created was of class D, and that D inherits multiply (and non-virtually) from B1 and B2. The sub-object corresponding to most-derived class D runs first, followed by the dtors for its non-virtual base classes in reverse declaration-order. Thus the destructor order will be D, B2, B1. This rule is applied recursively; for example, if B1 inherits from B1a and B1b, and B2 inherits from B2a and B2b, the final order is D, B2, B2b, B2a, B1, B1b, B1a.

After all this is finished, virtual base classes that appear anywhere in the hierarchy are handled. The destructors for these virtual base classes are executed in the reverse order they appear in a depth-first left-to-right traversal of the graph of base classes, where left to right refer to the order of appearance of base class names. For instance, if the virtual base classes in that traversal order are V1, V1, V1, V2, V1, V2, V2, V1, V3, V1, V2, the unique ones are V1, V2, V3, and the final-final order is D, B2, B2b, B2a, B1, B1b, B1a, V3, V2, V1.

Reminder to make your base class’s destructor virtual, at least in the normal case. If you don’t thoroughly understand the rules for why you make your base class’s destructor virtual, then either learn the rationale or just trust me and make them virtual.