Okay, you’ve ended up in a situation where you’re either using dynamic_cast * to find out the actual type of a pointer, then moving on to do what you really want, or checking some enum in a base class to see which derived class it really is so that you can perform type-specific operations on it

* or some other form of run-time type information

You know something is wrong with this, but you can’t think of a satisfactory solution. You’ve already considered adding another virtual function on the base class, but you’re worried about how many virtual functions are already there, or the impact that this might have on consumers of the base class. Good on you for recognizing that. Admitting you have a problem is the first step to solving it 🙂 In this tutorial, I’ll talk about one solution to this problem I’ve had some success with: the double-dispatch Visitor pattern. With it, you can trim down those long if-else if blocks, separate responsibility into manageable pieces, and even stabilize your interface better.

How did you get into this mess?

At first, a pointer to a base class made sense; you didn’t need to know the actual derived class. So you decided to expose a single collection of base class pointers to your clients like so:

struct Animal { virtual std::string Noise() const = 0; virtual ~Animal() = default; };

using AnimalCollection = std::vector<Animal*>;

As you added your first few classes, your assumptions were validated; you never needed to know the actual type.

struct Cat : public Animal { std::string Noise() const override{ return "meow"; } }; struct Dog : public Animal { std::string Noise() const override{ return "woof"; } };

But requirements change. One day the client came to you and said "I’m trying to model a person that is afraid of dogs, so they run away when they see one. But they love cats, so they try to pet them when they see them." Dammit. Now your assumptions are wrong. You do need to know the type. And you’re under pressure to meet a deadline. Then you thought "Well there’s only two types of animals, this isn’t so bad." So you wrote code like this:

void Person::ReactTo(Animal* _animal){ if (dynamic_cast<Dog*>(_animal)) RunAwayFrom(_animal); else if (dynamic_cast<Cat*>(_animal)) TryToPet(_animal); }

"I’ll come up with something better later," you told yourself as you moved onto something more fun. Then the client said that if the Animal was a Horse , they wanted to try to ride it. "Okay, that’s doable I guess…" you thought to yourself. So you updated your Person::ReactTo code:

void Person::ReactTo(Animal* _animal){ if (dynamic_cast<Dog*>(_animal)) RunAwayFrom(_animal); else if (dynamic_cast<Cat*>(_animal)) TryToPet(_animal); else if (dynamic_cast<Horse*>(_animal)) TryToRide(_animal); }

At this point you didn’t like working with your own code. We’ve all been there. Nonetheless, the trend continued continued for some time until you found yourself with a mess like this:

void Person::ReactTo(Animal* _animal){ if (dynamic_cast<Dog*>(_animal) || dynamic_cast<Gerbil*>(_animal)) { if (dynamic_cast<Dog*>(_animal) && dynamic_cast<Dog>()->GetBreed() == DogBreed.Daschund) // Daschund's are the exception TryToPet(_animal); else RunAwayFrom(_animal); } else if (dynamic_cast<Cat*>(_animal) || dynamic_cast<Pig*>(_animal)) TryToPet(_animal); else if (dynamic_cast<Horse*>(_animal)) TryToRide(_animal); else if (dynamic_cast<Lizard*>(_animal)) TryToFeed(_animal); else if (dynamic_cast<Mole*>(_animal)) Attack(_animal) // etc. }

"This list is getting pretty long," you thought to yourself one day. "All these dynamic casts seem wrong, and they’re kind of slow." Because it was slow and unwieldy, your team was finally given some leeway to refactor. Then they went and committed this sin (which I’m going to call "manual RTTI"):

enum Animal_Type = {DOG = 0, CAT, PIG, LIZARD, HORSE, /*...*/}; struct Animal { //... virtual Animal_Type GetType() const = 0; };

…and then each derived class:

Animal_Type Cat::GetType() const {return Animal_Type::CAT;} Animal_Type Dog::GetType() const {return Animal_Type::DOG;} // etc.

(If your team instead elected to have GetType() return a std::strin g instead of an enum , you have my deepest sympathies) It is so, so tempting to do this. On the surface, the code gets cleaner and faster

void Person::ReactTo(Animal* _animal){ if (_animal->GetType() == DOG || _animal->GetType() == GERBIL) { if (_animal->GetType() == DOG && static_cast<Dog*>(_animal)->GetBreed() == DogBreed.Daschund) // Daschund's are the exception to the phobia TryToPet(_animal); else RunAwayFrom(_animal); } else if (_animal->GetType() == CAT || _animal->GetType() == PIG) TryToPet(_animal); else if (_animal->GetType() == HORSE) TryToRide(_animal); else if (_animal->GetType() == LIZARD) TryToFeed(_animal); else if (_animal->GetType() == MOLE) Attack(_animal) // etc. }

I’ve seen this happen over and over again as inheritance hierarchies evolve. Do not be seduced by this pattern because it will cause more problems in the long run!

Why is it so bad?

Let’s ignore the fact that your team basically reimplemented the typeid() operator. It’s a little-known aspect of the language that the typeid operator provides the same functionality as your new enum .

// if (_animal->GetType() == CAT) if (typeid(*_animal) == typeid(Cat)) TryToPet(_animal)

Even though typeid() is still RTTI, it is more efficient than a dynamic_cast in practice. In my own experiments it’s about 3x faster. But like I said, let’s ignore that. The hand-rolled virtual GetType() (manual RTTI) approach is still about 30% faster than the typeid() approach, so don’t feel too bad. (Timings: http://coliru.stacked-crooked.com/a/146255f5da7329a6)

The manual RTTI approach is unsatisfactory because it trades a bit of speed for a lot of maintenance. Now you have two problems: You still have a long list of if…else if Every time you add a new type, you are affecting the interface of the base class, and therefore every other type in the chain because they all depend on the Animal_Type enum that you’re appending to. This will result in your client being surprised (not in a good way) to find that they need to recompile just because you added a new derived class (that the client doesn’t care about). They’re upset because you violated on of the core tenets of C++, "Don’t pay for what you don’t use" You also still have a performance issue thanks to the arbitrary ordering of if...else if statements, which means iterating over many Animals is a bit slower than you’d like it to be; your branch predictor cannot help you here.

A little more advanced

Hopefully at this point you realized that something more needed to be done about your class design to support better separation of responsibilities. Or perhaps you doubled-down on this nightmare and realized that you could create a table of Animal_Type enumerations to function pointers to speed things up a bit and clean up the Person::ReactTo code like so:

using function_type = void(Person::*)(Animal*); using ReactionHash = std::unordered_map<Animal_Type, function_type>; ReactionHash& Person::GetReactionFunctions() const{ return { {CAT, &TryToPet}, {DOG, &RunAwayFrom}, {PIG, &TryToPet}, // etc. }; } void Person::ReactTo(Animal* _animal){ const auto& reactionFunctions = GetReactionFunctions(); reactionFunctions[_animal->GetType()](_animal); }

Congratulations, you just implemented your own virtual table. Poorly. You only moved the complexity of ReactTo to another function GetReactionFunctions now has a long list of function pointers Adding another derived class still affects the base class You’ve sacrificed quite a bit of granular control. Note that we now forget to check that a dog is a Daschund . We simply run away. You’ll need to add more complexity to model this scenario (another function). Modeling a "do nothing" scenario requires you either add an empty function or perform a check to see if any function exists. You may not have received any performance gain at all, thanks to the overhead of hashing and lookup. You are paying a little extra in memory to store your lookup table

A Single-Dispatch Visitor In the manual RTTI scenario, above we’re using the virtual GetType() function to discover run time type information, and then using that type information to decide which logic to execute. Let’s cut out the middle-man and stop at "use a virtual function". At the point of a virtual function call, the derived class already knows it is being called. We can leverage this knowledge to clean things up a bit. (There will still be some major flaws with this approach, or else I wouldn’t need to talk about double dispatch later) In a single dispatch visitor, we create a new type to represent Person::ReactTo :

struct ReactionVisitor { explicit ReactionVisitor(Person* _person) : person_{_person} {} Person* person_ = nullptr; // person doing the reacting };

And then we add a new virtual void Visit() function to our Animal class (getting rid of our redundant GetType method)

struct Animal { virtual std::string Noise() const = 0; virtual ~Animal() = default; virtual void Visit(ReactionVisitor& _visitor) = 0; };

And now we implement Visit on derived classes like so:

void Dog::Visit(ReactionVisitor& _visitor){ Person* personWhoIsReacting = _visitor.person_; if (my_breed == DogBreed.Daschund) personWhoIsReacting.TryToPet(this); else personWhoIsReacting.RunAwayFrom(this); }

Person::ReactTo now looks like this

void Person::ReactTo(Animal* _animal){ ReactionVisitor visitor{this}; _animal->Visit(visitor); }

On one hand this is great — we’ve recaptured our ability to write nuanced code on a per-animal basis. We’ve also completely removed any derived class information from the base class. On top of this, we have successfully separated all the per-animal reaction logic, so the code is easier to read. It’s fairly easy to add a new derived Animal class and enforce that Person has to react to it in some way. If we want to allow for a default action, then we can implement the virtual function on the base Animal class itself. Performance-wise, it’s the same as our manual RTTI approach. (Timings: http://coliru.stacked-crooked.com/a/7fa5e3fca9149bd8, no statistically significant difference) There’s one major problem with the single dispatch approach Why should the Dog class dictate how a Person reacts to it? We have leaked implementation details of the Person class and therefore have violated encapsulation. A secondary concern is this: What if the Person class has other behaviors they want to implemented? Are we really going to add a new virtual method on the base class for each of them? (Note: "yes" may be a valid answer depending on your situation.) We’ve separated responsibility but at the cost of being extremely invasive to the Animal classes. Is it possible for us to get the benefits of single dispatch visitation (fast performance, separation of responsibility), while maintaining encapsulation without being invasive? Yes! With double dispatch. A Double-Dispatch Visitor "All problems in computer science can be solved by another level of indirection." – David Wheeler Hang, on things are about to get a little weird. At present, we have created what I’ve called a "single dispatch" visitor. Classes that derive from Animal override a virtual void Visit method that receives a ReactionVisitor in order for us to separate the animal-specific logic. The problem is that the animals themselves are now responsible for Person -specific logic. By also giving our ReactionVisitor a virtual Visit method, we can address this shortcoming.

struct AnimalVisitor { virtual void Visit(Animal*) = 0; }; struct ReactionVisitor : AnimalVisitor { void Visit(Animal*) override { // ???? } Person* person_ = nullptr; // person doing the reacting };

This doesn’t appear very useful…yet. We still want: Encapsulation, while maintaining Separation of responsibilities Starting with #1, let’s make ReactionVisitor a friend of our Person class

struct Person { /*...*/ friend ReactionVisitor; };

This is one of the use cases where a friend class makes perfect sense. Logically, ReactionVisitor is part of the same class as Person , so it should have access to Person ‘s private members. Making it a friend (vs a nested class) allows us to cleanly separate the interfaces, de-clutter the Person class, and as a bonus leaves the ReactionVisitor directly testable. This achieves encapsulation, but so far our AnimalVisitor::Visit method only knows about the base Animal class, which doesn’t really help with separation of responsibility; we’re back to step one. So let’s do the obvious thing and make AnimalVisitor know about each Animal type.

struct AnimalVisitor { virtual void Visit(Cat*) = 0; virtual void Visit(Dog*) = 0; /*...*/ }; struct ReactionVisitor : public AnimalVisitor { void Visit(Cat*) override; void Visit(Dog*) override; Person* person_ = nullptr; };

Ignoring any potential issues with this approach, how do we make it work? That is, how do we ultimately call ReactionVisitor::Visit(Cat*) ? First, we reclaim the ReactTo logic from the derived classes and give it to ReactionVisitor instead:

struct ReactionVisitor : public AnimalVisitor { void Visit(Dog* _dog) override{ if (_dog.GetBreed() == DogBreed.Daschund) person_.TryToPet(this); else person_.RunAwayFrom(this); } //... (other overridden Visit methods) Person* person_ = nullptr; // person doing the reacting };

In this way we preserve separation of responsibility without leaking implementation details for Person . Next, the entire Animal::Visit hierarchy is refactored to so that derived classes call through to the visitor’s Visit method with the this pointer (will explain why shortly):

void Cat::Visit(AnimalVisitor* _visitor){ // overridden virtual method _visitor->Visit(this); } void Dog::Visit(AnimalVisitor* _visitor){ // overriden virtual method _visitor->Visit(this); }

Finally, we can leave Person::ReactTo untouched

void Person::ReactTo(Animal* _animal){ ReactionVisitor visitor{this}; _animal->Visit(&visitor); }

How does double dispatch work?

…I warned you things would get weird. It can be hard to wrap your head around without visualization. We have 2 virtual dispatch calls:

From Person::ReactTo , we call Animal::Visit , which will dispatch to the appropriate overridden Visit ( Cat::Visit , Dog::Visit , etc.)

From the overridden Cat::Visit(AnimalVisitor*) , we call AnimalVisitor::Visit , which will dispatch to the appropriate overridden AnimalVisitor::Visit method ( ReactionVisitor::Visit(Cat*) )

Benefits of Double Dispatch over the other approaches Still fast 2 virtual function calls, no dynamic_casts or long if…else if blocks

overall time is still the same as both manual RTTI and single dispatch (Timings) Encapsulation The Person class still controls all impl details for how it will react to any specific animal type Separation of responsibility The logic for reacting to any individual animal is cleanly separated from the logic for reacting to any other animal

This improves readability, and if the logic for reacting to any individual animal became sufficiently complex, we could put it into its own translation unit (.cpp file)

More stable class interfaces for the Animal hierarchy – Adding a new animal to the hierarchy doesn’t affect any base or derived class – Animal::Visit() accepts an interface now, which makes it a pathway for ALL type-specific behavior. Reusing the Visitor machinery Point #4 above is what makes this pattern extra powerful. No longer do you need to add intrusive new virtual functions to the base class to support type-specific behavior. Clients who don’t use any type-specific machinery also don’t have to rebuild their code every time a new derived class is added. What are some of the situations where you could reuse double dispatch? In a real system, it wouldn’t just be Person::ReactTo that might require knowing the actual type behind the base Animal* pointer. You can implement filtering with double-dispatch. E.g., "give me only the Cats from a vector<Animal*> "

from a " You can expose your vector<Animal*> to Python as a heterogeneous list of [Cat, Dog, etc.] via a visitor that constructs appropriate PyObjects for each type

to Python as a heterogeneous list of via a visitor that constructs appropriate for each type You can co-opt it to make serialization of your generic containers easier. In the next blog post I will show a demonstration of using this machinery to implement I/O for a vector<Animal*>



Why is dynamic_cast so slow in the first place? dynamic_cast requires run time type information (RTTI) to determine if a base class pointer really points to a derived class. Or, at least that’s how you might think of it. The reality is that inheritance hierarchies can be pretty complicated with multiple inheritance, virtual inheritance, private inheritance, etc. A dynamic_cast is a flexible tool that allows casting up, down, and sideways in that hierarchy. While how it works is implementation-defined, it’s generally slow. In a nutshell, implementors hijack the virtual function table of polymorphic classes to store additional information about the inheritance hierarchy. Satisfying a call to dynamic_cast requires a number of indirect calls that also create instances of intermediary classes, and these intermediary classes store all kinds of bookkeeping information. Even the typeid approach still has to produce instances of type_info to be compared for equality. "Manual RTTI", single-, and double-dispatch are able to perform faster because they only deal in virtual function calls, which don’t need to jump through as many hoops as full-blown RTTI. Instead, when you call a function on a base class pointer, there’s a single indirect call to the appropriate derived class’ function via the virtual table. This is made possible by hidden machinery in a derived class’ constructor that adjusts the base class’ reference to the virtual table when you create an instance. While virtual function calls are faster than dynamic casting, the cost is still non-zero, which is why C++ has so many advocates decrying the use of dynamic polymorphism in favor of static polymorphism (templates).

This article from “Jeris” walks through Microsoft Visual Studio 2010’s implementation of dynamic_cast from a debugger’s standpoint. [Quarkslab also investigated the relationships between Microsoft’s classes used to store type information. Finally, There’s an interesting discussion here about challenges the Itanium team faced w.r.t. RTTI.

Fixing issues with our double dispatch implementation The implementation I’ve shown still has some problems, let’s examine them and see how we might mitigate them. But first let’s review our current double dispatch implementation

Review of our current double dispatch impl

// AnimalVisitor.h struct Cat; struct Dog; // ... (other forward declarations) struct AnimalVisitor { virtual void Visit(Cat*) = 0; virtual void Visit(Dog*) = 0; /*...*/ }; // Animal.h struct AnimalVisitor; struct Animal { virtual void Visit(AnimalVisitor* _visitor) = 0; // ... }; // Dog.h #include "Animal.h" struct Dog { virtual void Visit(AnimalVisitor* _visitor){ _visitor->Visit(this); } }; // Cat.h etc follow Dog.h impl // ReactionVisitor.h #include "AnimalVisitor.h" #include "Cat.h" #include "Dog.h" struct ReactionVisitor : public AnimalVisitor { void Visit(Cat*) override{/*...*/} void Visit(Dog*) override{/*...*/} Person* person_ = nullptr; }; // Person.h #include "Animal.h" #include "ReactionVisitor.h" struct Person { /*...*/ void ReactTo(Animal* _animal){ ReactionVisitor visitor{this}; _animal->Visit(&visitor); } };

Issue #1: Each new derived Animal must override Visit() in the same way It can be easy to forget to override the Visit() method, and even when it’s done it is a tedious copy-paste task. Leaving it this way is safe if your team is disciplined, has keen-eyed code reviewers, and solid unit test coverage. Or, you could take a Mixin/CRTP approach to idiot-proof things with extremely minimal runtime overhead. The idea is that all classes derive from a common base that automatically provides the Visit(AnimalVisitor*) functionality. Let’s see it in action:

struct Animal { virtual void Visit(AnimalVisitor* _visitor) = 0; // ... }; template<class T> struct VisitableAnimal : Animal { void Visit(AnimalVisitor* _visitor) override { _visitor->Visit(static_cast<T*>(this)); } }; struct Cat : VisitableAnimal<Cat> { }; struct Dog : VisitableAnimal<Dog> { };

Issue #2: It’s easy to add new Visitors, painful to add new derived Animal classes Adding a new Animal requires that we update the AnimalVisitor base class:

struct AnimalVisitor { virtual void Visit(Cat*) = 0; virtual void Visit(Dog*) = 0; /*...*/ virtual void Visit(NewAnimal*) = 0; };

This affects every derived AnimalVisitor class. Personally, I believe this is a blessing in disguise; you now have a very easy way to find all the functionality that needs to be updated to handle the addition of a new Animal. However, not all visitors need to know about every derived Animal all the time. For example, if I simply want a visitor that acts as a Filter for Cat objects, I don’t want to implement an empty Visit(Dog*) . What can be done?

There’s a couple of mitigative steps we can undertake to insulate visitors from new derived classes. Derive from a "do-nothing" base Visitor This approach also uses CRTP like we performed above with VisitableAnimal . The idea is to create a derived class that will automatically "do nothing" for each derived Animal , allowing your derived visitor to opt-in to doing something. In C++17, the addition of variadic using declarations makes this fairly easy:

// DefaultDoNothingAnimalVisitor.h #include "AnimalVisitor.h" template<class T> struct SingleDoNothingAnimalVisitor : virtual AnimalVisitor { using AnimalVisitor::Visit; void Visit(T*) override{} }; template<class... T> struct MultipleDoNothingAnimalVisitor : public SingleDoNothingAnimalVisitor<T>... { using SingleDoNothingAnimalVisitor<T>::Visit...; }; // strong typedef struct DoNothingAnimalVisitor : public MultipleDoNothingAnimalVisitor<Cat, Dog, ...> {};

Now we can write our CatFilter visitor without worrying about the rest of the hierarchy:

struct CatFilter : DoNothingAnimalVisitor { using DoNothingAnimalVisitor::Visit; void Visit(Cat* _cat) override { cats_.push_back(_cat); } std::vector<Cat*> cats_; };

(Here’s a working demo) Adding a new derived class will now touch the MultipleDoNothingAnimalVisitor base class, but no other code needs to change.

Dispatch to a Visit(Animal*) default This approach has us adding an additional Visit method to the base AnimalVisitor to allow dispatch to a generic Animal* catch-all:

struct AnimalVisitor { virtual void Visit(Animal*) = 0; // ... (other Visit methods) }; struct DefaultAnimalVisitor : AnimalVisitor { void Visit(Animal*) override{} void Visit(Cat* _cat) override{ Visit(static_cast<Animal*>(_cat); } void Visit(Dog* _dog) override{ Visit(static_cast<Animal*>(_dog)); } // ... };

This allows you to write the inverse of a CatFilter — a visitor that will visit everything except for Cats :

struct DefaultAnimalVisitor : AnimalVisitor { void Visit(Animal*){} void Visit(Cat* _cat){Visit(static_cast<Animal*>(_cat));} void Visit(Dog* _dog){Visit(static_cast<Animal*>(_dog));} // ... }; struct AllButCatFilter : DefaultAnimalVisitor { using DefaultAnimalVisitor::Visit; void Visit(Animal* _animal) override { animals_.push_back(_animal); } void Visit(Cat*) override{/*intentionally blank*/} std::vector<Animal*> animals_; };

(Here’s a working demo) A mixture of the above The double dispatch pattern is flexible enough where you could mix a "do-nothing" approach and a "dispatch to a default approach" into a "dispatch into a fallback that does nothing by default" kind of approach. Or neither. The point is that we can insulate our derived visitors in a number of ways.

Issue #3: There’s a lot of boilerplate if you only care about visiting a single type What’s easier, writing a new class to inherit from AnimalVisitor , then capturing whatever state you need as member variables, and finally overriding the correct method(s) to get what you want, or writing a single dynamic_cast ? It’s hard to say that dynamic_cast is "wrong". It may well be more readable and fast enough for your needs. The only danger here is that when the code inevitably needs to support another type, will the next programmer simply copy + paste the dynamic_cast or will they recognize when a refactor is needed? In my experience, too often is the copy + paste route taken, and this isn’t caught until code review time (or later), at which point the developer will be annoyed by your badgering to suddenly implement all the machinery needed for a double dispatch implementation. So your code review comment becomes a tech debt item with low priority that is eventually forgotten (until the next copy + paste).

Issue #4: It’s more difficult to understand than the manual RTTI or dynamic_cast approach Let’s face it, for as fancy and useful as a double dispatch visitor is, if your fellow C++ developers are primarily novices (that’s not to say they aren’t intelligent or extremely proficient in other languages or the business domain), it will be difficult for them to maintain this pattern. One of the hard truths in software development is that shiny new tools are often a liability simply because your team lacks the expertise to support them. This limitation can be overcome by training (formal and informal), but that’s often time and money employers and employees are not willing to spend. If you’re reading this article at all, you are likely on the more expert side of the spectrum, so you’re probably all-to-familiar with push-back from team members and management when you propose anything new that will simplify everyone’s life. Approach these roadblocks as opportunities to prove your value and teamwork by championing the training yourself! It will take work on your side to create demos and before/after examples of the codebase where you highlight all the improvements the new tool offers. You may even discover something better along the way, or that what you thought would be a big improvement has a major flaw. Either way, it’s also an opportunity for you to grow into a more senior role yourself.

TLDR; The double dispatch pattern is a flexible design pattern that confers a number of benefits: separation of responsibility for type-specific logic

performance over dynamic_cast and check or enum / string comparison

and check or / comparison new functionality can be added without touching any class headers

(It’s also been battle-tested for about as long as polymorphism has been around. Dr. Dobbs has an article on this exact pattern since 1998, and I wouldn’t be surprised to find artifacts of it from long before.) When your class hierarchy is finalized (if), you can begin decommissioning the visitor pattern in favor of an even faster std::variant approach. Happy Coding!