Keep simple data structures simple! There’s no need for artificial pseudo-encapsulation when all you have is a bunch of data.

Recently I have come across a class that looked similar to this:

class Unit { public: Unit(std::string name_, unsigned points_, int x_, int y_) : name{name_}, points{points_}, x{x_}, y{y_} {} Unit(std::string name_) : name{name_}, points{0}, x{0}, y{0} {} Unit() : name{""}, points{0}, x{0}, y{0} {} void setName(std::string const& n) { name = n; } std::string const& getName() const { return name; } void setPoints(unsigned p) { points = p; } unsigned getPoints() const { return points; } void setX(int x_) { x = x_; } int getX() const { return x; } void setY(int y_) { y = y_; } int getY() const { return x; } private: std::string name; unsigned points; int x; int y; };

Let’s have a closer look because this structure could be made much simpler.

Free access to everything

If we look at the getters and setters, we see that they are just a bunch of boilerplate. Books about object-oriented programming often talk in length about encapsulation. They encourage us to use getters and setters for every data member.

However, encapsulation means that there is some data that should be protected against free access. Usually, that’s because there is some logic that ties some of the data together. In such a case, access functions do checks and some data might be changed only together.

But C++ is not a purely object-oriented language. In some cases, we have structures that are just a simple bunch of data and nothing more. It’s best to not hide that fact behind a pseudo-class but make it obvious by using a struct with public data members. The effect is the same: everyone has unlimited access to everything.

What if the logic is elsewhere?

Sometimes, classes like this one just seem to be plain data containers, and the logic is hidden elsewhere. In the case of domain objects, this is called Anemic Domain Model and usually considered an antipattern. The usual solution is to refactor the code to move the logic into the class to be colocated with the data.

Whether we do so or leave the logic separated from the data, it should be a conscious decision. If we decide to leave data and logic separated, we should probably write that decision down. In that case, we’re back to the earlier conclusion: instead of the class, use a struct with public data.

Even if we decide to move the logic into the class there are rare cases where the actual encapsulation is provided outside the class. One example are detail classes in the “pimpl idiom”; nobody but the containing class and the pimpl itself will ever have access, so there’s no point in adding all those getters and setters.

Constructors

Constructors usually are needed to create an object in a consistent state and establish invariants. In the case of plain data structures, there are no invariants and no consistency that could be maintained. The constructors in the example above are only needed to not have to default construct an object and then immediately set each member via its setter.

If you look closely, there’s even a potential for bugs in there: Any std::string is implicitly convertible to Unit , because the single argument constructor is not explicit . Things like that can lead to a lot of debugging fun and headscratching.

Since C++11, we have the feature of in-class initializers. In cases like this one, they can be used instead of constructors. All the constructors above are covered by that approach. With that, the 53 lines of code in the example can be boiled down to 6 lines:

struct Unit { std::string name{ "" }; unsigned points{ 0 }; int x{ 0 }; int y{ 0 }; };

Initialization looks as it did before if you used uniform initialization:

Unit a{"Alice"}; Unit b{"Bob", 43, 1, 2}; Unit c;

What if there is logic for one of the members?

A name probably shouldn’t be an empty string or contain special characters. Does that mean we have to throw it all over and make a proper class out of the Unit again? Probably not. Often we have logic at one place to validate and sanitize strings and similar things. Data that enters our program or library has to pass that point, and later we just assume that the data is valid.

If that is too close to the Anemic Domain Model, we still don’t have to encapsulate everything in our Unit class again. Instead, we can use a custom type that contains the logic instead std::string . After all, a std::string is an arbitrary bunch of characters. If we need something different, a std::string may be convenient but it’s the wrong choice. Our custom type might well have a proper constructor, so it can’t be default constructed as an empty string.

What if some of the data belongs together?`

If we look at the class yet again, we can pretty much assume that x and y are some sorts of coordinates. They probably belong together, so shouldn’t we have a method that sets both together? And maybe the constructors made sense as they allowed to set either both or none?

No, that’s not a solution. It may remedy a few of the symptoms, but we would still have the “Data Clump” code smell. Those two variables belong together, so they deserve their own structure or class.

Conclusion

In the end, our Unit looks like this:

struct Unit { PlayerName name; unsigned points{ 0 }; Point location{ {0,0} }; };

It is small, it is simple. And the fact that it’s a struct with a few public members clearly sends the right message: it’s just a bundle of data.