Empty classes in C++

davmac.org > Techblog

Every now and then I look at an established software standard and contemplate how it could have been so much better if it weren't for one or two little details. Programming languages are a particular sticking point; I have yet to find the "perfect programming language". And by that I mean a language which is perfect given its design considerations.

Take a look at C++. It's a very well designed language in many ways; it is backwards compatible with C, and yet immensely more powerful, because it provides inheritance, encapsulation, and templates. This latter is arguably the most powerful, but I certainly wouldn't do away with the others if given the choice.

Templates are basically a magical way of generating type-safe code. They save the programmer from writing the same classes over and over again (list of X, list of Y, list of Z) and they do so in a way that is run-time efficient, because most of the details are sorted out at compile time. They are amazingly powerful, with a variety of uses, including performing complex calculations at compile time (rather than at run time).

Templates fit right in with the design goals of C++ - to provide a language which is low-level in many respects, but which provides high-level facilities as well, and which gives you raw efficiency at the cost of somewhat complex code. Speed and memory cost are what it's all about. If you are careful, a well designed C++ program will need very little additional memory (or processing power) than an equivalent C program - and it's generally accepted that C is only a step or two away from assembly.

That's the thing - C++ doesn't force you any further away from the processor than C does. But it gives you the option of going further, and it does so in a way which generally leads to highly optimized implementations. Contrary to popular opinion, templates aid in this goal if they are well written - because templates can perform compile time optimization.

I won't go into the gory details of what you can do with templates here. Others have done that elsewhere. I understand what templates are capable of, and I appreciate their power, but I have one bone to pick.

It is: C++ doesn't allow truly empty classes. More specifically, the size of a class cannot be 0. Even if you declare a class with no members, it will generally come back is being 1 byte in size.

The reason for this? For some reason the C++ committe decided that two distinct objects of the same type (with a common base type, to be precise) must have different addresses in memory. That is, if you have to pointers to such objects you can compare whether they are "the same" object by comparing the pointers.

The result is that empty classes take up real space in memory. When passed as arguments to functions, they take up stack space. They increase the size of other objects, which has all sorts of consequences for performance in terms of both memory consumption and speed. An object which contains only empty objects can itself be many bytes in size.

So what, you might think? There's an easy way to avoid being slogged memory usage of empty classes - just don't use the suckers.

The problem with that reasoning is that empty classes are particularly useful as template parameters, in certain situations. Consider the Allocator classes used as a template parameter to the standard library list and vector classes. An instance of the allocator class becomes a member of the collection class itself (or of one of its direct or indirect members) and is used by the collection to allocate memory when required. But in many cases, the allocator needs no instance data - it allocates memory from the program heap. Meaning it has no data members and is therefore an empty class - which potentially takes up memory needlessly.

There is of course the empty base class optimization, which certainly goes some way to solving the problem, but it is tricky - particularly with templates when it may not be clear which classes are potentially empty.

It's annoying, and potentially degrades the efficiency of a program which uses templates - something that is never really supposed to be an issue. All for the unjustifiable requirement that objects of a common base must keep that base at a unique address.

- davpage@davmac.org.