Surrounded By Bugs

A response to "Spooky action at a distance" by Drew DeVault.

As Abraham Maslow said in 1966, "I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail."

Wikipedia, "Law of the Instrument"

Our familiarity with particular tools, and the ways in which they work, predisposes us in our judgement of others. This is true also with programming languages; one who is familiar with a particular language, but not another, might tend to judge the latter unfavourably based on perceived lack of functionality or feature found in the former. Of course, it might turn out that such a lack is not really important, because there is another way to achieve the same result without that feature; what we should really focus on is exactly that, the end result, not the feature.

Drew Devault, in his blog post "Spooky action at a distance", makes the opposite error: he takes a particular feature found in other languages, specifically, operator overloading, and claims that it leads to difficulty in understanding (various aspects of the relevant) code:

The performance characteristics, consequences for debugging, and places to look for bugs are considerably different than the code would suggest on the surface

Yes, in a language with operator overloading, an expression involving an operator may effectively resolve to a function call. DeVault calls this "spooky action" and refers to some (otherwise undefined) "distance" between an operator and its behaviour (hence "at a distance", from his title).

DeVault's hammer, then, is called "C". And if another language offers greater capability for abstraction than C does, that is somehow "spooky"; code written that way is a bent nail, so to speak.

Let's look at his follow-up example about strings:

Also consider if x and y are strings: maybe “+” means concatenation? Concatenation often means allocation, which is a pretty important side-effect to consider. Are you going to thrash the garbage collector by doing this? Is there a garbage collector, or is this going to leak? Again, using C as an example, this case would be explicit:

I wonder about the point of the question "is there a garbage collector, or is this going to leak?" - does DeVault really think that the presence or absence of a garbage collector can be implicit in a one-line code sample? Presumably he does not furthermore really believe that lack of a garbage collector would necessitate a leak, although that's implied by the unfortunate phrasing. Ironically, the C code he then provides for concatenating strings does leak - there's no deallocation performed at all (nor is there any checking for allocation failure, potentially causing undefined behaviour when the following lines execute).

Taking C++, we could write the string concatenation example as:

std::string newstring = x + y;

Now look again at the questions DeVault posed. First, does the "+" mean concatenation? It's true that this is not certain from this one line of code alone, since in fact it depends on the types of x and y, but there is a good chance it does, and we can anyway tell by looking at the surrounding code, which of course we need to do anyway in order to truly understand what this code is doing (and why) regardless of what language it is written in. I'll add that even if it does turn out to be difficult to determine the types of the operands from inspecting the immediately surrounding code, this is probably an indication of badly written (or badly documented) code*.

Any C++ systems programmer, with only a modest amount of experience, would also almost certainly know that string concatenation may involve heap allocation. There's no garbage collector (although C++ allows for one, it is optional, and I'm not aware of any implementations that provide one). True, there's still no check for allocation failure, though here it would throw an exception and most likely lead to (defined) imminent program termination instead of undefined behaviour. (Yes, the C code most likely would also terminate the program immediately if the allocation failed; but technically this is not guaranteed; and, a C programmer should know not to assume that undefined behaviour in a C program will actually behave in some certain way, despite that they might believe that they know how their code should be translated by the compiler).

So, we reduced the several-line C example to a single line, which is straight-forward to read and understand, and for which we do in fact have ready answers to the questions posed by DeVault (who seems to be taking the tack that the supposed difficulty of answering these questions contributes to a case against operator overloading).

Importantly, there's also no memory leak, unlike in the C code, since the string destructor will perform any necessary deallocation. Would the destructor call (occurring when the string goes out of scope) also count as "spooky action at a distance"? I guess that it should, according to DeVault's definition, although that is a bit too fuzzy to be sure. Is this "spooky action" problematic? No, it's downright helpful. It's also not really spooky, since as a C++ programmer, we expect it.

It's true that C's limitations often force code to be written in such a way that low-level details are exposed, and that this can make it easier to follow control flow, since everything is explicit. In particular, lack of user-defined operator overloading, combined with lack of function overloading, mean that types often become explicit when variables are used (the argument to strlen is, presumably, a string). But it's easy to argue - and I do - that this doesn't really matter. Abstractions such as operator overloading exist for a reason; in many cases they aid in code comprehension, and they don't really obscure details (such as allocation) that DeVault suggests they do.

As a counter-example to DeVaults first point, consider:

x + foo()

This is a very brief line of C code, but now we can't say whether it performs allocation, nor talk about performance characteristics or so-forth, without looking at other parts of the code.

We got to the heart of the matter earlier on: you don't need to understand everything about what a line of code does by looking at that line in isolation. In fact, it's hard to see how a regular function call (in C or any other language) doesn't in fact also qualify as "spooky action at a distance", unless you take the stance that, since it is a function call, we know that it goes off somewhere else in the code, whereas for an "x + y" expression we don't - but then you're also wielding C as your hammer: the only reason you think that an operator doesn't involve a call to a function is because you're used to a language where it doesn't.