Code formatting in C++ Part Two


In the mid nineties I worked at Dr. Solomon's on their Anti-Virus toolkit. I spent some time in the virus labs working with live viruses (which I am told is the correct pluralisation). In those days viruses were mostly DOS based and attached themselves either to an exe image or a disk boot sector. Dr. Solomon's had their own scripting language for describing how a particular virus was identified, and then how it should be removed. This was significant as viruses already had plenty of sophistication with encryption and poly-morphism (each infection looked different). So the guys in the lab would write code in this scripting language on a regular basis.

One thing I noticed as I did my stint in the labs was that the guys who worked there all the time didn't use any indentation. While the script was not really procedural, it did have sections and block scopes, and yet these were never being highlighted in the textual layout of the code. There was nothing in the language that prevented this, so when I wrote my own scripts I indented as I thought best and happily showed my code to the head of the lab. His appraisal?:

We don't use indentation here

I was shocked! Why would you deliberately hide the structure of the code when there was virtually no overhead in bringing it out?
Of course I knew that different people have different ideas about code formatting, but I hadn't come across such an extreme case before.

As my career progressed I learned more and more that the subject of code formatting was very delicate. Developers may grudgingly adopt a "house style" for the sake of consistency (or, increasingly commonly, just adopt the style of the source file they are editing at the time). But ask them to change what they think is best and you'll be lucky to walk away with all your teeth!

Despite this I did pay attention to my own formatting style. Rather than stick to what I'd always done, if I saw a new style I questioned myself on whether there was anything about it that gave it an advantage. If so I adopted the style. For example, when I started out I used the common style of placing spaces on the outside of parentheses, like so:

if (condition==expected)
{
    doSomething (argument);
}

Note the space before the opening (.

Then I saw someone who put the spaces on the inside:

if( condition==expected )
{
    doSomething( argument );
}

This looked really strange and I wondered why he was so keen to depart from the norm. But after a while I realised that, for me at least, I find the second version easier to read. The difference was only slight, of course (or so I thought), but I found that if I was looking at a screenful of code and needed to home in on the interesting bits, having the spaces on the inside of the parentheses helped those parts of the code to come out of the screen at me. Logically, the parentheses belong to the function or keyword preceding it, whereas the arguments or expressions passed in where external and varied independently - so the use of whitespace captured that relationship.
At least that's how I see it.

After a few more years I began to think about whether any sort of objective metrics could be extracted on what aspects of code formatting style enhanced readability - independently of an individual's "preferred" (ie, existing, often ground in) style.

If I could find any such metrics or recommendations they would, I surmised, need to satisfy the following requirements:

They would derive from objective sources that are ideally not connected with software development
They wouldn't necessarily follow my own existing style (ok this isn't a requirement - but if it diverges from my own style it's a good hint)
Other people, picked at random and asked to give the style a try, would come to appreciate it - even if they objected at first

The first requirement is investigated in more detail in the first part of this series - the Speed Reading perspective.

In this article we'll look at the other two.

The best laid code of keyboards and men

Armed with the ideas I'd derived from Speed Reading I decided to tackle the issue of objectively good code formatting styles. This is not to say that it is perfect or that it as truly objective in an absolute sense. But I do believe it has some value. Not least for solving the problem of how to format function signatures consistently.

Before we look at the specifics, I'll address the second and third requirements from the previous section.

They wouldn't necessarily follow my own existing style

This is the case. Although I continue to prefer my spaces-inside-the-parentheses style, and this is compatible, and I'd already had a preference for alignment and columns, the realisation of my ideas took some of that further, as well as into unexpected directions that took some getting used to.

Other people, picked at random and asked to give the style a try, would come to appreciate it - even if they objected at first

As stated early on, the numbers may not be statistically significant, but I have asked a number of developers with difference backgrounds to give the style a fair try. An immediate problem here is that they may have given this style a fairer try than other contenders. A proper study would have introduced control styles too. Nonetheless I found the results illuminating.

Pretty much without exception (at time of writing) everyone who tried it followed the same pattern:

  1. Immediate reaction: "Ugh! That's horrible! Ok, I'll try it, but then I'm going straight back to my old style"
  2. Day 1: Much the same reaction, some regressions, but generally following the style fairly easily, despite personal feelings.
  3. Day 2: "Actually I'm starting to like it!"
  4. Day 2-3: "This is awesome, I'm going to use this style in all my code now"
  5. Day 3+: "I can't stop myself reformatting all my old code to this new style!"
  6. ...
  7. Year 3+: "meh"

We'll come back to the Year 3 effect later. Other than that the general progression is promising, to say the least. However it's by no means conclusive. In addition to the weaknesses already outlined it doesn't really tell us how effective it is (i.e. whether it has a net positive impact on productivity, beyond the initial "feel good" phase). For this I don't have any hard numbers. What I do have is my own feeling, and that of those that tried it, that code readability and navigability improved greatly.

By this point you're probably wondering if I'll ever get to describe the style itself at all. In that case you shall be pleased to know that the next thing I'll cover is just that.

In the next article. See you then!


Please submit or upvote, here - or follow through to comment on Reddit