Code formatting in C++ Part Three

11th November 2009 at 21:30

In this article I am going to present my recommendation for a C++ code formatting style (although it applies to most free-formatted languages, especially those that are C/C++ like).

I have covered the background to most of my choices in some detail (some would say too much - but I invoke Blaise Pascal here) in the first two articles of this series, the rather consistently named:

Code formatting in C++ Part One

Code formatting in C++ Part Two

Since the style I am about to present is a little unusual in places, and arbitrary in others, I encourage you to take a look a the previous articles if you have not already done so. Also, where arbitrary looking numbers are used, follow the spirit of the rule rather than the letter (or, in this case, number).

Page width

The proposals below refer often to page width, and by this I mean the number of characters that you would normally expect to be visible while reading and writing code in an editor. For example, it used to be common to keep within 80 characters (or less) due to text mode screen sizes. These days windows can easily be sized to much greater character widths, but I would still recommend adopting a page width of between 80-100 characters. It is not a hard limit, although it is more important in some areas than others. Personally I still try to stick to 80.

Proposal 1: Formatting variable declaration blocks

char*          txt = "hello";
int            i = 7;
std::string    txt2 = "world";
std::vector<std::string>            v;
std::map<std::string, std::string>  m;

Variable declarations should be grouped together where possible (without violating the principle of locality - ie, keeping them close to first use) in "islands" of no more than 16 lines at a time. If there are more than 16 variable declarations in a group then use single lines of whitespace to break them up. Try to keep variables of similar length in the same block.

Within each block, align the variable names as much as possible. Where there is a large variation in type name length, sub-group longer names and shorter names together and align variable names in sub-group blocks instead (note how the vector and map, above, are separated out this way)

This proposal applies both in function body code and within class declarations (and at global scope, if you must).

Proposal 2: Formatting function signatures

Function signatures come in two forms, and we make a distinction here. The first form is the prototype, usually found in a header file, if at all. The second form is part of the definition and is followed by the function body (if applying this to a language without the separate prototype stage, the first form does not exist, of course). We shall start with that:

Function definition signatures

////////////////////////////////////////////////////////////////////////////////
void ClassName::MethodName
(
    char*          txt = "hello",
    int            i = 7,
    std::string    txt2 = "world",
    std::vector<std::string>            v,
    std::map<std::string, std::string>  m
)
const
{
    // ... method body
}

The example here is for a method of a class, but the formatting would be the same for a free function

The return value and function or method name (along with any modifier prefixes - e.g. static, or namespace prefixes) appear on their own line.
Next are the parentheses - both of which appear on their own line - indented to the same level as the preceding line.
Within the parentheses, each on their own lines, are the arguments - formatted according to Proposal 1. Any post-fix modifiers (just const, in this case) appear on their own line, followed by the function or method body.

This is almost certainly the most controversial proposal and I will take up my additional rationale in the next article

If a comment block does not already precede the signature, use a line of forward slashes for about a page width (e.g. I run them up to the 80th column).

An additional point worth mentioning here is that this style lends itself well to being used with the Doxygen "inline comment" method of documenting function and method arguments.

Function prototype signatures

void MethodName
    (   char*          txt,
        int            i,
        std::string    txt2,
        std::vector<std::string>            v,
        std::map<std::string, std::string>  m ) const;

If a separate prototype is required there are some differences to the formatting. This might seem a little odd but I'll provide the rationale in the following article.

First, the parentheses appear in-line with the arguments block, rather than on their own lines. Furthermore the whole block itself is indented with respect to the function name. Finally, any post-fix modifiers (which may include the pure virtual marker here) appear on the same line as the closing parenthesis.

Note that no line of comment characters precedes the signature. Ideally functions and methods would be fully documented at the implementation site and the documentation extracted from comments using a tool such as Doxygen. There are reasons to consider keeping the prototypes clear of too many comments, but obviously you can put them here if you are sure that is best for you

Proposal 3: Function calls

If a function call fits within a normal page width then write it on one line. Long lines should be split across lines according to one of the following two examples:

LongMethodCall1( "some text",
                  aString,
                  anInt,
                  anotherArgument );

string returnVal = ReallyLongMethodNameCall
        ( "some text",
          aString,
          anInt,
          anotherArgument );

In both cases the arguments are aligned, one line each, with parentheses on the first and last lines. The first line should share the line with the function or method name itself, unless that would push the argument list across such that any of the arguments end up beyond the page width

General Principles

The proposals above are deliberately narrow in focus, concentrating on those areas that are often left out of standards, or not sufficiently described, and where using an ad-hoc approach is often less than satisfactory. However there are some simple emergent themes that can be carried through to other areas of code:

Types and identifier names are separated into columns through alignment
Code is kept within a page width where possible. This is especially significant for code that you need to refer to at a glance, such as function prototypes - and is often where it is most overlooked!
For "structural" code, such as function signatures, consistency is especially important, even in places where it seems unnecessary (e.g. splitting short function signatures across multiple lines - even empty constructors). For implementation code the choice of when to split can be guided by the page width.

I have made some recommendations in these proposals that are not specifically backed by the discussion in the previous articles. I will attempt to cover these in the next, and final, article in this series.

Please submit or upvote, here - or follow through to comment on Reddit