Skip to the content of the web site.

Coding standards and conventions

IEEE Spectrum Magazine    MIT Technology Review    Wired Magazine    EDN Network    EE Times    Skeptic's Guide to the Universe

Please note: students will, in general, not be deducted any marks for not following these conventions (with the exception of the use of profanity). These conventions are here strictly so that students are presented a common format of coding, and so that students can clearly see the various parts of the sample code.

If someone in the instructional staff is showing of other examples of C++ code not specifically prepared for this course, there is no need to alter it to satisfy this convention, so long as it is internally consistent with its own conventions.

You, as a teaching assistant, who has had much experience with C and C++ and Java, may wonder why we require a coding standard such as

    for ( std::size_t k{0}; k < capacity; ++k ) {
         array[k] = 0.0;
    }

instead of the more familiar

    for ( int k = 0; k < size; ++k ) {
         array[k] = 0.0;
    }

This is because later, when we introduce classes, a for loop will use the initializer format, so we present the same format for all. Additionally, the use of std::size_t helps the student understand the underlying representation and helps the student make the transition to professional developer and not fly-by-the-seat-of-your-pants evening cobbling of code together.

The same goes for declaring all functions before defining any of them. In any professional programming environment, this is absolutely necessary, for as projects become larger and larger, ordering the function definitions without prior function declarations can become increasingly more-and-more difficult and time-consuming.


The coding conventions for the course include:

File layout

No line of text should be longer than 80 characters, with a tab character assumed to be equal to eight spaces, although generally, in course code, tabs are four spaces, and on repl.it, tabs are two spaces.

Every sample code provided to or shown to students must have the following format:

// Include statements

// Class declarations
class X;

// Function declarations
void f();
int main();

// Class definitions
class X {
    /// ...
};

// Main definition
int main() {
    return 0;
};

// Member function and other function definitions

No short cuts are ever to be taken, such as using the function or class definition as the simultaneous declaration. While this works great for toy examples, this is not acceptable for any industry standards. We are trying to teach students good programming practice, not late-night hacking.

While technically, main() does not have to return zero under the newest C++ standards, always have main return zero if there is no explicit return statement. This is so that if students migrate to C, they continue to follow this practice. Besides, it's an oddity that exactly one function has a return type of int but doesn't have to return anything.

Now, why are class declarations and function declarations so important: after all, for anything you've ever programmed so far, you didn't have to do any such thing, and everything you did seemed to work.

Suppose, however, you have classes A and B, and functions f and g, and a member variable of class A refers to class B—and here is the important point—a member variable of class B refers to class A.

You cannot define class A first, as the compiler does not know about class B, and you cannot define class B first, as it requires class A:

// This causes an error: the compiler does not know what 'B' is
//  - 'B' has not yet been declared
class A {
    private:
        B *p_b;
};

class B {
    private:
        A *p_b;
};

One 'hack' is to declare class B before class A is defined:

// Hackish solution, only declare what must be declared
//  - Seems a little childlike, eh?
class B;

class A {
    private:
        B *p_b;
};

class B {
    private:
        A *p_b;
};

This, however, now means that subsequent changes and the introduction of new classes may require you to "fix" other compilation problems.

// Class declarations
class A;
class B;

// Class definitions
class A {
    private:
        B *p_b;
};

class B {
    private:
        A *p_b;
};

Additionally, as projects become larger and larger, class definitions become larger and larger, and therefore anyone reading the code may find it difficult to quickly view all classes in a particular file. By having a list of class declarations, the reader an quickly understand all the classes that will be defined in a particular file.

Again, we're trying to teach good programming habits, so that they become second nature, instead of learning 'tricks' that sometimes work and sometimes don't.

The same goes with functions: suppose that function f(...) calls function g() and vice versa. You have the same dilemma, so you may realize, "Oh, I've got to declare this first...":

// This declaration is necessary as 'f(...)' calls 'g(...)',
// and without this declaration, the compiler doesn't know what
// to do about the identifier 'g' it comes across.
int g( int n );

int f( int n ) {
    n /= 2;
    std::cout << n << std::endl;

    if ( n == 1 ) {
        return 1;
    } else if ( n%2 == 0 ) {
        return f(n);
    } else {
        return g(n);
    }
}

int g( int n ) {
    n = 3*n + 1;
    std::cout << n << std::endl;

    if ( n%2 == 0 ) {
        return f(n);
    } else {
        return g(n);
    }
}

Once again, it makes much more sense to just declare all functions first, and then define the functions. Again, this too helps anyone reading a file, as they need look only one place to get a brief summary of all the functions that will be defined in a file, including the identifiers, the types of the arguments, and the return types:

// Function declarations
int f( int n );
int g( int n );

// Function definitions
//    ...as above...

Later, when students start programming in C, this becomes an issue with structures and typedef. Without declarations, a linked list node definition like this does not work:

typedef struct {
    double  value;
    node_t *p_next;
} node_t;

You cannot use node_t in the structure definition because the type definition has not yet taken place. To fix this, you could give the structure a name:

typedef struct node {
    double       value;
    struct node *p_next;
} node_t;

but now everywhere else you use node_t except in the structure definition itself, where you must use struct node. The solution is to declare the type first, and then use it:

// Structure declarations
typedef struct node node_t;

// Structure definitions
struct node {
    double  value;
    node_t *p_next;
};

Hopefully this long-winded explanation helps you understand why some of the standards we are presenting. A lot of these students are new to programming, and if we can teach them good programming habits from the start, they won't have to unlearn the bad habits.

By the way, if you are astute, you may also have noticed that a class definition does not require any other class definition, as the definition only includes the member variables and member functions. All these need are types, and all types have already been declared. It is only when you define the member functions that you may actually call a member function of a different class, but that's okay, because all the classes have been defined before any member function of any one class is defined.

Class definitions

This is a habit I, too, must break, but in a class definition, the public member functions and public member variables should come first, as this is what any reader would be primarily interested in when looking at a class. It is only the developer who needs to know the protected and private member functions and variables.

Private and protected member variables, as already noted, should be appended with an underscore. This allows, for example, parameters to take on the same names, but without the underscore. It also is one easy way to differentiate local variables from member variables.

Important, this convention of appending an underscore has sometimes been misinterpreted as prefixing the private member variables with an underscore. It is important you don't do this, as all identifiers beginning with an underscore are reserved for the compiler, and consequently, the compiler may actually use a particular _name as a macro, which would cause no end of problems in trying to determine what is happening.

Thus, a class declaration looks as follows:

class Class_name {
    public:
        // Constructors, copy and move constructors, destructors,
        // and assignment and move operators
        Class_name( ... );
       ~Class_name();
        Class_name( Class_name const &original );
        Class_name( Class_name const &&original );
        Class_name &operator=( Class_name rhs );
        Class_name &operator=( Class_name &&rhs );

        // Queries (const)
        typename f1( ... ) const;
        typename f2( ... ) const;
        typename f3( ... ) const;
        typename f4( ... ) const;

        // Modifiers
        typename f5( ... );
        typename f6( ... );
        typename f7( ... );
        typename f8( ... );

        // etc.
    private:
        typename f9( ... );
        typename f10( ... );
        typename f11( ... );

    // List any friendships here
};

Example 1

As an example of a class that you should be able to, as a teaching assistant, understand, and that makes use of the tools seen here, please see

This mini-project includes the constructor, copy constructor, move constructor, destructor, assignment operator and move operator defined. The test file shows conditions under which each of these five functions is called. The output on my computer is:

Constructor called:         0x7ffc60d19a00
[0, 0, 0, 0, 0]: constructor

Copy constructor called:    0x7ffc60d19a20
[0, 0, 0, 0, 0]: copy constructor

ones( 5 ) called...
Constructor called:         0x7ffc60d199c0
Move constructor called:    0x7ffc60d19a40
Destructor called:          0x7ffc60d199c0
[1, 1, 1, 1, 0]: move constructor

Assignment operator called: 0x7ffc60d19a00
Copy constructor called:    0x7ffc60d199d0
Destructor called:          0x7ffc60d199d0
[1, 1, 1, 1, 0]: assignment operator

ones( 5 ) called...
Constructor called:         0x7ffc60d199c0
Move constructor called:    0x7ffc60d19a60
Destructor called:          0x7ffc60d199c0
Move operator called:       0x7ffc60d19a00
Destructor called:          0x7ffc60d19a60
[1, 1, 1, 1, 0]: move operator

Destructor called:          0x7ffc60d19a40
Destructor called:          0x7ffc60d19a20
Destructor called:          0x7ffc60d19a00

Here you can access this entire test case at repl.it.

Example 2

The Hofstadter F&M sequences give an example of mutual recursion, where each sequence is defined in terms of the other. This is an example where function declarations are necessary, as the compiler must be aware of the declaration of the other when implementing either of the functions. See this example on repl.it on mutual recurssion

Example 3

Here is a simple sender-receiver pair that includes a mailbox that can hold one additional message. In this case, also, both classes should be declared before they are defined, as each uses the other. Similarly, both classes must be defined before any member functions are defined.

Include directives

Try not to use the name include statement, as all preprocessor commands are directives. Thus, it should be referred to as an include directive.

A space should separate the directive #include and the opening angled bracket or opening doublequote. Thus, of the following, the first is preferred:

#include <iostream>
#include "filename.hpp"

#include<iostream>
#include"filename.hpp"

With doublequotes, the second works, but looks awkward, and thus, to be consistent, both will use the space.

Naming of symbols

To be consistent, " is a doublequote, ' is a singlequote, # is the pound symbol, & is the ampersand, ( and ) are opening and closing parentheses or round parentheses, [ and ] are opening and closing brackets or square brackets, { and } are opening and closing braces or curly braces. The adjective can be used to emphasize what you are referring to.

Global variables

Will not be used. Of course, they can be used if you are demonstrating, for example, how M_PI is defined in the cmath library.

The using statement

Will not be used. All identifiers in the std namespace will have std:: prefixed at all times. Please note, while the constants in the cmath library and the assert macro in the cassert library are identifiers in the standard library, they are not in the std namespace.

The assert macro

The identifier assert in the cassert library is a macro and not a function. Thus, for clarity, call any use of the macro as an assert statement or, if you feel compelled, assert macro if you are willing to describe what a macro is, but try not to call it a call to the assert function. The students don't have to know this, but it's just easier.

Naming convention for identifiers

This course will use snake case for its naming convention.

The identifiers of all local variables that are declared const will be all uppercase letters with words separated by an underscore.

The identifiers of all other local variables, parameters, function names, member variable names, and member function names will be lower case with words separated by an underscore.

All private or protected member variables will be suffixed with a single underscore.

All class identifiers will have the first word capitalized and words separated by an underscore; e.g., Singly_linked_list.

If the last sequence of characters in an identifier are digits, it is up to the programmer to decide whether or not to separate the digits from any preceding letters by an underscore.

The underscore, an underscore followed by a sequence of digits, and the letters l, o and O by themselves will never be used as identifiers.

Profanity, slurs, etc. will never be acceptable, and any use of profanity may be shown to the entire class. Use of profanity may be reported to the Associate Dean of Undergraduate Studies for further investigation.

Parameters

Were reasonable, parameters should be declared const if there is no intention to change that parameter within the function body.

Control statements

The bodies of all control statements must have a body surrounded by braces. The opening brace is to appear at the end of the closing parenthesis of the control condition (e.g., if (...) { for ( ... ) {, while ( ... ) {, do {, and switch ( ... ) {) and the closing brace is to be lined-up vertically with the first character of the control keyword (if, for, while, do or switch).

    for ( int k{0}; k < 10; ++k ) {
        sum += k;
    }

    if ( sum < 100 ) {
        sum = 100;
    }

The following are unacceptable:

    for ( int k{0}; k < 10; ++k )
        sum += k;

    if ( sum < 100 )
        sum = 100;

    if ( sum < 100 ) sum = 100;

The initialization statement of a for loop will use initialization braces and not assignment if the looping variable is being declared. Thus, given the following two, the first must be used:

    for ( int k{0}; k < 10; ++k ) {
        sum += k;
    }

    for ( int k = 0; k < 10; ++k ) {
        sum += k;
    }

Control statements will be separated from other code by at least blank line. Thus, given the following two, the first must be used:

    double sum{ 0.0 };

    // Add the first ten numbers
    for ( int k{0}; k < 10; ++k ) {
        sum += k;
    }

    // Ensure that 'sum' is no less than 100.
    if ( sum < 100.0 ) {
        sum = 100.0;
    }

versus

    double sum{ 0.0 };
    // Add the first ten numbers
    for ( int k{0}; k < 10; ++k ) {
        sum += k;
    }
    // Ensure that 'sum' is no less than 100.
    if ( sum < 100.0 ) {
        sum = 100.0;
    }

For a conditional statement, there will be spaces between the if and the opening parenthesis, between that parenthesis and the condition, between the condition and the closing parenthesis, and between that parenthesis and the opening brace. The body of a conditional statement will always appear in parenthesis. The closing parenthesis will always be indented to the same level of the if keyword.

        if▢(▢condition▢)▢{
        ⋮   // Conditional body
        }

If a conditional has an else or one or more else-if clause, spaces will also be inserted appropriately:

        if▢(▢condition-1▢)▢{
        ⋮   // First consequent body
        }▢else▢if▢(▢condition-2▢)▢{
        ⋮   // Second consequent body
        }▢else▢{
        ⋮   // Complementary alternative body
        }

The same holds true for while, do-while and for loops, where there will be a space after every semi-colon in a for loop:

        while▢(▢condition▢)▢{
        ⋮   // Loop body
        }

        do▢{
        ⋮   // Loop body
        while▢(▢condition▢);

        for▢(▢initialization;▢condition;▢increment;▢)▢{
        ⋮   // Loop body
        }

Opening braces {

Under no circumstances will an opening brace appear on a separate line. This does nothing for clarity, and it simply spreads out the code over more lines, making it more difficult to view relevant code on a single screen. Like Python, it is the indentation of the conditional or repetition statement body that separates it from the surrounding statements.

The opening brace of the body of a conditional or repetition statement or a function must immediately follow the closing parenthesis with one space between the two. The opening brace of a class definition must follow the class identifier.

Local variable declarations

Local variable declarations must initialize the local variables. If the initial value is irrelevant, use {}; however, if the initial value is intended to be a specific value, even if it is the default value, the initial value will be specified.

    int n{};
    std::cout << "Enter an integer: ";
    std::cin >> n;

and

    double sum{ 0.0 };

Local variable declarations will be separated from other statements by one blank line except in the following circumstance: when only one local variable is being declared, and the next line is immediately associated with that local variable's initial value.

    double x{};
    std::cout << "Enter a number: ";
    std::cin >> x;
    double x{};
    double y{};

    std::cout << "Enter a number: ";
    std::cin >> x;

Only one local variable will be declared per declaration statement. Thus, of the following two, the first will be used.

    double x{};
    double y{};

    double x{}, y{};

Floating-point literals

Floating-point literals that happen to be integers must still have a .0 appended to them so that the reader can immediately note that it is a floating-point literal, and does not have to interpret it as being an integer that will be converted to the corresponding floating-point number. Thus, given the following two assignments, the second will be used.

    double sum{ 0.0 };

Binary operators

Spaces will be placed around all binary operators with the exception of *, / and %, as all three of these have equal precedence, and all three have greater precedence than binary + and -.

It is not required that no space is placed around the three operators *, / and %.

Apart from arithmetic statements, parentheses must be used when combining different binary operators. The only time that parentheses can be not included is when it is the same binary operator; for example,

    if ( (x < y) || (y + 5 < x) || (x*x < y + 5) || (x*y < 10) ) {
        // Consequent body
    }

If, however, it was a combination of logical && and || operators, parentheses must be used to indicate the order of operations, even if without parentheses precedence would have the same behavior.

As an aside, if this condition was too long, it should be split into two lines along the operators:

    if ( (x < y) || (x > y + 5) || (x*x < y + 5)
                    || (x*y < 10) ) {
        // Consequent body
    }

< versus >

In general, given a choice of

    if ( x < y ) {
        // Consequent body
    }

and

    if ( y > x ) {
        // Consequent body
    }

prefer the one that uses < or <=, as the anticipated smaller value is on the left (as the first operand), and the larger object on the right (as the second operand).

One exception to this may be a comparison with zero, such as

    if ( x > 0 ) {
        // Consequent body if 'x' is positive
    }

Unary operators

No space will be included between a unary operator and its operand. This includes the declaration of a pointer. Thus, of the following, the first must be used:

    double *p_x{ nullptr };

    double * p_x{ nullptr };

The conditional operator

Spaces will be placed around the ? and : of the conditional operator (if ever used). Any operand of a conditional operator that is more than a literal, a local variable, a parameter, a function call, or a unary operator and its operand will be surrounded with parentheses. Thus, of the following two, the first will be used

    return ( x < 0 ) ? -x : x;
    return ( x < y ) ? (y - x) : (x - y);

    return x<0?-x:x;
    return x < y ? y - x : x - y;

Pointer parameters and local variables

All local variables and parameters that are intended to be used as pointers and not arrays must be prefixed with a p_.

All array names will either be prefixed with a a_ or given a name that suggests multiple entries or has the word array in the name.

Array capacity versus size

The Standard Template Library always uses capacity, and thus, we will use the same word. In many cases where a naming convention is used in the STL, we should adopt similar names in this course so that when a student goes to use the STL, it becomes an easy and obvious transition.

The sizeof operator

The sizeof keyword represents an operator and not a function. There are special cases where the sizeof operator can be used without parentheses, but you will always use parentheses with this operator.

Do not call it the sizeof function. If a student asks, tell them it is a unary operator that is a keyword and not a symbol and that it generally requires parentheses.

Statements spanning multiple lines

If a statement spans multiple lines, the next line should begin with an operator, unless the entire line is surrounded by parentheses, in which case, it should use indentation similar to control statements.

If a line begins with an operator, it should line up appropriate with operators in the previous line.

     std::cout << "This is some text that is being printed..."
               << "and it continues here..." << std::endl;

     sum += a + b + c + d + e + f + g*(
        h + i + j + k + m + n + n
     ) + p + q + r;

     sum += a + b + c + d + e + f + g
              + h + i + j + k + m + n + n;

Assignment and automatic assignment operators

Always read the statement

     x = 3 + 4*(y - z);

as x is assigned the value of three plus four times quantity y minus z, or x is assigned the value of the right-hand side. Never read this as x equals .... It is not a statement of equality, it is a statement of assignment, and some programming languages use a much more intelligent choice of symbol for assignment, e.g.,

     x <- 3 + 4*(y - z);

The assignment and automatic assignment operators will always appear on a line of their own, and not part of a condition in a control statement or as a part of a larger statement. Thus, of the following the first must be used.

     ++k;
     array[k];
     data[m];
     ++m;

     a = f();

     if ( a != 0 )
          std::cout << a << std::endl;
     }

versus

     array[++k];
     data[m++];

     if ( (a = f()) != 0 )
          std::cout << a << std::endl;
     }

The one exception is the very common

    a = b = c = d = 0;

In all coding, this author uses ++k and not k++ because the second requires the compiler to save the previous value of k even if it is never used as a return value of this operation. Try out your compiler some time and see whether or not the following change increases the executable size by one or two instructions (it certainly does in Java, and in a recent version of the non-optimized GNU G++ compiler):

    int sum{ 0 };

    for ( int k{0}; k <= 100; ++k ) {
        sum +=k;
    }

versus

    int sum{ 0 };

    for ( int k{0}; k <= 100; k++ ) {
        sum +=k;
    }

Note if you optimize this, a good compiler may simply replace all of this with the single initialization:

    int sum{ 5050 };

Comments

A comment is only allowed on the same line as a statement if the comment succinctly describes the purpose of the statement and is not meant to line up with any other statements.

Otherwise, a comment will be on its own line and prior to the statement, block of statements, or control statement being described.

Only // comments will be used, because if this is the case, then if it becomes necessary to comment out an entire section of code, it can always be successfully accomplished using the /* */ style of comments.

Constructor initialization

The initialization of member variables in the constructor must be done through the initialization statements. The colon will appear as the last character of the name and parameters, and the member variables will be listed in the same order they appear in the class definition.

If the constructor body does nothing, the constructor body must include a comment that indicates this, as shown below.

Class_name::Class_name( ... ):
member_1_{ ... },
member_2_{ ... },
member_3_{ ... },
member_4_{ ... },
member_5_{ ... } {
    // Empty constructor body
}

Please note, it is important to note that if one member variable must be assigned first before it is used for the initialization of a second, that first must appear above the second in the class definition. Of the following two, only the first will work:

class Array {
    public:
        Array( std::size_t n );

    private:
        std::size_t capacity_;
        double     *array_;
};

Array::Array( std::size_t n ):
capacity_{n},
array_{ new double[capacity_]{} } {
    for ( std::size_t k{0}; k < capacity_; ++k ) {
        array_[k] = k;
    }
}

This does not work, because the memory for the array will be allocated before the member variable capacity_ is initialized:

class Array {
    public:
        Array( std::size_t n );

    private:
        // Wrong order
        double     *array_;
        std::size_t capacity_;
};

// Even though these are in the "right" order here,
// the compiler will initialize them in the order
// that the member variables appear in the class definition!
Array::Array( std::size_t n ):
capacity_{n},
array_{ new double[capacity_]{} } {
    for ( std::size_t k{0}; k < capacity_; ++k ) {
        array_[k] = k;
    }
}

Also note, in the constructor, there may be two options, to use the parameter or to use the member variable. If you look at the above code, the constructor definition could have just as easily been:

Array::Array( std::size_t n ):
capacity_{n},
array_{ new double[capacity_]{} } {
    for ( std::size_t k{0}; k < n; ++k ) {
        array_[k] = k;
    }
}

because, after all, n and capacity_ are the same, right? Well, n is a parameter to the constructor, while capacity_ is the member variable representing the capacity of the array. The second is linked to the array, while the first is only linked to the constructor. It would be bizarre if everywhere in a class, a for loop goes to capacity_, but in the constructor it goes up to n.

Additionally, what happens if a change to the constructor is made:

Array::Array( std::size_t n ):
capacity_{ std::max( 2, n ) },
array_{ new double[capacity_]{} } {
    for ( std::size_t k{0}; k < n; ++k ) {
        array_[k] = k;
    }
}

Now the code no longer executes correctly, as the capacity and the condition of the for loop are no longer the same.

Positive and negative

For this class, a positive integer or float is any value greater than zero, and a negative integer or float is any value less than zero.

Rather than using non-negative or non-positive to refer to the positive numbers including zero or the negative numbers including zero, respectively, this author tends to make a statement such as positive or zero for clarity.

While floating-point numbers do have signed zeros, they are still considered zero, as positive zero equals negative zero when using ==.

Printing

If source code is being copied to a document in a word-processing application, the source code should be formatted using single line spacing and a monospaced typeface such as Consolas (preferable) or Courier New.

Note: Consolas is preferable to Courier New, as Consolas has a slashed-zero (to differentiate it from a capital-O) and the 1, lower-case l, and capital-I are significantly differentiated.

References

From Steve Oualline in Practical C Programming,

There are fifteen precedence rules in C (&& comes before || comes before ?:). The practical programmer reduces these to two:

  1. Multiplication and division come before addition and subtraction.
  2. Put parentheses around everything else.

This is an excellent read, although this class does not implement all the rules. For example, in an embedded system, there is to be one and only one return statement in any function.

JPL Institutional Coding Standard for the C Programming Language.

While I detest any attempt to restrict anything useful to "ten", a number that really has only any real significance because we have ten fingers, this is never-the-less worth reading The Power of 10: Rules for Developing Safety-Critical Code. No doubt, had we had twelve fingers, we would be using Base 12, and the title of this document would still be "The Power of 10: Rules for Developing...", only in Base 12, the digits "10" represent a dozen, so this author would have expanded these ten to be twelve rules.

Additions to document come from the MTE 121 document "Proper Formatting and Commenting of Type C/C++ Code" authored by Carol Hulls.