C++ coding guidelines
Contents |
Introduction
This note gives a set of guidelines and recommendations for coding in C++ for the SCTau experiment.
There are several reasons for maintaining and following a set of programming guidelines. First, by following some rules, one can avoid some common errors and pitfalls in C++ programming, and thus have more reliable code. But even more important: a computer program should not only tell the machine what to do, but it should also tell other people what you want the machine to do. This is obviously important any time when you have many people working on a given piece of software, and such considerations would naturally lead to code that is easy to read and understand. Think of writing SCTau code as another form of publication, and take the same care as you would writing up an analysis for colleagues.
This document is derived from the ATLAS C++ coding guidelines [1]
This note is not intended to be a fixed set of rigid rules. Rather, it should evolve as experience warrants.
Naming
This section contains guidelines on how to name objects in a program.
Naming of files
Each class should have one header file, ending with ".h", and one implementation file, ending with ".cpp". [source-naming] Some exceptions: Small classes used as helpers for another class should generally not go in their own file, but should instead be placed with the larger class. Sometimes several very closely related classes may be grouped together in a single file; in that case, the files should be named after whichever is the "primary" class. A number of related small helper classes (not associated with a particular larger class) may be grouped together in a single file, which should be given a descriptive name. An example of the latter could be a set of classes used as exceptions for a package.
For classes in a namespace, the namespace should not be included in the file name. For example, the header for Trk::Track should be called Track.h.
Implementation (.cpp) files that would be empty may be omitted.
The use of the ".h" suffix for headers is a practice following many other HEP experiments; however, it is unfortunate since language-sensitive editors will then default to using "C" rather than "C++" mode for these files. For emacs, it can help to put a line like this at the start of the file:
// This file is really -*- C++ -*-.
Meaningful names
Choose names based on pronounceable English words, common abbreviations, or acronyms widely used in the experiment, except for loop iteration variables. [meaningful-names] For example, nameLength is better than nLn.
Use names that are English and self-descriptive. Abbreviations and/or acronyms used should be of common use within the community.
Do not create very similar names. [no-similar-names] In particular, avoid names that differ only in case. For example, track / Track; c1 / cl; XO / X0.
Required naming conventions:
Generally speaking, you should try to match the conventions used by whatever package you're working on. But please try to always follow these rules:
- Use prefix m_ for private/protected data members of classes. [data-member-naming]
Use a lowercase letter after the prefix m_.
- Do not start any other names with m_. [m-prefix-reserved]
- Do not start names with an underscore. Do not use names that contain anywhere a double underscore. [system-reserved-names]
Such names are reserved for the use of the compiler and system libraries.
The precise rule is that names that contain a double underscore or which start with an underscore followed by an uppercase letter are reserved anywhere, and all other names starting with an underscore are reserved in the global namespace. However, it's good practice to just avoid all names starting with an underscore.
Recommended naming conventions
If there is no already-established naming convention for the package you're working on, the following guidelines are recommended as being generally consistent with SCTau usage.
- Use prefix s_ for private/protected static data members of classes. [static-members]
Use a lowercase letter after the prefix s_.
- The choice of namespace names should be agreed to by the communities concerned. [namespace-naming]
Don't proliferate namespaces. If the community developing the code has a namespace defined already, use it rather than defining a new one. Examples include Trk:: for tracking and InDet:: for inner detector.
- Use namespaces to avoid name conflicts between classes. [use-namespaces]
A name clash occurs when a name is defined in more than one place. For example, two different class libraries could give two different classes the same name. If you try to use many class libraries at the same time, there is a fair chance that you will be unable to compile and link the program because of name clashes. To solve the problem you can use a namespace.
New code should preferably be put in a namespace, unless typical SCTau usage is otherwise. For example, SCTau classes related to the calorimeter conventionally start with "Calo" rather than being in a C++ namespace.
- Start class and enum types with an uppercase letter. [class-naming]
class Track; enum State { green, yellow, red };
- Typedef names should start with an uppercase letter if they are public and treated as classes. [typedef-naming]
typedef vector<MCParticleKinematics*> TrackVector;
- Alternatively, a typedef name may start with a lower-case letter and end with _t. [typedef-naming-2]
This form should be reserved for type names which are not treated as classes (e.g., a name for a fundamental type) or names which are private to a class.
typedef unsigned int mycounter_t;
- Start names of variables, members, and functions with a lowercase letter. [variable-and-function-naming]
double energy; void extrapolate();
Names starting with s_ and m_ should have a lowercase letter following the underscore.
Exceptions may be made for the case where the name is following standard physics or mathematical notation that would require an uppercase letter; for example, uppercase E for energy.
- In names that consist of more than one word, write the words together, and start each word that follows the first one with an uppercase letter. [compound-names]
class OuterTrackerDigit; double depositedEnergy; void findTrack();
Some SCTau packages may also use the convention that names are entirely lowercase and separated by underscores. When modifying an existing package, you should try to be consistent with the existing naming convention.
- All package names in the release must be unique, independent of the package's location in the hierarchy. [unique-package-names]
If there is a package, say "A/B/C", already existing, another package may not have the name "D/E/C" because that "C" has already been used. This is required for proper functioning of the build system.
- Underscores should be avoided in package names. [no-underscores-in-package-names]
Underscores should be avoided unless they really help with readability and help in avoiding spelling mistakes. TRTTracker looks odd because of the double "T". Using underscores in package names will also add to confusion in the multiple-inclusion protection lines.
- Acronyms should be written as all uppercase. [uppercase-acronyms]
METReconstruction, not MetReconstruction MuonCSCValidation, not MuonCscValidation
Coding
This section contains a set of items regarding the "content" of the code. Organization of the code, control flow, object life cycle, conversions, object-oriented programming, error handling, parts of C++ to avoid, portability, are all examples of issues that are covered here.
The purpose of the following items is to highlight some useful ways to exploit the features of the programming language, and to identify some common or potential errors to avoid.
Organizing the code
- Header files must begin and end with multiple-inclusion protection. [header-guards]
#ifndef PACKAGE_CLASS_H #define PACKAGE_CLASS_H // The text of the header goes in here ... #endif // PACKAGE_CLASS_H
Header files are often included many times in a program. Because C++ does not allow multiple definitions of a class, it is necessary to prevent the compiler from reading the definitions more than once.
The include guard should include both the package name and class name, to ensure that is unique.
Don't start the include guard name with an underscore!
In some rare cases, a file may be intended to be included multiple times, and thus not have an include guard. Such files should be explicitly commented as such, and should usually have an extension other than ".h".
- Use forward declaration instead of including a header file, if this is sufficient. [forward-declarations]
class Line; class Point { public: // Distance from a line Number distance(const Line& line) const; };
Here it is sufficient to say that Line is a class, without giving details which are inside its header. This saves time in compilation and avoids an apparent dependency upon the Line header file.
Be careful, however: this does not work if Line is actually a typedef
.
- Each header file must contain the declaration for one class only, except for embedded or very tightly coupled classes or collections of small helper classes. [one-class-per-source]
This makes your source code files easier to read. This also improves the version control of the files; for example the file containing a stable class declaration can be committed and not changed any more.
Some exceptions: Small classes used as helpers for another class should generally not go in their own file, but should instead be placed with the larger class. Sometimes several very closely related classes may be grouped together in a single file; in that case, the files should be named after whichever is the "primary" class. A number of related small helper classes (not associated with a particular larger class) may be grouped together in a single file, which should be given a descriptive name. An example of the latter could be a set of classes used as exceptions for a package.
- Implementation files must hold the member function definitions for the class(es) declared in the corresponding header file. [implementation-file]
This is for the same reason as for the previous item.
- Ordering of #include statements. [include-ordering]
#include
directives should generally be listed according to dependency ordering, with the files that have the most dependencies coming first. This implies that the first #include in a ".cpp" file should be the corresponding ".h" file, followed by other #include
directives from the same package. These would then be followed by #include
directives for other packages, again ordered from most to least dependent. Finally, system #include
directives should come last.
// Example for CaloCell.cxx // First the corresponding header. #include "CaloEvent/CaloCell.h" // The headers from other SCTau packages, // from most to least dependent. #include "CaloDetDescr/CaloDetDescrElement.h" #include "SGTools/BaseInfo.h" // Headers from external packages. #include "CLHEP/Geometry/Vector3D.h" #include "CLHEP/Geometry/Point3D.h" // System headers. #include <cmath>
Ordering the #include
directives in this way gives the best chance of catching problems where headers fail to include other headers that they depend on.
Some old guides recommended testing on the C++ header guard around the #include directive. This advice is now obsolete and should be avoided.
// Obsolete --- don't do this anymore. #ifndef MYPACKAGE_MYHEADER_H # include "MyPackage/MyHeader.h" #endif
The rationale for this was to avoid having the preprocessor do redundant reads of the header file. However, current C++ compilers do this optimization on their own, so this serves only to clutter the source.
- Do not use "using" directives or declarations in headers or prior to an #include. [no-using-in-headers]
A using directive or declaration imports names from one namespace into another, often the global namespace.
This does, however, lead to pollution of the global namespace. This can be manageable if it's for a single source file; however, if the directive is in a header file, it can affect many different source files. In most cases, the author of these sources won't be expecting this.
Having using in a header can also hide errors. For example:
// In first header A.h: using namespace std; // In second header B.h: #include "A.h" // In source file B.cxx #include "B.h" ... vector<int> x; // Missing std!
Here, a reference to std::vector
in B.cxx is mistakenly written without the std::
qualifier. However, it works anyway because of the using directive in A.h. But imagine that later B.h is revised so that it no longer uses anything from A.h, so the #include of A.h is removed. Suddenly, the reference to vector in B.cxx no longer compiles. Now imagine there are several more layers of #include and potentially hundreds of affected source files. To try to prevent problems like this, headers should not use using outside of classes. (Within a class definition, using can have a different meaning that is not covered by this rule.)
For similar reasons, if you have a using directive or declaration in a ".cxx" file, it should come after all #include
directives. Otherwise, the using may serve to hide problems with missing namespace qualifications in the headers.
Starting with C++11, using can also be used in ways similar to typedef
. Such usage is not covered by this rule.
Control flow
- Do not change a loop variable inside a for loop block. [do-not-modify-for-variable]
When you write a for loop, it is highly confusing and error-prone to change the loop variable within the loop body rather than inside the expression executed after each iteration. It may also inhibit many of the loop optimizations that the compiler can perform.
- Prefer range-based for loops. [prefer-range-based-for]
C++11 introduced the "range-based for". Prefer using this to a loop with explicit iterators; that is, prefer:
std::vector<int> v = ...; for (int x : v) { doSomething (x); }
to
std::vector<int> v = ...; for (std::vector<int>::const_iterator it = v.begin(); it != v.end(); ++it) { doSomething (*it); }
In some cases you can't make this replacement; for example, if you need to call methods on the iterator itself, or you need to manage multiple iterators within the loop. But most simple loops over STL ranges are more simply written with a range-based for.
- All switch statements must have a default clause. [switch-default]
In some cases the default clause can never be reached because there are case labels for all possible enum values in the switch statement, but by having such an unreachable default clause you show a potential reader that you know what you are doing.
You also provide for future changes. If an additional enum value is added, the switch statement should not just silently ignore the new value. Instead, it should in some way notify the programmer that the switch statement must be changed; for example, you could throw an exception
- Each clause of a switch statement must end with break. [switch-break]
// somewhere specified: enum Colors { GREEN, RED } // semaphore of type Colors switch(semaphore) { case GREEN: // statement break; case RED: // statement break; default: // unforeseen color; it is a bug // do some action to signal it }
If you must "fall through" from one switch clause to another (excluding the trivial case of a clause with no statements), this must be explicitly stated in a comment. This should, however, be a rare case.
switch (case) { case 1: doSomething(); /* FALLTHROUGH */ case 2: doSomethingMore(); break; ...
gcc7 will warn about such constructs unless you use a comment like in the example above. (C++17 will add a fallthrough attribute.)
- An if-statement which does not fit in one line must have braces around the conditional statement. [if-bracing]
This makes code much more readable and reliable, by clearly showing the flow paths.
The addition of a final else is particularly important in the case where you have if/else-if. To be safe, even single statements should be explicitly blocked by {}.
if (val == thresholdMin) { statement; } else if (val == thresholdMax) { statement; } else { statement; // handles all other (unforeseen) cases }
- Do not use goto. [no-goto]
Use break
or continue
instead.
This statement remains valid also in the case of nested loops, where the use of control variables can easily allow to break the loop, without using goto
.
goto
statements decrease readability and maintainability and make testing difficult by increasing the complexity of the code.
If goto
statements must be used, it's better to use them for forward branching than backwards, and the functions involved should be kept short.