CPP_STYLE_GUIDE.md
This guide describes the conventions for formatting C/C++ code. Consistent code formatting (following coding conventions) is essential for collaboration on large projects, because it makes it significantly easier and faster to read someone else's code.
The point is that you don't write a program for a computer ("But it works!") or for yourself ("Just ask and I'll tell you how it works"), but for all developers who are reading, modifying, and debugging it, and their needs should be respected.
Please be aware that use of imperative, object-oriented C++ (version C++14) is assumed. This means:
Currently, there is a small subset of code in the source tree that doesn't conform to the standards outlined below (partly because some of these code blocks were written about 15+ years ago). Nevertheless, this standard must be followed when writing new code.
To automatically format C++ files, use the ya style command. It is based on clang-format with the correct config (located at devtools/ya/handlers/style/config), which can be used separately if your editor uses clang-format directly.
A name should reflect the essence of the data, the type, or the action that it names. Only commonly-used abbreviations are allowed in names. Conventional single-letter names (i, j, k) are only allowed for counters and iterators. Structures are also classes, and everything related to classes also applies to structures (unless explicitly stated otherwise).
Local and global variables begin with a lowercase letter.
Function names begin with an uppercase letter.
Function pointers, like ordinary variables, begin with a lowercase letter: auto localFunction = [&]() { ... }
Function arguments begin with a lowercase letter.
Class members begin with an uppercase letter.
Class methods begin with an uppercase letter.
Class names and type definitions (typedefs) are preceded by the prefix T, followed by the name of the class beginning with an uppercase letter. The names of virtual interfaces start with 'I'.
All global constants and defines are fully capitalized.
Tokens in complex names of variables and functions are differentiated by capitalizing the first letter of the token (without inserting an underscore). Tokens in fully capitalized names of constants are separated by underscores.
Using the underscore as the first character of a name is prohibited.
Hungarian notation is prohibited.
class TClass {
public:
int Size;
int GetSize() {
return Size;
}
};
TClass object;
int GetValue();
void SetValue(int val);
Exception: The names of functions, classes, and so on that mimic or extend functions of standard libraries (libc, stl, etc.) should follow the library's naming convention. Examples are TVector, fget, autoarray, sprintf, equivalents of the main function. These classes and functions are usually located in /util.
If you do have to use a macro, you need to make sure that it is unique (for example, it should match the hierarchy of directories in the path to this file). If the macro is intended to be used as part of your library's API, then the macro must have the Y_ prefix (for example, see the macros in util/system/compiler.h).
Global enums should be named using the same rules as for classes, but with a capital letter E. The members of these enums should be named using all capital letters, just as for global constants, which is what they actually are. Names must have a prefix formed by the first letters of the enum.
enum EFetchType {
FT_SIMPLE,
FT_RELFORM_DEBUG,
FT_ATR_DEBUG,
FT_SELECTED,
FT_ATR_SELECTED,
FT_LARGE_YA
};
For enum members of a class, follow the same rules as for other members of a class, since this is the same thing as constant members:
class TIndicator {
public:
enum EStatus {
Created,
Running,
Suspended,
Aborted,
Failed,
Finished
};
...
};
Unnamed enums are allowed only in class members:
class TFile {
public:
enum {
Invalid = -1
};
...
};
C++11 enums should be formed in the same way as enum class members, since they have a similar scope.
enum class EStatus {
Created,
Running,
Suspended,
Aborted,
Failed,
Finished
};
Do not create your own constructions for converting enums to TString and back. Use GENERATE_ENUM_SERIALIZATION.
Instead of the last field with the number of fields in the enum, you can use GENERATE_ENUM_SERIALIZATION_WITH_HEADER.
Try to make sure that the program makes sense in English, meaning it resembles a cohesive and meaningful English text:
For counter variables, do not use the names DocNum, NumDoc, DocsCount, DocsNum, and CountDoc because they are ungrammatical and ambiguous. For the number of elements (such as documents), you can use NumDocs or DocCount. For a function that explicitly counts this number for a long time, CountDocs() is acceptable.
Don't use tabs in a text editor. The reason is because this is the only way to ensure that your program is readable on any device. Make sure that your text editor has an option to replace the tab character with spaces. For example, in the TextPad editor, select the "Convert new tabs to spaces" option.
Our standard indent is 4 spaces. The indent should be filled with spaces, even if you use the Tab button.
For block operators, use the 1TBS style:
if (something) { // K&R style
One();
Two();
} else {
Three();
Four();
}
for (int i = 0; i < N; ++i) { // K&R style
// do something...
}
Multi-line conditions are an exception (if the condition doesn't fit on one line, split it into several), and they are written like this:
if (a && b && c &&
d && e)
{
Op();
}
For functions and methods, you can use either of two styles:
Func1(a, b, c)
{
}
or
Func1(a, b, c) {
}
The style of the curly brackets must be consistent within the same file.
Short blocks
Single-line bodies of operators and inline functions must begin with a new line. Bodies of operators and functions declared in the same line make debugging difficult.
if (something)
A();
The subordinate operator must not be empty. Not allowed:
for (int i = 0; i < 100; i++);
The reason is this text looks like a typo that wasn't caught at the compilation stage.
Operators
Don't use more than one operator per line.
Blank lines
We recommended leaving blank lines between separate logical blocks of code. This greatly improves readability.
All operator symbols, with the exception of unary operators and the member access operator for structures, should have a space on both sides:
a = b;
x += 3;
z = a / 6;
This includes the assignment operator. In other words, write:
if (!x.a || ~(b->c - e::d) == 0)
z = 0;
void F() throw () {
}
struct T {
void F() const throw () {
}
};
Do not put a space after a function name, after the opening parenthesis, or before the closing bracket:
Func(a, b, c);
Do put a space between the operator and the bracket:
if ()
for ()
while ()
The spaces inside brackets should look like this:
Func(a, b, c);
(a + b)
Inside a range-based for:
for (auto& x : c) {
}
Asymmetric spaces are not allowed.
When instantiating templates, use triangular brackets without spaces.
vector<vector<int>> matrix;
There shouldn't be any spaces at the end of a line. Use the options in your text editor to control this.
Settings in text editors
TextPad: "Strip trailing spaces from lines when saving".
Vim
augroup vimrc
" Automatically delete trailing DOS-returns and whitespace on file open and
" write.
autocmd BufRead,BufWritePre,FileWritePre * silent! %s/[\r \t]\+$//
augroup END
(add-hook 'c-mode-common-hook
(lambda () (add-to-list 'write-file-functions 'delete-trailing-whitespace)))
Single-line lambdas are allowed only in one case: to define the function where it is used. However, the lambda function itself should not violate the other rules of the style guide:
Sort(a.begin(), a.end(), [](int x, int y) -> bool {return x < y;}); //OK
Sort(a.begin(), a.end(), [](int x, int y) -> bool {int z = x - y; return z < 0;}); //not OK - you can't have 2 statements on the same line
In all other cases, they should be formatted as follows:
auto f = [](int x, int y) -> bool { //K&R style, the same as for for/if/while
return x < y;
};
// you can also use 'auto&& f' if the lambda function is "heavy"
Sort(a.begin(), a.end(), f);
The preferred format is "one declaration per line." It is allowed to declare multiple variables of the same type on the same line. It is not allowed to mix arrays, pointers, references, and simple types. Do not use line wrapping in the declaration.
int level; // preffered
int size;
int level, size; // allowed
int level,
size; // prohibited: line wrapping
int level, array[16], *pValue; // prohibited: mixed types
A structure can only contain open members. You don't need to specify public for it. If the structure contains anything other than members, a constructor, and a destructor, we recommend that you rename it to a class.
The scope labels start from the same column where the class declaration begins. Specifying scopes is mandatory, including the first private scope.
Members and methods can't be in the same section of scopes. They should be separated by re-specifying the scope. There should be a minimal number of scope labels, reduced to the fewest possible by changing the order of the parts of the class declaration.
Within one scope:
A public scope with methods must precede protected and private scopes with methods.
Class data members should be placed at the beginning or at the end of the class description. Class type descriptions can precede data descriptions.
class TClass {
private:
int Member; // comments about Member
public:
TClass();
TClass(const TClass& other);
~TClass();
};
The word template should start a separate line.
Constructors should be formatted as follows:
TClass::TClass()
: FieldA(1)
, FieldB("value")
, FieldC(true)
{
// some more
// code here
}
One of the following variations is allowed:
struct T {
int X = 0;
double Y = 1.0;
};
or
struct T {
int X;
double Y;
T() //this implementation can also be in a .cpp file
: X(0)
, Y(1.0)
{
}
};
The reason is that if you mix two types of initialization, it's much easier to forget to initialize some member of the class, since the initialization code is "spread out" (possibly across multiple source files).
Namespaces should be formatted like classes, except for the name. Namespaces must begin with a capital letter N:
namespace NStl {
namespace NPrivate {
//the namespace nesting level is restricted to two
}
class TVectorType {
};
}
Only constexpr from C++11 is allowed, because constexpr for non-constant methods is not supported by MSVC.
Variable templates can be used because MSVC 2015 Update 3 or newer is in use and supports that.
auto f() -> decltype() {}
or
auto f() {}
Only use it where it is truly necessary.
Always use nullptr.
In new code, prefer 'using' (as a more general mechanism), except when this is not possible. There are cases when the combination of 'using' + templates with an unknown number of parameters + function type leads to compilation errors in MSVC:
template <class R, class Args...>
struct T {
using TSignature = R (Args...);
};
In this case, you should use typedef.
In derived classes, use override without virtual.
class A {
virtual ui32 f(ui32 k) const {
}
};
class B: public A {
ui32 f(ui32 k) const override {
}
};
Comments are for explaining the code where they are located. Do not use comments to remove an unnecessary function or block, especially if this is the old version of a function you corrected. Simply delete any unnecessary parts of the code – you can always go to VCS (e.g. svn, git, hg, etc.) to retrieve the deleted section if you suddenly realize how useful it was. The main harm from commenting previous versions of the code, instead of removing them, is that VCS diff won't work correctly.
Comments should be written in English with correct spelling and grammar.
It is useful to explain the purpose of each class member in the class description. MSVC editor displays this line in the tooltip in "smart editing" mode.
Doxygen-style comments are encouraged.
To make it easier to search for your TODO comments in the code, use one of two formats:
// Presumably a temporary comment with notes to yourself:
// TODO (username): fix me later
// Comment with the ticket:
// TODO (ticket_number): fix me later
Capital letters are not allowed in file names. File extensions for C++: "cpp", "h".
Indents in the preprocessor are also 4 spaces. Keep the hash in the first position.
#ifdef Z
# include
#elif Z
# define
# if
# define
# endif
#else
# include
#endif
With a preprocessing conditional in the middle of the file, we start in the first position.
func A() {
int x;
#ifdef TEST_func_A // ifndef + else = schisophrenia
x = 0;
#else
x = 1;
#endif
}
The include files should not be interdependent, meaning an include file must be compileable by itself as a separate compilation unit. If the include file contains references to types that are not described in it:
The using namespace declaration is not allowed inside include files.
Include files should be specified in the order of less general to more general (regardless of whether it's in cpp or another include), so that a more specific file is included before a more general file. This order allows you to once again check the independence of the other included header files. For example, for the library/cpp/json/some_program/some_class.cpp file, the order of inclusion is:
#include "some_class.h"
#include "other_class.h"
#include "other_class2.h"
// library/cpp/json
#include <library/cpp/json/json_reader.h>
// library
#include <library/cpp/string_utils/base64/base64.h>
#include <library/cpp/threading/local_executor/local_executor.h>
#include <contrib/libs/rapidjson/include/rapidjson/reader.h>
#include <util/folder/dirut.h>
#include <util/system/yassert.h>
#include <cmath>
#include <cstdio>
#include <ctime>
#include <Windows.h>
Thus, all non-local names (from other directories) are written in angle brackets. Within each group, alphabetical sorting is preferable.
To include files just once:
#pragma once
..
..
Errors should be handled using exceptions:
#include <util/generic/yexception.h>
class TSomeException: public yexception {
....
};
...
if (errorHappened) {
ythrow TSomeException(numericCode) << "error happened (" << usefulDescription << ")";
}
Error handling using return codes like
if (Func1() == ERROR1) {
return MY_ERROR1;
}
if (Func2() == ERROR2) {
return MY_ERROR2;
}
is prohibited everywhere except in specially stipulated cases:
To test various kinds of compile-time invariants (for example, sizeof(int) == 4), use static_assert. To test run-time invariants, instead of assert(), use the Y_ASSERT() macro, since it is better integrated into Visual Studio.
Calling platform-dependent system functions is allowed only in /util. In order to use specific system primitives, use the cross-platform wrappers from /util. If the necessary wrapper does not exist, you can write one (preferably using OOP) and add it to util (don't forget the code review).
The /contrib folder contains libraries and programs from 3rd parties. Obviously, they use their own style of writing code. If there is a need to add something to contrib that isn't there yet, create a ticket for discussion and decision making.