The C/C++ Preprocessor

The C++ preprocessor is a text replacement tool used to transform the source code in order to produce a single text file that is then passed to the actual compiler. It has various capabilities, such as including files, conditional compilation, text macro replacement, error emitting, stringizing, or token concatenation.

All preprocessor directives begin with #, and only white-space characters may appear before a preprocessor directive on a line.

Preprocessor directives are not C++ statements, so they do not end in a semicolon (;).


The output from the C preprocessor looks much like the input, except that all preprocessing directive lines have been replaced with blank lines and all comments with spaces. Long runs of blank lines are discarded.

The ISO standard specifies that it is implementation defined whether a preprocessor preserves whitespace between tokens, or replaces it with e.g. a single space. In GNU CPP, whitespace between tokens is collapsed to become a single space, with the exception that the first token on a non-directive line is preceded with sufficient spaces that it appears in the same column in the preprocessed output that it appeared in the original source file.

The #define Preprocessor Directive

The #define preprocessor directive creates symbolic constants. The symbolic constant is called a macro and the general form of the directive is:

#define <macro-name> <replacement-text>

When this line appears in a file, all subsequent occurrences of macro in that file will be replaced by replacement-text before the program is compiled. For example:

#include <iostream>
using namespace std;

#define PI 3.14159

int main () {
   cout << "Value of PI :" << PI << endl;

   return 0;
}

Inclusion

Files are included verbatim through the #include FILENAME directive.


The __has_include(FILENAME) directive (C++17) finds if a file exists, as in

#if __has_include(<optional>)
    #include <optional>
    #define has_optional 1
    template<class T>
    using optional_t = std::optional<T>;

Conditionals

There are several directives, which can be used to compile selective portions of your program's source code. This process is called conditional compilation.

The conditional preprocessor construct is much like the if selection structure in C and C++. Consider the following preprocessor code:

#ifndef NULL
   #define NULL 0
#endif

You can compile a program for debugging purpose. You can also turn on or off the debugging using a single macro as follows

#ifdef DEBUG
   cerr << "Variable x = " << x << endl;
#endif

This causes the cerr statement to be compiled in the program if the symbolic constant DEBUG has been defined before directive #ifdef DEBUG. You can use #if 0 statment to comment out a portion of the program as follows

#if 0
   code prevented from compiling
#endif

Stringification

The # operator causes a replacement-text token to be converted to a string surrounded by quotes.

#include <iostream>
using namespace std;

#define MKSTR( x ) #x

int main () {

   cout << MKSTR(HELLO C++) << endl;

   return 0;
}

Concatenation

The ## operator is used to concatenate two tokens. Here is an example

#define CONCAT( x, y )  x ## y

When CONCAT appears in the program, its arguments are concatenated and used to replace the macro. For example, CONCAT(HELLO, C++) is replaced by "HELLO C++" in the program as follows.

#include <iostream>
using namespace std;

#define concat(a, b) a ## b
int main() {
   int xy = 100;

   cout << concat(x, y);
   return 0;
}

Function-Like Macros

You can use #define to define a macro which will take argument as follows:

#include <iostream>
using namespace std;

#define MIN(a,b) (((a)<(b)) ? a : b)

int main () {
   int i, j;

   i = 100;
   j = 30;

   cout <<"The minimum is " << MIN(i, j) << endl;

   return 0;
}

Predefined Cxx Macros

Cxx provides a number of predefined macros:

__LINE__ This contains the current line number of the program when it is being compiled.
__FILE__ This contains the current file name of the program when it is being compiled.
__DATE__ This contains a string of the form month/day/year that is the date of the translation of the source file into object code.
__TIME__ This contains a string of the form hour:minute:second that is the time at which the program was compiled.

Let us see an example for all the above macros

#include <iostream>
using namespace std;

int main () {
   cout << "Value of __LINE__ : " << __LINE__ << endl;
   cout << "Value of __FILE__ : " << __FILE__ << endl;
   cout << "Value of __DATE__ : " << __DATE__ << endl;
   cout << "Value of __TIME__ : " << __TIME__ << endl;

   return 0;
}

GNU Preprocessor

The C preprocessor, often known as cpp, is a macro processor that is used automatically by the C compiler to transform your program before compilation. It is called a macro processor because it allows you to define macros, which are brief abbreviations for longer constructs.

The C preprocessor is intended to be used only with C and C++.

Calling the GNU C/C++ Preprocessor Alone (cpp in GCC)

(From https://gcc.gnu.org/onlinedocs/cpp/)

Most often when you use the C preprocessor you do not have to invoke it explicitly: the C compiler does so automatically. However, the preprocessor is sometimes useful on its own. You can invoke the preprocessor either with the cpp command, or via gcc -E. In GCC, the preprocessor is actually integrated with the compiler rather than a separate program, and both of these commands invoke GCC and tell it to stop after the preprocessing phase.

The cpp options listed below are also accepted by gcc and have the same meaning. Likewise the cpp command accepts all the usual gcc driver options, although those pertaining to compilation phases after preprocessing are ignored.

Only options specific to preprocessing behavior are documented here. Refer to the GCC manual for full documentation of other driver options.


The cpp command expects two file names as arguments, infile and outfile. The preprocessor reads infile together with any other files it specifies with #include. All the output generated by the combined input files is written in outfile.

Either infile or outfile may be -, which as infile means to read from standard input and as outfile means to write to standard output. If either file is omitted, it means the same as if - had been specified for that file. You can also use the -o outfile option to specify the output file.


Unless otherwise noted, or the option ends in =, all options which take an argument may have that argument appear either immediately after the option, or with a space between option and argument: -Ifoo and -I foo have the same effect.

Many options have multi-letter names; therefore multiple single-letter options may not be grouped: -dM is very different from -d -M.

GNU Preprocessor Options (not exhaustive)

-D name
Predefine name as a macro, with definition 1.
-D name=definition

The contents of definition are tokenized and processed as if they appeared during translation phase three in a #define directive. In particular, the definition is truncated by embedded newline characters.

If you are invoking the preprocessor from a shell or shell-like program you may need to use the shell's quoting syntax to protect characters such as spaces that have a meaning in the shell syntax.

If you wish to define a function-like macro on the command line, write its argument list with surrounding parentheses before the equals sign (if any). Parentheses are meaningful to most shells, so you should quote the option. With sh and csh, -D'name(args…)=definition' works.

-D and -U options are processed in the order they are given on the command line. All -imacros file and -include file options are processed after all -D and -U options.

-U name
Cancel any previous definition of name, either built in or provided with a -D option.
-M

Instead of outputting the result of preprocessing, output a rule suitable for make describing the dependencies of the main source file. The preprocessor outputs one make rule containing the object file name for that source file, a colon, and the names of all the included files, including those coming from -include or -imacros command-line options.

Unless specified explicitly (with -MT or -MQ), the object file name consists of the name of the source file with any suffix replaced with object file suffix and with any leading directory parts removed. If there are many included files then the rule is split into several lines using '\'-newline. The rule has no commands.

This option does not suppress the preprocessor's debug output, such as -dM. To avoid mixing such debug output with the dependency rules you should explicitly specify the dependency output file with -MF, or use an environment variable like DEPENDENCIES_OUTPUT (see Environment Variables). Debug output is still sent to the regular output stream as normal.

Passing -M to the driver implies -E, and suppresses warnings with an implicit -w.

-MM

Like -M but do not mention header files that are found in system header directories, nor header files that are included, directly or indirectly, from such a header.

This implies that the choice of angle brackets or double quotes in an '#include' directive does not in itself determine whether that header appears in -MM dependency output.

The Clang C/C++ Preprocessor Alone (cpp in GCC)

(From https://clang.llvm.org/docs/ClangCommandLineReference.html)

Clang Preprocessor Options (not exhaustive)

-C or --comments
Include comments in preprocessed output
-CC or --comments-in-macros
Include comments from within macros in preprocessed output
-Dmacro=value, --define-macro arg, or --define-macro=arg
Define macro to value (or 1 if value omitted)
-H, or --trace-includes
Show header includes and nesting depth
-P, or --no-line-commands
Disable linemarker output in -E mode
-Umacro, --undefine-macro arg, --undefine-macro=arg
Undefine macro macro