Separate Compilation in C++
Separate compilation is a way to cut down build time by dividing your code into modules and only processing the modules whose source files have changed.
Let's assume we have written some classes and functions and placed the declarations in a header file (FILE.h) and the definitions of the functions and members in a definition file (FILE.cpp).
For ordinary files, when the header file containing only the declarations is included (#include "FILE.h"
) in the main file--say, the file defining function int main()
-- the preprocessor puts the declarations in the header into the main file.
With separate compilation, then, after the preprocessing phase is over, the compiler does piecewise compilation of the source code contained in .cpp
files and translates it into object files (.o
). At this point the compiler doesn't mind the missing definitions (of functions/classes) and the object files can refer to symbols that are not defined. The compiler, hence can compile the source code as long as it is well formed.
Then, during the linking stage, the compiler links several files together and it is during this stage that the linker will signal an error on missing/duplicate definitions. If the function definition is correctly present in the other file then the linker goes ahead and the functions called from the main file are successfully linked to the definitions and can be used.
When no separate compilation is invoked, the main file may have included declarations and definitions from other files
#include "FILE.h" #include "FILE.cpp" int main() { ... }
or maybe FILE.h already contained an include directive (#include "FILE.cpp"
) after its functions and classes had been declared, or maybe FILE.h contains its own definitions and there is no FILE.cpp providing the definitions.
With no separate compilation, the preprocessor includes all the files included through #include "FILENAME"
directives into one file, then the compiler compiles EVERYTHING in it into an object file, and last the linker produces the executable.
For templates, things work differently.
A Common Scenario
To perform separate compilation, you need to write your declarations and your definitions or implementations
in separate files. Typically, they bear the same name and differ only in their extension, for instance ClassA.h and ClassA.cpp.
Then you compile each implementation file into a matching object file.
Now, to compìle an implementation file, it must contain an include directive that includes the header. Thus, at the top of ClassA.cpp we should write something like:
#include "ClassA.h"
Often enough, the switch for compilation only is -c
. So you would compile only like this:
g++ -c ClassA.cpp
or
g++ -c ClassA.cpp -o ClassA.o
Now, after you have compiled all the object files that reference one another, including one (and only one) containing global function main()
, you link them into one executable, let's name it MyApp, like this:
g++ ClassA.o ClassB.o MyApp.main.o -o MyApp
Now, if you omit the -o
directive, your executable will get an arbitrary name not of your choice. With g++
this default is a.out.
To sum up, these are the steps:
- write your declarations and your implementations in different files, typically *.h and *.cpp respectively; include the declaration file in each implementation file
- compile each implementation file separately (
g++ -c FILE.cpp
) - compile one
*.cpp
defining the globalmain()
function - finally, link all the object files together
Separate Compilation of Variables
Sometimes code in one file may need to use a variable defined in another file.
To obtain a declaration that is not also a definition, we add the extern
keyword and may not provide an explicit initializer. An extern that has an initializer is a definition. It is an error to provide an initializer on an extern inside a function.
Variables must be defined exactly once but can be declared several times. To use a variable in more than one file requires declarations that are separate from the variable's definition. To use the same variable in multiple files, we must define that variable in only one of them and declare it in the other files that use it.
When you have global variables, you declare the existence of global variables in a header, so that each source file that includes the header knows about it, but you have to define it only once in one of your source files.
Example: Using
tells the compiler that an object of type extern int A
int
called A exists somewhere. The compiler doesn't need to know where, it just needs to know the type and name so it knows how to use it. Once all the source files are compiled, the linker will resolve all of the references of A to the one definition that it finds in one of the compiled source files. For the definition of A to work, it needs to have external linkage (it needs to be declared outside of a function, in the file scope
), and without the static
keyword.
Example
Header:
extern int A; void printA();
Source 1 (main):
#include "header.h" int A; int main() { A = 3; printA(); }
Source 2:
#include <iostream> #include "header.h" void printA() { std::cout << A << std::endl; }
const
When we split a program into multiple files, every file that uses const
must have access to its initializer. To see the initializer, the variable must be defined in every file that wants to use the variable's value. To support this usage, yet avoid multiple definitions of the same variable, by default, const variables are defined as local to a file. When we define const variables with the same name in multiple files, it is as if we had written definitions for separate variables in each file.
If we have a const variable that we want to share across multiple files but whose initializer is not a constant expression, we want the const object to behave like other (nonconst) variables. We define the const in one file and declare it in the other files that use it by using extern
on both its definition and declaration.
Example
file1.cc:
extern const int value = func();
file2.h:
extern const int value;