Linking C++ and C Object Files Together

Every C++ compiler is also a C compiler. There is no reason to compile C as C++; you can just compile C as C. If your project consists of a mixture of C and C++, you can simply link the C and C++ object files together into the final executable. This ease of incorporating C code in a C++ program comes in handy when you encounter a useful library or legacy code that was written in C. Functions and classes work just fine together. A class member function can call a function, and a function can make use of objects.

Shifting Paradigms

One of the dangers of mixing C and C++ is that your program may start to lose its object-oriented properties. For example, if your object-oriented web browser is implemented with a procedural networking library, the program will be mixing these two paradigms. Given the importance and quantity of networking tasks in such an application, you might consider writing an object-oriented wrapper around the procedural library. A typical design pattern that can be used for this is called the façade.

For example, imagine that you are writing a web browser in C++, but you are using a networking library that has a C-style API and contains the functions declared in the following code. Note that the HostHandle and ConnectionHandle data structures have been omitted for brevity.

// networklib.h
#include "HostHandle.h"
#include "ConnectionHandle.h"

// Get the host record for a particular Internet host given
// its hostname (i.e. www.host.com):
HostHandle* lookupHostByName(const char* hostName);
// Free the given HostHandle:
void freeHostHandle(HostHandle* host);

// Connects to the given host.
ConnectionHandle* connectToHost(HostHandle* host);
// Closes the given connection.
void closeConnection(ConnectionHandle* connection);

// Retrieves a web page from an already-opened connection.
char* retrieveWebPage(ConnectionHandle* connection, const char* page);
// Frees the memory pointed to by page.
void freeWebPage(char* page);

The networklib.h interface is fairly simple and straightforward. However, it is not object-oriented, and a C++ programmer who uses such a library is bound to feel icky, to use a technical term. This library isn't organized into a cohesive class. Of course, the authors of the library could have written a better interface, but as the user of a library, you have to accept what you are given. Writing a wrapper is your opportunity to customize the interface.

Before you build an object-oriented wrapper for this library, take a look at how it might be used as-is to gain an understanding of its actual usage. In the following program, the networklib library is used to retrieve the web page at www.example.com/index.html:

HostHandle* myHost { lookupHostByName("www.example.com") };
ConnectionHandle* myConnection { connectToHost(myHost) };
char* result { retrieveWebPage(myConnection, "/index.html") };

println("The result is:\n{}", result);

freeWebPage(result); result = nullptr;
closeConnection(myConnection); myConnection = nullptr;
freeHostHandle(myHost); myHost = nullptr;

A possible way to make the library more object-oriented is to provide a single abstraction that recognizes the commonality between looking up a host, connecting to the host, and retrieving a web page. A good object-oriented wrapper hides the needless complexity of the HostHandle and ConnectionHandle types.

The new class should capture the common use case for the library. The previous example shows the most frequently used pattern: first a host is looked up, then a connection is established, and finally a page is retrieved. It is also likely that subsequent pages will be retrieved from the same host, so a good design will accommodate that mode of use as well.

To start, the HostRecord class wraps the functionality of looking up a host. It's an RAII (Resource Acquisition is Initialization) class. Its constructor uses lookupHostByName() to perform the lookup. The std::unique_ptr data member uses a custom deleter to automatically free the retrieved HostHandle by calling freeHostHandle(). Here is the code:

export class HostRecord final
{
public:
  // Looks up the host record for the given host.
  explicit HostRecord(const std::string& host)
  : m_hostHandle { lookupHostByName(host.c_str()), freeHostHandle }
  { }
  // Returns the underlying handle.
  HostHandle* get() const noexcept { return m_hostHandle.get(); }
private:
  std::unique_ptr<HostHandle, decltype(&freeHostHandle)> m_hostHandle;
};

Next, a WebHost class is implemented that uses the HostRecord class. The WebHost class creates a connection to a given host and supports retrieving webpages. It's also an RAII class. When the WebHost object is destroyed, it automatically closes the connection to the host. The getPage() member function calls retrieveWebPage() and immediately stores the result in a std::unique_ptr with a custom deleter, freeWebPage(). Here is the code:

export class WebHost final
{
public:
  // Connects to the given host.
  explicit WebHost(const std::string& host);
  // Obtains the given page from this host.
  std::string getPage(const std::string& page);
private:
  std::unique_ptr<ConnectionHandle, decltype(& closeConnection)> m_connection
  { nullptr, closeConnection };
};

WebHost::WebHost(const std::string& host)
{
  HostRecord hostRecord { host };
  if (hostRecord.get()) {
    m_connection = { connectToHost(hostRecord.get()), closeConnection };
  }
}

std::string WebHost::getPage(const std::string& page)
{
  std::string resultAsString;
  if (m_connection) {
    std::unique_ptr<char[], decltype(& freeWebPage)> result {
    retrieveWebPage(m_connection.get(), page.c_str()),
    freeWebPage };
    resultAsString = result.get();
  }
  return resultAsString;
}

The WebHost class effectively encapsulates the behavior of a host and provides useful functionality without unnecessary calls and data structures. The implementation of the WebHost class makes extensive use of the networklib library without exposing any of its workings to the user. The constructor of WebHost uses a HostRecord RAII object for the specified host. The resulting HostRecord is used to set up a connection to the host, which is stored in the m_connection data member for later use. The HostRecord RAII object is automatically destroyed at the end of the constructor. The WebHost destructor destroys m_connection which closes the connection. The getPage() member function uses retrieveWebPage() to retrieve a web page, converts it to an std::string, uses freeWebPage() to free memory, and returns the retrieved page as an std::string.

The WebHost class makes the common case easy for the client programmer. Here is an example:

WebHost myHost { "www.example.com" };
string result { myHost.getPage("/index.html") };
println("The result is:\n{}", result);

Note: Networking-savvy readers may note that keeping a connection open to a host indefinitely is considered bad practice and doesn't adhere to the HTTP specification. You should not do this in production-quality code. However, for this example, I've chosen elegance over etiquette.

As you can see, the WebHost class provides an object-oriented wrapper around the C-style library. By providing an abstraction, you can change the underlying implementation without affecting client code, and you can provide additional features. These features can include connection reference counting, automatically closing connections after a specific time to adhere to the HTTP specification, automatically reopening the connection on the next getPage() call, and so on.

Linking with C Code

The previous example assumed that you had the raw C code to work with. The example took advantage of the fact that most C code will successfully compile with a C++ compiler. If you only have compiled C code, perhaps in the form of a library, you can still use it in your C++ program, but you need to take a few extra steps.

Before you can start using compiled C code in your C++ programs, you first need to know about a concept called name mangling. To implement function overloading, the complex C++ namespace is flattened. For example, if you have a C++ program, it is legitimate to write the following:

void myFunc(double);
void myFunc(int);
void myFunc(int, int);

However, this would mean that the linker would see several different functions, all called myFunc, and would not know which one you want to call. Therefore, all C++ compilers perform an operation that is referred to as name mangling and is the logical equivalent of generating names, as follows:

myFunc_double
myFunc_int
myFunc_int_int

To avoid conflicts with other names you might have defined, the compiler might generate names that are reserved as identifiers, for example, names beginning with double underscores or names beginning with an underscore followed by an uppercase letter. Alternatively, some compilers generate names that have characters that are legal to the linker but not legal in C++ source code. For example, Microsoft VC++ generates names as follows:

?myFunc@@YAXN@Z
?myFunc@@YAXH@Z
?myFunc@@YAXHH@Z

This encoding is complex and often vendor specific. The C++ standard does not specify how function overloading should be implemented on a given platform, so there is no standard for name mangling algorithms.

In C, function overloading is not supported (the compiler will complain about duplicate definitions). So, names generated by the C compiler are quite simple, for example, _myFunc.

Now, if you compile a simple program with the C++ compiler, even if it has only one instance of the myFunc name, it still generates a request to link to a mangled name. However, when you link with the C library, it cannot find the desired mangled name, and the linker complains. Therefore, it is necessary to tell the C++ compiler to not mangle that name. This is done by using the extern "C" qualification both in the header file (to instruct the client code to create a name compatible with C) and, if your library source is in C++, at the definition site (to instruct the library code to generate a name compatible with C).

Here is the syntax of extern "C":

extern "C" declaration1();
extern "C" declaration2();

extern "C" {
  declaration1();
  declaration2();
}

The C++ standard says that any language specification can be used, so in principle, the following could be supported by a compiler:

extern "C" void myFunc(int i);
extern "Fortran" Matrix* matrixInvert(Matrix* M);
extern "Pascal" void someLegacySubroutine(int n);
extern "Ada" bool aimMissileDefense(double angle);

In practice, many compilers only support "C". Each compiler vendor will inform you which language designators they support.

As an example, the following code specifies the function prototype for cFunction() as an external C function:

extern "C" {
void cFunction(int i);
}

int main()
{
cFunction(8); // Calls the C function.
}

The actual definition for cFunction() is provided in a compiled binary file attached in the link phase. The extern keyword informs the compiler that the linked-in code was compiled in C.

A more common pattern for using extern is at the header level. For example, if you are using a graphics library written in C, it probably came with an .h file for you to include. The author of this header file should condition it on whether it is being compiled for C or C++. A C++ compiler predefines the symbol __cplusplus if you are compiling for C++. The symbol is not defined for C compilations. This symbol can be used to condition a header file as follows:

#ifdef __cplusplus
extern "C" {
#endif
drawCircle();
drawSquare();
#ifdef __cplusplus
} // matches extern "C"
#endif

This means that drawCircle() and drawSquare() are functions that are in a library compiled by the C compiler. Using this technique, the same header file can be used in both C and C++ clients.

Whether you are including C code in your C++ program or linking against a compiled C library, remember that even though C++ is almost a superset of C, they are different languages with different design goals. Adapting C code to work in C++ is quite common, but providing an object-oriented C++ wrapper around procedural C code is often much better.