C++ Dynamic Link Libraries : Part 2 (Explicit Linking)

Overview

Welcome to the second installment of my DLL walkthrough! Today I’ll be focusing on Explicit Linking which is a method for getting the OS to load a DLL for you during program execution. This can be useful for a number of reasons, among them are that it allows you to load DLLs that you don’t know the name of, don’t have the lib and header files for, or aren’t required for your program to function.

We’ll be using Visual Studio 2012 and coding in C++ for this walkthrough. Like the previous post in this series, we’ll create a DLL that contains the template components that come from not clicking the Empty Project checkbox and we’ll be adding a basic Win32 Console Application to the solution. The solution explorer window should look like this.

explicitSolutionSetup

We are going to leave the DLL project in the solution so that we can change some of its characteristics. Having the project present in the solution won’t effect the DLLTestApp code that we write or how the test app performs. Everything will work identically well to the case when the project isn’t present.

 

Explicit Linking

There are 3 functions that we’re going to use to Explictly Link our DLL and they all exist in windows.h so we’ll add that to our DLLTestApp.cpp file.

#include <windows.h>

The 3 methods are

  •  LoadLibrary – This method loads the DLL by name and returns a handle
  • GetProcAddress – This finds an object / function / variable by name within the DLL
  • FreeLibrary – This unloads the DLL when we’re finished with it

Here’s some example code that uses these functions:

#include "stdafx.h"
#include <windows.h>

int _tmain(int argc, _TCHAR* argv[])
{
  HINSTANCE hInstance = LoadLibrary(L"DLL.dll");

  int(*fnDLLFuncAddress)(void) = (int(*)(void))GetProcAddress(hInstance, "fnDLL");

  int result = fnDLLFuncAddress();
  FreeLibrary(hInstance);

  return 0;
}

This compiles and upon running, loads the DLL.dll (if the OS can find it), tries to finds the address of the fnDLL method, and then tries to executes the method. Unlike the Implicit Linking tutorial, we haven’t added a reference to the DLL to the test app project and we aren’t relying on having a header or a lib file for the DLL that we’ll be accessing. Let’s dissect this program line by line.

line 6

  • The return type from LoadLibrary is HINSTANCE, but you may also see HMODULE used in the wild. HMODULE is just a typedef for HINSTANCE which is a handle to the DLL. If the DLL loads successfully then the HINSTANCE will have some value, otherwise it will be set to NULL (in classic C fashion).
  • LoadLibrary takes a string that is the name of the DLL, and you may have noticed that it takes an L string. An L preceding a string literal means that the string literal should be encoded as a wchar_t type which has multiple bytes (usually 2) per character (instead of just 1 for char). This multi-byte representation is often confused with Unicode because Unicode is often encoded in wchar_t types. Unicode is actually distinct since it’s an encoding and wchar_t is just a datatype primitive. The distinction stems from being able to use a wchar_t to hold values, not just Unicode characters. For instance it could be used as a 16bit counter instead of an 8bit char counter, and the value at each increment wouldn’t be meaningful as Unicode in the context of the counter.
  • The string that is passed into LoadLibrary can be the name of the DLL or it can be a path to the DLL. It can be a relative path like HINSTANCE hInstance = LoadLibrary(L"../DLL.dll"); or it can be a fully qualified path like HINSTANCE hInstance2 = LoadLibrary(L"C:/Users/barngoggles/Documents/ExplicitDLLTut/DLL.dll");.

line 8

  • We are declaring a variable on the left hand side of the equals that is named fnDLLFuncAddress which is a pointer to a function that returns an int and takes no parameters (the void in the parentheses means that it takes no params). The syntax is a little funny if you’re not familiar with it and it’s a little annoying to type out every time if you are using many function pointers. A common practice is to create a typedef of the function pointer and use that:
    typedef int fnDLL(void);
    fnDLL* procAddress = (fnDLL*) GetProcAddress(hInstance, "fnDLL");
  • On the right hand side of the equals sign and just to the left of GetProcAddress is code that casts the result of the GetProcAddress to our function pointer type. Since this is a C style cast, it will never fail. This can be bad news if you make a mistake in your function signature and result in undefined behavior. Unless you are doing something subversive or academic, it’s best to avoid cases that could produce undefined behavior.
  • GetProcAddress takes a valid handle and a string to search the DLL memory space for. If it can’t find the function by the string supplied, or if the DLL handle is NULL, then GetProcAddress returns NULL.

line 10

  • On this line we call the function that we’ve just gotten a pointer to. It may seem intuitive that we would want to dereference the function pointer before calling it, but that’s actually unnecessary. Still, int result = (*procAddress)(); is totally valid code, and it’s totally valid for a weird reason. The C standard states that dereferencing a function pointer just returns the function pointer. Yep, it’s a no-change operation, which means int result = (***********procAddress)(); is also totally valid.

line 12

  • This is where we call the FreeLibrary method on the DLL handle to attempt to unload the DLL from the process’s memory space. An interesting characteristic is that the FreeLibrary may not actually free the DLL if there are other open handles. In that case it will decrement an internal counter that keeps track of the number of references. It would be reasonable to assume that there will only ever be a few handles to a DLL and that this counter would not need to contain very large numbers. Curious to see if there was a design error in the maximum capacity that the counter could hold, I tried to find the max number. The max must be greater than what can be held in 30 bits since I ran out of memory before I saw any strange behavior. There may still be a design error in de/incrementing the counter since it may not be atomic, but I haven’t done any multi-threaded testing of this supposition.

So that’s all the code that’s necessary to load and call the DLL. There’s a problem here though and you may have caught wind of it earlier when I said the program tries to find the address of the method with the name supplied to GetProcAddress. In C++ function names get decorated with extra symbols (or mangled if you prefer) because of all the lovely complexities of C++ namespacing, which means that what we thought was the function name, “fnDLL”, is not actually the name anymore. This is only a problem when you are Explicitly Linking to DLLs, and not when you are Implicitly Linking to them. To make sure the name doesn’t get decorated we need to go to our DLL code and add the extern keyword. Our DLL header code looks like this

#ifdef DLL_EXPORTS
#define DLL_API __declspec(dllexport)
#else
#define DLL_API __declspec(dllimport)
#endif

// This class is exported from the DLL.dll
class DLL_API CDLL {
public:
  CDLL(void);
  // TODO: add your methods here.
};

extern DLL_API int nDLL;

DLL_API int fnDLL(void);

There is already one extern keyword in there, it precedes the nDLL variable declaration. For the sake of example I’m going to put extern in front of the class declaration as well (but we’ll see that it doesn’t actually change anything). If we place another extern in front of the fnDLL function declaration then we should be good right? Well, let’s see. We can view the names of things within the DLL using a program called Dependency Walker.

Just for reference here is what the DLL was exporting if the extern keyword is removed entirely from the code. Notice the question marks preceding  and the @ symbols and other things succeeding the recognizable names.

withoutExternDependencyWalker

Once we’ve added in the extern keyword to each of the three declaration lines in the header (lines 8, 14, and 16) Dependency Walker shows us that the names are still decorated. In fact it looks exactly the same! So what are we missing to get unmangled names?

withExternDependencyWalker

We are missing a string which tells us what kind of extern exporting we need, in this case “C” since C doesn’t do export decoration. The fixed code looks like this.

// This class is exported from the DLL.dll
extern "C" class DLL_API CDLL {
public:
  CDLL(void);
  // TODO: add your methods here.
};

extern "C"  DLL_API int nDLL;

extern "C" DLL_API int fnDLL(void);

After compiling the DLL, we get the following Dependency Walker output.

withExternCDependencyWalker

You might notice that the E column now has C for the last two lines instead of C++, and the names on those lines are no longer mangled.

If we run our test app against this new DLL that uses extern "C" then we will be able to find the function fnDLL and call it. Alternatively, we could just change the name we look for in the test app to “?fnDLL@@YAHXZ” and then the GetProcAddress would find it and we’d be able to call it. This last method is useful when you are working with DLLs that you don’t have the code for and can’t recompile with extern "C". You may notice that the names of the class and its constructor are still decorated, that’s because the compiler can’t export those as C names because those types don’t exist within the C language.

Those decorations are actually meaningful and Dependency Walker has a way of interpreting their meaning for you (provided that the DLL was compiled using the Microsoft compiler that comes with Visual Studio). I’ve recompiled without extern "C" and have opened the DLL in Dependency Walker. If you right click in the lower right window where all of the DLL’s components are shown, there’s an Undecorate C++ Functions option.

withoutExternDependencyWalker_rightClick

Clicking it reveals the signatures of the DLL’s components.

withoutExternDependencyWalker_undecorate

Those signatures are pretty nice to have since now you know how to cast the function pointers from GetProcAddress. If the DLL was compiled with extern "C" then the signatures wouldn’t be available. Here’s what the undecoration of the C exported DLL looks like.

withExternCDependencyWalker_undecorate

The C++ class and constructor are revealed but the function and variable types are still unknown. Perhaps C type exporting is a way to obfuscate the nature of components in your DLL? Only superficially, someone can always decompile your DLL and see what’s really going on.

At this point you known enough to go out and start Explicitly Linking to DLLs. I hope this has been informative and remember, FreeLibrary() your DLLs when you’re done with them.

 

Addendum

  • Different compiles have different name mangling schemes. If you’re not using the Microsoft compiler then Dependency Walker won’t be able to unmangle the name. You can always look up the conventions if you have a hunch about the compiler. Here’s the explanation for the G++ mangling.
  • A common misconception is that C doesn’t have namespaces. It actually does but unlike in C++ they aren’t explicitly created by users or a byproduct of polymorphism since those things don’t exist. In C there are 2 namespaces one for tags and one for types and since you can’t have functions inside structs in C, function names don’t get mangled when they are exported from a DLL.
  • When I recompiled the DLL without extern "C"  I got the following error. withoutExternCompileErrorThis was because I had the variable nDLL declared in the header and also defined in the cpp. If it wasn’t going to be exported then it should be declared and defined at the same time. I deleted the definition (line 9 above) and added it to the declaration in the header so that the header contained DLL_API int nDLL=0;.