I'm now involved in two projects in which I do a hefty bit of coding. In particular, they both involve the creation and use of dynamic libraries in their respective languages. Although I've done both of these tasks numerous times, the times that I've done so were few and far between so I easily forget the compilation flags and such. So, for at least my own reference, I'll do the following in this post:
- Describe the basics of libraries.
- Write and compile a dynamic shared library.
- Write and compile a program that uses the functionality of the library.
Let's first review some terminology. I sometimes get my terms mixed up so it's often good to step back an see what's going on.
LibraryA library is a repository of functions or classes used to provide additional functionality to independent programs. Code compiled into a library can potentially be used by many programs such that those programs don't need the source code of that functionality in order to use it. For example, instead of having to rewrite the code for
printf over and over again, it's stored in a library named
stdio. In fact, you can find the corresponding header file in
/usr/include/stdio.h. The library that it
links to is
/usr/lib/libc.dylib. (More on linking later.)
Static LibraryLibraries come in two flavors: static and dynamic. The two types are primarily distinguished by the time at which the functionality of those two libraries are loaded into a program. A program referencing a
static library will copy all of the referenced routines, functions, and variables of that library into the program at
compile-time. The filename of a static library usually contain
lib as a prefix and
.a as a suffix. There are several pros and cons to using a static library instead of a dynamic library:
- Pros:
- Since the library is loaded at compile-time, you can be sure that all functionality is present within the program when distributed and that it's up to date.
- The entire program can be contained within a single executable.
- In some cases there is a performance improvement since the program doesn't have to communicate back and forth between files.
- Cons:
- No way to change or update the functionality of a program after compile-time unless you recompile the entire program. (This can take a while with large programs.)
- Statically linking libraries increases the size of the executable since all of the appropriate library functionality is present within the file.
- Library functionality can't be shared between programs.
Dynamic LibraryAlso known as a
dynamic-link library, the referenced functionality of a dynamic library is loaded into a the target program at
run-time as opposed to compile-time. The library is sometimes shipped with the program and is placed in a common library repository, such as
/usr/lib/ or
/usr/local/lib/ although other times one may have to install the library separately. The filename of a dynamic library usually contains
lib as a prefix and
.so as a suffix. (On the Mac OS X platform, the suffix is
.dylib instead.) There are several pros and cons to using a dynamic library instead of a static library:
- Pros:
- Since the library is a separate file, one can update the functionality of a program without having to update the entire program.
- Multiple programs can use the same library.
- Cons:
- Dependency issues mean that you may have to install multiple libraries before you can use a desired program.
- Licensing issues: you can't use a library that is incompatible with your chosen software license. (Example: proprietary software can use LGPL-licensed libraries but not GPL-licensed libraries.
In the end, choosing between compiling your code as a static or dynamic library depends on its functionality and how you implement that functionality.
Shared LibrariesConcerning both static and dynamic libraries, there is one more layer of distinction. A
shared library is one where the loaded functionality is allocated in a certain physical page of RAM. Programs --- one or several at a time -- that use this functionality address this same page as opposed to creating their own physical page copy of the loaded library for their own, personal use. Essentially, a shared library acts like an executable and is a common occurrence in modern operating systems.
Linking to LibrariesFinally, a
link is a reference from an executable to a library. A
linker is a standard program that establishes this link during compile-time. Specifically, in the static library case, the linker will take as input the program
object files and the required static libraries and output an executable. (Or possibly another library.) In the dynamic library case, only undefined symbols and addresses to dynamic libraries are loaded into the output executable. On UNIX-based systems, the linker program is
ld. However, it's usually automatically used when compiling a program with a common compiler, such as
gcc or
gfortran.
An Example ProgramWe will now apply all of these ideas and create a sample dynamic shared library and sample program that will reference this library. Although this will be a small project, we'll use a standard method or organizing the project's main components: the source files, include files, libraries, and binaries. That is, we create a project directory that looks something like this:
- project-foo
- Makefile --- instructions for how to compile the library and program
- bin/ --- contains all output executables
- include/ --- contains all header files
- lib/ --- contains all output libraries
- src/ --- contains all source files
Let's begin with a simple library,
foolib, containing only one function. The function will take an integer as input and return its square. (Such a simple operation, of course, need not be written in a function but the point of this post is to show how to use a function in a shared library.)
The header:
/* foolib.h */
int mysquare(int)
The source:
/* foolib.c */
#include <stdio>
#include "foolib.h"
int mysquare(int x)
{
printf("--- Using foolib's mysquare(%d) function...", x);
return x*x;
}
Following the directory hierarchy we set up earlier, we place
foolib.h in the
include/ directory and
foolib.c in the
src/ directory. Theoretically, you could dump all of your files into a single directory. However, we (1) want to practice good coding and (2) want to convince ourselves that we are indeed calling the library and not compiling the source code of
foolib into our main executable.
Now, for the main file:
/* foo.c *./
#include <stdio.h>
#include "foolib.h"
int main(int argc, char **argv)
{
int x, xx;
printf("Enter an integer: ");
scanf("%d", &x);
xx = mysquare(x);
printf("%d * %d = %dn", x, x, xx);
return 0;
}
As we can see, the main file obtains the names of the library's functions from the corresponding header,
foolib.h. In fact, you can find the
stdio.h header
/usr/include/ and the standard C library it refers to,
libc.dylib (or
libc.so on Linux systems), in
/usr/lib/. (Note: this shows that a program can interface with a single library through multiple headers. The standard C library is compromised of many header files so that a program can
include only the functionality it needs from the very large and versatile library.)
CompilingNow that we have all of the code written we need to compile both the library and the main program that will reference the library. To do so, we use the popular GNU compiler,
gcc. Let's start with presenting the entire Makefile. Then, we'll look at each line step-by-step and describe what's going on:
# Makefile --- for an example dynamic library linking program
default:
make library
make source
library:
gcc -fPIC -g -c -Wall -I./include
-o ./lib/foolib.o ./src/foolib.c
gcc -fPIC -dynamiclib -o ./lib/libfoolib.dylib ./lib/foolib.o
source:
gcc -L./lib -lfoolib -I./include ./src/foo.c
Let's take a closer look at the first command that is run:
Step #1 --- Creating an Object File (Pre-Linking)gcc -fPIC -g -c -Wall -I./include -o ./lib/foolib.o ./src/foolib.c
- -fPIC --- Tells the compiler to create position-independent code suitable for dynamic linking. Code is position independent if it can execute correctly regardless of where it is loaded into RAM. This is essential for shared libraries so that, as mentioned above, each program that uses a library doesn't have to allocate a copy of that library in memory.
- -g --- Output debugging information during compile-time.
- -c --- Tells the compiler to not run the linker. Since we are creating a library rather than linking to one, we want to turn this feature off.
- -Wall --- Enables the use of warnings at compile-time about constructions that are commonly considered questionable. Similar to the -g flag.
- -I dir --- Add the directory dir to the list of directories to be searched for header files. Since the header file for the library code is not in the same directory as the source code, and since we're compiling from the project root folder, we must specify this or else the compiler won't be able to finder the header foolib.h.
- -o filename.o --- Name the output file filename.o. If -o is not specified, the default output is an ambiguously named executable a.out.
In short, this command outputs an
object file named
foolib.o. An object file is essentially compiled code that hasn't yet been run through a linker. It's a binary that's mostly made up of machine code but often contains data that the code might use at runtime, relocation information, comments program symbols, and debugging information. The next command will take this object file and turn it into a dynamic library:
Step #2 --- Creating a Dynamic Library From an Objectgcc -fPIC -dynamiclib -o ./lib/libfoolib.dylib ./lib/foolib.o
- -fPIC --- See above. (Note: I'm not sure if it's necessary to specify the -fPIC flag again after the creation of an object file. Can someone out there comment on this?)
- -dynamiclib (-shared on Linux systems) --- With this option, gcc will create a dynamic library instead of an executable when linking.
- -o libfilename.dylib --- Name the output file filename.dylib. Note that dynamic libraries must have the prefix lib and the suffix .dylib. (On Linux systems this suffix is .so instead.)
You should find in the project's
lib/ directory a file named
libfoolib.dylib. At this point, you could potentially distribute this file to anyone who has a similar system running on their machine and they can take advantage of the library's functionality. For example, since I compiled this library on my Intel Macbook Pro, anyone with the same or similar computer can link to my silly little squaring function without having to compile the library for themselves.
Now for the final step: linking the library to the main program.
Step #3 --- Linking the Library to the Main Programgcc -L./lib -lfoolib -I./include ./src/foo.c
- -L dir --- Adds dir to the list of directories the compiler will look through for necessary libraries.
- -llibrary --- Tells the compiler to search for the library named library when linking. Note that there isn't a space between the flag -l and the name of the library. This is necessary for proper linking to occur. The linker searches through the standard list of directories and those specified by the -L flag for a file named liblibrary.o or liblibrary.a. These can also include dynamic libraries with the .dylib or .so extension.
- -I dir --- Adds dir to the list of directories the compiler will look through for the necessary headers; including those associated with custom libraries.
Finally, you will get an executable! Here's what the output looks like on my computer:
cswiercz@bellarmine:~/project-foo$ ./foo
Enter an integer: 2
--- Using foolib's mysquare(2) function...
2 * 2 = 4
cswiercz@bellarmine:~/project-foo$
You can check that the program is in fact referencing the library without compiling the library's source code into the main program by first compiling the library and then moving the header,
foolib.h, to
/usr/local/include/ and moving the library,
libfoolib.dylib, to
/usr/local/lib/. Then, recompile the main program but remove the
-L and
-I flags. That should be enough to convince you that
foolib.c is only involved in the compilation of the
library.
With this framework, we are now ready to make a much more substantial library! In fact, I'm going to use this information right now on two projects I'm working: one that involves compiling a dynamic library containing routines for setting up and running hidden markov models on a Windows machine and another that involves integrating a Fortran-based program, Clawpack, into Sage. Fun!
Let me know if you found this guide helpful. I searched around a bit for this information. It's nice to finally have it all on one page.