Learn C Linking: From Object Files to Process Execution
Goal: Build a precise mental model of what happens after the compiler finishes. You will understand how object files encode code and data, how linkers resolve symbols and apply relocations, and how loaders map executable segments into memory and transfer control to your program. By the end, you will diagnose linker errors quickly, reason about binary size/performance trade-offs, and build runtime-extensible systems with dynamic loading.
Why C Linking Matters
Linking and loading are the moment your abstract C program becomes a concrete process. Every crash dump, performance regression, and “undefined reference” error you debug ultimately traces back to how symbols were resolved and how code was relocated in memory.
Real-world impact:
- Build systems: Understanding how libraries are searched and resolved prevents fragile build flags and link-order bugs.
- Performance: Static vs. dynamic linking changes startup time, memory sharing, and update cadence.
- Security: Knowing PLT/GOT and relocation mechanics explains exploitation paths and mitigations like RELRO.
- Operations: ABI compatibility and symbol visibility determine whether rolling updates succeed or fail.
Traditional View Systems View
┌──────────────────┐ ┌────────────────────────────────────┐
│ gcc main.c │ │ cc1 → as → ld → loader → process │
│ "it works" │ │ symbols, relocations, ABI, PLT/GOT │
└──────────────────┘ └────────────────────────────────────┘
Concrete example: You add a library update, and a production binary crashes only on older machines. The root cause is usually not in your C code—it is in ABI drift, symbol versioning, or relocation assumptions. Linking knowledge turns that failure into a tractable debugging session.
Prerequisites & Background Knowledge
Essential Prerequisites (Must Have)
- Comfortable C syntax (functions, structs, pointers, arrays)
- File I/O (reading binary files with
fread,fseek) - Basic command-line usage
Helpful But Not Required
- Assembly basics (
call,jmp, stack frames) - ELF or Mach-O familiarity
- Debugger experience (GDB/LLDB)
Self-Assessment Questions
- Can you explain the difference between a declaration and a definition in C?
- Can you compile a file into an object file and inspect it with
nmorobjdump? - Do you know why
gcc -candgcc file.o -o filedo different things?
Development Environment Setup
- GCC or Clang
objdump,readelf(Linux) orotool(macOS)- GDB or LLDB
Time Investment
- Project 1: 1–2 weeks
- Project 2: 2–4 days
- Project 3: 1–2 weeks
- Project 4: 2–4 days
Important Reality Check
This topic is mechanical and detail-heavy. Expect to read raw bytes and consult tool output often. The payoff is lasting clarity when debugging real systems.
Definitions & Mental Models
Object file: A structured container with code (.text), data (.data, .bss), symbols, and relocation entries. It is not executable yet.
Symbol: A named addressable entity (function or global variable). Linkers match undefined references to definitions.
Relocation: A patch that tells the linker or loader where to place the final address of a referenced symbol.
Static linking: Library code is copied into the executable at link time.
Dynamic linking: The executable records dependencies and resolves them at runtime via the dynamic loader.
PIC (Position-Independent Code): Machine code that can run at any address, enabling shared libraries.
PLT/GOT: Indirection tables that allow code to call shared library functions whose addresses are unknown until runtime.
OS-Specific Notes (Linux vs macOS)
Binary format:
- Linux uses ELF (
readelf,objdump). - macOS uses Mach-O (
otool,nm,dyldinfo).
Key tool equivalents:
readelf -h -S file→otool -h -l filereadelf -s file→nm -a filereadelf -r file→otool -rV fileobjdump -d file→otool -tV file
Dynamic loader:
- Linux:
ld-linux.so(viald.so), environment variables likeLD_LIBRARY_PATH,LD_PRELOAD. - macOS:
dyld, environment variables likeDYLD_LIBRARY_PATH,DYLD_INSERT_LIBRARIES.
Library naming:
- Linux shared libraries:
libfoo.so - macOS shared libraries:
libfoo.dylib
Note on examples: Most commands below show Linux tooling. On macOS, use the equivalents above, and expect slightly different output formatting and relocation terminology.
Core Concept Analysis
1) The Post-Compilation Pipeline
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ main.c │ → │ main.o │ → │ a.out │ → │ process │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
cc1 as + symbols ld loader
The compiler produces machine code and metadata, but it does not resolve external references. The linker is a graph resolver: it matches references to definitions, applies relocations, and builds the final executable image. The loader then maps segments into memory and performs final relocations for dynamic libraries.
2) Object File Anatomy (ELF/Mach-O)
An object file is a map of sections (code/data) and metadata (symbol tables, relocation tables, string tables). The linker reads this structure and decides what to keep, combine, or discard.
ELF File Layout (Simplified)
┌────────────────────────────┐
│ ELF Header │
├────────────────────────────┤
│ Section Headers │
│ ├─ .text (code) │
│ ├─ .data (init data) │
│ ├─ .bss (zero data) │
│ ├─ .symtab / .strtab │
│ └─ .rela.* (relocations) │
└────────────────────────────┘
3) Symbols and Resolution Rules
- Strong symbols: primary definitions (only one allowed).
- Weak symbols: optional defaults (overridden by strong).
- Undefined symbols: references that must be resolved by the linker.
Link order matters because the linker resolves in a single pass for static libraries. If a library appears before the object file that needs it, symbols can remain unresolved.
4) Relocations and Address Patching
Relocations are placeholders that become concrete addresses later.
call printf → call 0x401030
^ relocation ^ patched by linker/loader
Relocation types differ (absolute, PC-relative) and are critical to position-independent code.
5) Static vs Dynamic Linking
- Static: larger binaries, fewer runtime dependencies, reproducible deployments.
- Dynamic: smaller binaries, shared memory, faster updates, ABI compatibility risks.
6) PIC, GOT, and PLT
PIC uses indirection tables so shared libraries can be loaded anywhere:
call printf@plt
│
├─ PLT stub jumps to GOT entry
└─ GOT entry filled by dynamic loader on first call
This is the core of lazy binding and the reason shared libraries can be updated independently.
7) Loader Responsibilities
The loader:
- Maps segments into memory
- Applies relocations for shared objects
- Initializes the process stack (
argc,argv,envp) - Transfers control to
_start
8) ABI Stability and Symbol Visibility
Shared libraries require stable ABIs. Symbol visibility (default, hidden) and versioning ensure compatibility. Mistakes here break runtime loading and are hard to debug.
Concept Summary Table
| Concept Cluster | What You Need to Internalize |
|---|---|
| Object files | Sections, symbols, relocation tables, and headers |
| Linking rules | Strong vs weak, link order, archive extraction |
| Relocations | Absolute vs PC-relative, patching call sites |
| Dynamic linking | PIC, GOT/PLT, lazy binding, loader role |
| ABI stability | Versioning, visibility, compatibility risk |
Deep Dive Reading by Concept
| Concept | Book & Chapter | Why It Helps |
|---|---|---|
| Linkers + object files | Computer Systems: A Programmer’s Perspective — Ch. 7 | Practical linker model and ELF basics |
| Process loading | Computer Systems: A Programmer’s Perspective — Ch. 8 | Program execution and loader behavior |
| Dynamic loading | Advanced Programming in the UNIX Environment — Ch. 13 | dlopen, dlsym, and runtime loading |
| C symbol rules | Effective C, 2nd Edition — Ch. 9 | Declarations vs definitions, linkage |
| Low-level execution | Low-Level Programming — Ch. 10-12 | PIC, relocation, and ABI details |
Quick Start (First 48 Hours)
Day 1
- Compile and inspect:
gcc -c hello.cthenreadelf -h -S hello.o - Run
nm hello.oand explain each symbol classification - Skim CS:APP Ch. 7 sections on linking
Day 2
- Build a tiny static library with
ar rcs libx.aand link it - Use
objdump -dto findcall printf@pltin a dynamic binary - Write one-page notes: “How does a symbol become an address?”
Recommended Learning Paths
1) C-first path: Project 1 → Project 2 → Project 3 → Project 4 2) Tooling-first path: Project 1 → Project 3 → Project 2 → Project 4 3) Practical plugin path: Project 4 → Project 1 → Project 2 → Project 3
Project List
These projects are designed to be done in order, as they build upon each other to create a complete mental model of the linking and loading process.
Project 1: Build an ELF/Mach-O Inspector
- File: LEARN_C_LINKING_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: Python, Go, Rust
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Systems Programming / Binary Formats
- Software or Tool: A C compiler (GCC/Clang)
- Main Book: “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron
What you’ll build: A command-line tool that reads an object file (.o) or executable and prints its essential metadata: the file headers, the list of sections (like .text, .data, .bss), and the symbols defined or required by the file. Think of it as a simplified readelf or objdump.
Why it teaches the fundamentals: You cannot understand linking without first understanding the data structure a linker operates on: the object file. This project forces you to confront the binary layout, byte by byte. You’ll stop seeing executables as opaque blobs and start seeing them as structured data.
Core challenges you’ll face:
- Parsing the main file header → maps to understanding the file’s architecture, type, and entry point
- Locating and reading the section header table → maps to learning how the file is divided into code, data, etc.
- Finding the symbol table and string table → maps to figuring out how symbol names are stored and referenced
- Handling different endianness and word sizes (32/64-bit) → maps to writing portable and robust parsing code
Key Concepts:
- ELF File Format:
man 5 elfon Linux is the canonical source. - Struct-based Parsing: “The C Programming Language” (K&R) Ch. 6 on structures.
- File I/O:
fopen,fread,fseekare your primary tools.
Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Solid C programming skills, including pointers, structs, and file I/O.
Real world outcome: A tool that gives you insight into any compiled program on your system.
$ ./my_readelf my_program.o
ELF Header:
Magic: 7f 45 4c 46 02 01 01 ...
Class: ELF64
Type: REL (Relocatable file)
Machine: AMD x86-64
Section Headers:
[Nr] Name Type Address Offset Size
[ 1] .text PROGBITS 0000000000000000 00000040 0000005a
[ 2] .data PROGBITS 0000000000000000 0000009c 00000004
[ 3] .symtab SYMTAB 0000000000000000 00000a30 000001b0
Symbol Table '.symtab':
Num: Value Size Type Bind Name
8: 000000000000001a 42 FUNC GLOBAL my_function
9: 0000000000000000 0 NOTYPE GLOBAL my_global_var
10: 0000000000000000 0 NOTYPE GLOBAL printf (UNDEFINED)
macOS equivalents:
# If you build a Mach-O variant, use:
otool -h -l my_program.o
otool -tV my_program.o
nm -a my_program.o
macOS expected outcome (example):
$ otool -h -l my_program.o
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
0xfeedfacf 16777223 3 0x00 1 5 488 0x0000
$ nm -a my_program.o | head -n 5
0000000000000000 T _my_function
U _printf
0000000000000000 D _my_global_var
The Core Question You’re Answering: How does an object file describe code, data, and symbols well enough for a linker to assemble a full executable?
Concepts You Must Understand First:
- Binary file layout and struct parsing (K&R Ch. 6)
- ELF headers and section tables (CS:APP Ch. 7)
- Symbol tables and string tables (CS:APP Ch. 7)
- Endianness and word size (CS:APP Ch. 2)
Questions to Guide Your Design:
- Which fields in the ELF header decide whether the file is 32-bit or 64-bit?
- How do you locate the section header string table to name sections?
- Why do symbol names live in a different table than symbols?
- How will you handle unknown or custom sections gracefully?
Thinking Exercise:
Imagine a file with two symbols named foo, one in .text and one in .data. How would your tool display and distinguish them? What fields make them different?
The Interview Questions They’ll Ask:
- What is the difference between a section header and a program header?
- How does the linker use
.symtabvs.dynsym? - Why are
.bsssections stored as size-only metadata? - What does it mean for a symbol to be undefined in a relocatable object?
Hints in Layers:
Layer 1 - Parse the ELF header:
Elf64_Ehdr hdr;
if (fread(&hdr, 1, sizeof(hdr), fp) != sizeof(hdr)) { /* error */ }
if (memcmp(hdr.e_ident, ELFMAG, SELFMAG) != 0) { /* not ELF */ }
Layer 2 - Read section headers:
fseek(fp, hdr.e_shoff, SEEK_SET);
Elf64_Shdr *shdrs = calloc(hdr.e_shnum, sizeof(Elf64_Shdr));
fread(shdrs, sizeof(Elf64_Shdr), hdr.e_shnum, fp);
Layer 3 - Resolve section names:
Elf64_Shdr shstr = shdrs[hdr.e_shstrndx];
char *names = malloc(shstr.sh_size);
fseek(fp, shstr.sh_offset, SEEK_SET);
fread(names, 1, shstr.sh_size, fp);
Layer 4 - Parse symbols:
// Find SHT_SYMTAB section, then read Elf64_Sym entries
Books That Will Help:
| Book | Chapters | What You’ll Learn |
|---|---|---|
| CS:APP | 7 | Object file structure and symbols |
| K&R | 6 | Structs, file I/O patterns |
| Low-Level Programming | 10 | ELF anatomy and sections |
Common Pitfalls & Debugging:
Problem: “I get garbage section names”
- Why: Using the wrong string table (you need
e_shstrndx). - Fix: Load the section header string table first and use it for section names.
- Quick test: Compare your names with
readelf -Soutput.
Problem: “My tool crashes on 32-bit files”
- Why: Assuming
Elf64_*structures for all input. - Fix: Read
e_ident[EI_CLASS]and chooseElf32_*orElf64_*. - Quick test: Run against
/lib32/ld-linux.so.2on Linux.
Project 2: A Practical Study of Symbols and Linking
- File: LEARN_C_LINKING_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: N/A
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 1: Beginner
- Knowledge Area: C Programming / Linker Theory
- Software or Tool: GCC/Clang, and your tool from Project 1.
- Main Book: “Linkers and Loaders” by John R. Levine
What you’ll build: Not a single tool, but a series of small, targeted C programs that demonstrate specific linker behaviors. You will create experiments to prove how symbol resolution, weak symbols, and static linking work.
Why it teaches symbols: This project moves from parsing the data structures to understanding the rules the linker applies to them. You’ll create scenarios that intentionally cause linker errors or trigger specific behaviors, forcing you to understand the “why”.
Core challenges you’ll face:
- Causing a “multiple definition” error → maps to understanding what a “strong” symbol is
- Using weak symbols to provide default implementations → maps to practical application of weak/strong symbol rules
- Investigating a static library (
.a) file → maps to understanding that a static library is just an archive of.ofiles - Seeing how the linker pulls only needed objects from a library → maps to understanding efficient linking
Key Concepts:
- Symbol Resolution Rules: “Computer Systems: A Programmer’s Perspective” Ch. 7
- Weak and Strong Symbols: The
__attribute__((weak))GCC extension. - Static Libraries: The
arcommand (ar t my_library.alists objects).
Difficulty: Beginner Time estimate: Weekend Prerequisites: Project 1, basic command-line skills.
Real world outcome:
A deep, intuitive understanding of linker errors. You’ll never be confused by an “undefined reference” again. You’ll have a git repository with several small directories, each demonstrating a core linking concept with a README.md explaining the behavior.
macOS expected outcome (example):
$ nm -gU libx.a
0000000000000000 T _a
U _b
The Core Question You’re Answering: What exact rule causes the linker to choose one definition over another, and why does link order matter?
Concepts You Must Understand First:
- Declarations vs definitions (Effective C Ch. 9)
- Strong/weak symbols (CS:APP Ch. 7)
- Archive libraries (
.a) (CS:APP Ch. 7) - One-pass symbol resolution (CS:APP Ch. 7)
Questions to Guide Your Design:
- How does the linker decide which
.ofile to extract from a.aarchive? - Why does
gcc main.o -lfoowork butgcc -lfoo main.osometimes fail? - What is the practical use of weak symbols in real systems?
Thinking Exercise: Sketch a link command line that will fail due to order, then reorder it to succeed. Explain why the behavior changes.
The Interview Questions They’ll Ask:
- Why can two files define the same global symbol without a compile error but fail at link time?
- What is the difference between
nmoutput symbolsT,t,U, andW? - How do static libraries avoid bloating executables?
Hints in Layers:
Layer 1 - Provoke a multiple-definition error:
gcc -c a.c b.c
gcc a.o b.o -o bad_link
Layer 2 - Use weak symbols:
__attribute__((weak)) void func(void) { puts("weak"); }
Layer 3 - Inspect a static library:
ar rcs libx.a a.o b.o
ar t libx.a
nm -g --defined-only libx.a
macOS equivalents:
libtool -static -o libx.a a.o b.o
ar t libx.a
nm -gU libx.a
Books That Will Help:
| Book | Chapters | What You’ll Learn |
|---|---|---|
| CS:APP | 7 | Symbol resolution rules |
| Effective C | 9 | Declarations, linkage, visibility |
| Low-Level Programming | 11 | Archive and link behavior |
Common Pitfalls & Debugging:
Problem: “Undefined reference” with library present
- Why: Library appears before the object that needs it.
- Fix: Put
-lfooafter the object files that use it. - Quick test: Compare
gcc main.o -lfoovsgcc -lfoo main.o.
Problem: “Multiple definition of symbol”
- Why: Two strong symbols with the same name.
- Fix: Make one
static, or remove one definition. - Quick test: Use
nmto find duplicateTsymbols.
Project 3: Relocation and the PLT/GOT
- File: LEARN_C_LINKING_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: N/A
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 3: Advanced
- Knowledge Area: Assembly / Dynamic Linking
- Software or Tool: A debugger (GDB/LLDB) and a disassembler (
objdump). - Main Book: “Linkers and Loaders” by John R. Levine
What you’ll build: Another lab-based project. You’ll write simple C code that calls a shared library function (like printf). You will then disassemble the executable and trace its execution in a debugger to see the Procedure Linkage Table (PLT) and Global Offset Table (GOT) in action.
Why it teaches relocations and PIC: This project unravels the “magic” of dynamic linking. You will see the exact machine code mechanism that allows an executable to call a library function whose address isn’t known until runtime. It connects the concepts of object files directly to CPU execution.
Core challenges you’ll face:
- Generating readable assembly → maps to using
objdump -dor GDB’sdisassemblecommand - Finding the PLT and GOT sections → maps to using your inspector or
readelf -S - Stepping through the PLT indirection in a debugger → maps to seeing the lazy binding process happen live
- Understanding how the GOT entry is patched on the first call → maps to witnessing the dynamic loader’s work
Key Concepts:
- Procedure Linkage Table: Indirect calls through the PLT stub.
- Lazy Binding: Linking happens on first call, not at startup.
- x86 Assembly: A basic understanding of
call,jmp, and memory addressing is needed.
Difficulty: Advanced
Time estimate: 1-2 weeks
Prerequisites: Project 1, basic GDB skills (setting breakpoints, stepping instructions (si), examining memory (x)).
Real world outcome:
A profound “aha!” moment. You will be able to look at a disassembled binary, see a call printf@plt, and know exactly what series of jumps and memory lookups the CPU will perform to find and execute printf.
macOS expected outcome (example):
$ otool -tV a.out | rg "__stubs|__la_symbol_ptr" -n
0000000100003f60 __TEXT,__stubs:
0000000100005000 __DATA,__la_symbol_ptr:
The Core Question You’re Answering: How can a program call a function whose address is unknown at link time, and why is the first call slower than the rest?
Concepts You Must Understand First:
- Relocations and PIC (CS:APP Ch. 7)
- PLT/GOT structure (Low-Level Programming Ch. 12)
- Debugger instruction stepping (The Art of Debugging with GDB Ch. 2-3)
Questions to Guide Your Design:
- What is stored in the GOT before the first call to
printf? - Why does the PLT stub push an index before jumping to the loader?
- How does
LD_BIND_NOW=1change what you observe?
Thinking Exercise:
Draw a flowchart of the first call to printf showing each jump and the GOT update. Then draw the second call and compare the paths.
The Interview Questions They’ll Ask:
- Explain the role of the PLT and GOT in dynamic linking.
- Why is lazy binding considered a performance optimization?
- How does
-fPICchange the generated machine code? - What is RELRO and how does it affect the GOT?
Hints in Layers:
Layer 1 - Build a minimal test case:
int main(void) { printf("Hello\n"); return 0; }
Layer 2 - Find the PLT entry:
objdump -d a.out | rg "@plt"
readelf -S a.out | rg "\.plt|\.got"
macOS equivalents:
otool -tV a.out | rg "stub|__stubs|__la_symbol_ptr"
otool -l a.out | rg "__TEXT|__stubs|__la_symbol_ptr"
Layer 3 - Step through with GDB:
gdb ./a.out
(gdb) break main
(gdb) run
(gdb) disassemble printf@plt
(gdb) si # step into call
macOS equivalents:
lldb ./a.out
(lldb) break set -n main
(lldb) run
(lldb) disassemble -n printf
(lldb) si
Books That Will Help:
| Book | Chapters | What You’ll Learn |
|---|---|---|
| CS:APP | 7.9 | Lazy binding and dynamic linking |
| Low-Level Programming | 12 | PLT/GOT internals |
| The Art of Debugging with GDB | 2-3 | Instruction-level debugging |
Common Pitfalls & Debugging:
Problem: “I can’t find @plt symbols”
- Why: The binary is stripped or built as PIE with different symbol names.
- Fix: Compile with
-fno-pie -no-pie -O0for clarity. - Quick test:
readelf -s a.out | rg "plt".
Problem: “GOT doesn’t change after first call”
- Why: Binding might be eager (RELRO + BIND_NOW).
- Fix: Unset
LD_BIND_NOWand rebuild without full RELRO. - Quick test:
env | rg LD_BIND_NOW.
Project 4: The Dynamic Loader in Action (dlopen)
- File: LEARN_C_LINKING_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: N/A
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Systems Programming / C
- Software or Tool:
libdllibrary - Main Book: “Advanced Programming in the UNIX Environment” by Stevens & Rago
What you’ll build: A simple plugin-based program. The main application will load shared libraries (.so files) from a plugins/ directory at runtime, look for a specific function within them (e.g., run_plugin), and execute it.
Why it teaches loading: This project lets you play the role of the dynamic loader. You’ll use the same system calls (dlopen, dlsym, dlclose) that ld.so uses, but explicitly in your own code. It’s the key to building extensible applications.
Core challenges you’ll face:
- Compiling a shared library → maps to using the
-fPICand-sharedflags correctly - Loading a library at runtime → maps to using
dlopenand handling errors - Finding a symbol in the loaded library → maps to using
dlsymand casting the returnedvoid*to a function pointer - Managing library handles and memory → maps to understanding
dlcloseand its implications
Key Concepts:
- Dynamic Loading API:
man 3 dlopen. - Function Pointers: “The C Programming Language” (K&R) Ch. 5.11.
- PIC/Shared Libraries: GCC documentation on
-fPIC.
Difficulty: Intermediate Time estimate: Weekend Prerequisites: Solid C skills, especially function pointers.
Real world outcome:
A working host program that can run new functionality by simply dropping new .so files into a directory, without recompiling the main application.
macOS expected outcome (example):
$ ./host
Hello from Plugin 1!
The Core Question You’re Answering: How does a running process discover and call code that did not exist at compile time?
Concepts You Must Understand First:
- Shared libraries and PIC (CS:APP Ch. 7)
- Function pointers (K&R Ch. 5)
- Dynamic loader API (APUE Ch. 13)
Questions to Guide Your Design:
- How will you define a stable plugin interface so different plugins remain compatible?
- How will you handle version mismatches or missing symbols safely?
- What should the host do if a plugin crashes or hangs?
Thinking Exercise: Design a minimal plugin interface with versioning. What symbol names and structs would you expose to keep compatibility over time?
The Interview Questions They’ll Ask:
- What’s the difference between link-time and runtime linking?
- Why must shared libraries be compiled with
-fPIC? - How do
dlopenanddlsymrelate told.so?
Hints in Layers:
Layer 1 - Build shared libraries:
gcc -fPIC -shared plugin1.c -o plugins/plugin1.so
gcc -fPIC -shared plugin2.c -o plugins/plugin2.so
macOS equivalents:
clang -fPIC -dynamiclib plugin1.c -o plugins/plugin1.dylib
clang -fPIC -dynamiclib plugin2.c -o plugins/plugin2.dylib
Layer 2 - Load and call:
void *handle = dlopen("./plugins/plugin1.so", RTLD_LAZY);
plugin_func_t run = (plugin_func_t)dlsym(handle, "run_plugin");
run();
macOS equivalents:
void *handle = dlopen("./plugins/plugin1.dylib", RTLD_LAZY);
plugin_func_t run = (plugin_func_t)dlsym(handle, "run_plugin");
run();
Layer 3 - Add a stable interface:
typedef struct {
int api_version;
const char *name;
void (*run)(void);
} plugin_v1_t;
Books That Will Help:
| Book | Chapters | What You’ll Learn |
|---|---|---|
| APUE | 13 | dlopen, dlsym, dlclose |
| CS:APP | 7 | Shared libraries and dynamic linking |
| K&R | 5.11 | Function pointers |
Common Pitfalls & Debugging:
Problem: “undefined reference to dlopen”
- Why: Missing
-ldlduring linking. - Fix: Compile with
gcc main.c -ldl -o host. - Quick test:
ldd host | rg libdl.
Problem: “dlsym returns NULL”
- Why: Symbol name mismatch or hidden visibility.
- Fix: Ensure the function is exported and the name matches exactly.
- Quick test:
nm -D plugin1.so | rg run_plugin.
Project Comparison Table
| Project | Difficulty | Time | Depth of Understanding | Fun Factor |
|---|---|---|---|---|
| ELF/Mach-O Inspector | Level 2: Intermediate | 1-2 weeks | ★★★☆☆ | ★★★☆☆ |
| Symbol/Linking Study | Level 1: Beginner | Weekend | ★★★☆☆ | ★★☆☆☆ |
| PLT/GOT Relocation Lab | Level 3: Advanced | 1-2 weeks | ★★★★★ | ★★★★☆ |
Dynamic Loader (dlopen) |
Level 2: Intermediate | Weekend | ★★★★☆ | ★★★★☆ |
Recommendation
It is essential to do these projects in order. Start with Project 1: Build an ELF/Mach-O Inspector. This provides the fundamental knowledge of the data structures involved. Without it, the other projects will feel abstract and magical.
Once you have your inspector, proceed to the Symbol Study and the PLT/GOT Lab. These hands-on analysis projects will connect the file format knowledge to the actual behavior of the linker and loader. Finally, the dlopen project will let you apply this knowledge to a practical, real-world programming pattern.
This path will take you from theory (file formats) to observation (debugging) to application (plugins), giving you a robust and complete understanding of the C linking and loading process.
Summary
| Project | Main Programming Language |
|---|---|
| Build an ELF/Mach-O Inspector | C |
| A Practical Study of Symbols and Linking | C |
| Relocation and the PLT/GOT | C |
The Dynamic Loader in Action (dlopen) |
C |