Learn C Linking: From Object Files to Process Execution

Goal: Build a precise mental model of what happens after the compiler finishes. You will understand how object files encode code and data, how linkers resolve symbols and apply relocations, and how loaders map executable segments into memory and transfer control to your program. By the end, you will diagnose linker errors quickly, reason about binary size/performance trade-offs, and build runtime-extensible systems with dynamic loading.


Why C Linking Matters

Linking and loading are the moment your abstract C program becomes a concrete process. Every crash dump, performance regression, and “undefined reference” error you debug ultimately traces back to how symbols were resolved and how code was relocated in memory.

Real-world impact:

  • Build systems: Understanding how libraries are searched and resolved prevents fragile build flags and link-order bugs.
  • Performance: Static vs. dynamic linking changes startup time, memory sharing, and update cadence.
  • Security: Knowing PLT/GOT and relocation mechanics explains exploitation paths and mitigations like RELRO.
  • Operations: ABI compatibility and symbol visibility determine whether rolling updates succeed or fail.
Traditional View                      Systems View
┌──────────────────┐                  ┌────────────────────────────────────┐
│  gcc main.c      │                  │  cc1 → as → ld → loader → process   │
│   "it works"     │                  │  symbols, relocations, ABI, PLT/GOT │
└──────────────────┘                  └────────────────────────────────────┘

Concrete example: You add a library update, and a production binary crashes only on older machines. The root cause is usually not in your C code—it is in ABI drift, symbol versioning, or relocation assumptions. Linking knowledge turns that failure into a tractable debugging session.


Prerequisites & Background Knowledge

Essential Prerequisites (Must Have)

  • Comfortable C syntax (functions, structs, pointers, arrays)
  • File I/O (reading binary files with fread, fseek)
  • Basic command-line usage

Helpful But Not Required

  • Assembly basics (call, jmp, stack frames)
  • ELF or Mach-O familiarity
  • Debugger experience (GDB/LLDB)

Self-Assessment Questions

  • Can you explain the difference between a declaration and a definition in C?
  • Can you compile a file into an object file and inspect it with nm or objdump?
  • Do you know why gcc -c and gcc file.o -o file do different things?

Development Environment Setup

  • GCC or Clang
  • objdump, readelf (Linux) or otool (macOS)
  • GDB or LLDB

Time Investment

  • Project 1: 1–2 weeks
  • Project 2: 2–4 days
  • Project 3: 1–2 weeks
  • Project 4: 2–4 days

Important Reality Check

This topic is mechanical and detail-heavy. Expect to read raw bytes and consult tool output often. The payoff is lasting clarity when debugging real systems.


Definitions & Mental Models

Object file: A structured container with code (.text), data (.data, .bss), symbols, and relocation entries. It is not executable yet.

Symbol: A named addressable entity (function or global variable). Linkers match undefined references to definitions.

Relocation: A patch that tells the linker or loader where to place the final address of a referenced symbol.

Static linking: Library code is copied into the executable at link time.

Dynamic linking: The executable records dependencies and resolves them at runtime via the dynamic loader.

PIC (Position-Independent Code): Machine code that can run at any address, enabling shared libraries.

PLT/GOT: Indirection tables that allow code to call shared library functions whose addresses are unknown until runtime.


OS-Specific Notes (Linux vs macOS)

Binary format:

  • Linux uses ELF (readelf, objdump).
  • macOS uses Mach-O (otool, nm, dyldinfo).

Key tool equivalents:

  • readelf -h -S fileotool -h -l file
  • readelf -s filenm -a file
  • readelf -r fileotool -rV file
  • objdump -d fileotool -tV file

Dynamic loader:

  • Linux: ld-linux.so (via ld.so), environment variables like LD_LIBRARY_PATH, LD_PRELOAD.
  • macOS: dyld, environment variables like DYLD_LIBRARY_PATH, DYLD_INSERT_LIBRARIES.

Library naming:

  • Linux shared libraries: libfoo.so
  • macOS shared libraries: libfoo.dylib

Note on examples: Most commands below show Linux tooling. On macOS, use the equivalents above, and expect slightly different output formatting and relocation terminology.


Core Concept Analysis

1) The Post-Compilation Pipeline

┌──────────┐      ┌──────────┐      ┌──────────┐      ┌──────────┐
│  main.c  │  →   │  main.o  │  →   │  a.out   │  →   │  process │
└──────────┘      └──────────┘      └──────────┘      └──────────┘
     cc1            as + symbols        ld                loader

The compiler produces machine code and metadata, but it does not resolve external references. The linker is a graph resolver: it matches references to definitions, applies relocations, and builds the final executable image. The loader then maps segments into memory and performs final relocations for dynamic libraries.

2) Object File Anatomy (ELF/Mach-O)

An object file is a map of sections (code/data) and metadata (symbol tables, relocation tables, string tables). The linker reads this structure and decides what to keep, combine, or discard.

ELF File Layout (Simplified)
┌────────────────────────────┐
│ ELF Header                  │
├────────────────────────────┤
│ Section Headers             │
│  ├─ .text (code)            │
│  ├─ .data (init data)       │
│  ├─ .bss  (zero data)       │
│  ├─ .symtab / .strtab       │
│  └─ .rela.* (relocations)   │
└────────────────────────────┘

3) Symbols and Resolution Rules

  • Strong symbols: primary definitions (only one allowed).
  • Weak symbols: optional defaults (overridden by strong).
  • Undefined symbols: references that must be resolved by the linker.

Link order matters because the linker resolves in a single pass for static libraries. If a library appears before the object file that needs it, symbols can remain unresolved.

4) Relocations and Address Patching

Relocations are placeholders that become concrete addresses later.

call printf          →  call 0x401030
     ^ relocation          ^ patched by linker/loader

Relocation types differ (absolute, PC-relative) and are critical to position-independent code.

5) Static vs Dynamic Linking

  • Static: larger binaries, fewer runtime dependencies, reproducible deployments.
  • Dynamic: smaller binaries, shared memory, faster updates, ABI compatibility risks.

6) PIC, GOT, and PLT

PIC uses indirection tables so shared libraries can be loaded anywhere:

call printf@plt
   │
   ├─ PLT stub jumps to GOT entry
   └─ GOT entry filled by dynamic loader on first call

This is the core of lazy binding and the reason shared libraries can be updated independently.

7) Loader Responsibilities

The loader:

  • Maps segments into memory
  • Applies relocations for shared objects
  • Initializes the process stack (argc, argv, envp)
  • Transfers control to _start

8) ABI Stability and Symbol Visibility

Shared libraries require stable ABIs. Symbol visibility (default, hidden) and versioning ensure compatibility. Mistakes here break runtime loading and are hard to debug.


Concept Summary Table

Concept Cluster What You Need to Internalize
Object files Sections, symbols, relocation tables, and headers
Linking rules Strong vs weak, link order, archive extraction
Relocations Absolute vs PC-relative, patching call sites
Dynamic linking PIC, GOT/PLT, lazy binding, loader role
ABI stability Versioning, visibility, compatibility risk

Deep Dive Reading by Concept

Concept Book & Chapter Why It Helps
Linkers + object files Computer Systems: A Programmer’s Perspective — Ch. 7 Practical linker model and ELF basics
Process loading Computer Systems: A Programmer’s Perspective — Ch. 8 Program execution and loader behavior
Dynamic loading Advanced Programming in the UNIX Environment — Ch. 13 dlopen, dlsym, and runtime loading
C symbol rules Effective C, 2nd Edition — Ch. 9 Declarations vs definitions, linkage
Low-level execution Low-Level Programming — Ch. 10-12 PIC, relocation, and ABI details

Quick Start (First 48 Hours)

Day 1

  • Compile and inspect: gcc -c hello.c then readelf -h -S hello.o
  • Run nm hello.o and explain each symbol classification
  • Skim CS:APP Ch. 7 sections on linking

Day 2

  • Build a tiny static library with ar rcs libx.a and link it
  • Use objdump -d to find call printf@plt in a dynamic binary
  • Write one-page notes: “How does a symbol become an address?”

1) C-first path: Project 1 → Project 2 → Project 3 → Project 4 2) Tooling-first path: Project 1 → Project 3 → Project 2 → Project 4 3) Practical plugin path: Project 4 → Project 1 → Project 2 → Project 3


Project List

These projects are designed to be done in order, as they build upon each other to create a complete mental model of the linking and loading process.


Project 1: Build an ELF/Mach-O Inspector

  • File: LEARN_C_LINKING_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Python, Go, Rust
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Systems Programming / Binary Formats
  • Software or Tool: A C compiler (GCC/Clang)
  • Main Book: “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron

What you’ll build: A command-line tool that reads an object file (.o) or executable and prints its essential metadata: the file headers, the list of sections (like .text, .data, .bss), and the symbols defined or required by the file. Think of it as a simplified readelf or objdump.

Why it teaches the fundamentals: You cannot understand linking without first understanding the data structure a linker operates on: the object file. This project forces you to confront the binary layout, byte by byte. You’ll stop seeing executables as opaque blobs and start seeing them as structured data.

Core challenges you’ll face:

  • Parsing the main file header → maps to understanding the file’s architecture, type, and entry point
  • Locating and reading the section header table → maps to learning how the file is divided into code, data, etc.
  • Finding the symbol table and string table → maps to figuring out how symbol names are stored and referenced
  • Handling different endianness and word sizes (32/64-bit) → maps to writing portable and robust parsing code

Key Concepts:

  • ELF File Format: man 5 elf on Linux is the canonical source.
  • Struct-based Parsing: “The C Programming Language” (K&R) Ch. 6 on structures.
  • File I/O: fopen, fread, fseek are your primary tools.

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Solid C programming skills, including pointers, structs, and file I/O.

Real world outcome: A tool that gives you insight into any compiled program on your system.

$ ./my_readelf my_program.o
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 ...
  Class:   ELF64
  Type:    REL (Relocatable file)
  Machine: AMD x86-64

Section Headers:
  [Nr] Name      Type      Address          Offset   Size
  [ 1] .text     PROGBITS  0000000000000000 00000040 0000005a
  [ 2] .data     PROGBITS  0000000000000000 0000009c 00000004
  [ 3] .symtab   SYMTAB    0000000000000000 00000a30 000001b0

Symbol Table '.symtab':
   Num:    Value          Size Type    Bind   Name
     8: 000000000000001a    42 FUNC    GLOBAL my_function
     9: 0000000000000000     0 NOTYPE  GLOBAL my_global_var
    10: 0000000000000000     0 NOTYPE  GLOBAL printf      (UNDEFINED)

macOS equivalents:

# If you build a Mach-O variant, use:
otool -h -l my_program.o
otool -tV my_program.o
nm -a my_program.o

macOS expected outcome (example):

$ otool -h -l my_program.o
Mach header
      magic  cputype  cpusubtype  caps  filetype  ncmds  sizeofcmds  flags
 0xfeedfacf 16777223          3  0x00         1      5        488  0x0000

$ nm -a my_program.o | head -n 5
0000000000000000 T _my_function
                 U _printf
0000000000000000 D _my_global_var

The Core Question You’re Answering: How does an object file describe code, data, and symbols well enough for a linker to assemble a full executable?

Concepts You Must Understand First:

  • Binary file layout and struct parsing (K&R Ch. 6)
  • ELF headers and section tables (CS:APP Ch. 7)
  • Symbol tables and string tables (CS:APP Ch. 7)
  • Endianness and word size (CS:APP Ch. 2)

Questions to Guide Your Design:

  1. Which fields in the ELF header decide whether the file is 32-bit or 64-bit?
  2. How do you locate the section header string table to name sections?
  3. Why do symbol names live in a different table than symbols?
  4. How will you handle unknown or custom sections gracefully?

Thinking Exercise: Imagine a file with two symbols named foo, one in .text and one in .data. How would your tool display and distinguish them? What fields make them different?

The Interview Questions They’ll Ask:

  1. What is the difference between a section header and a program header?
  2. How does the linker use .symtab vs .dynsym?
  3. Why are .bss sections stored as size-only metadata?
  4. What does it mean for a symbol to be undefined in a relocatable object?

Hints in Layers:

Layer 1 - Parse the ELF header:

Elf64_Ehdr hdr;
if (fread(&hdr, 1, sizeof(hdr), fp) != sizeof(hdr)) { /* error */ }
if (memcmp(hdr.e_ident, ELFMAG, SELFMAG) != 0) { /* not ELF */ }

Layer 2 - Read section headers:

fseek(fp, hdr.e_shoff, SEEK_SET);
Elf64_Shdr *shdrs = calloc(hdr.e_shnum, sizeof(Elf64_Shdr));
fread(shdrs, sizeof(Elf64_Shdr), hdr.e_shnum, fp);

Layer 3 - Resolve section names:

Elf64_Shdr shstr = shdrs[hdr.e_shstrndx];
char *names = malloc(shstr.sh_size);
fseek(fp, shstr.sh_offset, SEEK_SET);
fread(names, 1, shstr.sh_size, fp);

Layer 4 - Parse symbols:

// Find SHT_SYMTAB section, then read Elf64_Sym entries

Books That Will Help:

Book Chapters What You’ll Learn
CS:APP 7 Object file structure and symbols
K&R 6 Structs, file I/O patterns
Low-Level Programming 10 ELF anatomy and sections

Common Pitfalls & Debugging:

Problem: “I get garbage section names”

  • Why: Using the wrong string table (you need e_shstrndx).
  • Fix: Load the section header string table first and use it for section names.
  • Quick test: Compare your names with readelf -S output.

Problem: “My tool crashes on 32-bit files”

  • Why: Assuming Elf64_* structures for all input.
  • Fix: Read e_ident[EI_CLASS] and choose Elf32_* or Elf64_*.
  • Quick test: Run against /lib32/ld-linux.so.2 on Linux.

Project 2: A Practical Study of Symbols and Linking

  • File: LEARN_C_LINKING_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: N/A
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 1: Beginner
  • Knowledge Area: C Programming / Linker Theory
  • Software or Tool: GCC/Clang, and your tool from Project 1.
  • Main Book: “Linkers and Loaders” by John R. Levine

What you’ll build: Not a single tool, but a series of small, targeted C programs that demonstrate specific linker behaviors. You will create experiments to prove how symbol resolution, weak symbols, and static linking work.

Why it teaches symbols: This project moves from parsing the data structures to understanding the rules the linker applies to them. You’ll create scenarios that intentionally cause linker errors or trigger specific behaviors, forcing you to understand the “why”.

Core challenges you’ll face:

  • Causing a “multiple definition” error → maps to understanding what a “strong” symbol is
  • Using weak symbols to provide default implementations → maps to practical application of weak/strong symbol rules
  • Investigating a static library (.a) file → maps to understanding that a static library is just an archive of .o files
  • Seeing how the linker pulls only needed objects from a library → maps to understanding efficient linking

Key Concepts:

  • Symbol Resolution Rules: “Computer Systems: A Programmer’s Perspective” Ch. 7
  • Weak and Strong Symbols: The __attribute__((weak)) GCC extension.
  • Static Libraries: The ar command (ar t my_library.a lists objects).

Difficulty: Beginner Time estimate: Weekend Prerequisites: Project 1, basic command-line skills.

Real world outcome: A deep, intuitive understanding of linker errors. You’ll never be confused by an “undefined reference” again. You’ll have a git repository with several small directories, each demonstrating a core linking concept with a README.md explaining the behavior.

macOS expected outcome (example):

$ nm -gU libx.a
0000000000000000 T _a
                 U _b

The Core Question You’re Answering: What exact rule causes the linker to choose one definition over another, and why does link order matter?

Concepts You Must Understand First:

  • Declarations vs definitions (Effective C Ch. 9)
  • Strong/weak symbols (CS:APP Ch. 7)
  • Archive libraries (.a) (CS:APP Ch. 7)
  • One-pass symbol resolution (CS:APP Ch. 7)

Questions to Guide Your Design:

  1. How does the linker decide which .o file to extract from a .a archive?
  2. Why does gcc main.o -lfoo work but gcc -lfoo main.o sometimes fail?
  3. What is the practical use of weak symbols in real systems?

Thinking Exercise: Sketch a link command line that will fail due to order, then reorder it to succeed. Explain why the behavior changes.

The Interview Questions They’ll Ask:

  1. Why can two files define the same global symbol without a compile error but fail at link time?
  2. What is the difference between nm output symbols T, t, U, and W?
  3. How do static libraries avoid bloating executables?

Hints in Layers:

Layer 1 - Provoke a multiple-definition error:

gcc -c a.c b.c
gcc a.o b.o -o bad_link

Layer 2 - Use weak symbols:

__attribute__((weak)) void func(void) { puts("weak"); }

Layer 3 - Inspect a static library:

ar rcs libx.a a.o b.o
ar t libx.a
nm -g --defined-only libx.a

macOS equivalents:

libtool -static -o libx.a a.o b.o
ar t libx.a
nm -gU libx.a

Books That Will Help:

Book Chapters What You’ll Learn
CS:APP 7 Symbol resolution rules
Effective C 9 Declarations, linkage, visibility
Low-Level Programming 11 Archive and link behavior

Common Pitfalls & Debugging:

Problem: “Undefined reference” with library present

  • Why: Library appears before the object that needs it.
  • Fix: Put -lfoo after the object files that use it.
  • Quick test: Compare gcc main.o -lfoo vs gcc -lfoo main.o.

Problem: “Multiple definition of symbol”

  • Why: Two strong symbols with the same name.
  • Fix: Make one static, or remove one definition.
  • Quick test: Use nm to find duplicate T symbols.

Project 3: Relocation and the PLT/GOT

  • File: LEARN_C_LINKING_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: N/A
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Assembly / Dynamic Linking
  • Software or Tool: A debugger (GDB/LLDB) and a disassembler (objdump).
  • Main Book: “Linkers and Loaders” by John R. Levine

What you’ll build: Another lab-based project. You’ll write simple C code that calls a shared library function (like printf). You will then disassemble the executable and trace its execution in a debugger to see the Procedure Linkage Table (PLT) and Global Offset Table (GOT) in action.

Why it teaches relocations and PIC: This project unravels the “magic” of dynamic linking. You will see the exact machine code mechanism that allows an executable to call a library function whose address isn’t known until runtime. It connects the concepts of object files directly to CPU execution.

Core challenges you’ll face:

  • Generating readable assembly → maps to using objdump -d or GDB’s disassemble command
  • Finding the PLT and GOT sections → maps to using your inspector or readelf -S
  • Stepping through the PLT indirection in a debugger → maps to seeing the lazy binding process happen live
  • Understanding how the GOT entry is patched on the first call → maps to witnessing the dynamic loader’s work

Key Concepts:

  • Procedure Linkage Table: Indirect calls through the PLT stub.
  • Lazy Binding: Linking happens on first call, not at startup.
  • x86 Assembly: A basic understanding of call, jmp, and memory addressing is needed.

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Project 1, basic GDB skills (setting breakpoints, stepping instructions (si), examining memory (x)).

Real world outcome: A profound “aha!” moment. You will be able to look at a disassembled binary, see a call printf@plt, and know exactly what series of jumps and memory lookups the CPU will perform to find and execute printf.

macOS expected outcome (example):

$ otool -tV a.out | rg "__stubs|__la_symbol_ptr" -n
0000000100003f60 __TEXT,__stubs:
0000000100005000 __DATA,__la_symbol_ptr:

The Core Question You’re Answering: How can a program call a function whose address is unknown at link time, and why is the first call slower than the rest?

Concepts You Must Understand First:

  • Relocations and PIC (CS:APP Ch. 7)
  • PLT/GOT structure (Low-Level Programming Ch. 12)
  • Debugger instruction stepping (The Art of Debugging with GDB Ch. 2-3)

Questions to Guide Your Design:

  1. What is stored in the GOT before the first call to printf?
  2. Why does the PLT stub push an index before jumping to the loader?
  3. How does LD_BIND_NOW=1 change what you observe?

Thinking Exercise: Draw a flowchart of the first call to printf showing each jump and the GOT update. Then draw the second call and compare the paths.

The Interview Questions They’ll Ask:

  1. Explain the role of the PLT and GOT in dynamic linking.
  2. Why is lazy binding considered a performance optimization?
  3. How does -fPIC change the generated machine code?
  4. What is RELRO and how does it affect the GOT?

Hints in Layers:

Layer 1 - Build a minimal test case:

int main(void) { printf("Hello\n"); return 0; }

Layer 2 - Find the PLT entry:

objdump -d a.out | rg "@plt"
readelf -S a.out | rg "\.plt|\.got"

macOS equivalents:

otool -tV a.out | rg "stub|__stubs|__la_symbol_ptr"
otool -l a.out | rg "__TEXT|__stubs|__la_symbol_ptr"

Layer 3 - Step through with GDB:

gdb ./a.out
(gdb) break main
(gdb) run
(gdb) disassemble printf@plt
(gdb) si  # step into call

macOS equivalents:

lldb ./a.out
(lldb) break set -n main
(lldb) run
(lldb) disassemble -n printf
(lldb) si

Books That Will Help:

Book Chapters What You’ll Learn
CS:APP 7.9 Lazy binding and dynamic linking
Low-Level Programming 12 PLT/GOT internals
The Art of Debugging with GDB 2-3 Instruction-level debugging

Common Pitfalls & Debugging:

Problem: “I can’t find @plt symbols”

  • Why: The binary is stripped or built as PIE with different symbol names.
  • Fix: Compile with -fno-pie -no-pie -O0 for clarity.
  • Quick test: readelf -s a.out | rg "plt".

Problem: “GOT doesn’t change after first call”

  • Why: Binding might be eager (RELRO + BIND_NOW).
  • Fix: Unset LD_BIND_NOW and rebuild without full RELRO.
  • Quick test: env | rg LD_BIND_NOW.

Project 4: The Dynamic Loader in Action (dlopen)

  • File: LEARN_C_LINKING_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: N/A
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Systems Programming / C
  • Software or Tool: libdl library
  • Main Book: “Advanced Programming in the UNIX Environment” by Stevens & Rago

What you’ll build: A simple plugin-based program. The main application will load shared libraries (.so files) from a plugins/ directory at runtime, look for a specific function within them (e.g., run_plugin), and execute it.

Why it teaches loading: This project lets you play the role of the dynamic loader. You’ll use the same system calls (dlopen, dlsym, dlclose) that ld.so uses, but explicitly in your own code. It’s the key to building extensible applications.

Core challenges you’ll face:

  • Compiling a shared library → maps to using the -fPIC and -shared flags correctly
  • Loading a library at runtime → maps to using dlopen and handling errors
  • Finding a symbol in the loaded library → maps to using dlsym and casting the returned void* to a function pointer
  • Managing library handles and memory → maps to understanding dlclose and its implications

Key Concepts:

  • Dynamic Loading API: man 3 dlopen.
  • Function Pointers: “The C Programming Language” (K&R) Ch. 5.11.
  • PIC/Shared Libraries: GCC documentation on -fPIC.

Difficulty: Intermediate Time estimate: Weekend Prerequisites: Solid C skills, especially function pointers.

Real world outcome: A working host program that can run new functionality by simply dropping new .so files into a directory, without recompiling the main application.

macOS expected outcome (example):

$ ./host
Hello from Plugin 1!

The Core Question You’re Answering: How does a running process discover and call code that did not exist at compile time?

Concepts You Must Understand First:

  • Shared libraries and PIC (CS:APP Ch. 7)
  • Function pointers (K&R Ch. 5)
  • Dynamic loader API (APUE Ch. 13)

Questions to Guide Your Design:

  1. How will you define a stable plugin interface so different plugins remain compatible?
  2. How will you handle version mismatches or missing symbols safely?
  3. What should the host do if a plugin crashes or hangs?

Thinking Exercise: Design a minimal plugin interface with versioning. What symbol names and structs would you expose to keep compatibility over time?

The Interview Questions They’ll Ask:

  1. What’s the difference between link-time and runtime linking?
  2. Why must shared libraries be compiled with -fPIC?
  3. How do dlopen and dlsym relate to ld.so?

Hints in Layers:

Layer 1 - Build shared libraries:

gcc -fPIC -shared plugin1.c -o plugins/plugin1.so
gcc -fPIC -shared plugin2.c -o plugins/plugin2.so

macOS equivalents:

clang -fPIC -dynamiclib plugin1.c -o plugins/plugin1.dylib
clang -fPIC -dynamiclib plugin2.c -o plugins/plugin2.dylib

Layer 2 - Load and call:

void *handle = dlopen("./plugins/plugin1.so", RTLD_LAZY);
plugin_func_t run = (plugin_func_t)dlsym(handle, "run_plugin");
run();

macOS equivalents:

void *handle = dlopen("./plugins/plugin1.dylib", RTLD_LAZY);
plugin_func_t run = (plugin_func_t)dlsym(handle, "run_plugin");
run();

Layer 3 - Add a stable interface:

typedef struct {
    int api_version;
    const char *name;
    void (*run)(void);
} plugin_v1_t;

Books That Will Help:

Book Chapters What You’ll Learn
APUE 13 dlopen, dlsym, dlclose
CS:APP 7 Shared libraries and dynamic linking
K&R 5.11 Function pointers

Common Pitfalls & Debugging:

Problem: “undefined reference to dlopen”

  • Why: Missing -ldl during linking.
  • Fix: Compile with gcc main.c -ldl -o host.
  • Quick test: ldd host | rg libdl.

Problem: “dlsym returns NULL”

  • Why: Symbol name mismatch or hidden visibility.
  • Fix: Ensure the function is exported and the name matches exactly.
  • Quick test: nm -D plugin1.so | rg run_plugin.

Project Comparison Table

Project Difficulty Time Depth of Understanding Fun Factor
ELF/Mach-O Inspector Level 2: Intermediate 1-2 weeks ★★★☆☆ ★★★☆☆
Symbol/Linking Study Level 1: Beginner Weekend ★★★☆☆ ★★☆☆☆
PLT/GOT Relocation Lab Level 3: Advanced 1-2 weeks ★★★★★ ★★★★☆
Dynamic Loader (dlopen) Level 2: Intermediate Weekend ★★★★☆ ★★★★☆

Recommendation

It is essential to do these projects in order. Start with Project 1: Build an ELF/Mach-O Inspector. This provides the fundamental knowledge of the data structures involved. Without it, the other projects will feel abstract and magical.

Once you have your inspector, proceed to the Symbol Study and the PLT/GOT Lab. These hands-on analysis projects will connect the file format knowledge to the actual behavior of the linker and loader. Finally, the dlopen project will let you apply this knowledge to a practical, real-world programming pattern.

This path will take you from theory (file formats) to observation (debugging) to application (plugins), giving you a robust and complete understanding of the C linking and loading process.

Summary

Project Main Programming Language
Build an ELF/Mach-O Inspector C
A Practical Study of Symbols and Linking C
Relocation and the PLT/GOT C
The Dynamic Loader in Action (dlopen) C