Project 6: Build a Minimal Dynamic Linker
Build a simplified dynamic linker that maps an ELF executable, resolves a minimal set of relocations, and transfers control to
main().
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 5: Expert |
| Time Estimate | 1 month+ |
| Main Programming Language | C (Alternatives: Rust, Zig) |
| Alternative Programming Languages | Rust, Zig |
| Coolness Level | Level 5: Pure Magic |
| Business Potential | Level 4: Infrastructure Model |
| Prerequisites | ELF internals, memory mapping, relocations |
| Key Topics | ELF headers, mapping PT_LOAD, relocations, symbol resolution |
1. Learning Objectives
By completing this project, you will:
- Parse ELF headers and program headers to map segments correctly.
- Load shared libraries and resolve a minimal subset of relocations.
- Build a simplified symbol resolution algorithm.
- Transfer control to program entry safely.
- Explain what happens between
execveandmain().
2. All Theory Needed (Per-Concept Breakdown)
2.1 ELF Headers, Program Headers, and Memory Mapping
Fundamentals
An ELF executable is a structured binary containing headers that describe how it should be loaded. The ELF header identifies the file type, architecture, and entry point. Program headers define memory segments that the loader must map into the process address space. Each PT_LOAD header describes a segment’s file offset, virtual address, size, and permissions. A minimal dynamic linker must parse these headers and map the segments correctly using mmap (or equivalent), honoring alignment and permissions.
Deep Dive into the concept
ELF has two main header tables: the section header table (used by linkers) and the program header table (used by loaders). The program header table is the loader’s blueprint. For each PT_LOAD entry, the loader maps the segment into memory at the specified virtual address (or at a base address plus offset if using PIE). The loader must align mappings to p_align and handle cases where p_memsz is larger than p_filesz, which indicates .bss regions that must be zero-initialized. If you map the segment with mmap directly from the file, you must handle the last page carefully: it may contain both file data and zeroed data.
Permissions are critical. Code segments should be read+execute, data segments read+write. If you map with incorrect permissions, the program will crash on execution or fail due to W^X policies. The loader also needs to set up the program break and stack, but a minimal loader can rely on the host process stack and focus on mapping only the executable and its dependencies.
For PIE executables, the loader chooses a base address, then adds the base to all virtual addresses. For non-PIE, addresses are fixed, and mapping at those addresses may require MAP_FIXED. This can be dangerous in a normal process, but for a minimal loader you can run in a separate process or use a reserved address range. A safe approach is to restrict to PIE executables and choose a base via mmap(NULL, ...) then relocate accordingly.
Understanding the layout of segments is crucial. If you map segments incorrectly, relocations will be applied to the wrong addresses, and the program will crash. Use readelf -l to compare with your mapping logic. A minimal loader should at least map the executable’s PT_LOAD segments and any required shared libraries. For a simplified project, you can support only dynamically linked PIE binaries with a small subset of relocations.
How this fits in this project
You will parse ELF headers and map PT_LOAD segments, establishing the memory image of the executable before relocations.
Definitions & key terms
- ELF header -> Top-level metadata about the ELF file.
- Program header -> Describes loadable segments.
PT_LOAD-> A loadable segment.p_filesz/p_memsz-> File vs memory size.
Mental model diagram (ASCII)
ELF file -> program headers -> mmap segments
PT_LOAD(text) -> RX
PT_LOAD(data) -> RW
How it works (step-by-step, with invariants and failure modes)
- Read ELF header and validate magic and class.
- Read program headers and iterate
PT_LOAD. - Map each segment at correct aligned address.
- Zero-fill extra memory for
.bss.
Invariants: mappings match alignment and permissions. Failure modes: wrong addresses, missing zero-fill, incorrect permissions.
Minimal concrete example
$ readelf -l ./hello_dynamic
# Use p_vaddr, p_offset, p_filesz, p_memsz to map segments
Common misconceptions
- “Sections matter at runtime.” -> Loader uses program headers, not sections.
- “Mapping the file is enough.” -> You must zero
.bssand handle alignment.
Check-your-understanding questions
- Why is
p_memszsometimes larger thanp_filesz? - Why is alignment important for
PT_LOADsegments? - Why are program headers more important than sections for loading?
Check-your-understanding answers
- It represents
.bsswhich is zero-initialized in memory. - The loader must align mappings to page boundaries for correct permissions.
- The loader uses program headers to map memory, sections are for link-time.
Real-world applications
- Runtime loaders in operating systems.
- Binary instrumentation tools.
Where you’ll apply it
- In this project: see Section 3.2 Functional Requirements and Section 4.1 High-Level Design.
- Also used in: P02-library-dependency-visualizer.
References
- System V ABI (ELF spec).
- “Linkers and Loaders” (Levine), Ch. 10.
Key insights The program header table is the loader’s contract: if you map segments correctly, the rest of the runtime work is possible.
Summary Mapping ELF segments is the first job of a loader. Do it wrong, and everything else fails.
Homework/Exercises to practice the concept
- Write a tool that prints all
PT_LOADsegments with sizes and permissions. - Compare
p_fileszvsp_memszfor a binary. - Identify which segment contains
.textvs.data.
Solutions to the homework/exercises
- Use
readelf -lor parse headers in C. .bsscontributes top_memszbut notp_filesz.- Use
readelf -Sand map sections to segments.
2.2 Relocations, GOT/PLT, and Dynamic Section
Fundamentals
Relocations are instructions to the loader on how to fix up addresses after a binary is loaded. The dynamic section points to relocation tables (DT_RELA, DT_JMPREL), the dynamic symbol table (DT_SYMTAB), and other metadata required to resolve symbols. The GOT and PLT provide indirection for global variables and function calls. A minimal loader needs to apply a subset of relocations to get the executable running.
Deep Dive into the concept
Relocations come in different types. On x86-64, common relocation types include R_X86_64_RELATIVE (adjust addresses by base), R_X86_64_GLOB_DAT (resolve global data), and R_X86_64_JUMP_SLOT (resolve function calls). For a minimal loader, you can start with RELATIVE relocations because they do not require symbol lookup; they simply add the base address to a location. Then you can add JUMP_SLOT to resolve function calls and GLOB_DAT for global variables.
The dynamic section (PT_DYNAMIC) contains tags that point to relocation tables (DT_RELA, DT_RELASZ, DT_JMPREL) and the dynamic symbol table (DT_SYMTAB) and string table (DT_STRTAB). You must parse these tags to locate the tables in memory. Each relocation entry references a symbol index; the symbol table entry provides the symbol name, which you must resolve by searching loaded libraries.
GOT and PLT are used to facilitate lazy binding. For simplicity, you can implement eager binding: resolve JUMP_SLOT entries at load time. This means you do not need to implement the PLT resolver trampoline. Instead, you can treat JUMP_SLOT like GLOB_DAT and write the resolved address directly.
Relocation processing must be done in the correct order: map libraries, then apply relocations. If you apply relocations before all dependencies are loaded, symbol lookup might fail. Your loader should build the dependency graph, map each library, then process relocations for the main executable and each library in load order.
How this fits in this project
You will implement a minimal relocation engine that handles RELATIVE, GLOB_DAT, and JUMP_SLOT for x86-64 ELF binaries.
Definitions & key terms
- Relocation -> Loader patch instruction for addresses.
- GOT/PLT -> Indirection tables for data and functions.
DT_RELA-> Relocation table with addends.DT_JMPREL-> Relocation table for PLT entries.
Mental model diagram (ASCII)
reloc table -> [offset, type, sym] -> resolve -> write address
How it works (step-by-step, with invariants and failure modes)
- Parse
.dynamicto find relocation tables and symbol table. - For each relocation entry:
- If RELATIVE: write base + addend.
- If GLOB_DAT/JUMP_SLOT: resolve symbol and write address.
- Continue until all relocations are applied.
Invariants: relocation addresses must be writable; symbols must resolve. Failure modes: missing symbols, wrong base, unsupported relocation type.
Minimal concrete example
$ readelf -r ./hello_dynamic
# Look for R_X86_64_RELATIVE and R_X86_64_JUMP_SLOT
Common misconceptions
- “Relocations are only for data.” -> Function calls also require relocations.
- “PLT must be implemented.” -> You can do eager binding for a minimal loader.
Check-your-understanding questions
- Why are RELATIVE relocations easier to implement?
- What does
DT_JMPRELpoint to? - Why must you load dependencies before resolving symbols?
Check-your-understanding answers
- They do not require symbol lookup, only base adjustment.
- The relocation table for PLT entries.
- Symbol resolution requires the libraries to be mapped and indexed.
Real-world applications
- Dynamic loaders and runtime linkers.
- Binary instrumentation tools.
Where you’ll apply it
- In this project: see Section 4.4 Algorithm Overview and Section 5.10 Phase 2.
- Also used in: P01-plugin-audio-effects-processor.
References
- System V ABI relocation tables.
- “Computer Systems: A Programmer’s Perspective” (Bryant/O’Hallaron), Ch. 7.
Key insights Relocations are the final step that makes mapped code executable; without them, even a mapped binary cannot run.
Summary Relocation handling turns a mapped ELF into a runnable program. A minimal loader can start with a small subset and still be functional.
Homework/Exercises to practice the concept
- Write a parser that lists relocation types in a binary.
- Identify which relocations are
RELATIVEvsJUMP_SLOT. - Modify a binary to remove a relocation and observe failure.
Solutions to the homework/exercises
- Use
readelf -r. - Look at the
Typecolumn in relocation output. - Removing relocations will cause crashes or unresolved symbols.
2.3 Symbol Resolution and Global Scope Rules
Fundamentals
Symbol resolution is the process of finding the address of a named function or variable in a set of loaded objects. The loader maintains a search order and looks for the first matching symbol definition. Resolution uses the dynamic symbol table (.dynsym) and string table. For a minimal loader, you can implement a simplified global scope search: search the main executable first, then each loaded library in order.
Deep Dive into the concept A symbol table entry includes the symbol name, type (function or object), binding (global, weak), and the value (address or offset). When resolving a symbol, the loader uses binding rules: a strong global definition overrides a weak one. If only a weak symbol is found, it can be used as a fallback. If no definition is found, relocation fails. A robust loader must implement these rules, but a minimal loader can simplify by choosing the first strong definition and treating weak as fallback.
The search order matters. The System V ABI defines a symbol lookup scope based on the “global scope” of loaded objects. Typically, this includes the main executable, then libraries loaded at startup, then libraries loaded later with RTLD_GLOBAL. For your loader, you can build a list of loaded objects in dependency order and search it linearly. This is inefficient but simple and sufficient for small binaries.
Symbol versioning complicates resolution, but you can ignore it in a minimal loader if you restrict to binaries that do not require versioned symbols. Alternatively, you can support the default symbol version and skip version checks. This is a common simplification in educational loaders.
Finally, resolution interacts with relocation types. For JUMP_SLOT, you resolve function symbols; for GLOB_DAT, you resolve data symbols. In both cases, you write the resolved address into the relocation target.
How this fits in this project
You will implement a simple symbol resolver that searches loaded libraries in order and resolves GLOB_DAT and JUMP_SLOT relocations.
Definitions & key terms
- Symbol table -> List of symbols exported by a binary.
- Weak symbol -> Fallback symbol used if no strong symbol exists.
- Global scope -> Ordered list of objects searched for symbols.
Mental model diagram (ASCII)
resolve("printf"):
main exe -> libA -> libc -> found
How it works (step-by-step, with invariants and failure modes)
- Build list of loaded objects (main + dependencies).
- For each relocation, look up symbol name.
- Search objects in order for a strong definition.
- If found, write address; if not, fail.
Invariants: symbol tables are parsed correctly. Failure modes: missing symbols, wrong scope order.
Minimal concrete example
void* resolve(const char* name) {
for (obj in objects) {
sym = find_symbol(obj, name);
if (sym) return sym->addr;
}
return NULL;
}
Common misconceptions
- “Symbols are resolved only once.” -> With lazy binding, they can be resolved later.
- “Weak symbols are ignored.” -> They are used if no strong symbol exists.
Check-your-understanding questions
- What is a weak symbol and when is it used?
- Why does search order matter?
- What happens if a symbol is missing?
Check-your-understanding answers
- A weak symbol is a fallback if no strong definition exists.
- The first match wins, which can change behavior.
- Relocation fails and the program cannot start.
Real-world applications
- Loader symbol resolution and LD_PRELOAD interposition.
Where you’ll apply it
- In this project: see Section 4.4 Algorithm Overview and Section 6.2 Critical Test Cases.
- Also used in: P03-ld-preload-interceptor.
References
- System V ABI symbol resolution rules.
- “Linkers and Loaders” (Levine), Ch. 9-10.
Key insights Symbol resolution is deterministic but order-dependent; understanding scope is crucial.
Summary A minimal resolver can be simple and still functional, as long as it respects ordering and binding rules.
Homework/Exercises to practice the concept
- Use
nm -Dto list symbols in libc. - Identify weak vs strong symbols.
- Write a small resolver that searches two symbol tables.
Solutions to the homework/exercises
nm -D /lib/x86_64-linux-gnu/libc.so.6 | head.- Weak symbols are marked with
W. - Use linear search on arrays of symbol entries.
2.4 Entry Point Transfer and Process Initialization
Fundamentals After mapping segments and applying relocations, the loader transfers control to the program’s entry point. Normally, the kernel jumps into the dynamic linker, which then sets up the process and calls the program’s entry point. For a minimal loader, you can call the entry point directly once relocations are done. This requires setting up the initial stack or running in a process where the stack is already valid.
Deep Dive into the concept
The ELF header defines e_entry, the virtual address of the entry point. In dynamically linked programs, this entry point typically points to the program’s _start function, which sets up the C runtime and eventually calls main. If you jump directly to e_entry without setting up the expected environment (argc/argv/envp/auxv), the program may crash. A minimal loader can instead call the program’s main if you can locate it, but that also requires setting up the environment. The simplest approach is to run the loader as a separate process and let it construct a minimal stack with argc=1 and argv[0] pointing to the executable name.
For an educational loader, you can limit the scope: support a simple test binary that does not rely on complex runtime features. You can also design a custom test program that uses a small startup stub expecting a minimal environment. This makes the project feasible while still teaching core loader mechanics.
How this fits in this project You will transfer control to the entry point of a small test binary after relocations, and observe it print “Hello, world!”.
Definitions & key terms
- Entry point -> Address where execution begins.
_start-> Default entry point that sets up the runtime.- Auxiliary vector (auxv) -> Kernel-provided runtime info.
Mental model diagram (ASCII)
map -> relocate -> jump to e_entry -> _start -> main
How it works (step-by-step, with invariants and failure modes)
- Locate
e_entryin ELF header. - Ensure all relocations applied.
- Build minimal stack (argc, argv, envp).
- Jump to entry point.
Invariants: entry point address is valid and mapped. Failure modes: invalid stack, missing runtime setup.
Minimal concrete example
typedef void (*entry_fn)(int, char**, char**);
entry_fn entry = (entry_fn)(base + ehdr.e_entry);
entry(1, argv, envp);
Common misconceptions
- “Jumping to entry point always works.” -> It requires correct stack setup.
- “
mainis the entry point.” ->mainis called by_start.
Check-your-understanding questions
- What does
_startdo before callingmain? - Why does the stack need to be set up?
- Why is
e_entrynot necessarilymain?
Check-your-understanding answers
- It sets up argc/argv/envp and initializes the runtime.
- The program expects arguments and environment on the stack.
e_entrypoints to_start, notmain.
Real-world applications
- Custom loaders and sandboxed runtimes.
Where you’ll apply it
- In this project: see Section 3.7 Real World Outcome and Section 5.10 Phase 3.
References
- ABI docs for process startup.
- “Linkers and Loaders” (Levine).
Key insights Relocations alone are not enough; the process environment must match expectations.
Summary
Transferring control is the final step, and it requires correct stack setup and awareness of _start.
Homework/Exercises to practice the concept
- Write a minimal
_startthat callsmain. - Use
objdump -dto find_startin a binary. - Create a test binary that ignores argv/envp and just prints.
Solutions to the homework/exercises
- Use inline assembly to call
mainwith a dummy stack. objdump -d ./hello | grep _start.- Compile with
-nostdlibfor a minimal program.
3. Project Specification
3.1 What You Will Build
A minimal dynamic linker myld that:
- Loads a PIE ELF executable and its dependencies.
- Maps
PT_LOADsegments. - Applies a limited set of relocations.
- Transfers control to the entry point.
3.2 Functional Requirements
- ELF parsing: Validate ELF headers and program headers.
- Memory mapping: Map
PT_LOADsegments with correct permissions. - Dependency loading: Load required shared libraries.
- Relocations: Support
RELATIVE,GLOB_DAT,JUMP_SLOT. - Entry transfer: Jump to entry point with minimal stack setup.
3.3 Non-Functional Requirements
- Safety: Reject unsupported ELF files cleanly.
- Determinism: Fixed load order and predictable behavior.
- Diagnostics: Clear logs of mapping and relocation.
3.4 Example Usage / Output
$ ./myld ./hello_dynamic
[loader] mapped PT_LOAD segments
[loader] loaded libc.so.6
[loader] resolved 37 relocations
[loader] transferring control...
Hello, world!
3.5 Data Formats / Schemas / Protocols
ELF structures
typedef struct {
unsigned char e_ident[16];
uint16_t e_type;
uint64_t e_entry;
uint64_t e_phoff;
uint16_t e_phnum;
} Elf64_Ehdr;
3.6 Edge Cases
- Non-PIE binaries (reject).
- Unsupported relocation types.
- Missing dependencies.
3.7 Real World Outcome
3.7.1 How to Run (Copy/Paste)
./myld ./tests/fixtures/hello_pie
3.7.2 Golden Path Demo (Deterministic)
- Loads
hello_pieand prints exactlyHello, world!.
3.7.3 CLI Transcript (Success + Failure)
$ ./myld ./tests/fixtures/hello_pie
[loader] mapped PT_LOAD segments
[loader] loaded libc.so.6
[loader] resolved 37 relocations
[loader] transferring control...
Hello, world!
[exit] code=0
$ ./myld ./tests/fixtures/hello_static
[error] unsupported file: not PIE
[exit] code=12
3.7.4 If CLI: Exit Codes
0: success12: unsupported ELF13: unresolved relocation
4. Solution Architecture
4.1 High-Level Design
myld
-> parse ELF
-> map segments
-> load deps
-> relocate
-> jump to entry
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| ELF parser | Read headers | Reject non-PIE |
| Mapper | mmap segments |
Correct permissions |
| Resolver | Symbol lookup | Simple linear search |
| Relocator | Apply relocations | Limited types |
4.3 Data Structures (No Full Code)
typedef struct {
char* path;
void* base;
Elf64_Dyn* dynamic;
} loaded_obj_t;
4.4 Algorithm Overview
Key Algorithm: Load and Relocate
- Map executable segments.
- Parse
.dynamicand load dependencies. - Build global symbol table.
- Apply relocations.
- Jump to entry point.
Complexity Analysis:
- Time: O(V + E + R) where R = relocations.
- Space: O(V).
5. Implementation Guide
5.1 Development Environment Setup
sudo apt-get install build-essential
5.2 Project Structure
myld/
|-- src/
| |-- main.c
| |-- elf.c
| |-- map.c
| |-- reloc.c
| `-- resolve.c
|-- tests/
| `-- fixtures/
|-- Makefile
`-- README.md
5.3 The Core Question You’re Answering
“What actually happens between execve and main()?”
5.4 Concepts You Must Understand First
- ELF headers and program headers.
- Dynamic section and relocations.
- Symbol resolution rules.
- Entry point transfer.
5.5 Questions to Guide Your Design
- Which relocations will you support first?
- How will you locate libc?
- How will you handle unsupported ELF files?
5.6 Thinking Exercise
List the minimum data structures you must parse to load an ELF executable. Which are mandatory?
5.7 The Interview Questions They’ll Ask
- “What is the role of the dynamic linker?”
- “Why do ELF binaries contain a dynamic section?”
- “How do you resolve external symbols?”
5.8 Hints in Layers
Hint 1: Start with RELATIVE relocations
Hint 2: Restrict to PIE binaries
Hint 3: Use readelf to validate your parser
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| ELF loading | Linkers and Loaders | Ch. 10 |
| Relocations | CSAPP | Ch. 7 |
5.10 Implementation Phases
Phase 1: Foundation (1-2 weeks)
- Parse ELF headers and map segments.
Phase 2: Core Functionality (2-3 weeks)
- Load dependencies and implement relocations.
Phase 3: Polish & Edge Cases (1-2 weeks)
- Entry point transfer and deterministic tests.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Binary type | PIE only vs all | PIE only | Simpler mapping |
| Relocations | RELATIVE only vs more | Add JUMP_SLOT | Enough to run simple programs |
| Symbol resolver | hash vs linear | Linear | Simplicity |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | ELF parsing | verify headers |
| Integration Tests | Hello binary | run hello_pie |
| Edge Case Tests | Unsupported files | static binary |
6.2 Critical Test Cases
- PIE binary loads and prints output.
- Unsupported relocation type -> error.
- Missing dependency -> clear error.
6.3 Test Data
hello_pie
hello_static
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Wrong alignment | Segfault | Align to page size |
Missing .bss zero |
Garbage data | Zero p_memsz - p_filesz |
| Unsupported reloc | Crash | Add detection and error |
7.2 Debugging Strategies
- Compare mapping with
readelf -l. - Log every relocation type processed.
7.3 Performance Traps
- Symbol lookup with linear search can be slow for large binaries; acceptable for minimal loader.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add support for
R_X86_64_COPY. - Add simple PLT lazy binding.
8.2 Intermediate Extensions
- Support non-PIE executables.
- Implement symbol versioning.
8.3 Advanced Extensions
- Implement full
ld.sobehavior. - Add support for TLS relocations.
9. Real-World Connections
9.1 Industry Applications
- OS runtime loaders and sandboxed runtimes.
- Security tools that analyze loader behavior.
9.2 Related Open Source Projects
- musl dynamic linker.
- glibc
ld-linux.
9.3 Interview Relevance
- Demonstrates deep knowledge of program startup and linking.
10. Resources
10.1 Essential Reading
- “Linkers and Loaders” (Levine).
- System V ABI (ELF spec).
10.2 Video Resources
- ELF loader internals conference talks.
10.3 Tools & Documentation
readelf,objdump,nm.
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain
PT_LOADmapping. - I can describe how relocations are applied.
- I can explain symbol resolution order.
11.2 Implementation
- My loader runs a PIE “hello” binary.
- Unsupported files are rejected cleanly.
- Relocation logs are deterministic.
11.3 Growth
- I can explain loader internals in an interview.
- I documented at least one tricky relocation bug.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Map PT_LOAD segments and apply RELATIVE relocations.
- Transfer control to entry point for a simple binary.
Full Completion:
- Resolve external symbols and run a dynamically linked binary.
- Clear error reporting for missing dependencies.
Excellence (Going Above & Beyond):
- Support symbol versioning and TLS relocations.
- Run a complex system binary under your loader.