Project 6: Minimal Dynamic Linker (Capstone)
Build a simplified dynamic linker that loads ELF binaries, resolves symbols, and runs
main().
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 1 month+ |
| Language | C (Linux) |
| Prerequisites | ELF parsing, virtual memory, Projects 1-2 |
| Key Topics | mmap, relocations, GOT/PLT, symbol resolution |
1. Learning Objectives
By completing this project, you will:
- Parse ELF headers and program headers.
- Load segments into memory with
mmap. - Resolve dynamic symbols and apply relocations.
- Transfer control to the program entry point.
2. Theoretical Foundation
2.1 Core Concepts
- ELF program headers: Describe loadable segments.
- Relocations: Fix up addresses at runtime for PIC.
- GOT/PLT: Indirection tables used for dynamic calls.
2.2 Why This Matters
Dynamic linking happens before your program even reaches main(). Building a minimal linker reveals exactly how shared libraries are made to work.
2.3 Historical Context / Background
System linkers like ld.so evolved with ELF to support shared code, lazy binding, and versioning. This capstone builds the smallest viable subset.
2.4 Common Misconceptions
- “The kernel loads everything”: The kernel loads the interpreter, which loads the libs.
- “Relocations are optional”: Without relocations, most shared objects cannot run.
3. Project Specification
3.1 What You Will Build
A minimal myld loader that can:
- parse an ELF executable,
- load a dependent shared library,
- resolve a small set of symbols,
- run the target program.
3.2 Functional Requirements
- Parse ELF headers and identify loadable segments.
- Map segments into memory at correct addresses.
- Resolve at least one dynamic symbol from a shared library.
- Apply relocations needed for execution.
3.3 Non-Functional Requirements
- Reliability: Exit cleanly on malformed binaries.
- Performance: Not critical; clarity matters.
- Safety: Validate all offsets and sizes.
3.4 Example Usage / Output
$ ./myld ./hello_world
Hello, World!
3.5 Real World Outcome
Your loader runs a dynamically linked program without the system loader doing the work:
$ ./myld ./hello_world
Hello, World!
4. Solution Architecture
4.1 High-Level Design
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ ELF binary │────▶│ myld loader │────▶│ program main │
└──────────────┘ └──────────────┘ └──────────────┘
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| ELF parser | Read headers and tables | Support 64-bit ELF |
| Loader | mmap segments | Respect permissions |
| Resolver | Symbol lookup | Minimal symbol set |
4.3 Data Structures
typedef struct {
uint64_t vaddr;
uint64_t memsz;
uint64_t filesz;
uint64_t offset;
uint32_t flags;
} segment_t;
4.4 Algorithm Overview
Key Algorithm: Minimal dynamic loading
- Read ELF headers.
- Map PT_LOAD segments.
- Load dependent library and symbols.
- Apply relocations.
- Jump to entry point.
Complexity Analysis:
- Time: O(S + R) for segments and relocations.
- Space: O(S) for mapped segments.
5. Implementation Guide
5.1 Development Environment Setup
gcc --version
readelf --version
5.2 Project Structure
project-root/
├── src/
│ ├── main.c
│ ├── elf.c
│ ├── loader.c
│ ├── reloc.c
│ └── resolver.c
└── Makefile
5.3 The Core Question You’re Answering
“What actually happens between execve and main for a dynamically linked program?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- ELF layout
- Program headers and dynamic section
- Relocations
- RELA records and symbol references
- Memory mapping
mmap, permissions, and alignment
5.5 Questions to Guide Your Design
- Which relocations are mandatory for your target?
- How will you resolve symbols across multiple libs?
- What is the minimal set to run
printf?
5.6 Thinking Exercise
Why must the loader run constructors (.init_array) before main()?
5.7 The Interview Questions They’ll Ask
- What is the role of the dynamic linker?
- How do relocations enable PIC?
- What is the PLT/GOT used for?
5.8 Hints in Layers
Hint 1: Start with static
- Load a static binary first to simplify.
Hint 2: Minimal symbols
- Resolve only
putsorprintfinitially.
Hint 3: Validate mapping
- Use
readelf -lto confirm segment layout.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| ELF internals | “Practical Binary Analysis” | Ch. 2-4 |
| Dynamic linking | “Linkers and Loaders” | Ch. 10 |
| Memory mapping | TLPI | Ch. 49 |
5.10 Implementation Phases
Phase 1: Foundation (1-2 weeks)
Goals:
- Parse ELF and load PT_LOAD segments.
Tasks:
- Parse ELF headers.
- Map segments into memory.
Checkpoint: A static binary can run.
Phase 2: Core Functionality (2-3 weeks)
Goals:
- Resolve symbols and relocations.
Tasks:
- Parse
.dynsymand.rela. - Apply relocations.
Checkpoint: A dynamically linked hello program runs.
Phase 3: Polish & Edge Cases (1-2 weeks)
Goals:
- Handle multiple libs and constructors.
Tasks:
- Load dependent shared libraries.
- Run
.init_arraybefore entry point.
Checkpoint: Simple programs with libc run.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| ELF class | 32-bit vs 64-bit | 64-bit | Modern systems |
| Reloc set | minimal vs full | minimal first | Scope control |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| ELF parsing | Validate headers | Compare with readelf |
| Mapping | Correct memory layout | Verify segment addresses |
| Execution | Program runs | ./hello_world |
6.2 Critical Test Cases
- Static hello binary runs.
- Dynamic hello binary runs with
printf. - Multiple dependencies load without crashes.
6.3 Test Data
hello_world (static and dynamic builds)
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Wrong mmap flags | SIGSEGV | Match segment permissions |
| Missing relocations | Crash at call | Implement required relocation types |
| Bad symbol lookup | Unresolved functions | Verify dynsym parsing |
7.2 Debugging Strategies
- Use
straceto compare with system loader behavior. - Dump mapped addresses and compare to
readelf -l.
7.3 Performance Traps
Not performance-critical; correctness first.
8. Extensions & Challenges
8.1 Beginner Extensions
- Print segment and relocation tables.
8.2 Intermediate Extensions
- Support lazy binding via PLT.
8.3 Advanced Extensions
- Implement symbol versioning checks.
9. Real-World Connections
9.1 Industry Applications
- Loader debugging: Understand crashes before
main. - Security: Loader behavior impacts exploit techniques.
9.2 Related Open Source Projects
- glibc ld.so: Reference implementation.
- musl ldso: Smaller, readable dynamic loader.
9.3 Interview Relevance
- Deep systems knowledge and ELF internals.
10. Resources
10.1 Essential Reading
- System V ABI - ELF specification.
- TLPI - Memory mappings and linking.
10.2 Video Resources
- Search: “build a dynamic linker”.
10.3 Tools & Documentation
- readelf and objdump.
- man mmap, man elf.
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain how ELF segments map to memory.
- I can describe how relocations work.
11.2 Implementation
- My loader runs a simple dynamic binary.
- I can resolve at least one external symbol.
11.3 Growth
- I can reason about loader errors in real systems.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Load a static binary and run it.
Full Completion:
- Run a dynamically linked program with
printf.
Excellence (Going Above & Beyond):
- Load multiple libraries and handle constructors.
This guide was generated from SHARED_LIBRARIES_LEARNING_PROJECTS.md. For the complete learning path, see the parent directory README.