Project 6: Build a Minimal Dynamic Linker

Build a simplified dynamic linker that maps an ELF executable, resolves a minimal set of relocations, and transfers control to main().

Quick Reference

Attribute Value
Difficulty Level 5: Expert
Time Estimate 1 month+
Main Programming Language C (Alternatives: Rust, Zig)
Alternative Programming Languages Rust, Zig
Coolness Level Level 5: Pure Magic
Business Potential Level 4: Infrastructure Model
Prerequisites ELF internals, memory mapping, relocations
Key Topics ELF headers, mapping PT_LOAD, relocations, symbol resolution

1. Learning Objectives

By completing this project, you will:

  1. Parse ELF headers and program headers to map segments correctly.
  2. Load shared libraries and resolve a minimal subset of relocations.
  3. Build a simplified symbol resolution algorithm.
  4. Transfer control to program entry safely.
  5. Explain what happens between execve and main().

2. All Theory Needed (Per-Concept Breakdown)

2.1 ELF Headers, Program Headers, and Memory Mapping

Fundamentals An ELF executable is a structured binary containing headers that describe how it should be loaded. The ELF header identifies the file type, architecture, and entry point. Program headers define memory segments that the loader must map into the process address space. Each PT_LOAD header describes a segment’s file offset, virtual address, size, and permissions. A minimal dynamic linker must parse these headers and map the segments correctly using mmap (or equivalent), honoring alignment and permissions.

Deep Dive into the concept ELF has two main header tables: the section header table (used by linkers) and the program header table (used by loaders). The program header table is the loader’s blueprint. For each PT_LOAD entry, the loader maps the segment into memory at the specified virtual address (or at a base address plus offset if using PIE). The loader must align mappings to p_align and handle cases where p_memsz is larger than p_filesz, which indicates .bss regions that must be zero-initialized. If you map the segment with mmap directly from the file, you must handle the last page carefully: it may contain both file data and zeroed data.

Permissions are critical. Code segments should be read+execute, data segments read+write. If you map with incorrect permissions, the program will crash on execution or fail due to W^X policies. The loader also needs to set up the program break and stack, but a minimal loader can rely on the host process stack and focus on mapping only the executable and its dependencies.

For PIE executables, the loader chooses a base address, then adds the base to all virtual addresses. For non-PIE, addresses are fixed, and mapping at those addresses may require MAP_FIXED. This can be dangerous in a normal process, but for a minimal loader you can run in a separate process or use a reserved address range. A safe approach is to restrict to PIE executables and choose a base via mmap(NULL, ...) then relocate accordingly.

Understanding the layout of segments is crucial. If you map segments incorrectly, relocations will be applied to the wrong addresses, and the program will crash. Use readelf -l to compare with your mapping logic. A minimal loader should at least map the executable’s PT_LOAD segments and any required shared libraries. For a simplified project, you can support only dynamically linked PIE binaries with a small subset of relocations.

How this fits in this project You will parse ELF headers and map PT_LOAD segments, establishing the memory image of the executable before relocations.

Definitions & key terms

  • ELF header -> Top-level metadata about the ELF file.
  • Program header -> Describes loadable segments.
  • PT_LOAD -> A loadable segment.
  • p_filesz / p_memsz -> File vs memory size.

Mental model diagram (ASCII)

ELF file -> program headers -> mmap segments
   PT_LOAD(text)  -> RX
   PT_LOAD(data)  -> RW

How it works (step-by-step, with invariants and failure modes)

  1. Read ELF header and validate magic and class.
  2. Read program headers and iterate PT_LOAD.
  3. Map each segment at correct aligned address.
  4. Zero-fill extra memory for .bss.

Invariants: mappings match alignment and permissions. Failure modes: wrong addresses, missing zero-fill, incorrect permissions.

Minimal concrete example

$ readelf -l ./hello_dynamic
# Use p_vaddr, p_offset, p_filesz, p_memsz to map segments

Common misconceptions

  • “Sections matter at runtime.” -> Loader uses program headers, not sections.
  • “Mapping the file is enough.” -> You must zero .bss and handle alignment.

Check-your-understanding questions

  1. Why is p_memsz sometimes larger than p_filesz?
  2. Why is alignment important for PT_LOAD segments?
  3. Why are program headers more important than sections for loading?

Check-your-understanding answers

  1. It represents .bss which is zero-initialized in memory.
  2. The loader must align mappings to page boundaries for correct permissions.
  3. The loader uses program headers to map memory, sections are for link-time.

Real-world applications

  • Runtime loaders in operating systems.
  • Binary instrumentation tools.

Where you’ll apply it

References

  • System V ABI (ELF spec).
  • “Linkers and Loaders” (Levine), Ch. 10.

Key insights The program header table is the loader’s contract: if you map segments correctly, the rest of the runtime work is possible.

Summary Mapping ELF segments is the first job of a loader. Do it wrong, and everything else fails.

Homework/Exercises to practice the concept

  1. Write a tool that prints all PT_LOAD segments with sizes and permissions.
  2. Compare p_filesz vs p_memsz for a binary.
  3. Identify which segment contains .text vs .data.

Solutions to the homework/exercises

  1. Use readelf -l or parse headers in C.
  2. .bss contributes to p_memsz but not p_filesz.
  3. Use readelf -S and map sections to segments.

2.2 Relocations, GOT/PLT, and Dynamic Section

Fundamentals Relocations are instructions to the loader on how to fix up addresses after a binary is loaded. The dynamic section points to relocation tables (DT_RELA, DT_JMPREL), the dynamic symbol table (DT_SYMTAB), and other metadata required to resolve symbols. The GOT and PLT provide indirection for global variables and function calls. A minimal loader needs to apply a subset of relocations to get the executable running.

Deep Dive into the concept Relocations come in different types. On x86-64, common relocation types include R_X86_64_RELATIVE (adjust addresses by base), R_X86_64_GLOB_DAT (resolve global data), and R_X86_64_JUMP_SLOT (resolve function calls). For a minimal loader, you can start with RELATIVE relocations because they do not require symbol lookup; they simply add the base address to a location. Then you can add JUMP_SLOT to resolve function calls and GLOB_DAT for global variables.

The dynamic section (PT_DYNAMIC) contains tags that point to relocation tables (DT_RELA, DT_RELASZ, DT_JMPREL) and the dynamic symbol table (DT_SYMTAB) and string table (DT_STRTAB). You must parse these tags to locate the tables in memory. Each relocation entry references a symbol index; the symbol table entry provides the symbol name, which you must resolve by searching loaded libraries.

GOT and PLT are used to facilitate lazy binding. For simplicity, you can implement eager binding: resolve JUMP_SLOT entries at load time. This means you do not need to implement the PLT resolver trampoline. Instead, you can treat JUMP_SLOT like GLOB_DAT and write the resolved address directly.

Relocation processing must be done in the correct order: map libraries, then apply relocations. If you apply relocations before all dependencies are loaded, symbol lookup might fail. Your loader should build the dependency graph, map each library, then process relocations for the main executable and each library in load order.

How this fits in this project You will implement a minimal relocation engine that handles RELATIVE, GLOB_DAT, and JUMP_SLOT for x86-64 ELF binaries.

Definitions & key terms

  • Relocation -> Loader patch instruction for addresses.
  • GOT/PLT -> Indirection tables for data and functions.
  • DT_RELA -> Relocation table with addends.
  • DT_JMPREL -> Relocation table for PLT entries.

Mental model diagram (ASCII)

reloc table -> [offset, type, sym] -> resolve -> write address

How it works (step-by-step, with invariants and failure modes)

  1. Parse .dynamic to find relocation tables and symbol table.
  2. For each relocation entry:
    • If RELATIVE: write base + addend.
    • If GLOB_DAT/JUMP_SLOT: resolve symbol and write address.
  3. Continue until all relocations are applied.

Invariants: relocation addresses must be writable; symbols must resolve. Failure modes: missing symbols, wrong base, unsupported relocation type.

Minimal concrete example

$ readelf -r ./hello_dynamic
# Look for R_X86_64_RELATIVE and R_X86_64_JUMP_SLOT

Common misconceptions

  • “Relocations are only for data.” -> Function calls also require relocations.
  • “PLT must be implemented.” -> You can do eager binding for a minimal loader.

Check-your-understanding questions

  1. Why are RELATIVE relocations easier to implement?
  2. What does DT_JMPREL point to?
  3. Why must you load dependencies before resolving symbols?

Check-your-understanding answers

  1. They do not require symbol lookup, only base adjustment.
  2. The relocation table for PLT entries.
  3. Symbol resolution requires the libraries to be mapped and indexed.

Real-world applications

  • Dynamic loaders and runtime linkers.
  • Binary instrumentation tools.

Where you’ll apply it

References

  • System V ABI relocation tables.
  • “Computer Systems: A Programmer’s Perspective” (Bryant/O’Hallaron), Ch. 7.

Key insights Relocations are the final step that makes mapped code executable; without them, even a mapped binary cannot run.

Summary Relocation handling turns a mapped ELF into a runnable program. A minimal loader can start with a small subset and still be functional.

Homework/Exercises to practice the concept

  1. Write a parser that lists relocation types in a binary.
  2. Identify which relocations are RELATIVE vs JUMP_SLOT.
  3. Modify a binary to remove a relocation and observe failure.

Solutions to the homework/exercises

  1. Use readelf -r.
  2. Look at the Type column in relocation output.
  3. Removing relocations will cause crashes or unresolved symbols.

2.3 Symbol Resolution and Global Scope Rules

Fundamentals Symbol resolution is the process of finding the address of a named function or variable in a set of loaded objects. The loader maintains a search order and looks for the first matching symbol definition. Resolution uses the dynamic symbol table (.dynsym) and string table. For a minimal loader, you can implement a simplified global scope search: search the main executable first, then each loaded library in order.

Deep Dive into the concept A symbol table entry includes the symbol name, type (function or object), binding (global, weak), and the value (address or offset). When resolving a symbol, the loader uses binding rules: a strong global definition overrides a weak one. If only a weak symbol is found, it can be used as a fallback. If no definition is found, relocation fails. A robust loader must implement these rules, but a minimal loader can simplify by choosing the first strong definition and treating weak as fallback.

The search order matters. The System V ABI defines a symbol lookup scope based on the “global scope” of loaded objects. Typically, this includes the main executable, then libraries loaded at startup, then libraries loaded later with RTLD_GLOBAL. For your loader, you can build a list of loaded objects in dependency order and search it linearly. This is inefficient but simple and sufficient for small binaries.

Symbol versioning complicates resolution, but you can ignore it in a minimal loader if you restrict to binaries that do not require versioned symbols. Alternatively, you can support the default symbol version and skip version checks. This is a common simplification in educational loaders.

Finally, resolution interacts with relocation types. For JUMP_SLOT, you resolve function symbols; for GLOB_DAT, you resolve data symbols. In both cases, you write the resolved address into the relocation target.

How this fits in this project You will implement a simple symbol resolver that searches loaded libraries in order and resolves GLOB_DAT and JUMP_SLOT relocations.

Definitions & key terms

  • Symbol table -> List of symbols exported by a binary.
  • Weak symbol -> Fallback symbol used if no strong symbol exists.
  • Global scope -> Ordered list of objects searched for symbols.

Mental model diagram (ASCII)

resolve("printf"):
  main exe -> libA -> libc -> found

How it works (step-by-step, with invariants and failure modes)

  1. Build list of loaded objects (main + dependencies).
  2. For each relocation, look up symbol name.
  3. Search objects in order for a strong definition.
  4. If found, write address; if not, fail.

Invariants: symbol tables are parsed correctly. Failure modes: missing symbols, wrong scope order.

Minimal concrete example

void* resolve(const char* name) {
    for (obj in objects) {
        sym = find_symbol(obj, name);
        if (sym) return sym->addr;
    }
    return NULL;
}

Common misconceptions

  • “Symbols are resolved only once.” -> With lazy binding, they can be resolved later.
  • “Weak symbols are ignored.” -> They are used if no strong symbol exists.

Check-your-understanding questions

  1. What is a weak symbol and when is it used?
  2. Why does search order matter?
  3. What happens if a symbol is missing?

Check-your-understanding answers

  1. A weak symbol is a fallback if no strong definition exists.
  2. The first match wins, which can change behavior.
  3. Relocation fails and the program cannot start.

Real-world applications

  • Loader symbol resolution and LD_PRELOAD interposition.

Where you’ll apply it

  • In this project: see Section 4.4 Algorithm Overview and Section 6.2 Critical Test Cases.
  • Also used in: P03-ld-preload-interceptor.

References

  • System V ABI symbol resolution rules.
  • “Linkers and Loaders” (Levine), Ch. 9-10.

Key insights Symbol resolution is deterministic but order-dependent; understanding scope is crucial.

Summary A minimal resolver can be simple and still functional, as long as it respects ordering and binding rules.

Homework/Exercises to practice the concept

  1. Use nm -D to list symbols in libc.
  2. Identify weak vs strong symbols.
  3. Write a small resolver that searches two symbol tables.

Solutions to the homework/exercises

  1. nm -D /lib/x86_64-linux-gnu/libc.so.6 | head.
  2. Weak symbols are marked with W.
  3. Use linear search on arrays of symbol entries.

2.4 Entry Point Transfer and Process Initialization

Fundamentals After mapping segments and applying relocations, the loader transfers control to the program’s entry point. Normally, the kernel jumps into the dynamic linker, which then sets up the process and calls the program’s entry point. For a minimal loader, you can call the entry point directly once relocations are done. This requires setting up the initial stack or running in a process where the stack is already valid.

Deep Dive into the concept The ELF header defines e_entry, the virtual address of the entry point. In dynamically linked programs, this entry point typically points to the program’s _start function, which sets up the C runtime and eventually calls main. If you jump directly to e_entry without setting up the expected environment (argc/argv/envp/auxv), the program may crash. A minimal loader can instead call the program’s main if you can locate it, but that also requires setting up the environment. The simplest approach is to run the loader as a separate process and let it construct a minimal stack with argc=1 and argv[0] pointing to the executable name.

For an educational loader, you can limit the scope: support a simple test binary that does not rely on complex runtime features. You can also design a custom test program that uses a small startup stub expecting a minimal environment. This makes the project feasible while still teaching core loader mechanics.

How this fits in this project You will transfer control to the entry point of a small test binary after relocations, and observe it print “Hello, world!”.

Definitions & key terms

  • Entry point -> Address where execution begins.
  • _start -> Default entry point that sets up the runtime.
  • Auxiliary vector (auxv) -> Kernel-provided runtime info.

Mental model diagram (ASCII)

map -> relocate -> jump to e_entry -> _start -> main

How it works (step-by-step, with invariants and failure modes)

  1. Locate e_entry in ELF header.
  2. Ensure all relocations applied.
  3. Build minimal stack (argc, argv, envp).
  4. Jump to entry point.

Invariants: entry point address is valid and mapped. Failure modes: invalid stack, missing runtime setup.

Minimal concrete example

typedef void (*entry_fn)(int, char**, char**);
entry_fn entry = (entry_fn)(base + ehdr.e_entry);
entry(1, argv, envp);

Common misconceptions

  • “Jumping to entry point always works.” -> It requires correct stack setup.
  • main is the entry point.” -> main is called by _start.

Check-your-understanding questions

  1. What does _start do before calling main?
  2. Why does the stack need to be set up?
  3. Why is e_entry not necessarily main?

Check-your-understanding answers

  1. It sets up argc/argv/envp and initializes the runtime.
  2. The program expects arguments and environment on the stack.
  3. e_entry points to _start, not main.

Real-world applications

  • Custom loaders and sandboxed runtimes.

Where you’ll apply it

  • In this project: see Section 3.7 Real World Outcome and Section 5.10 Phase 3.

References

  • ABI docs for process startup.
  • “Linkers and Loaders” (Levine).

Key insights Relocations alone are not enough; the process environment must match expectations.

Summary Transferring control is the final step, and it requires correct stack setup and awareness of _start.

Homework/Exercises to practice the concept

  1. Write a minimal _start that calls main.
  2. Use objdump -d to find _start in a binary.
  3. Create a test binary that ignores argv/envp and just prints.

Solutions to the homework/exercises

  1. Use inline assembly to call main with a dummy stack.
  2. objdump -d ./hello | grep _start.
  3. Compile with -nostdlib for a minimal program.

3. Project Specification

3.1 What You Will Build

A minimal dynamic linker myld that:

  • Loads a PIE ELF executable and its dependencies.
  • Maps PT_LOAD segments.
  • Applies a limited set of relocations.
  • Transfers control to the entry point.

3.2 Functional Requirements

  1. ELF parsing: Validate ELF headers and program headers.
  2. Memory mapping: Map PT_LOAD segments with correct permissions.
  3. Dependency loading: Load required shared libraries.
  4. Relocations: Support RELATIVE, GLOB_DAT, JUMP_SLOT.
  5. Entry transfer: Jump to entry point with minimal stack setup.

3.3 Non-Functional Requirements

  • Safety: Reject unsupported ELF files cleanly.
  • Determinism: Fixed load order and predictable behavior.
  • Diagnostics: Clear logs of mapping and relocation.

3.4 Example Usage / Output

$ ./myld ./hello_dynamic
[loader] mapped PT_LOAD segments
[loader] loaded libc.so.6
[loader] resolved 37 relocations
[loader] transferring control...
Hello, world!

3.5 Data Formats / Schemas / Protocols

ELF structures

typedef struct {
    unsigned char e_ident[16];
    uint16_t e_type;
    uint64_t e_entry;
    uint64_t e_phoff;
    uint16_t e_phnum;
} Elf64_Ehdr;

3.6 Edge Cases

  • Non-PIE binaries (reject).
  • Unsupported relocation types.
  • Missing dependencies.

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

./myld ./tests/fixtures/hello_pie

3.7.2 Golden Path Demo (Deterministic)

  • Loads hello_pie and prints exactly Hello, world!.

3.7.3 CLI Transcript (Success + Failure)

$ ./myld ./tests/fixtures/hello_pie
[loader] mapped PT_LOAD segments
[loader] loaded libc.so.6
[loader] resolved 37 relocations
[loader] transferring control...
Hello, world!
[exit] code=0

$ ./myld ./tests/fixtures/hello_static
[error] unsupported file: not PIE
[exit] code=12

3.7.4 If CLI: Exit Codes

  • 0: success
  • 12: unsupported ELF
  • 13: unresolved relocation

4. Solution Architecture

4.1 High-Level Design

myld
  -> parse ELF
  -> map segments
  -> load deps
  -> relocate
  -> jump to entry

4.2 Key Components

Component Responsibility Key Decisions
ELF parser Read headers Reject non-PIE
Mapper mmap segments Correct permissions
Resolver Symbol lookup Simple linear search
Relocator Apply relocations Limited types

4.3 Data Structures (No Full Code)

typedef struct {
    char* path;
    void* base;
    Elf64_Dyn* dynamic;
} loaded_obj_t;

4.4 Algorithm Overview

Key Algorithm: Load and Relocate

  1. Map executable segments.
  2. Parse .dynamic and load dependencies.
  3. Build global symbol table.
  4. Apply relocations.
  5. Jump to entry point.

Complexity Analysis:

  • Time: O(V + E + R) where R = relocations.
  • Space: O(V).

5. Implementation Guide

5.1 Development Environment Setup

sudo apt-get install build-essential

5.2 Project Structure

myld/
|-- src/
|   |-- main.c
|   |-- elf.c
|   |-- map.c
|   |-- reloc.c
|   `-- resolve.c
|-- tests/
|   `-- fixtures/
|-- Makefile
`-- README.md

5.3 The Core Question You’re Answering

“What actually happens between execve and main()?”

5.4 Concepts You Must Understand First

  1. ELF headers and program headers.
  2. Dynamic section and relocations.
  3. Symbol resolution rules.
  4. Entry point transfer.

5.5 Questions to Guide Your Design

  1. Which relocations will you support first?
  2. How will you locate libc?
  3. How will you handle unsupported ELF files?

5.6 Thinking Exercise

List the minimum data structures you must parse to load an ELF executable. Which are mandatory?

5.7 The Interview Questions They’ll Ask

  1. “What is the role of the dynamic linker?”
  2. “Why do ELF binaries contain a dynamic section?”
  3. “How do you resolve external symbols?”

5.8 Hints in Layers

Hint 1: Start with RELATIVE relocations

Hint 2: Restrict to PIE binaries

Hint 3: Use readelf to validate your parser

5.9 Books That Will Help

Topic Book Chapter
ELF loading Linkers and Loaders Ch. 10
Relocations CSAPP Ch. 7

5.10 Implementation Phases

Phase 1: Foundation (1-2 weeks)

  • Parse ELF headers and map segments.

Phase 2: Core Functionality (2-3 weeks)

  • Load dependencies and implement relocations.

Phase 3: Polish & Edge Cases (1-2 weeks)

  • Entry point transfer and deterministic tests.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Binary type PIE only vs all PIE only Simpler mapping
Relocations RELATIVE only vs more Add JUMP_SLOT Enough to run simple programs
Symbol resolver hash vs linear Linear Simplicity

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests ELF parsing verify headers
Integration Tests Hello binary run hello_pie
Edge Case Tests Unsupported files static binary

6.2 Critical Test Cases

  1. PIE binary loads and prints output.
  2. Unsupported relocation type -> error.
  3. Missing dependency -> clear error.

6.3 Test Data

hello_pie
hello_static

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Wrong alignment Segfault Align to page size
Missing .bss zero Garbage data Zero p_memsz - p_filesz
Unsupported reloc Crash Add detection and error

7.2 Debugging Strategies

  • Compare mapping with readelf -l.
  • Log every relocation type processed.

7.3 Performance Traps

  • Symbol lookup with linear search can be slow for large binaries; acceptable for minimal loader.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add support for R_X86_64_COPY.
  • Add simple PLT lazy binding.

8.2 Intermediate Extensions

  • Support non-PIE executables.
  • Implement symbol versioning.

8.3 Advanced Extensions

  • Implement full ld.so behavior.
  • Add support for TLS relocations.

9. Real-World Connections

9.1 Industry Applications

  • OS runtime loaders and sandboxed runtimes.
  • Security tools that analyze loader behavior.
  • musl dynamic linker.
  • glibc ld-linux.

9.3 Interview Relevance

  • Demonstrates deep knowledge of program startup and linking.

10. Resources

10.1 Essential Reading

  • “Linkers and Loaders” (Levine).
  • System V ABI (ELF spec).

10.2 Video Resources

  • ELF loader internals conference talks.

10.3 Tools & Documentation

  • readelf, objdump, nm.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain PT_LOAD mapping.
  • I can describe how relocations are applied.
  • I can explain symbol resolution order.

11.2 Implementation

  • My loader runs a PIE “hello” binary.
  • Unsupported files are rejected cleanly.
  • Relocation logs are deterministic.

11.3 Growth

  • I can explain loader internals in an interview.
  • I documented at least one tricky relocation bug.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Map PT_LOAD segments and apply RELATIVE relocations.
  • Transfer control to entry point for a simple binary.

Full Completion:

  • Resolve external symbols and run a dynamically linked binary.
  • Clear error reporting for missing dependencies.

Excellence (Going Above & Beyond):

  • Support symbol versioning and TLS relocations.
  • Run a complex system binary under your loader.