Project 1: Build an ELF/Mach-O Inspector

A command-line tool that reads an object file (.o) or executable and prints its essential metadata: the file headers, the list of sections (like .text, .data, .bss), and the symbols defined or required by the file. Think of it as a simplified readelf or objdump.

Quick Reference

Attribute Value
Primary Language C
Alternative Languages Python, Go, Rust
Difficulty Level 2: Intermediate
Time Estimate 1-2 weeks
Knowledge Area Systems Programming / Binary Formats
Tooling A C compiler (GCC/Clang)
Prerequisites Solid C programming skills, including pointers, structs, and file I/O.

What You Will Build

A command-line tool that reads an object file (.o) or executable and prints its essential metadata: the file headers, the list of sections (like .text, .data, .bss), and the symbols defined or required by the file. Think of it as a simplified readelf or objdump.

Why It Matters

This project builds core skills that appear repeatedly in real-world systems and tooling.

Core Challenges

  • Parsing the main file header → maps to understanding the file’s architecture, type, and entry point
  • Locating and reading the section header table → maps to learning how the file is divided into code, data, etc.
  • Finding the symbol table and string table → maps to figuring out how symbol names are stored and referenced
  • Handling different endianness and word sizes (32/64-bit) → maps to writing portable and robust parsing code

Key Concepts

  • ELF File Format: man 5 elf on Linux is the canonical source.
  • Struct-based Parsing: “The C Programming Language” (K&R) Ch. 6 on structures.
  • File I/O: fopen, fread, fseek are your primary tools.

Real-World Outcome

$ ./my_readelf my_program.o
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 ...
  Class:   ELF64
  Type:    REL (Relocatable file)
  Machine: AMD x86-64

Section Headers:
  [Nr] Name      Type      Address          Offset   Size
  [ 1] .text     PROGBITS  0000000000000000 00000040 0000005a
  [ 2] .data     PROGBITS  0000000000000000 0000009c 00000004
  [ 3] .symtab   SYMTAB    0000000000000000 00000a30 000001b0

Symbol Table '.symtab':
   Num:    Value          Size Type    Bind   Name
     8: 000000000000001a    42 FUNC    GLOBAL my_function
     9: 0000000000000000     0 NOTYPE  GLOBAL my_global_var
    10: 0000000000000000     0 NOTYPE  GLOBAL printf      (UNDEFINED)

Implementation Guide

  1. Reproduce the simplest happy-path scenario.
  2. Build the smallest working version of the core feature.
  3. Add input validation and error handling.
  4. Add instrumentation/logging to confirm behavior.
  5. Refactor into clean modules with tests.

Milestones

  • Milestone 1: Minimal working program that runs end-to-end.
  • Milestone 2: Correct outputs for typical inputs.
  • Milestone 3: Robust handling of edge cases.
  • Milestone 4: Clean structure and documented usage.

Validation Checklist

  • Output matches the real-world outcome example
  • Handles invalid inputs safely
  • Provides clear errors and exit codes
  • Repeatable results across runs

References

  • Main guide: LEARN_C_LINKING_DEEP_DIVE.md
  • “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron