LEARN CSAPP COMPUTER SYSTEMS

Learn Computer Systems: A Programmer’s Perspective (CS:APP)

Goal: Master every concept from Bryant & O’Hallaron’s legendary textbook through hands-on projects that force you to understand how computers actually work—from bits to networking.

Why CS:APP Matters

CS:APP is considered one of the most important computer science textbooks ever written. It bridges the gap between high-level programming and the metal underneath. After completing these projects, you’ll understand:

How integers and floating-point numbers are represented in binary
How C code becomes assembly, and how assembly becomes machine code
How processors fetch, decode, and execute instructions
Why caches matter and how to write cache-friendly code
How the operating system manages memory, processes, and I/O
How linking combines object files into executables
How network programming actually works at the socket level
How to write correct concurrent programs

This isn’t just theory—you’ll build systems that demonstrate each concept.

Core Concept Analysis

CS:APP covers the entire systems stack. Here’s how the concepts map to chapters:

Part	Chapters	Core Concepts
Program Structure & Execution	1-6	Data representation, machine code, processor architecture, memory hierarchy
Running Programs on a System	7-9	Linking, exceptional control flow, virtual memory
Interaction & Communication	10-12	System I/O, network programming, concurrent programming

Fundamental Building Blocks

Data Representation (Ch. 2): Two’s complement, IEEE floating-point, bit manipulation
Machine-Level Programs (Ch. 3): x86-64 assembly, calling conventions, stack frames
Processor Architecture (Ch. 4): Instruction pipelines, hazards, performance
Memory Hierarchy (Ch. 5-6): Caching, locality, memory-aware programming
Linking (Ch. 7): Symbol resolution, relocation, shared libraries
Exceptional Control Flow (Ch. 8): Processes, signals, nonlocal jumps
Virtual Memory (Ch. 9): Address translation, page tables, malloc
System I/O (Ch. 10): Unix I/O, files, RIO package
Network Programming (Ch. 11): Sockets, HTTP, web servers
Concurrency (Ch. 12): Threads, synchronization, parallelism

Project List

The following 17 projects are ordered to build understanding progressively, starting from bits and building up to full systems.

Project 1: Bit Manipulation Puzzle Solver (The Data Lab)

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 1. The “Resume Gold”
Difficulty: Level 1: Beginner
Knowledge Area: Binary Representation / Bit Manipulation
Software or Tool: GCC, Make
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 2

What you’ll build: A collection of functions that perform common operations (absolute value, conditional, floating-point comparison) using ONLY bitwise operators—no if statements, no loops, no comparisons.

Why it teaches data representation: You can’t hide behind abstractions. To implement isPositive(x) using only & | ^ ~ << >>, you must understand exactly how negative numbers work in two’s complement. You’ll discover that the sign bit is just another bit you can manipulate.

Core challenges you’ll face:

Implementing conditionals without if/else → maps to understanding how comparison generates boolean values
Extracting the sign bit correctly → maps to two’s complement representation
Handling floating-point special cases → maps to IEEE 754 format (exponent, mantissa, sign)
Working within operator count limits → maps to understanding bit-level equivalences

Key Concepts:

Two’s Complement: CS:APP Chapter 2.2-2.3 - Bryant & O’Hallaron
IEEE Floating Point: CS:APP Chapter 2.4 - Bryant & O’Hallaron
Bit Manipulation Tricks: Hacker’s Delight Chapter 2 - Henry S. Warren Jr.

Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic C programming (variables, functions, operators), understanding of binary numbers

Real world outcome:

$ ./btest
Score  Rating  Errors  Function
    1       0       bitXor
    1       0       tmin
    2       0       isTmax
    2       0       allOddBits
    2       0       negate
    3       0       isAsciiDigit
    3       0       conditional
    3       0       isLessOrEqual
    4       0       logicalNeg
    4       0       howManyBits
    4       0       floatScale2
    4       0       floatFloat2Int
Total points: 33/33

Implementation Hints:

Think of these puzzles as constraints that force understanding:

For two’s complement negation:

What is -x in terms of bit operations?
Consider: flipping all bits and adding 1
But wait, you can’t use +! How do you add 1 with only bit ops?

For conditional x ? y : z:

You need to create a “mask” that’s all 1s or all 0s
All 1s AND y = y, All 0s AND y = 0
How do you convert any non-zero to all 1s? Think about negation and arithmetic shift

For floating-point operations:

Extract the three parts: sign (1 bit), exponent (8 bits), fraction (23 bits)
Use masking: (x >> 23) & 0xFF gives you the exponent
Handle special cases: infinity, NaN, denormalized

Learning milestones:

You solve basic bitwise puzzles → You understand bit operations as building blocks
You implement conditional without branches → You grasp how CPUs can avoid branches
You manipulate floating-point at bit level → You truly understand IEEE 754

Project 2: Binary Bomb Defuser (Reverse Engineering)

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: x86-64 Assembly (reading), C (understanding)
Alternative Programming Languages: N/A (this is about reading assembly)
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Reverse Engineering / x86-64 Assembly
Software or Tool: GDB, objdump, Ghidra
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 3

What you’ll build: You’ll defuse a “binary bomb”—an executable that expects 6 secret passwords. Enter the wrong one and it “explodes.” You must reverse-engineer the assembly code to discover each password without detonating the bomb.

Why it teaches machine-level programs: There’s no better way to learn assembly than reading real assembly with real stakes. You’ll learn calling conventions, control flow, data structures—all by necessity, not memorization. The bomb uses loops, conditionals, switch statements, recursion, arrays, linked lists, and function pointers.

Core challenges you’ll face:

Reading x86-64 assembly fluently → maps to understanding registers, opcodes, addressing modes
Tracing through recursive functions → maps to stack frame layout and calling conventions
Understanding switch statement compilation → maps to jump tables
Reverse-engineering linked list traversal → maps to pointer arithmetic in assembly

Resources for key challenges:

Binary Bomb Tutorial - Step-by-step walkthrough of bomb lab phases
Stanford CS107 Bomb Lab - Excellent hints and GDB tips

Key Concepts:

x86-64 Registers and Calling Convention: CS:APP Chapter 3.4-3.7 - Bryant & O’Hallaron
Control Flow in Assembly: CS:APP Chapter 3.6 - Bryant & O’Hallaron
GDB Debugging: CS:APP Chapter 3.10.2 - Bryant & O’Hallaron
Procedures and the Stack: CS:APP Chapter 3.7 - Bryant & O’Hallaron

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic C programming, willingness to learn GDB, patience

Real world outcome:

$ ./bomb
Welcome to my fiendish little bomb. You have 6 phases with
which to blow yourself up. Have a nice day!
Border relations with Canada have never been better.
Phase 1 defused. How about the next one?
1 2 4 8 16 32
That's number 2.  Keep going!
1 311
Halfway there!
7 0
So you got that one.  Try this one.
ionefg
Good work!  On to the next...
4 3 2 1 6 5
Congratulations! You've defused the bomb!

Implementation Hints:

Essential GDB commands you’ll use constantly:

(gdb) break explode_bomb          # Stop before explosion
(gdb) break phase_1               # Stop at phase 1 entry
(gdb) run                         # Start the bomb
(gdb) disas                       # Show assembly of current function
(gdb) stepi                       # Execute one instruction
(gdb) info registers              # Show all register values
(gdb) x/s $rdi                    # Print string at address in rdi
(gdb) x/10x $rsp                  # Print 10 hex values at stack pointer

Phase analysis approach:

Set breakpoint at explode_bomb to catch detonation
Set breakpoint at phase_N to see what arguments are passed
Disassemble the phase function
Look for comparison instructions (cmp, test) - these reveal expected values
For string phases, use x/s to read string constants
For numeric phases, trace the arithmetic backward from comparisons

Learning milestones:

You defuse Phase 1 → You can read basic string comparisons in assembly
You defuse Phases 2-3 → You understand loops and conditionals
You defuse Phases 4-5 → You grasp recursion and array indexing
You defuse Phase 6 → You can trace complex pointer manipulation

Project 3: Buffer Overflow Exploit Lab (Attack Lab)

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: x86-64 Assembly (writing shellcode), C
Alternative Programming Languages: N/A
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 1. The “Resume Gold”
Difficulty: Level 3: Advanced
Knowledge Area: Security / Exploitation / Stack Layout
Software or Tool: GDB, objdump, hex editors
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 3

What you’ll build: You’ll exploit buffer overflow vulnerabilities in two programs: one using code injection (writing your own shellcode to the stack and jumping to it), and one using Return-Oriented Programming (chaining existing code snippets to bypass modern protections).

Why it teaches stack discipline: Nothing teaches you how the stack really works like breaking it. You’ll understand exactly why return addresses are stored where they are, how stack frames are laid out, and why security researchers talk about NX bits and ASLR.

Core challenges you’ll face:

Crafting precise stack overflows → maps to understanding buffer layout and stack frame structure
Writing x86-64 shellcode → maps to encoding instructions, calling conventions
Bypassing non-executable stack (ROP) → maps to understanding code reuse attacks
Finding and chaining gadgets → maps to deep instruction-level understanding

Resources for key challenges:

CTF Handbook - ROP - Excellent ROP explanation
Georgia Tech ROP Tutorial - Step-by-step ROP guide

Key Concepts:

Stack Buffer Overflows: CS:APP Chapter 3.10.3 - Bryant & O’Hallaron
Return-Oriented Programming: “Return-Oriented Programming: Systems, Languages, and Applications” - Shacham
x86-64 Encoding: CS:APP Chapter 3.10 - Bryant & O’Hallaron

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Complete Project 2 (Bomb Lab), solid GDB skills, understanding of stack layout

Real world outcome:

$ ./hex2raw < ctarget.l1.txt | ./ctarget -q
Cookie: 0x59b997fa
Type string:Touch1!: You called touch1()
Valid solution for level 1 with target ctarget
PASS: Would have posted the following:
        user id bovik
        course  15213-f15
        lab     attacklab
        result  1:PASS:0xffffffff:ctarget:1:...

Implementation Hints:

Understanding the stack layout is critical:

High addresses
+------------------+
| return address   | ← You overwrite this!
+------------------+
| saved %rbp       |
+------------------+
| local variables  |
| ...              |
| buffer[0]        | ← Your input starts here
+------------------+
Low addresses

For code injection (Phases 1-3):

Find the buffer’s distance from the return address
Craft shellcode that does what you need (e.g., call a function)
Include the shellcode in your input, pad to reach return address
Overwrite return address to point to your shellcode

For ROP (Phases 4-5):

Find “gadgets”—short instruction sequences ending in ret
Chain gadgets by placing their addresses on the stack
Each ret pops the next gadget address and jumps to it
Use ROPgadget or manually search with objdump

Example gadget: mov %rax, %rdi ; ret lets you control the first argument.

Learning milestones:

You complete Level 1 → You understand basic buffer overflow
You inject shellcode (Level 2-3) → You can write and position executable code
You complete ROP attacks (Level 4-5) → You understand modern exploitation

Project 4: Y86-64 Processor Simulator

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C (with HCL - Hardware Control Language)
Alternative Programming Languages: Verilog, VHDL (for hardware-minded)
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 1. The “Resume Gold”
Difficulty: Level 4: Expert
Knowledge Area: CPU Architecture / Pipelining
Software or Tool: Y86-64 Simulator, Tcl/Tk (for GUI)
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 4

What you’ll build: A pipelined processor simulator for the Y86-64 instruction set (a simplified x86-64). You’ll implement instruction fetch, decode, execute, memory, and writeback stages, handling hazards and forwarding.

Why it teaches processor architecture: Abstract diagrams of pipelines don’t stick until you’ve implemented one. When your simulator crashes because of a data hazard you forgot to forward, you’ll never forget what hazards are. When you optimize ncopy and watch CPE drop, you’ll understand the hardware-software interface.

Core challenges you’ll face:

Implementing sequential (SEQ) processor → maps to understanding fetch-decode-execute cycle
Converting to pipelined (PIPE) design → maps to understanding parallelism in hardware
Handling data hazards → maps to forwarding and stalling
Optimizing code for your pipeline → maps to hardware-software co-optimization

Resources for key challenges:

CS:APP Y86-64 Simulator Guide - Official documentation
CS:APP Architecture Lab - Download the lab materials

Key Concepts:

Instruction Pipeline Design: CS:APP Chapter 4.5 - Bryant & O’Hallaron
Pipeline Hazards: CS:APP Chapter 4.5.5 - Bryant & O’Hallaron
Hardware Control Language (HCL): CS:APP Chapter 4.2 - Bryant & O’Hallaron

Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Strong understanding of assembly (Project 2), digital logic basics

Real world outcome:

$ ./ssim -t sdriver.yo
ISA Check Succeeds
CPI: 1.00
Cycles: 42  Instructions: 42

$ ./psim -t ncopy.yo
ISA Check Succeeds
CPI: 9.43
Cycles: 518  Elements: 63  CPE: 8.22

# After optimization
$ ./psim -t ncopy-opt.yo
ISA Check Succeeds
CPI: 6.21
Cycles: 391  Elements: 63  CPE: 6.21

Implementation Hints:

The Y86-64 instruction set is a subset of x86-64:

halt, nop: Control instructions
rrmovq, irmovq, rmmovq, mrmovq: Move instructions
OPq: Integer operations (add, sub, and, xor)
jXX, cmovXX: Conditional jump and move
call, ret, pushq, popq: Stack operations

HCL (Hardware Control Language) example:

# Determine what register to write to (destination)
word dstE = [
    icode in { IRRMOVQ, IIRMOVQ, IOPQ } : rB;
    icode in { IPUSHQ, IPOPQ, ICALL, IRET } : RSP;
    1 : RNONE;
];

Optimization strategies for ncopy:

Loop unrolling: Process multiple elements per iteration
Reducing branches: Fewer mispredictions
Using iaddq: If you add it to your processor

Learning milestones:

SEQ processor works → You understand the basic execution cycle
PIPE processor passes tests → You grasp pipelining fundamentals
You add iaddq instruction → You’ve modified processor hardware
You optimize ncopy below 10 CPE → You understand the HW/SW interface

Project 5: Cache Simulator

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: C++, Rust
Coolness Level: Level 3: Genuinely Clever
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Memory Hierarchy / Cache Design
Software or Tool: Valgrind (for trace generation)
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 6

What you’ll build: A configurable cache simulator that processes memory traces and reports hits, misses, and evictions. Then you’ll optimize a matrix transpose function to minimize cache misses.

Why it teaches memory hierarchy: Cache behavior is invisible in high-level code but dominates performance. By simulating caches, you’ll see exactly why A[i][j] vs A[j][i] matters. When you optimize matrix transpose and watch miss rates plummet, you’ll internalize locality forever.

Core challenges you’ll face:

Implementing cache indexing → maps to set index, tag, block offset
Building LRU replacement → maps to understanding replacement policies
Analyzing memory traces → maps to understanding access patterns
Optimizing for cache blocks → maps to exploiting spatial locality

Resources for key challenges:

Cache Simulator Guide - Modern C++ implementation for reference
CS:APP Cache Lab - Official lab materials

Key Concepts:

Cache Organization: CS:APP Chapter 6.4 - Bryant & O’Hallaron
Locality of Reference: CS:APP Chapter 6.2 - Bryant & O’Hallaron
Cache-Friendly Code: CS:APP Chapter 6.5 - Bryant & O’Hallaron

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: C programming, understanding of memory addresses

Real world outcome:

# Cache simulator output
$ ./csim -v -s 4 -E 1 -b 4 -t traces/yi.trace
L 10,1 miss
L 20,1 miss
L 22,1 hit
S 18,1 hit
L 110,1 miss eviction
L 210,1 miss eviction
M 12,1 miss eviction hit
hits:4 misses:5 evictions:3

# Matrix transpose optimization
$ ./test-trans -M 32 -N 32
Function 0 (2 total)
Step 1: Validating and calculation misses for func 0
func 0 (Transpose submission): hits:1710, misses:287, evictions:255

Implementation Hints:

Cache addressing breakdown (for address A):

| t bits (tag) | s bits (set index) | b bits (block offset) |

Block offset: Which byte within the cache line
Set index: Which set in the cache
Tag: Identifies the specific memory block

Data structure for your cache:

typedef struct {
    int valid;
    unsigned long tag;
    int lru_counter;  // For LRU replacement
} cache_line_t;

typedef struct {
    cache_line_t *lines;  // Array of lines (associativity E)
} cache_set_t;

typedef struct {
    cache_set_t *sets;    // Array of sets (2^s sets)
} cache_t;

For matrix transpose optimization:

Use blocking/tiling: Process submatrices that fit in cache
Block size should match cache line size
Minimize conflict misses by careful addressing

Learning milestones:

Simulator passes all traces → You understand cache lookup mechanics
You can predict hit/miss by hand → You’ve internalized cache behavior
Transpose achieves <300 misses → You write cache-friendly code

Project 6: Dynamic Memory Allocator (Malloc Lab)

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: Rust (for a safe implementation)
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 4: Expert
Knowledge Area: Memory Management / Heap Allocation
Software or Tool: GDB, Valgrind
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 9

What you’ll build: A complete malloc/free/realloc implementation that manages heap memory. You’ll implement free lists, coalescing, splitting, and optimize for both throughput and memory utilization.

Why it teaches virtual memory: malloc is where the rubber meets the road for memory management. You’ll understand fragmentation (both internal and external), why coalescing matters, and how real allocators balance speed vs. space. After this, you’ll never misuse malloc again.

Core challenges you’ll face:

Implementing sbrk-based heap extension → maps to understanding the program break
Managing free lists → maps to pointer manipulation and data structures
Coalescing free blocks → maps to preventing external fragmentation
Optimizing allocation strategy → maps to first-fit vs. best-fit tradeoffs

Resources for key challenges:

Memory Allocators 101 - Clear beginner tutorial
CS341 Malloc Guide - Comprehensive coverage
Custom Malloc Implementation - Reference implementation

Key Concepts:

Dynamic Memory Allocation: CS:APP Chapter 9.9 - Bryant & O’Hallaron
Implicit Free Lists: CS:APP Chapter 9.9.6 - Bryant & O’Hallaron
Explicit Free Lists: CS:APP Chapter 9.9.13 - Bryant & O’Hallaron
Segregated Free Lists: CS:APP Chapter 9.9.14 - Bryant & O’Hallaron

Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Strong C pointer skills, understanding of heap vs. stack

Real world outcome:

$ ./mdriver -V
Team Name: your_team
Member 1 : Your Name:youremail@domain.com

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.000556 10245
 1       yes   99%    5848  0.000535 10926
 2       yes   99%    6648  0.001099  6050
 3       yes  100%    5380  0.000629  8552
 4       yes  100%   14400  0.000564 25531
 5       yes   95%    4800  0.001082  4436
 6       yes   94%    4800  0.001005  4777
 7       yes   55%   12000  0.004682  2563
 8       yes   51%   24000  0.006420  3738
 9       yes   31%   14401  0.029809   483
10       yes   45%   14401  0.001159 12424
Total          79%  112372  0.047540  2364

Perf index = 47 (util) + 40 (thru) = 87/100

Implementation Hints:

Block structure with boundary tags:

+------------------+
| Header (8 bytes) | ← Size + allocated bit
+------------------+
| Payload          |
| ...              |
+------------------+
| Padding          | ← For alignment
+------------------+
| Footer (8 bytes) | ← Copy of header (for coalescing)
+------------------+

Coalescing cases:

Previous and next both allocated → No coalescing
Previous allocated, next free → Merge with next
Previous free, next allocated → Merge with previous
Both free → Merge all three

Key optimizations:

Use explicit free lists (doubly-linked)
Segregated free lists by size class
Best-fit within size class, first-fit across classes

Learning milestones:

Basic malloc/free works → You understand heap management
Coalescing prevents fragmentation → You grasp memory efficiency
Explicit free lists speed up allocation → You understand performance tradeoffs
Score > 80 → You’ve built a production-quality allocator

Project 7: Unix Shell Implementation

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: Level 3: Genuinely Clever
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 2: Intermediate
Knowledge Area: Processes / System Calls / Job Control
Software or Tool: fork, exec, wait, signal handlers
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 8

What you’ll build: A functional Unix shell that can run commands, handle pipelines (ls | grep foo), manage background jobs (sleep 10 &), and handle signals (Ctrl+C, Ctrl+Z) correctly.

Why it teaches exceptional control flow: A shell exercises the entire process model: fork, exec, wait, signals, process groups. When you handle SIGCHLD to reap zombie processes, you’ll understand why Unix works the way it does. When job control finally works, you’ll deeply understand process states.

Core challenges you’ll face:

Parsing command lines → maps to understanding shell syntax
Creating child processes with fork → maps to process creation semantics
Executing programs with exec → maps to program loading
Handling signals safely → maps to async-signal-safe functions, race conditions
Managing foreground/background jobs → maps to process groups, terminal control

Resources for key challenges:

Write a Shell in C - Stephen Brennan’s excellent tutorial
YoLinux Fork/Exec Tutorial - Process creation guide

Key Concepts:

Process Control: CS:APP Chapter 8.4 - Bryant & O’Hallaron
Signals: CS:APP Chapter 8.5 - Bryant & O’Hallaron
Nonlocal Jumps: CS:APP Chapter 8.6 - Bryant & O’Hallaron

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic C, understanding of processes, willingness to read man pages

Real world outcome:

$ ./tsh
tsh> /bin/ls -l
total 72
-rw-r--r--  1 user user  1024 Dec 20 10:00 Makefile
-rw-r--r--  1 user user  8192 Dec 20 10:00 tsh.c
-rwxr-xr-x  1 user user 32768 Dec 20 10:00 tsh
tsh> /bin/sleep 10 &
[1] (12345) /bin/sleep 10 &
tsh> jobs
[1] (12345) Running /bin/sleep 10 &
tsh> fg %1
tsh> ^C
tsh> /bin/echo hello | /bin/cat
hello
tsh> quit

Implementation Hints:

Core shell loop:

while (1) {
    print_prompt();
    read_command_line();
    parse_into_tokens();

    if (builtin_command()) {
        handle_builtin();  // cd, jobs, fg, bg, quit
    } else {
        pid = fork();
        if (pid == 0) {
            // Child: set up redirections, exec
            execve(program, argv, environ);
        } else {
            // Parent: add to job list, wait or return
        }
    }
}

Why builtins must be handled by the shell itself:

cd changes directory—but child’s directory change doesn’t affect parent
jobs accesses the shell’s internal job list
exit terminates the shell process itself

Signal handling subtleties:

Block signals before fork, unblock after updating job list
Use sigsuspend to wait for signals atomically
Only call async-signal-safe functions in handlers

Learning milestones:

Basic commands execute → You understand fork/exec
Background jobs work → You grasp process groups
Ctrl+C/Ctrl+Z work correctly → You’ve mastered signal handling
Pipelines work → You understand I/O redirection and dup2

Project 8: ELF Linker and Loader

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: Rust
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 1. The “Resume Gold”
Difficulty: Level 4: Expert
Knowledge Area: Linking / Object Files / ELF Format
Software or Tool: objdump, readelf, nm
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 7

What you’ll build: A simple static linker that takes multiple ELF relocatable object files (.o) and combines them into an executable. You’ll handle symbol resolution, relocation, and section merging.

Why it teaches linking: Most programmers treat the linker as a black box. After building one, you’ll understand why “undefined reference” errors happen, how static vs. dynamic linking differs, and what those mysterious .data, .text, .bss sections actually are.

Core challenges you’ll face:

Parsing ELF headers and sections → maps to understanding object file format
Resolving symbols across object files → maps to symbol table management
Performing relocations → maps to address patching
Generating a valid executable → maps to executable file format

Resources for key challenges:

Build Your Own Linker - Step-by-step linker project
ELF Format Guide - UC Irvine tutorial

Key Concepts:

Static Linking: CS:APP Chapter 7.6 - Bryant & O’Hallaron
Symbol Resolution: CS:APP Chapter 7.6 - Bryant & O’Hallaron
Relocation: CS:APP Chapter 7.7 - Bryant & O’Hallaron
ELF Format: “Linkers and Loaders” Chapter 3 - John R. Levine

Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Understanding of x86-64 assembly, file I/O, binary data parsing

Real world outcome:

$ ./myld -o hello main.o helper.o
Linking 2 object files...
Resolved 5 symbols
Applied 8 relocations
Generated executable: hello (4096 bytes)

$ ./hello
Hello from my custom linker!

$ readelf -h hello
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 ...
  Class:                             ELF64
  Type:                              EXEC (Executable file)
  Entry point address:               0x401000

Implementation Hints:

ELF structure overview:

+------------------+
| ELF Header       | ← Magic number, entry point, section table offset
+------------------+
| Program Headers  | ← Segments for loading (for executables)
+------------------+
| .text            | ← Executable code
+------------------+
| .rodata          | ← Read-only data (string literals)
+------------------+
| .data            | ← Initialized global variables
+------------------+
| .bss             | ← Uninitialized globals (zero-filled)
+------------------+
| .symtab          | ← Symbol table
+------------------+
| .strtab          | ← String table (symbol names)
+------------------+
| .rela.text       | ← Relocations for .text
+------------------+
| Section Headers  | ← Describes each section
+------------------+

Linker algorithm:

Read all input object files
Merge like sections (all .text together, all .data together)
Build global symbol table, resolve undefined references
Calculate final addresses for each section
Apply relocations (patch addresses in code/data)
Write output executable

Relocation types (x86-64):

R_X86_64_PC32: PC-relative 32-bit (for call/jmp)
R_X86_64_32: Absolute 32-bit
R_X86_64_64: Absolute 64-bit

Learning milestones:

You parse ELF sections correctly → You understand object file format
Symbol resolution works → You grasp linking’s core problem
Relocations produce correct addresses → You understand address patching
Your executable runs → You’ve built a working linker

Project 9: Virtual Memory Simulator

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: C++, Rust
Coolness Level: Level 3: Genuinely Clever
Business Potential: 1. The “Resume Gold”
Difficulty: Level 3: Advanced
Knowledge Area: Virtual Memory / Page Tables / TLB
Software or Tool: Custom simulator
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 9

What you’ll build: A virtual memory system simulator that implements multi-level page tables, TLB caching, page replacement algorithms (LRU, FIFO, Clock), and handles page faults with demand paging.

Why it teaches virtual memory: Virtual memory is one of the most elegant abstractions in computing, but it’s invisible at the application level. By simulating it, you’ll understand address translation, why TLBs matter, and how the OS juggles physical memory. You’ll see why programs can use more memory than physically exists.

Core challenges you’ll face:

Implementing multi-level page tables → maps to hierarchical address translation
Building a TLB with LRU replacement → maps to caching address translations
Handling page faults → maps to demand paging, page selection
Implementing page replacement → maps to LRU/FIFO/Clock algorithms

Resources for key challenges:

Virtual Memory Simulator - Reference implementation
Hierarchical Page Table Simulation - Multi-level page table project

Key Concepts:

Address Translation: CS:APP Chapter 9.6 - Bryant & O’Hallaron
Multi-level Page Tables: CS:APP Chapter 9.6.3 - Bryant & O’Hallaron
TLB: CS:APP Chapter 9.6.2 - Bryant & O’Hallaron
Page Replacement: Operating Systems: Three Easy Pieces, Chapter 22

Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Understanding of memory addressing, data structures

Real world outcome:

$ ./vmsim -a lru -n 4 -p 4096 trace.txt
Number of frames: 4
Page size: 4096 bytes
Algorithm: LRU

Memory Access Trace:
  Address 0x00001234: Page 1, TLB miss, Page hit
  Address 0x00002345: Page 2, TLB miss, Page hit
  Address 0x00003456: Page 3, TLB miss, Page hit
  Address 0x00004567: Page 4, TLB miss, Page fault
  Address 0x00001234: Page 1, TLB hit

Statistics:
  Total accesses:     100000
  TLB hits:           87234 (87.2%)
  TLB misses:         12766 (12.8%)
  Page hits:          99012 (99.0%)
  Page faults:        988 (1.0%)
  Pages written back: 234

Implementation Hints:

Address translation with multi-level page tables:

Virtual Address (48 bits used):
+-------+-------+-------+-------+-----------+
| VPN1  | VPN2  | VPN3  | VPN4  |  Offset   |
| 9 bits| 9 bits| 9 bits| 9 bits| 12 bits   |
+-------+-------+-------+-------+-----------+

Page Table Walk:
1. CR3 → Level-1 page table base
2. PTE1 = PageTable1[VPN1] → Level-2 base
3. PTE2 = PageTable2[VPN2] → Level-3 base
4. PTE3 = PageTable3[VPN3] → Level-4 base
5. PTE4 = PageTable4[VPN4] → Physical frame number
6. Physical Address = Frame << 12 | Offset

TLB structure:

typedef struct {
    unsigned long vpn;
    unsigned long pfn;
    int valid;
    int dirty;
    int lru_counter;
} tlb_entry_t;

Clock algorithm for page replacement:

Pages arranged in circular buffer
Each page has a “reference bit”
Hand sweeps around, clearing reference bits
Evict first page found with reference bit = 0

Learning milestones:

Single-level page table works → You understand address translation basics
Multi-level tables reduce memory → You see why hierarchy matters
TLB dramatically improves performance → You appreciate caching
Page replacement minimizes faults → You understand virtual memory management

Project 10: Robust I/O Library (RIO)

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: Rust
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 2: Intermediate
Knowledge Area: System I/O / File Descriptors / Error Handling
Software or Tool: Unix system calls (read, write, open)
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 10

What you’ll build: A robust I/O library that handles short counts, interrupted system calls, and buffered reading. This is the foundation for reliable network and file I/O.

Why it teaches system I/O: The Unix I/O interface is simple (read/write/open/close) but deceptive. Real programs must handle short counts, signals interrupting I/O, and buffering for efficiency. Building RIO teaches you to never trust a single read() call.

Core challenges you’ll face:

Handling short counts → maps to understanding why read returns fewer bytes
Handling EINTR → maps to signals can interrupt system calls
Implementing buffered reading → maps to efficiency and line-at-a-time reading
Designing robust APIs → maps to error handling philosophy

Key Concepts:

Unix I/O: CS:APP Chapter 10.1-10.4 - Bryant & O’Hallaron
Robust I/O: CS:APP Chapter 10.5 - Bryant & O’Hallaron
Standard I/O vs. Unix I/O: CS:APP Chapter 10.10 - Bryant & O’Hallaron

Difficulty: Intermediate Time estimate: Weekend Prerequisites: Basic C, understanding of file descriptors

Real world outcome:

// Using your RIO library
rio_t rio;
char buf[MAXLINE];

rio_readinitb(&rio, fd);

// Read exactly n bytes (handles short counts)
rio_readn(fd, buf, n);

// Read a line (handles buffering)
while (rio_readlineb(&rio, buf, MAXLINE) > 0) {
    printf("%s", buf);
}

$ ./test-rio < /usr/share/dict/words
Processed 99171 lines
Bytes read: 938848
Buffered reads: 938848, Syscalls: 230
Efficiency: 4082 bytes/syscall

Implementation Hints:

The problem with raw read():

// This is WRONG for networks/pipes:
char buf[1024];
read(fd, buf, 1024);  // Might return 10 bytes, not 1024!

Unbuffered robust read (rio_readn):

ssize_t rio_readn(int fd, void *usrbuf, size_t n) {
    size_t nleft = n;
    ssize_t nread;
    char *bufp = usrbuf;

    while (nleft > 0) {
        if ((nread = read(fd, bufp, nleft)) < 0) {
            if (errno == EINTR)  // Interrupted by signal
                nread = 0;       // Call read() again
            else
                return -1;
        } else if (nread == 0) { // EOF
            break;
        }
        nleft -= nread;
        bufp += nread;
    }
    return (n - nleft);
}

Buffered reading structure:

typedef struct {
    int rio_fd;              // File descriptor
    int rio_cnt;             // Unread bytes in buffer
    char *rio_bufptr;        // Next unread byte
    char rio_buf[RIO_BUFSIZE]; // Internal buffer
} rio_t;

Learning milestones:

rio_readn handles short counts → You understand robust reading
rio_writen handles short counts → You understand robust writing
Buffered reading works → You appreciate efficiency
Line reading handles edge cases → You’ve built a production-quality library

Project 11: HTTP Web Server

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: Level 3: Genuinely Clever
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 2: Intermediate
Knowledge Area: Network Programming / HTTP Protocol / Sockets
Software or Tool: BSD sockets, HTTP/1.1
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 11

What you’ll build: A functional HTTP/1.1 web server that serves static files (HTML, images, CSS) and handles dynamic content via CGI. The server will parse HTTP requests, generate proper responses, and handle multiple clients.

Why it teaches network programming: Web servers are the canonical network application. Building one teaches socket programming, protocol parsing, and the client-server model. You’ll understand what happens between typing a URL and seeing a page.

Core challenges you’ll face:

Creating listening sockets → maps to socket, bind, listen, accept
Parsing HTTP requests → maps to protocol format, header parsing
Serving static content → maps to MIME types, file I/O
Running CGI programs → maps to fork, exec, environment variables

Key Concepts:

Socket Interface: CS:APP Chapter 11.4 - Bryant & O’Hallaron
HTTP Protocol: CS:APP Chapter 11.5 - Bryant & O’Hallaron
CGI: CS:APP Chapter 11.5.4 - Bryant & O’Hallaron
Web Server Architecture: CS:APP Chapter 11.6 - Bryant & O’Hallaron

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: RIO library (Project 10), understanding of processes

Real world outcome:

$ ./tiny 8080 &
[1] 12345

$ curl -v http://localhost:8080/index.html
*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080
> GET /index.html HTTP/1.1
> Host: localhost:8080
>
< HTTP/1.1 200 OK
< Server: Tiny Web Server
< Content-length: 1234
< Content-type: text/html
<
<!DOCTYPE html>
<html>
<head><title>Tiny Server</title></head>
<body><h1>Welcome to Tiny!</h1></body>
</html>

$ curl http://localhost:8080/cgi-bin/adder?1&2
The sum is: 1 + 2 = 3

Implementation Hints:

Server main loop:

int listenfd = open_listenfd(port);
while (1) {
    connfd = accept(listenfd, ...);
    handle_request(connfd);
    close(connfd);
}

HTTP request format:

GET /index.html HTTP/1.1\r\n
Host: localhost:8080\r\n
User-Agent: curl/7.64.0\r\n
\r\n

HTTP response format:

HTTP/1.1 200 OK\r\n
Server: Tiny Web Server\r\n
Content-Type: text/html\r\n
Content-Length: 1234\r\n
\r\n
<html>...</html>

CGI execution:

Parse query string from URL
Fork a child process
Set environment variables (QUERY_STRING, etc.)
Redirect stdout to client socket
Exec the CGI program
Program output goes directly to client

Learning milestones:

Server accepts connections → You understand socket basics
Static files are served → You grasp HTTP request/response
CGI programs run → You understand dynamic content
Multiple MIME types work → You’ve built a real web server

Project 12: Concurrent Web Proxy

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 3: Advanced
Knowledge Area: Concurrency / Caching / Network Programming
Software or Tool: pthreads, mutex, semaphores
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapters 11-12

What you’ll build: A multi-threaded caching web proxy that sits between browsers and servers. It handles multiple clients concurrently, caches responses using LRU policy, and must be free of race conditions and deadlocks.

Why it teaches concurrent programming: A proxy must handle many clients simultaneously while accessing shared caches. You’ll face the classic challenges: race conditions when updating the cache, reader-writer synchronization, and thread-safe logging. This is where theoretical concurrency becomes practical.

Core challenges you’ll face:

Handling concurrent clients → maps to thread creation, thread pools
Implementing thread-safe cache → maps to mutexes, reader-writer locks
Avoiding deadlock → maps to lock ordering, careful design
Parsing/forwarding HTTP → maps to protocol handling

Resources for key challenges:

Concurrent Web Proxy Tutorial - Implementation guide
CMU Proxy Lab - Reference implementation

Key Concepts:

Threads: CS:APP Chapter 12.3 - Bryant & O’Hallaron
Semaphores: CS:APP Chapter 12.5 - Bryant & O’Hallaron
Thread Safety: CS:APP Chapter 12.7 - Bryant & O’Hallaron
Deadlock: CS:APP Chapter 12.7.2 - Bryant & O’Hallaron

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Web server (Project 11), understanding of threads

Real world outcome:

$ ./proxy 8080 &
Proxy listening on port 8080

# Configure browser to use localhost:8080 as proxy
# Browse to various websites...

$ cat proxy.log
[Thread 1] GET http://example.com/ -> 200 (1234 bytes, cached)
[Thread 2] GET http://example.com/style.css -> 200 (567 bytes, cached)
[Thread 3] GET http://other.com/ -> 200 (2345 bytes, from server)
[Thread 1] GET http://example.com/ -> 200 (1234 bytes, cache hit!)

Cache statistics:
  Total requests: 1000
  Cache hits: 234 (23.4%)
  Bytes saved: 2.3 MB

Implementation Hints:

Thread-per-connection model:

while (1) {
    connfd = accept(listenfd, ...);
    pthread_create(&tid, NULL, thread_handler, connfd);
}

void *thread_handler(void *arg) {
    pthread_detach(pthread_self());
    int connfd = (int)arg;
    handle_request(connfd);
    close(connfd);
    return NULL;
}

Cache with reader-writer lock:

typedef struct {
    char *url;
    char *content;
    int size;
    struct cache_entry *next;
    struct cache_entry *prev;
} cache_entry_t;

pthread_rwlock_t cache_lock;

// Multiple readers can read simultaneously
char *cache_get(char *url) {
    pthread_rwlock_rdlock(&cache_lock);
    // Search cache...
    pthread_rwlock_unlock(&cache_lock);
}

// Writers need exclusive access
void cache_insert(char *url, char *content) {
    pthread_rwlock_wrlock(&cache_lock);
    // Insert into cache, evict if necessary...
    pthread_rwlock_unlock(&cache_lock);
}

LRU cache eviction:

Use doubly-linked list with head and tail
On access, move entry to front
On eviction, remove from tail
Track total size, evict until under limit

Learning milestones:

Sequential proxy works → You understand HTTP forwarding
Concurrent handling works → You can create and manage threads
Cache with synchronization → You’ve solved reader-writer problem
No race conditions or deadlocks → You’ve mastered concurrent programming

Project 13: Thread Pool Implementation

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go, Java
Coolness Level: Level 3: Genuinely Clever
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 3: Advanced
Knowledge Area: Concurrency / Producer-Consumer / Work Queues
Software or Tool: pthreads, condition variables
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 12

What you’ll build: A reusable thread pool library that maintains a fixed number of worker threads, accepts tasks through a work queue, and executes them concurrently. This pattern is used everywhere from web servers to database engines.

Why it teaches thread synchronization: Thread pools combine every concurrency primitive: mutexes for queue access, condition variables for blocking/waking, and careful shutdown semantics. Building one teaches you the producer-consumer pattern deeply.

Core challenges you’ll face:

Implementing the work queue → maps to producer-consumer pattern
Blocking when queue is empty → maps to condition variables
Graceful shutdown → maps to thread termination, cleanup
Avoiding thundering herd → maps to efficient signaling

Key Concepts:

Thread Pools: CS:APP Chapter 12.6.4 - Bryant & O’Hallaron
Producer-Consumer Pattern: CS:APP Chapter 12.5.2 - Bryant & O’Hallaron
Condition Variables: CS:APP Chapter 12.5.3 - Bryant & O’Hallaron

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Understanding of threads, mutexes, condition variables

Real world outcome:

// Create a pool with 4 worker threads
threadpool_t *pool = threadpool_create(4);

// Submit 100 tasks
for (int i = 0; i < 100; i++) {
    threadpool_submit(pool, compute_task, &data[i]);
}

// Wait for all tasks to complete
threadpool_wait(pool);

// Shutdown cleanly
threadpool_destroy(pool);

$ ./test-threadpool
Created thread pool with 4 workers
Submitted 100 tasks
[Worker 0] Completed task 1
[Worker 1] Completed task 2
[Worker 2] Completed task 3
...
All 100 tasks completed in 2.3 seconds
(Sequential would take ~25 seconds)
Speedup: 10.8x

Implementation Hints:

Core data structures:

typedef struct {
    void (*function)(void *);
    void *argument;
} task_t;

typedef struct {
    task_t *tasks;        // Circular buffer
    int head, tail;       // Queue indices
    int size, capacity;   // Queue state

    pthread_mutex_t lock;
    pthread_cond_t not_empty;
    pthread_cond_t not_full;

    pthread_t *threads;   // Worker threads
    int num_threads;
    int shutdown;         // Shutdown flag
} threadpool_t;

Worker thread function:

void *worker(void *arg) {
    threadpool_t *pool = arg;

    while (1) {
        pthread_mutex_lock(&pool->lock);

        // Wait for work or shutdown
        while (pool->size == 0 && !pool->shutdown) {
            pthread_cond_wait(&pool->not_empty, &pool->lock);
        }

        if (pool->shutdown && pool->size == 0) {
            pthread_mutex_unlock(&pool->lock);
            return NULL;
        }

        // Dequeue task
        task_t task = pool->tasks[pool->head];
        pool->head = (pool->head + 1) % pool->capacity;
        pool->size--;

        pthread_cond_signal(&pool->not_full);
        pthread_mutex_unlock(&pool->lock);

        // Execute task
        task.function(task.argument);
    }
}

Graceful shutdown:

Set shutdown flag
Broadcast to all waiting threads
Join all threads
Free resources

Learning milestones:

Basic submit/execute works → You understand work queues
Blocking works correctly → You’ve mastered condition variables
Graceful shutdown → You can terminate threads safely
High performance → You’ve built production-quality code

Project 14: Signal-Safe Printf

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: Rust
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 3: Advanced
Knowledge Area: Signals / Async-Signal-Safety / System Programming
Software or Tool: Unix signals, write()
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 8

What you’ll build: A printf replacement that’s safe to call from signal handlers. The standard printf uses locks and heap allocation—both forbidden in signal handlers. Your version uses only async-signal-safe functions.

Why it teaches signal safety: Signal handlers are one of the trickiest parts of Unix programming. They can interrupt your program anywhere, including inside malloc or printf. Learning what’s safe in handlers is crucial for robust systems programming.

Core challenges you’ll face:

Converting integers without sprintf → maps to understanding why sprintf isn’t safe
Using only write() for output → maps to async-signal-safe function list
Handling format specifiers → maps to parsing without malloc
Being reentrant → maps to no global state

Key Concepts:

Signal Handlers: CS:APP Chapter 8.5.3 - Bryant & O’Hallaron
Async-Signal-Safe Functions: CS:APP Chapter 8.5.5 - Bryant & O’Hallaron
Reentrant Functions: CS:APP Chapter 12.7.1 - Bryant & O’Hallaron

Difficulty: Advanced Time estimate: Weekend Prerequisites: Understanding of signals, low-level I/O

Real world outcome:

void sigint_handler(int sig) {
    // UNSAFE: printf("Caught signal %d\n", sig);

    // SAFE: Our signal-safe printf
    sio_printf("Caught signal %d\n", sig);
    _exit(0);
}

int main() {
    signal(SIGINT, sigint_handler);
    while (1) pause();
}

$ ./test-sio
Program running... (press Ctrl+C)
^CCaught signal 2

Implementation Hints:

Why printf is unsafe:

Uses malloc for formatting (not reentrant)
Uses internal locks (could deadlock if signal arrives while holding lock)
Has static state (not reentrant)

Integer to string without sprintf:

void sio_ltoa(long v, char *s, int base) {
    char digits[] = "0123456789abcdef";
    int neg = (v < 0);
    char buf[32];
    int i = 0;

    if (neg) v = -v;

    do {
        buf[i++] = digits[v % base];
        v /= base;
    } while (v > 0);

    int j = 0;
    if (neg) s[j++] = '-';
    while (i > 0) s[j++] = buf[--i];
    s[j] = '\0';
}

Signal-safe output:

ssize_t sio_puts(char s[]) {
    return write(STDOUT_FILENO, s, strlen(s));
}

List of async-signal-safe functions (from POSIX):

_exit, write, read, open, close
signal, sigaction, sigemptyset, sigfillset
fork, execve, waitpid
NOT: printf, malloc, free, exit

Learning milestones:

Integer output works → You can convert without sprintf
String output works → You use only write()
Format specifiers work → You’ve built a mini-printf
No crashes in handlers → You write signal-safe code

Project 15: Performance Profiler

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: C++, Rust
Coolness Level: Level 3: Genuinely Clever
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 3: Advanced
Knowledge Area: Performance / Profiling / CPU Counters
Software or Tool: SIGPROF, rdtsc, perf_event_open
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapter 5

What you’ll build: A sampling profiler that periodically interrupts your program, records where it is (the instruction pointer and call stack), and generates a profile showing where time is spent.

Why it teaches performance optimization: Understanding performance requires measurement. By building a profiler, you’ll understand how tools like perf and gprof work. You’ll learn about CPU cycles, sampling, and statistical profiling.

Core challenges you’ll face:

Periodic sampling with signals → maps to SIGPROF and setitimer
Walking the call stack → maps to frame pointers, unwinding
Measuring with cycle counters → maps to rdtsc instruction
Generating useful output → maps to aggregation, reporting

Key Concepts:

Performance Measurement: CS:APP Chapter 5.13 - Bryant & O’Hallaron
CPU Cycle Counters: CS:APP Chapter 5.13.1 - Bryant & O’Hallaron
Amdahl’s Law: CS:APP Chapter 5.12 - Bryant & O’Hallaron

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Understanding of signals, assembly, stack frames

Real world outcome:

$ ./profiler ./my-program
Running with sampling profiler...
Program completed.

Profile:
  45.2%  compute_matrix     src/matrix.c:123
  23.1%  sort_data          src/sort.c:45
  12.4%  read_input         src/io.c:78
   8.3%  write_output       src/io.c:156
   5.2%  main               src/main.c:23
   5.8%  (other)

Total samples: 10000
Sample rate: 1000 Hz

Implementation Hints:

Sampling with SIGPROF:

void profile_handler(int sig, siginfo_t *info, void *ctx) {
    ucontext_t *uctx = ctx;
    void *ip = (void *)uctx->uc_mcontext.gregs[REG_RIP];

    // Record this instruction pointer
    record_sample(ip);
}

void start_profiling() {
    struct sigaction sa = {
        .sa_sigaction = profile_handler,
        .sa_flags = SA_SIGINFO
    };
    sigaction(SIGPROF, &sa, NULL);

    struct itimerval timer = {
        .it_interval = {0, 1000},  // 1ms = 1000 Hz
        .it_value = {0, 1000}
    };
    setitimer(ITIMER_PROF, &timer, NULL);
}

Stack walking (simplified, with frame pointers):

void walk_stack(void **buffer, int max) {
    void **bp = __builtin_frame_address(0);
    int i = 0;

    while (bp && i < max) {
        buffer[i++] = *(bp + 1);  // Return address
        bp = *bp;                  // Previous frame
    }
}

Reading cycle counter:

static inline uint64_t rdtsc() {
    unsigned int lo, hi;
    __asm__ volatile ("rdtsc" : "=a" (lo), "=d" (hi));
    return ((uint64_t)hi << 32) | lo;
}

Learning milestones:

Sampling works → You understand timer signals
IP recording works → You can identify hot functions
Stack walking works → You can see call chains
Output is useful → You’ve built a real profiler

Project 16: Memory Leak Detector

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: C++
Coolness Level: Level 3: Genuinely Clever
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 3: Advanced
Knowledge Area: Memory Management / Debugging / Linking
Software or Tool: LD_PRELOAD, malloc interposition
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapters 7 and 9

What you’ll build: A memory leak detector that intercepts malloc/free, tracks all allocations (with stack traces), and reports unfreed memory at program exit. Similar to Valgrind’s memcheck but simpler.

Why it teaches linking and memory: By using library interposition (LD_PRELOAD), you’ll understand how dynamic linking works. By tracking allocations, you’ll reinforce heap memory understanding. This combines linking and memory management beautifully.

Core challenges you’ll face:

Interposing malloc/free → maps to LD_PRELOAD and dlsym
Tracking allocations → maps to data structures without malloc (chicken-egg)
Recording stack traces → maps to backtrace function
Reporting at exit → maps to atexit handlers

Key Concepts:

Library Interposition: CS:APP Chapter 7.13 - Bryant & O’Hallaron
Dynamic Linking: CS:APP Chapter 7.11-7.12 - Bryant & O’Hallaron
Memory Leaks: CS:APP Chapter 9.11 - Bryant & O’Hallaron

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Understanding of malloc (Project 6), dynamic linking

Real world outcome:

$ gcc -shared -fPIC -o mymalloc.so mymalloc.c -ldl

$ LD_PRELOAD=./mymalloc.so ./leaky-program
Program running...
Program completed.

=== MEMORY LEAK REPORT ===
Leaked 1024 bytes in 1 block(s)
  at malloc (mymalloc.c:45)
  at create_buffer (leaky.c:23)
  at main (leaky.c:89)

Leaked 48 bytes in 12 block(s)
  at malloc (mymalloc.c:45)
  at strdup (string.c:123)
  at process_line (leaky.c:67)
  at main (leaky.c:95)

Total leaked: 1072 bytes in 13 blocks

Implementation Hints:

Library interposition with LD_PRELOAD:

#define _GNU_SOURCE
#include <dlfcn.h>

static void *(*real_malloc)(size_t) = NULL;
static void (*real_free)(void *) = NULL;

void *malloc(size_t size) {
    if (!real_malloc)
        real_malloc = dlsym(RTLD_NEXT, "malloc");

    void *ptr = real_malloc(size);

    // Track this allocation
    track_alloc(ptr, size);

    return ptr;
}

void free(void *ptr) {
    if (!real_free)
        real_free = dlsym(RTLD_NEXT, "free");

    // Remove from tracking
    untrack_alloc(ptr);

    real_free(ptr);
}

The chicken-and-egg problem:

Your tracking code can’t use malloc (infinite recursion!)
Solutions:
1. Use mmap directly for tracking data structures
2. Pre-allocate a static array
3. Use a “in_malloc” flag to prevent recursion

Recording stack traces:

#include <execinfo.h>

void record_backtrace(alloc_record_t *rec) {
    void *buffer[16];
    int n = backtrace(buffer, 16);
    rec->trace = backtrace_symbols(buffer, n);
    rec->trace_size = n;
}

Learning milestones:

Interposition works → You understand LD_PRELOAD
Tracking works → You can manage metadata
Leak report is accurate → You’ve built a useful tool
Stack traces are helpful → You can debug real leaks

Project 17: Debugger (ptrace-based)

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C
Alternative Programming Languages: C++, Rust
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 1. The “Resume Gold”
Difficulty: Level 5: Master
Knowledge Area: Debugging / ptrace / Process Control
Software or Tool: ptrace, ELF, DWARF
Main Book: Computer Systems: A Programmer’s Perspective, 3rd Edition - Chapters 3, 7, 8

What you’ll build: A simple debugger like GDB that can set breakpoints, single-step through code, inspect registers and memory, and show source lines. This is the ultimate CS:APP project.

Why it teaches everything: A debugger touches almost every concept in CS:APP: process control (fork/exec/wait), signals (SIGTRAP), memory (reading/writing process memory), linking (parsing DWARF debug info), and machine code (understanding instructions).

Core challenges you’ll face:

Controlling a child process → maps to ptrace PTRACE_TRACEME
Setting breakpoints → maps to replacing instructions with int3
Single-stepping → maps to PTRACE_SINGLESTEP
Reading registers and memory → maps to PTRACE_PEEKDATA/GETREGS
Mapping addresses to source lines → maps to DWARF debug info

Resources for key challenges:

“Building a Debugger” by Sy Brand - Excellent step-by-step guide

Key Concepts:

ptrace: Linux man pages (man 2 ptrace)
Breakpoints: CS:APP Chapter 3.10.5 - Bryant & O’Hallaron
DWARF Debug Info: “DWARF Debugging Standard” specification
Process Control: CS:APP Chapter 8.4 - Bryant & O’Hallaron

Difficulty: Master Time estimate: 1 month+ Prerequisites: All previous projects, strong understanding of x86-64, ELF

Real world outcome:

$ ./mydb ./test-program
mydb> break main
Breakpoint 1 at 0x401150 (main.c:10)
mydb> run
Starting program: ./test-program
Breakpoint 1, main () at main.c:10
10        int x = 42;
mydb> step
11        int y = compute(x);
mydb> print x
$1 = 42
mydb> regs
rax: 0x0000000000000000
rbx: 0x0000000000000000
rcx: 0x00007ffff7e15a80
...
rip: 0x0000000000401158
mydb> continue
Program exited normally.

Implementation Hints:

Setting up tracing:

int main(int argc, char **argv) {
    pid_t pid = fork();

    if (pid == 0) {
        // Child: become tracee
        ptrace(PTRACE_TRACEME, 0, NULL, NULL);
        execvp(argv[1], argv + 1);
    } else {
        // Parent: debugger
        wait(NULL);  // Wait for exec to complete
        debugger_loop(pid);
    }
}

Setting a breakpoint:

void set_breakpoint(pid_t pid, uintptr_t addr) {
    // Read original byte
    long data = ptrace(PTRACE_PEEKDATA, pid, addr, NULL);
    uint8_t orig_byte = data & 0xFF;

    // Save original byte for restoration
    save_original(addr, orig_byte);

    // Replace with int3 (0xCC)
    long modified = (data & ~0xFF) | 0xCC;
    ptrace(PTRACE_POKEDATA, pid, addr, modified);
}

Reading registers:

struct user_regs_struct regs;
ptrace(PTRACE_GETREGS, pid, NULL, &regs);
printf("rip: %llx\n", regs.rip);
printf("rsp: %llx\n", regs.rsp);

Single-stepping:

ptrace(PTRACE_SINGLESTEP, pid, NULL, NULL);
wait(NULL);  // Wait for SIGTRAP

Learning milestones:

Can run and stop program → You understand basic ptrace
Breakpoints work → You understand int3 and instruction patching
Single-step works → You control execution precisely
Register/memory inspection → You can examine program state
Source-level debugging → You parse DWARF info

Project Comparison Table

Project	Difficulty	Time	Depth of Understanding	Fun Factor	CS:APP Chapter
1. Bit Manipulation Puzzles	⭐	Weekend	⭐⭐⭐⭐	⭐⭐⭐	Ch. 2
2. Binary Bomb Defuser	⭐⭐	1-2 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	Ch. 3
3. Buffer Overflow Exploits	⭐⭐⭐	1-2 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	Ch. 3
4. Y86-64 Processor Sim	⭐⭐⭐⭐	2-3 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	Ch. 4
5. Cache Simulator	⭐⭐	1-2 weeks	⭐⭐⭐⭐	⭐⭐⭐	Ch. 6
6. Dynamic Memory Allocator	⭐⭐⭐⭐	2-3 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐	Ch. 9
7. Unix Shell	⭐⭐	1-2 weeks	⭐⭐⭐⭐	⭐⭐⭐⭐	Ch. 8
8. ELF Linker	⭐⭐⭐⭐	2-3 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐	Ch. 7
9. Virtual Memory Sim	⭐⭐⭐	2 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐	Ch. 9
10. Robust I/O Library	⭐⭐	Weekend	⭐⭐⭐	⭐⭐	Ch. 10
11. HTTP Web Server	⭐⭐	1-2 weeks	⭐⭐⭐⭐	⭐⭐⭐⭐	Ch. 11
12. Concurrent Web Proxy	⭐⭐⭐	2-3 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	Ch. 11-12
13. Thread Pool	⭐⭐⭐	1-2 weeks	⭐⭐⭐⭐	⭐⭐⭐	Ch. 12
14. Signal-Safe Printf	⭐⭐⭐	Weekend	⭐⭐⭐⭐	⭐⭐	Ch. 8
15. Performance Profiler	⭐⭐⭐	1-2 weeks	⭐⭐⭐⭐	⭐⭐⭐⭐	Ch. 5
16. Memory Leak Detector	⭐⭐⭐	1-2 weeks	⭐⭐⭐⭐	⭐⭐⭐⭐	Ch. 7, 9
17. Debugger	⭐⭐⭐⭐⭐	1 month+	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	Ch. 3, 7, 8

Recommended Learning Path

Phase 1: Foundation (4-6 weeks)

Start with data representation and machine code:

Project 1: Bit Manipulation Puzzles - Understand binary representation
Project 2: Binary Bomb - Learn x86-64 assembly and GDB
Project 3: Attack Lab - Master stack layout and exploitation

After Phase 1, you’ll think in binary and read assembly fluently.

Phase 2: Hardware Understanding (4-6 weeks)

Learn how processors and memory hierarchy work:

Project 4: Y86-64 Processor - Understand CPU pipeline
Project 5: Cache Simulator - Master memory hierarchy

After Phase 2, you’ll understand the hardware-software interface.

Phase 3: System Software (6-8 weeks)

Learn how the OS manages programs:

Project 6: Malloc Implementation - Master heap memory
Project 7: Unix Shell - Understand processes and signals
Project 8: ELF Linker - Understand how programs are built
Project 9: Virtual Memory Simulator - Understand address translation

After Phase 3, you’ll understand operating system fundamentals.

Phase 4: I/O and Networking (3-4 weeks)

Build networked systems:

Project 10: RIO Library - Handle system I/O robustly
Project 11: HTTP Web Server - Understand network programming

After Phase 4, you can build real networked applications.

Phase 5: Concurrency (4-6 weeks)

Master parallel programming:

Project 12: Concurrent Web Proxy - Apply threading to real systems
Project 13: Thread Pool - Build reusable concurrency primitives
Project 14: Signal-Safe Printf - Understand async-signal safety

After Phase 5, you can write correct concurrent code.

Phase 6: Advanced Topics (4+ weeks)

Build developer tools:

Project 15: Performance Profiler - Understand measurement
Project 16: Memory Leak Detector - Combine linking and memory
Project 17: Debugger - The ultimate capstone

After Phase 6, you understand computer systems deeply.

Final Capstone Project: Operating System Kernel

File: LEARN_CSAPP_COMPUTER_SYSTEMS.md
Main Programming Language: C, x86-64 Assembly
Alternative Programming Languages: Rust
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 1. The “Resume Gold”
Difficulty: Level 5: Master
Knowledge Area: All of CS:APP
Software or Tool: QEMU, GRUB, GDB
Main Book: Operating Systems: Three Easy Pieces + CS:APP

What you’ll build: A minimal operating system kernel that boots on x86-64, manages memory with paging, runs multiple processes with preemptive scheduling, handles system calls, and implements a simple file system.

Why it’s the ultimate capstone: An OS kernel uses every concept from CS:APP:

Chapter 2: Data representation (page table entries, bitfields)
Chapter 3: Assembly (bootstrap, interrupt handlers)
Chapter 4: Processor details (privilege levels, interrupts)
Chapter 5-6: Performance and caching (kernel optimization)
Chapter 7: Linking (kernel loading, module loading)
Chapter 8: Exceptional control flow (interrupts, context switching)
Chapter 9: Virtual memory (paging, page fault handling)
Chapter 10: I/O (disk drivers, console)
Chapter 12: Concurrency (locks, scheduling)

Core challenges you’ll face:

Booting in protected/long mode → maps to processor modes, GDT/IDT
Implementing paging → maps to page tables, address translation
Writing interrupt handlers → maps to exceptional control flow
Context switching → maps to saving/restoring registers
System call interface → maps to user/kernel boundary

Resources:

“Operating Systems: Three Easy Pieces” by Arpaci-Dusseau - Free online textbook
OSDev Wiki - Comprehensive reference
“xv6: A simple Unix-like teaching operating system” - MIT’s teaching OS

Key Concepts:

Bootstrapping: OSDev Wiki + Intel manuals
Paging: CS:APP Chapter 9 + OSTEP Chapter 18-20
Interrupts: CS:APP Chapter 8 + Intel manuals
Scheduling: OSTEP Chapter 7-9
File Systems: OSTEP Chapter 39-42

Difficulty: Master+ Time estimate: 3-6 months Prerequisites: All 17 projects above, deep understanding of x86-64

Real world outcome:

Booting MyOS v0.1...
Memory: 256 MB detected
Paging: Enabled (4-level page tables)
Interrupts: 256 handlers installed
Timer: 100 Hz tick

MyOS> ps
PID  STATE    NAME
  1  RUNNING  init
  2  SLEEPING shell
  3  RUNNING  ps

MyOS> cat /hello.txt
Hello from MyOS!

MyOS> ./factorial 10
3628800

MyOS> shutdown
Goodbye!

Implementation Hints:

Suggested milestones:

Bare metal hello world - Boot and print to screen
Protected/long mode - Set up GDT, switch to 64-bit
Interrupts working - Handle keyboard, timer
Physical memory manager - Track free frames
Virtual memory - Enable paging
User mode processes - Privilege separation
System calls - User/kernel interface
Simple scheduler - Round-robin
File system - Read-only initially
Shell - Interactive interface

The key insight: Build incrementally. Each milestone should be a working (if limited) system.

Learning milestones:

Boots and prints → You understand bare metal
Paging works → You’ve implemented virtual memory
User processes run → You understand privilege levels
Multiple processes → You can context switch
File system works → You understand block storage
It’s actually usable → You’ve built an OS!

Summary

#	Project	Main Language
1	Bit Manipulation Puzzle Solver	C
2	Binary Bomb Defuser	x86-64 Assembly
3	Buffer Overflow Exploit Lab	x86-64 Assembly
4	Y86-64 Processor Simulator	C (with HCL)
5	Cache Simulator	C
6	Dynamic Memory Allocator	C
7	Unix Shell Implementation	C
8	ELF Linker and Loader	C
9	Virtual Memory Simulator	C
10	Robust I/O Library	C
11	HTTP Web Server	C
12	Concurrent Web Proxy	C
13	Thread Pool Implementation	C
14	Signal-Safe Printf	C
15	Performance Profiler	C
16	Memory Leak Detector	C
17	Debugger	C
Capstone	Operating System Kernel	C + Assembly

Resources

Primary Textbook

Computer Systems: A Programmer’s Perspective, 3rd Edition by Randal E. Bryant and David R. O’Hallaron

Official Lab Materials

CS:APP Labs - Download official lab handouts
Y86-64 Simulator Guide - Processor simulator documentation

Supplementary Books

Operating Systems: Three Easy Pieces by Arpaci-Dusseau - Free online, excellent companion
The Linux Programming Interface by Michael Kerrisk - Deep Unix system programming
Hacker’s Delight by Henry S. Warren Jr. - Bit manipulation mastery
Linkers and Loaders by John R. Levine - Linking in depth

Online Resources

CMU 15-213 Course - Course materials and videos
Exercism x86-64 Track - Assembly practice
OSDev Wiki - Operating system development

Total Estimated Time: 6-12 months of dedicated study

After completion: You will understand computer systems at a level that 99% of programmers never reach. You’ll write better code, debug faster, and tackle any low-level challenge with confidence.