Project 4: x86-64 Calling Convention Crash Cart
Project 4: x86-64 Calling Convention Crash Cart
Debug crashes by reading the machine: master stack frames, registers, and assembly patterns to trace any failure back to its source.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 1-2 weeks |
| Language | C (Alternatives: Rust, Zig, C++) |
| Prerequisites | Comfort using a debugger (GDB/LLDB), Projects 1-2 recommended |
| Key Topics | x86-64 calling conventions, stack frames, ABI, assembly patterns, debugging |
| CS:APP Chapters | 3 |
Table of Contents
- Learning Objectives
- Deep Theoretical Foundation
- Project Specification
- Solution Architecture
- Implementation Guide
- Testing Strategy
- Common Pitfalls
- Extensions
- Real-World Connections
- Resources
- Self-Assessment Checklist
1. Learning Objectives
By completing this project, you will:
- Map assembly to C constructs: Read compiler-generated assembly and understand how it implements C control flow, function calls, and data structures
- Explain stack layout precisely: Given any crash, draw the stack frame showing saved registers, return addresses, local variables, and arguments
- Master the System V AMD64 ABI: Know which registers hold arguments, return values, and which are caller/callee-saved
- Recognize compiler patterns: Identify prologues, epilogues, loops, conditionals, and switch statements in disassembly
- Debug from machine state: Given a register dump and stack bytes, reconstruct what happened and why
- Identify vulnerability classes: Recognize buffer overflows, use-after-free, and other memory errors from assembly signatures
2. Deep Theoretical Foundation
2.1 x86-64 Register Conventions
The x86-64 architecture has 16 general-purpose 64-bit registers. The System V AMD64 ABI (used on Linux, macOS, and BSD) assigns specific purposes to each:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ x86-64 General Purpose Registers โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ CALLER-SAVED (volatile) - Callee may trash these โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ %rax - Return value, syscall number โ
โ %rcx - 4th argument (syscalls: destroyed) โ
โ %rdx - 3rd argument, 2nd return value โ
โ %rsi - 2nd argument โ
โ %rdi - 1st argument โ
โ %r8 - 5th argument โ
โ %r9 - 6th argument โ
โ %r10 - Temporary, syscall argument โ
โ %r11 - Temporary (destroyed by syscalls) โ
โ โ
โ CALLEE-SAVED (non-volatile) - Must be preserved across calls โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ %rbx - General purpose (preserved) โ
โ %rbp - Frame pointer (optional, but conventional) โ
โ %r12 - General purpose (preserved) โ
โ %r13 - General purpose (preserved) โ
โ %r14 - General purpose (preserved) โ
โ %r15 - General purpose (preserved) โ
โ โ
โ SPECIAL PURPOSE โ
โ โโโโโโโโโโโโโโโโ โ
โ %rsp - Stack pointer (always preserved) โ
โ %rip - Instruction pointer (not directly accessible) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Register Size Variants
Each 64-bit register has smaller addressable portions:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 63 31 15 7 0โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ %rax (64-bit) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ %eax (32-bit) โ
โ โ โโโโโโโโโโโโโโโโโโโค
โ โ โ %ax (16-bit) โ
โ โ โ โโโโโโโโโโโโค
โ โ โ %ah โ %al โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key insight: Writing to %eax ZEROS the upper 32 bits of %rax.
Writing to %ax, %ah, or %al preserves the upper bits.
Why This Matters for Debugging
When you see a crash, the first thing you check is registers. Knowing:
- %rdi, %rsi, %rdx, %rcx, %r8, %r9 = What arguments were passed?
- %rax = What was (or would be) the return value?
- %rsp = Where is the stack? Is it aligned?
- %rbp = Can we walk the stack frames?
- %rip = What instruction crashed?
2.2 System V AMD64 ABI Calling Convention
The calling convention defines the contract between caller and callee:
Argument Passing Rules
Integer/Pointer Arguments (in order):
%rdi- 1st argument%rsi- 2nd argument%rdx- 3rd argument%rcx- 4th argument%r8- 5th argument%r9- 6th argument- Stack - 7th argument onwards (pushed right-to-left)
Floating-Point Arguments:
%xmm0through%xmm7(up to 8 float/double arguments)- Additional float arguments go on the stack
Return Values:
- Integer/pointer:
%rax(and%rdxfor 128-bit values) - Floating-point:
%xmm0(and%xmm1for pairs)
// Example: How arguments are passed
long example(long a, long b, long c, long d, long e, long f, long g, long h);
// %rdi %rsi %rdx %rcx %r8 %r9 stack stack
Stack Alignment Requirement
Critical Rule: The stack must be 16-byte aligned BEFORE the call instruction.
After call pushes the 8-byte return address, %rsp will be 8 mod 16.
The callee typically pushes %rbp (if using frame pointer) to realign.
Before call: %rsp % 16 == 0 (aligned)
After call: %rsp % 16 == 8 (misaligned due to return address)
After push: %rsp % 16 == 0 (realigned by saving %rbp)
Why alignment matters: SSE instructions require 16-byte alignment. Calling printf with a misaligned stack can crash!
2.3 Stack Frame Layout
A complete stack frame for a function call:
High addresses
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Caller's frame โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Stack argument n (if any) โ โ Arguments 7+
โ ... โ
โ Stack argument 7 (if any) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
+8(%rbp) โ Return address (8 bytes) โ โ Pushed by call
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
(%rbp) โ Saved %rbp (8 bytes) โ โ Frame pointer
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
-8(%rbp) โ Saved callee-saved registers โ โ %rbx, %r12-15
โ (as needed) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Local variables โ
โ (growing downward) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Padding for alignment โ โ 16-byte align
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
(%rsp) โ (Red zone: 128 bytes below) โ โ Leaf functions only
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Low addresses
Reading a Stack Frame in GDB
(gdb) x/20gx $rsp
0x7fffffffddc0: 0x0000000000000001 0x00007fffffffdeb8
0x7fffffffddd0: 0x0000000000000000 0x0000000000401234
0x7fffffffdde0: 0x00007fffffffde00 0x0000000000401456
โ โ
Saved %rbp Return address
(gdb) info frame
Stack level 0, frame at 0x7fffffffddf0:
rip = 0x401234 in main (example.c:15); saved rip = 0x7ffff7a2d830
called by frame at 0x7fffffffde50
source language c.
Arglist at 0x7fffffffdde0, args: argc=1, argv=0x7fffffffdeb8
Locals at 0x7fffffffdde0, Previous frame's sp is 0x7fffffffddf0
Saved registers:
rbp at 0x7fffffffdde0, rip at 0x7fffffffdde8
2.4 The Red Zone
On System V AMD64 (Linux, macOS), leaf functions (functions that donโt call other functions) can use 128 bytes below %rsp without adjusting the stack pointer:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ The Red Zone (128 bytes) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ %rsp โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ โ
โ โ Red Zone: 128 bytes that leaf functions โ โ
โ โ can use WITHOUT subtracting from %rsp โ โ
โ โ โ โ
โ โ - Preserved across signal handlers โ โ
โ โ - NOT preserved if function calls another โ โ
โ โ - Allows optimization: no stack setup โ โ
โ โ โ โ
โ %rsp-128 โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Why it matters for debugging: โ
โ - If you crash in a leaf function, locals may be โ
โ in the red zone (below %rsp) โ
โ - On Windows (different ABI), there is NO red zone โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Debugging implication: When examining a crash, donโt just look at %rsp and above. Check %rsp - 128 for leaf function locals.
2.5 Arrays and Structs in Memory
Array Layout
Arrays are contiguous in memory:
int arr[4] = {10, 20, 30, 40};
Address: arr[0] arr[1] arr[2] arr[3]
โโโโโโโโโโโฌโโโโโโโโโโฌโโโโโโโโโโฌโโโโโโโโโโ
Memory: โ 10 โ 20 โ 30 โ 40 โ
โโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
Offset: +0 +4 +8 +12
Assembly access: arr[i] โ movl (%rax,%rcx,4), %edx
โ โ โ โ
dest base index scale(sizeof int)
Struct Layout with Padding
struct example {
char a; // 1 byte
// 3 bytes padding
int b; // 4 bytes
char c; // 1 byte
// 7 bytes padding
long d; // 8 bytes
};
// Total size: 24 bytes (not 14!)
Offset: 0 1-3 4 8 9-15 16 24
โโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโฌโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโ
โ a โ padding โ b โ c โ padding โ d โ
โโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโดโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโ
1 3 4 1 7 8
Alignment rules:
- Each field is aligned to its natural size (or max alignment)
- Struct size is padded to multiple of largest alignment
char= 1-byte alignedshort= 2-byte alignedint= 4-byte alignedlong,pointer= 8-byte aligned
Struct Access in Assembly
struct point { int x; int y; };
struct point p;
int val = p.y;
# Assuming %rdi points to struct point
movl 4(%rdi), %eax # offset 4 = y field
Debugging tip: Know your struct layout! Use pahole or sizeof/offsetof to verify.
2.6 Common Instruction Patterns
Function Prologue (with frame pointer)
pushq %rbp # Save caller's frame pointer
movq %rsp, %rbp # Establish new frame pointer
subq $32, %rsp # Allocate 32 bytes for locals
pushq %rbx # Save callee-saved register (if used)
Function Prologue (without frame pointer, -fomit-frame-pointer)
subq $40, %rsp # Allocate locals + alignment
movq %rbx, 32(%rsp) # Save callee-saved in stack slot
Function Epilogue
addq $32, %rsp # Deallocate locals
popq %rbp # Restore caller's frame pointer
retq # Pop return address, jump to it
Or with leave:
leave # Equivalent to: movq %rbp, %rsp; popq %rbp
retq
Loop Patterns
For loop:
for (int i = 0; i < n; i++) { ... }
movl $0, %eax # i = 0
.L2:
cmpl %edi, %eax # compare i with n
jge .L1 # if i >= n, exit loop
# ... loop body ...
addl $1, %eax # i++
jmp .L2 # continue loop
.L1:
While loop:
while (condition) { ... }
jmp .Ltest # Jump to condition test
.Lbody:
# ... loop body ...
.Ltest:
testl %eax, %eax # Test condition
jne .Lbody # If true, continue loop
Conditional Patterns
If-else:
if (a > b) { x = 1; } else { x = 0; }
cmpl %esi, %edi # Compare a (edi) with b (esi)
jle .Lelse # If a <= b, goto else
movl $1, %eax # x = 1
jmp .Ldone
.Lelse:
movl $0, %eax # x = 0
.Ldone:
Conditional move (branchless):
return a > b ? a : b; // max
cmpl %esi, %edi # Compare a with b
movl %esi, %eax # Assume b
cmovg %edi, %eax # If a > b, use a instead
Switch Statement with Jump Table
switch (x) {
case 0: return 10;
case 1: return 20;
case 2: return 30;
}
cmpl $2, %edi # Check if x > 2
ja .Ldefault # If so, default case
leaq .Ljumptable(%rip), %rax
movslq (%rax,%rdi,4), %rdx # Load offset from table
addq %rax, %rdx # Add base address
jmpq *%rdx # Indirect jump
.Ljumptable:
.long .Lcase0 - .Ljumptable
.long .Lcase1 - .Ljumptable
.long .Lcase2 - .Ljumptable
Function Call Pattern
result = foo(a, b, c);
movl $3, %edx # 3rd arg (c)
movl $2, %esi # 2nd arg (b)
movl $1, %edi # 1st arg (a)
call foo
movl %eax, result(%rip) # Save return value
2.7 Reading Crash Dumps
When a program crashes, you get a signal (usually SIGSEGV or SIGBUS). The debugger preserves the exact machine state at the crash point.
Key information in a crash:
- %rip - What instruction caused the crash?
- Signal type - What kind of error?
- SIGSEGV: Invalid memory access
- SIGBUS: Misaligned access, bad address
- SIGFPE: Arithmetic error (division by zero)
- SIGILL: Illegal instruction
- SIGABRT: abort() called
- Faulting address - What address was accessed?
- Register state - What values were in play?
- Stack trace - How did we get here?
Example GDB crash analysis session:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401234 in process_data (buf=0x7fffffffddc0, len=256) at crash.c:42
(gdb) info registers
rax 0x0 0
rbx 0x7fffffffdeb8 140737488346808
rcx 0x100 256
rdx 0x0 0
rsi 0x100 256
rdi 0x7fffffffddc0 140737488346560
rbp 0x7fffffffddf0 0x7fffffffddf0
rsp 0x7fffffffdda0 0x7fffffffdda0
rip 0x401234 0x401234 <process_data+52>
(gdb) x/i $rip
=> 0x401234 <process_data+52>: movb %al,(%rdx)
โ
Writing to address 0x0 (NULL pointer!)
(gdb) bt
#0 0x0000000000401234 in process_data (buf=0x7fffffffddc0, len=256) at crash.c:42
#1 0x0000000000401456 in main (argc=1, argv=0x7fffffffdeb8) at crash.c:60
Diagnosis: The instruction movb %al, (%rdx) tried to write to address 0x0 (the value in %rdx). This is a NULL pointer dereference.
3. Project Specification
3.1 What You Will Build
A โcrash cartโ toolkit consisting of:
- Crash-inducing programs: Small C programs that demonstrate specific failure modes
- Post-mortem report template: A standardized format for analyzing crashes
- Analysis tools/scripts: Helpers for extracting and formatting crash information
- Solution narratives: Complete explanations for each crash type
3.2 Functional Requirements
Part A: Crash Program Suite
Create programs that reliably trigger:
- Stack buffer overflow - Overwrite return address
- NULL pointer dereference - Read/write through NULL
- Use-after-free - Access freed memory
- Double free - Free the same pointer twice
- Stack overflow (recursion) - Infinite recursion
- Uninitialized variable - Use garbage value
- Off-by-one - Array bounds violation
- Format string vulnerability - printf with user data as format
- Integer overflow leading to crash - Arithmetic causes bad access
- Misaligned access (if applicable) - Access unaligned address
Part B: Post-Mortem Report Format
For each crash, produce a report containing:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
CRASH POST-MORTEM REPORT
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
1. CRASH IDENTIFICATION
โโโโโโโโโโโโโโโโโโโโโ
Program: [executable name]
Source file: [source.c:line]
Crash signal: [SIGSEGV/SIGBUS/etc]
Crash address: [0x...]
2. REGISTER STATE AT CRASH
โโโโโโโโโโโโโโโโโโโโโโโโโ
%rip = 0x... (Faulting instruction)
%rsp = 0x... (Stack pointer)
%rbp = 0x... (Frame pointer)
%rax = 0x... [interpretation]
%rdi = 0x... [interpretation]
...
3. FAULTING INSTRUCTION
โโโโโโโโโโโโโโโโโโโโโ
Disassembly: [instruction]
Operation: [what it was trying to do]
Why it failed: [the specific reason]
4. STACK TRACE
โโโโโโโโโโโโ
#0 function_a at file.c:XX
#1 function_b at file.c:YY
...
5. STACK FRAME ANALYSIS
โโโโโโโโโโโโโโโโโโโโโ
[ASCII diagram of relevant stack frame(s)]
Return address: 0x... โ [function name + offset]
Saved %rbp: 0x...
Local variables:
- var1 @ rbp-8: 0x...
- var2 @ rbp-16: 0x...
Arguments (if on stack):
- arg7 @ rbp+16: 0x...
6. ROOT CAUSE
โโโโโโโโโโโ
[Clear explanation of what went wrong at the C level]
7. VULNERABILITY CLASSIFICATION
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
CWE ID: [if applicable]
Category: [buffer overflow / use-after-free / etc]
Exploitable: [Yes/No/Maybe - brief explanation]
8. ASSEMBLY SIGNATURE
โโโโโโโโโโโโโโโโโโโ
[Key assembly patterns that identify this bug class]
9. PREVENTION
โโโโโโโโโโโ
[How to avoid this bug in the future]
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Part C: GDB Automation Scripts
Create scripts that:
- Run program and capture crash state
- Extract register values to structured format
- Dump relevant stack bytes
- Generate disassembly of faulting function
- Produce initial report template
3.3 Example Crash Report
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
CRASH POST-MORTEM REPORT
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
1. CRASH IDENTIFICATION
โโโโโโโโโโโโโโโโโโโโโ
Program: ./buffer_overflow
Source file: buffer_overflow.c:15
Crash signal: SIGSEGV
Crash address: 0x0000414141414141
2. REGISTER STATE AT CRASH
โโโโโโโโโโโโโโโโโโโโโโโโโ
%rip = 0x414141414141 (Invalid! This is ASCII 'AAAAAA')
%rsp = 0x7fffffffddf8 (Stack pointer looks valid)
%rbp = 0x4141414141414141 (Corrupted! Also 'AAAAAAAA')
%rax = 0x0 (Return value was 0)
3. FAULTING INSTRUCTION
โโโโโโโโโโโโโโโโโโโโโ
Disassembly: Cannot disassemble - %rip points to invalid memory
Operation: Attempted to execute code at address 0x414141414141
Why it failed: Address is not mapped; %rip was overwritten
4. STACK TRACE
โโโโโโโโโโโโ
Cannot unwind - frame pointer chain corrupted
Last valid frame: vulnerable_function at buffer_overflow.c:12
5. STACK FRAME ANALYSIS
โโโโโโโโโโโโโโโโโโโโโ
Before overflow:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ rbp+8: Return address โ main+42 โ
โ rbp: Saved %rbp โ 0x7fffffffde00โ
โ rbp-32: char buffer[32] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
After overflow (64 bytes written to 32-byte buffer):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ rbp+8: 0x4141414141414141 (AAAAA) โ โ Overwritten!
โ rbp: 0x4141414141414141 (AAAAA) โ โ Overwritten!
โ rbp-32: "AAAA..." (overflow data) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
6. ROOT CAUSE
โโโโโโโโโโโ
The function vulnerable_function() uses strcpy() to copy
user-controlled input into a 32-byte stack buffer without
length checking. Providing 64 bytes of input overwrites:
- 32 bytes of buffer (intended)
- 8 bytes of saved %rbp (unintended)
- 8 bytes of return address (unintended)
When the function returns (ret instruction), it pops
0x4141414141414141 into %rip and attempts to execute there.
7. VULNERABILITY CLASSIFICATION
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
CWE ID: CWE-121 (Stack-based Buffer Overflow)
Category: Buffer Overflow / Stack Smash
Exploitable: Yes - attacker can redirect execution by
controlling the overwritten return address
8. ASSEMBLY SIGNATURE
โโโโโโโโโโโโโโโโโโโ
Look for:
- call to strcpy/gets/sprintf without bounds checking
- Buffer allocated on stack (subq $N, %rsp where N < input size)
- No stack canary (missing __stack_chk_fail reference)
Suspicious pattern:
leaq -32(%rbp), %rdi # buffer address
movq %rsi, %rsi # user input
call strcpy # unbounded copy!
9. PREVENTION
โโโโโโโโโโโ
- Use strncpy() or strlcpy() with explicit length limit
- Enable stack canaries: -fstack-protector-strong
- Enable ASLR and NX/DEP
- Use static analyzers that catch unbounded copies
- Prefer safe string APIs (snprintf, strncat)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
4. Solution Architecture
4.1 Project Structure
crash-cart/
โโโ programs/ # Crash-inducing programs
โ โโโ 01_stack_buffer_overflow.c
โ โโโ 02_null_pointer.c
โ โโโ 03_use_after_free.c
โ โโโ 04_double_free.c
โ โโโ 05_stack_exhaustion.c
โ โโโ 06_uninitialized.c
โ โโโ 07_off_by_one.c
โ โโโ 08_format_string.c
โ โโโ 09_integer_overflow.c
โ โโโ 10_misaligned.c
โโโ reports/ # Completed post-mortem reports
โ โโโ 01_stack_buffer_overflow.md
โ โโโ 02_null_pointer.md
โ โโโ ...
โโโ scripts/
โ โโโ crash_analyze.sh # Run program, capture crash
โ โโโ gdb_commands.txt # GDB automation commands
โ โโโ parse_registers.py # Extract register values
โ โโโ generate_report.py # Create report template
โโโ templates/
โ โโโ report_template.md # Empty report template
โโโ Makefile
โโโ README.md
4.2 GDB Automation
gdb_commands.txt - Commands to run on crash:
# Disable pagination for batch mode
set pagination off
set confirm off
# Run the program
run
# When it crashes, collect information
printf "\n=== REGISTERS ===\n"
info registers
printf "\n=== FAULTING INSTRUCTION ===\n"
x/1i $rip
printf "\n=== STACK TRACE ===\n"
bt
printf "\n=== STACK CONTENTS ===\n"
x/32gx $rsp
printf "\n=== CURRENT FRAME ===\n"
info frame
printf "\n=== DISASSEMBLY ===\n"
disassemble
printf "\n=== MEMORY MAPS ===\n"
info proc mappings
quit
Usage:
gdb -batch -x gdb_commands.txt ./crashme > crash_output.txt 2>&1
4.3 Essential GDB Commands Reference
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ GDB Commands for Crash Analysis โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ EXAMINING REGISTERS โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ info registers Show all general-purpose registers โ
โ info all-registers Show all registers including FP/vector โ
โ p/x $rax Print %rax in hex โ
โ p/d $rdi Print %rdi in decimal โ
โ p (char*)$rsi Interpret %rsi as string pointer โ
โ โ
โ EXAMINING MEMORY โ
โ โโโโโโโโโโโโโโโโ โ
โ x/Nfz address Examine memory: โ
โ N = count โ
โ f = format (x=hex, d=decimal, s=string, โ
โ i=instruction, c=char) โ
โ z = size (b=byte, h=half, w=word, g=giant) โ
โ โ
โ x/32gx $rsp 32 giant (8-byte) words at stack pointer โ
โ x/10i $rip 10 instructions at instruction pointer โ
โ x/s $rdi String at first argument โ
โ x/20wx 0x7fff... 20 words (4-byte) at address โ
โ โ
โ STACK NAVIGATION โ
โ โโโโโโโโโโโโโโโโ โ
โ bt Backtrace (show call stack) โ
โ bt full Backtrace with local variables โ
โ frame N Select frame N โ
โ up / down Move up/down the call stack โ
โ info frame Detailed info about current frame โ
โ info locals Show local variables โ
โ info args Show function arguments โ
โ โ
โ DISASSEMBLY โ
โ โโโโโโโโโโโ โ
โ disassemble Disassemble current function โ
โ disas /r With raw bytes โ
โ disas /m Mixed with source (if available) โ
โ disas function_name Disassemble specific function โ
โ x/20i $rip-40 Instructions around crash point โ
โ โ
โ BREAKPOINTS & CONTROL โ
โ โโโโโโโโโโโโโโโโโโโโโ โ
โ break *0x401234 Break at address โ
โ break function Break at function entry โ
โ break file.c:42 Break at source line โ
โ watch *0x7fff... Break when memory changes โ
โ catch signal SIGSEGV Break on signal โ
โ โ
โ USEFUL SHORTCUTS โ
โ โโโโโโโโโโโโโโโโ โ
โ p/x $rbp+8 Calculate return address location โ
โ x/a $rbp+8 Show return address โ
โ x/s *(char**)($rsp) Dereference string pointer on stack โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
5. Implementation Guide
5.1 Development Environment Setup
# Required tools
# On Ubuntu/Debian:
sudo apt-get install gcc gdb build-essential
# On macOS:
xcode-select --install
brew install gdb # Note: macOS gdb requires code signing
# Verify
gcc --version
gdb --version
# Create project structure
mkdir -p crash-cart/{programs,reports,scripts,templates}
cd crash-cart
5.2 Makefile
CC = gcc
# Compile with debug symbols, no optimization, no stack protector
CFLAGS = -g -O0 -fno-stack-protector -fno-pie -no-pie
# Enable all warnings
CFLAGS += -Wall -Wextra
# Source files
SOURCES = $(wildcard programs/*.c)
TARGETS = $(SOURCES:.c=)
.PHONY: all clean analyze
all: $(TARGETS)
programs/%: programs/%.c
$(CC) $(CFLAGS) -o $@ $<
# Run crash analysis on a specific program
analyze: programs/$(PROG)
./scripts/crash_analyze.sh programs/$(PROG)
clean:
rm -f programs/01_* programs/02_* programs/03_* programs/04_*
rm -f programs/05_* programs/06_* programs/07_* programs/08_*
rm -f programs/09_* programs/10_*
5.3 Implementation Phases
Phase 1: Foundation (Days 1-2)
Goals:
- Set up project structure
- Create first crash program (NULL pointer)
- Establish GDB workflow
- Create report template
Task 1.1: NULL Pointer Dereference
// programs/02_null_pointer.c
#include <stdio.h>
#include <stdlib.h>
struct data {
int value;
char *name;
};
struct data *get_data(int should_fail) {
if (should_fail) {
return NULL; // Simulate failed allocation or lookup
}
struct data *d = malloc(sizeof(struct data));
d->value = 42;
d->name = "valid";
return d;
}
void process(struct data *d) {
// Bug: No NULL check before dereference
printf("Value: %d\n", d->value); // Crash here if d is NULL
}
int main(int argc, char *argv[]) {
int fail = (argc > 1); // Fail if any argument given
struct data *d = get_data(fail);
process(d);
return 0;
}
Task 1.2: GDB Analysis Script
#!/bin/bash
# scripts/crash_analyze.sh
PROG="$1"
if [ -z "$PROG" ]; then
echo "Usage: $0 <program>"
exit 1
fi
OUTPUT="${PROG}.crash_dump.txt"
cat > /tmp/gdb_crash.txt << 'EOF'
set pagination off
set confirm off
run
printf "\n========== SIGNAL INFO ==========\n"
info signals SIGSEGV SIGBUS SIGFPE SIGABRT
printf "\n========== REGISTERS ==========\n"
info registers
printf "\n========== FAULTING INSTRUCTION ==========\n"
x/3i $rip-8
printf "=> "
x/1i $rip
x/3i $rip+1
printf "\n========== BACKTRACE ==========\n"
bt full
printf "\n========== STACK (16 quadwords) ==========\n"
x/16gx $rsp
printf "\n========== FRAME INFO ==========\n"
info frame
printf "\n========== DISASSEMBLY ==========\n"
disassemble
quit
EOF
echo "Analyzing crash in: $PROG"
echo "Output: $OUTPUT"
gdb -batch -x /tmp/gdb_crash.txt "$PROG" > "$OUTPUT" 2>&1
echo "Done. Check $OUTPUT for crash details."
Checkpoint: Run ./crash_analyze.sh programs/02_null_pointer and verify you get register dump and crash info.
Phase 2: Crash Program Suite (Days 3-6)
Task 2.1: Stack Buffer Overflow
// programs/01_stack_buffer_overflow.c
#include <stdio.h>
#include <string.h>
void vulnerable_function(char *input) {
char buffer[32];
strcpy(buffer, input); // No bounds check!
printf("Copied: %s\n", buffer);
}
int main(int argc, char *argv[]) {
if (argc < 2) {
printf("Usage: %s <input>\n", argv[0]);
printf("Try: %s $(python3 -c \"print('A'*64)\")\n", argv[0]);
return 1;
}
vulnerable_function(argv[1]);
printf("Returned safely (this shouldn't print with overflow)\n");
return 0;
}
Task 2.2: Use-After-Free
// programs/03_use_after_free.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct user {
char name[32];
int id;
};
int main(void) {
struct user *u = malloc(sizeof(struct user));
strcpy(u->name, "Alice");
u->id = 1001;
printf("Before free: %s (id=%d)\n", u->name, u->id);
free(u); // Free the memory
// Some other allocation might reuse this memory
char *other = malloc(100);
memset(other, 'X', 100);
// Use after free - u points to freed (and now corrupted) memory
printf("After free: %s (id=%d)\n", u->name, u->id); // UB!
return 0;
}
Task 2.3: Double Free
// programs/04_double_free.c
#include <stdlib.h>
int main(void) {
char *ptr = malloc(100);
free(ptr);
free(ptr); // Double free!
return 0;
}
Task 2.4: Stack Exhaustion
// programs/05_stack_exhaustion.c
#include <stdio.h>
int recurse(int depth) {
char buffer[4096]; // 4KB per frame
buffer[0] = depth; // Use buffer to prevent optimization
printf("Depth: %d\n", depth);
return recurse(depth + 1); // Infinite recursion
}
int main(void) {
return recurse(0);
}
Task 2.5: Uninitialized Variable
// programs/06_uninitialized.c
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int *ptr; // Uninitialized pointer
// Some path doesn't initialize ptr
int condition = rand() % 2;
if (condition) {
ptr = malloc(sizeof(int));
*ptr = 42;
}
// If condition is 0, ptr is garbage
printf("Value: %d\n", *ptr); // May crash with garbage pointer
return 0;
}
Task 2.6: Off-by-One
// programs/07_off_by_one.c
#include <stdio.h>
#include <string.h>
int main(void) {
char buffer[16];
int important_value = 0x12345678;
// Off-by-one: writing 17 bytes including null terminator
// into 16-byte buffer, corrupting adjacent data
strcpy(buffer, "1234567890123456"); // 16 chars + null = 17 bytes
printf("Buffer: %s\n", buffer);
printf("Important: 0x%x (should be 0x12345678)\n", important_value);
// Depending on stack layout, this may overwrite important_value
// or saved registers
return 0;
}
Task 2.7: Format String
// programs/08_format_string.c
#include <stdio.h>
void vulnerable(char *user_input) {
printf(user_input); // Format string vulnerability!
printf("\n");
}
int main(int argc, char *argv[]) {
if (argc < 2) {
printf("Usage: %s <input>\n", argv[0]);
printf("Try: %s '%%x.%%x.%%x.%%x'\n", argv[0]);
printf("Or: %s '%%s' (may crash)\n", argv[0]);
return 1;
}
vulnerable(argv[1]);
return 0;
}
Task 2.8: Integer Overflow
// programs/09_integer_overflow.c
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main(void) {
// Simulate size calculation overflow
uint32_t count = 0x40000001; // About 1 billion
uint32_t size = 4; // 4 bytes each
// Overflow: count * size = 0x100000004 but wraps to 4
uint32_t total = count * size;
printf("Count: %u, Size: %u, Total: %u\n", count, size, total);
// Allocate based on overflowed value
char *buf = malloc(total); // Only allocates 4 bytes!
// Try to use it as if it were large
for (uint32_t i = 0; i < count && i < 1000; i++) {
buf[i * size] = 'A'; // Out-of-bounds access!
}
return 0;
}
Task 2.9: Misaligned Access
// programs/10_misaligned.c
#include <stdio.h>
#include <stdint.h>
int main(void) {
char buffer[16];
// Create misaligned pointer for 8-byte access
uint64_t *misaligned = (uint64_t *)(buffer + 1);
// This may crash on strict-alignment architectures
// or cause performance penalty on x86-64
*misaligned = 0x123456789ABCDEF0ULL;
printf("Value: 0x%lx\n", *misaligned);
return 0;
}
Checkpoint: All programs compile with make. Each triggers its specific crash mode.
Phase 3: Report Writing (Days 7-10)
For each crash program:
- Run the crash analysis script
- Fill in the post-mortem template
- Include:
- Exact register values
- Stack frame diagram
- Root cause explanation
- Assembly patterns that identify the bug
Example workflow:
# 1. Compile with debug info, no protections
make PROG=01_stack_buffer_overflow
# 2. Trigger the crash
./programs/01_stack_buffer_overflow $(python3 -c "print('A'*64)")
# 3. Analyze with GDB
./scripts/crash_analyze.sh ./programs/01_stack_buffer_overflow
# 4. Open the dump and template, write the report
cat programs/01_stack_buffer_overflow.crash_dump.txt
vim reports/01_stack_buffer_overflow.md
Phase 4: Pattern Recognition (Days 11-14)
Goal: Practice recognizing crash types from assembly alone.
Exercise: Given only disassembly output, identify:
- What kind of bug this is
- Where the bug manifests
- What the fix should be
Practice scenarios:
Scenario A:
vulnerable:
pushq %rbp
movq %rsp, %rbp
subq $32, %rsp
movq %rdi, -24(%rbp)
movq -24(%rbp), %rax
movq %rax, %rsi
leaq -16(%rbp), %rdi # <- 16-byte buffer
call strcpy # <- Unbounded copy!
leave
ret
Identify: Stack buffer overflow (16-byte buffer, strcpy has no limit)
Scenario B:
process:
pushq %rbp
movq %rsp, %rbp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
movl (%rax), %eax # <- Dereference without check
...
Identify: Missing NULL check before dereference
Scenario C:
danger:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rdi
call printf # <- User input as format!
Identify: Format string vulnerability (first arg to printf is user-controlled)
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | How to Test |
|---|---|---|
| Crash Reproduction | Each program crashes as expected | Run and verify signal received |
| Report Completeness | Reports contain all required sections | Checklist verification |
| Assembly Accuracy | Disassembly matches source behavior | Manual verification |
| Pattern Recognition | Can identify bug from asm alone | Blind test with new programs |
6.2 Crash Reproduction Tests
#!/bin/bash
# scripts/test_crashes.sh
test_crash() {
local prog="$1"
local args="$2"
local expected_signal="$3"
echo -n "Testing $prog... "
# Run and capture exit code
timeout 5 "$prog" $args >/dev/null 2>&1
local exit_code=$?
# Signals result in exit code = 128 + signal number
# SIGSEGV = 11, so exit 139
# SIGABRT = 6, so exit 134
case $expected_signal in
SIGSEGV) expected_exit=139 ;;
SIGABRT) expected_exit=134 ;;
SIGBUS) expected_exit=138 ;;
SIGFPE) expected_exit=136 ;;
*) expected_exit=1 ;;
esac
if [ $exit_code -eq $expected_exit ]; then
echo "PASS (exit $exit_code)"
return 0
else
echo "FAIL (expected exit $expected_exit, got $exit_code)"
return 1
fi
}
# Run tests
test_crash "./programs/01_stack_buffer_overflow" "$(python3 -c \"print('A'*64)\")" "SIGSEGV"
test_crash "./programs/02_null_pointer" "fail" "SIGSEGV"
test_crash "./programs/04_double_free" "" "SIGABRT"
test_crash "./programs/05_stack_exhaustion" "" "SIGSEGV"
# ... add more tests
6.3 Report Quality Checklist
For each report, verify:
- Crash signal correctly identified
- All registers documented
- Faulting instruction explained
- Stack trace included
- Stack frame diagram accurate
- Root cause clearly explained
- Vulnerability classification correct
- Assembly signature described
- Prevention measures listed
6.4 Blind Testing
Have someone else create a crashing program. Given only:
- The executable
- A way to trigger the crash
- GDB access
Write a complete post-mortem without seeing the source code.
7. Common Pitfalls
7.1 Analysis Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Trusting %rbp when itโs corrupted | Backtrace makes no sense | Check if %rbp looks like valid stack address |
| Missing red zone locals | Canโt find local variables | Check $rsp - 128 for leaf functions |
| Forgetting PIE | Addresses donโt match objdump | Use info proc mappings to find base address |
| Wrong endianness in memory | Values look scrambled | x86-64 is little-endian |
| Optimized code confusion | Variables โoptimized outโ | Compile with -O0 -g for learning |
7.2 GDB Issues
Problem: GDB doesnโt show source lines
Solution: Compile with -g and ensure source files are accessible
Problem: Canโt set breakpoints
Solution: Check if binary has debug symbols: file <binary>
Problem: Stack trace shows โ??โ frames
Solution: Frame pointer may be omitted. Use bt with -fno-omit-frame-pointer or rely on DWARF unwinding
Problem: Canโt examine memory after crash
Solution: The process is still โaliveโ in GDB. Use x/ commands normally.
7.3 Platform Differences
| Aspect | Linux | macOS |
|---|---|---|
| ABI | System V AMD64 | System V AMD64 |
| Stack protector | Default on in many distros | Default on |
| ASLR | Enabled by default | Enabled by default |
| Debugger | gdb (native) | lldb (native), gdb (needs signing) |
| Red zone | 128 bytes | 128 bytes |
macOS GDB Signing (if using gdb instead of lldb):
# Create a certificate and sign gdb
# See: https://sourceware.org/gdb/wiki/PermissionsDarwin
8. Extensions
8.1 Beginner Extensions
- Add more crash types: Null function pointer call, signed integer overflow
- Colorize reports: Use ANSI colors to highlight important values
- Create crash quizzes: Given a register dump, identify the bug type
8.2 Intermediate Extensions
- Stack canary analysis: Enable
-fstack-protectorand show how it detects overflow - ASLR demonstration: Show address randomization across runs
- Heap corruption detection: Add ASAN/Valgrind analysis to reports
- Automated pattern matching: Script that suggests vulnerability type from crash dump
8.3 Advanced Extensions
- ROP gadget finder: Locate usable code sequences for return-oriented programming
- Crash reproduction from core dump: Analyze core files without live debugging
- Cross-architecture analysis: Apply same methodology to ARM64
- Integration with fuzzing: Use AFL/libFuzzer to generate crash-inducing inputs
- Symbolic execution preview: Connect crash to path constraints
9. Real-World Connections
9.1 Industry Applications
Security Research & Bug Bounties: Every serious vulnerability report requires crash analysis. The skills here directly apply to:
- Chrome/Firefox security bugs
- Linux kernel vulnerabilities
- IoT device exploitation
Incident Response: When production systems crash:
- Analyze core dumps from customer environments
- Determine root cause without source access
- Write post-mortems for engineering teams
Compiler Development: Understanding calling conventions is essential for:
- Implementing new language features
- Debugging code generation bugs
- Optimizing function calls
Embedded Systems: Resource-constrained systems often lack full debuggers:
- Must understand raw memory dumps
- Debug without symbols
- Analyze boot-time crashes
9.2 Related Tools
| Tool | Purpose |
|---|---|
| AddressSanitizer (ASAN) | Runtime memory error detection |
| Valgrind | Memory debugging and profiling |
| GDB | Primary interactive debugger |
| LLDB | LLVM debugger (macOS default) |
| radare2 | Reverse engineering framework |
| Ghidra | NSAโs reverse engineering tool |
| pwndbg/GEF | GDB plugins for exploitation |
| Crash | Linux kernel crash dump analysis |
9.3 Interview Relevance
This project prepares you for questions like:
- โWalk me through what happens when a function is calledโ
- โHow would you debug a segfault in production?โ
- โExplain how a stack buffer overflow worksโ
- โWhatโs the difference between caller-saved and callee-saved registers?โ
- โHow do you read a core dump?โ
- โWhat protections exist against buffer overflows?โ
10. Resources
10.1 Essential Reading
- CS:APP Chapter 3: โMachine-Level Representation of Programsโ
- Sections 3.4 (Accessing Information) and 3.7 (Procedures) are critical
- System V AMD64 ABI Specification: The authoritative reference
- Intel 64 and IA-32 Architectures Software Developerโs Manual: Volume 1 covers calling conventions
10.2 GDB Resources
- GDB Manual: https://sourceware.org/gdb/current/onlinedocs/gdb/
- GDB Cheat Sheet: https://darkdust.net/files/GDB%20Cheat%20Sheet.pdf
- pwndbg: Enhanced GDB for debugging and exploitation
- GEF: GDB Enhanced Features
10.3 x86-64 Reference
- x86-64 Instruction Reference: https://www.felixcloutier.com/x86/
- Compiler Explorer: See how C compiles to assembly
- x86-64 Register Usage: Quick reference
10.4 Security Resources
- CWE (Common Weakness Enumeration): Vulnerability classification
- Smashing the Stack for Fun and Profit: Classic buffer overflow paper
- ROP Emporium: Practice return-oriented programming
10.5 Related Projects in This Series
- Previous: P2 (Bitwise Data Inspector), P3 (Data Lab Clone)
- Next: P5 (Bomb Lab Workflow) - Apply debugging skills to reverse engineering
- Related: P6 (Attack Lab) - Use crash analysis for exploitation
11. Self-Assessment Checklist
Before considering this project complete, verify:
Understanding
- I can list the 6 argument registers in order
- I can explain the difference between caller-saved and callee-saved registers
- I can draw a complete stack frame from memory
- I understand why the stack must be 16-byte aligned before
call - I can explain what the red zone is and when itโs used
- I know how structs are laid out in memory including padding
- I can identify function prologues and epilogues in assembly
- I understand how loops and conditionals appear in assembly
Implementation
- All 10 crash programs compile and crash as expected
- GDB automation script captures complete crash information
- At least 5 complete post-mortem reports written
- Reports include accurate stack frame diagrams
- Reports correctly identify vulnerability classes
Skill Demonstration
- Given a crash dump, I can identify the faulting instruction
- I can trace back from crash address to the source-level bug
- I can recognize common vulnerability patterns from assembly
- I can use GDB commands fluently without reference
- I can explain crashes without guessing
Growth Indicators
- I debugged at least one โmysteryโ crash using only machine state
- I can read compiler-generated assembly and understand its purpose
- I catch myself thinking about stack layout when writing C code
- I understand why certain coding patterns are dangerous
12. Submission / Completion Criteria
Minimum Viable Completion:
- At least 5 crash programs implemented
- At least 3 complete post-mortem reports
- GDB automation script works
- Can explain stack frame layout accurately
Full Completion:
- All 10 crash programs implemented
- All 10 post-mortem reports complete
- Pattern recognition exercises completed
- Can analyze crashes without looking at source
Excellence (Going Above & Beyond):
- Cross-platform (Linux + macOS) analysis
- pwndbg/GEF integration
- Automated vulnerability classification
- Core dump analysis without live debugging
- Extended to ARM64 calling convention
13. Real World Outcome
When you complete this project, hereโs exactly what youโll see when running your crash analysis toolkit:
$ ./programs/01_stack_buffer_overflow $(python3 -c "print('A'*64)")
Copied: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Segmentation fault (core dumped)
$ ./scripts/crash_analyze.sh ./programs/01_stack_buffer_overflow
================================================================================
CRASH CART ANALYSIS REPORT
================================================================================
CRASH IDENTIFICATION
--------------------------------------------------------------------------------
Program: ./programs/01_stack_buffer_overflow
Arguments: AAAA...(64 bytes)
Signal: SIGSEGV (Segmentation fault)
Fault address: 0x0000414141414141
REGISTER STATE AT CRASH
--------------------------------------------------------------------------------
%rax = 0x0000000000000000 (return value: 0)
%rbx = 0x00007fffffffdeb8 (callee-saved, preserved)
%rcx = 0x0000000000000041 ('A' character, from strcpy)
%rdx = 0x00007fffffffdf00 (string pointer)
%rsi = 0x00007fffffffe200 (original source string)
%rdi = 0x00007fffffffddc0 (destination buffer address)
%rbp = 0x4141414141414141 [CORRUPTED! Contains "AAAAAAAA"]
%rsp = 0x00007fffffffddf8 (valid stack pointer)
%rip = 0x0000414141414141 [INVALID! Contains "AAAAAA"]
Analysis: %rip and %rbp contain 0x41 bytes ('A'), indicating the return
address and saved frame pointer were overwritten by the overflow.
FAULTING INSTRUCTION
--------------------------------------------------------------------------------
Cannot disassemble at 0x0000414141414141 - address not mapped!
What happened:
The 'ret' instruction at end of vulnerable_function popped the
overwritten return address (0x0000414141414141) into %rip.
CPU then tried to fetch instruction at that invalid address.
STACK TRACE
--------------------------------------------------------------------------------
Cannot unwind - frame pointer chain is corrupted.
Last known good frame:
vulnerable_function() at buffer_overflow.c:12
Called from main() at buffer_overflow.c:20
STACK FRAME ANALYSIS
--------------------------------------------------------------------------------
BEFORE OVERFLOW (intended layout):
+------------------+---------------------------+
| Address | Content |
+------------------+---------------------------+
| rbp+8 | Return address -> main+42 |
| rbp | Saved %rbp -> 0x7fff... |
| rbp-32 | char buffer[32] |
+------------------+---------------------------+
AFTER OVERFLOW (64 bytes written):
+------------------+---------------------------+
| Address | Content |
+------------------+---------------------------+
| rbp+8 | 0x4141414141414141 (AAAA) | <- OVERWRITTEN!
| rbp | 0x4141414141414141 (AAAA) | <- OVERWRITTEN!
| rbp-32 | AAAAAAAAAAAAAAAAAAA... |
+------------------+---------------------------+
Stack bytes at crash:
0x7fffffffddc0: 41 41 41 41 41 41 41 41 AAAAAAAA
0x7fffffffddc8: 41 41 41 41 41 41 41 41 AAAAAAAA
0x7fffffffddd0: 41 41 41 41 41 41 41 41 AAAAAAAA
0x7fffffffddd8: 41 41 41 41 41 41 41 41 AAAAAAAA
0x7fffffffdde0: 41 41 41 41 41 41 41 41 <- saved %rbp (corrupted)
0x7fffffffdde8: 41 41 41 41 41 41 41 41 <- return address (corrupted)
ROOT CAUSE
--------------------------------------------------------------------------------
The function vulnerable_function() uses strcpy() to copy user input
into a 32-byte stack buffer without bounds checking.
When the input exceeds 32 bytes:
- Bytes 1-32: Fill the buffer (intended)
- Bytes 33-40: Overwrite saved %rbp
- Bytes 41-48: Overwrite return address
- Bytes 49+: Overwrite caller's stack frame
Upon function return, the CPU loads the corrupted return address
into %rip and attempts to execute code at address 0x4141414141414141.
VULNERABILITY CLASSIFICATION
--------------------------------------------------------------------------------
CWE ID: CWE-121 (Stack-based Buffer Overflow)
CVSS: High (can lead to arbitrary code execution)
Exploitable: YES
An attacker who controls the input can:
1. Redirect execution to arbitrary address (by crafting return address)
2. Execute shellcode (if placed on executable stack)
3. Perform ROP attack (by chaining gadget addresses)
ASSEMBLY SIGNATURE
--------------------------------------------------------------------------------
Vulnerable pattern in disassembly:
4011a0: lea -0x20(%rbp),%rdi # buffer at rbp-32 (only 32 bytes!)
4011a4: mov %rsi,%rsi # user-controlled source
4011a7: call 401030 <strcpy@plt> # UNBOUNDED COPY!
Red flags:
- No length check before strcpy
- No __stack_chk_fail reference (no stack canary)
- Buffer size (0x20 = 32) smaller than potential input
PREVENTION
--------------------------------------------------------------------------------
1. Use strncpy() or strlcpy() with explicit size limit:
strncpy(buffer, input, sizeof(buffer) - 1);
buffer[sizeof(buffer) - 1] = '\0';
2. Enable stack canaries: -fstack-protector-strong
3. Use safe string functions (snprintf, strlcpy)
4. Enable ASLR and NX/DEP (defense in depth)
5. Use static analysis tools (cppcheck, Coverity)
================================================================================
14. The Core Question Youโre Answering
โWhen a program crashes, how do I read the machine state (registers, stack, memory) and trace backwards to understand exactly what went wrong and why?โ
This project transforms debugging from โsomething happened and I donโt know whatโ into a systematic forensic process. Youโll learn to read crash dumps like a detective reads a crime sceneโevery register value, stack byte, and instruction tells part of the story.
15. Concepts You Must Understand First
Before starting this project, ensure you understand these concepts:
| Concept | Why It Matters | Where to Learn |
|---|---|---|
| Function call mechanics (call/ret) | Core of stack frame understanding | CS:APP 3.7.2 |
| Stack grows downward | Essential for understanding buffer overflows | CS:APP 3.4.4 |
| What โpushโ and โpopโ do | Stack manipulation | CS:APP 3.4.2 |
| Basic x86-64 registers | Youโll interpret register dumps | CS:APP 3.4.1 |
| What a pointer is | Every address is a pointer | CS:APP 3.8 |
| How arrays work in C | Buffer overflows involve arrays | CS:APP 3.8.1 |
| Basic GDB usage (run, break, print) | Your primary tool | GDB manual |
16. Questions to Guide Your Design
Work through these questions BEFORE writing code:
-
Crash Triggering: How do you reliably trigger each type of crash? What inputs or conditions are needed?
-
GDB Automation: How do you script GDB to capture crash state non-interactively? Whatโs the output format?
-
Stack Walking: How do you walk the stack frame chain? What if %rbp is corrupted?
-
Symbol Resolution: How do you map an address (like 0x401234) back to a function name and source line?
-
Report Format: What information is essential in a post-mortem? Whatโs nice-to-have?
-
Cross-Platform: Will your toolkit work on both Linux (GDB) and macOS (LLDB)? How do they differ?
-
Compiler Options: What compiler flags affect crash behavior? (-g, -O0, -fno-stack-protector, -fno-pie)
17. Thinking Exercise
Before writing any code, work through this crash analysis by hand:
Given this register dump from a crash:
rax 0x0
rbx 0x7fffffffdeb8
rcx 0x40
rdx 0x7fffffffde00
rsi 0x7fffffffe200
rdi 0x0
rbp 0x7fffffffdde0
rsp 0x7fffffffddc0
rip 0x401234
And this stack contents:
0x7fffffffddc0: 0x0000000000000001
0x7fffffffddc8: 0x00007fffffffe200
0x7fffffffddd0: 0x0000000000000000
0x7fffffffddd8: 0x0000000000000040
0x7fffffffdde0: 0x00007fffffffde00 <- %rbp points here
0x7fffffffdde8: 0x00000000004011a0 <- return address
And the faulting instruction:
0x401234: movl (%rdi), %eax
Questions to answer:
- What operation did the instruction try to perform?
- movl (%rdi), %eax means: load 4 bytes from address in %rdi into %eax
- What was in %rdi?
- %rdi = 0x0 (NULL pointer!)
- Why did this crash?
- Attempted to read from address 0x0, which is not mapped
- What type of bug is this?
- NULL pointer dereference
- What was the function probably trying to do?
- Access the first element of a struct or array passed via %rdi (first argument)
- Whatโs at the return address (0x4011a0)?
- The callerโcheck disassembly to see which function called this one
18. The Interview Questions Theyโll Ask
After completing this project, youโll be ready for these common interview questions:
- โWalk me through what happens when a function is called.โ
- Expected:
- Arguments placed in %rdi, %rsi, %rdx, %rcx, %r8, %r9 (or stack)
callpushes return address and jumps- Callee pushes %rbp, sets up new frame
- Local variables allocated on stack
- Bonus: Mention 16-byte alignment requirement, red zone
- Expected:
- โHow would you debug a segfault in a production system?โ
- Expected: Analyze core dump with GDB, check registers and stack, use
btfor backtrace - Bonus: Mention AddressSanitizer, Valgrind for memory issues
- Expected: Analyze core dump with GDB, check registers and stack, use
- โExplain how a stack buffer overflow works.โ
- Expected: Writing past buffer end overwrites saved %rbp and return address
- Bonus: Explain how this leads to control flow hijacking, mention mitigations (canaries, ASLR)
- โWhatโs the difference between caller-saved and callee-saved registers?โ
- Expected: Caller-saved (%rax, %rcx, %rdx, %rsi, %rdi, %r8-r11) may be trashed by callee; callee-saved (%rbx, %rbp, %r12-r15) must be preserved
- Bonus: Explain why this matters for optimization
- โHow do you read a core dump?โ
- Expected: Use GDB with
gdb program core, examine registers, stack, memory - Bonus: Know how to enable core dumps (
ulimit -c unlimited)
- Expected: Use GDB with
- โWhat protections exist against buffer overflows?โ
- Expected: Stack canaries, ASLR, NX/DEP, PIE
- Bonus: Explain how each works and can be bypassed
19. Hints in Layers
If youโre stuck, reveal hints one at a time:
Hint 1: GDB Batch Mode
You can script GDB to run non-interactively:
gdb -batch -x commands.txt ./program
Where commands.txt contains:
set pagination off
run
info registers
bt
x/20gx $rsp
quit
This captures crash state without user interaction.
Hint 2: Disabling Security Features for Learning
To see โclassicโ crashes without modern protections:
# Compile without stack protector
gcc -fno-stack-protector program.c
# Disable ASLR (Linux, temporarily)
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
# Compile without PIE
gcc -no-pie -fno-pie program.c
# Enable core dumps
ulimit -c unlimited
Remember to re-enable protections after!
Hint 3: Reading the Stack
In GDB, examine stack memory:
x/20gx $rsp # 20 8-byte values starting at stack pointer
x/20wx $rsp # 20 4-byte values
x/s $rdi # String at address in %rdi
x/10i $rip # 10 instructions starting at instruction pointer
Key pattern: return address is at $rbp + 8 (if frame pointer used).
Hint 4: Identifying Crash Types
Quick identification guide:
| Symptom | Likely Cause | |โโโ|โโโโโ| | %rip = user-controlled bytes | Buffer overflow corrupted return address | | %rip valid, accessing %rdi=0 | NULL pointer dereference | | Crash in free() | Double free or use-after-free | | Stack trace very deep | Infinite recursion | | SIGFPE (not SIGSEGV) | Division by zero | | Random crash location | Use-after-free, uninitialized pointer |
20. Books That Will Help
| Topic | Book | Chapter/Section |
|---|---|---|
| x86-64 registers | CS:APP 3rd Ed | Section 3.4 โAccessing Informationโ |
| Stack frames | CS:APP 3rd Ed | Section 3.7 โProceduresโ |
| Calling conventions | CS:APP 3rd Ed | Section 3.7.3 โData Transferโ |
| Local variables on stack | CS:APP 3rd Ed | Section 3.7.4 โLocal Storage on the Stackโ |
| Register conventions | CS:APP 3rd Ed | Section 3.7.5 โLocal Storage in Registersโ |
| Arrays and structs | CS:APP 3rd Ed | Section 3.8-3.9 โArray Allocation and Accessโ, โHeterogeneous Data Structuresโ |
| Buffer overflow attacks | CS:APP 3rd Ed | Section 3.10.3 โThwarting Buffer Overflow Attacksโ |
| Control flow patterns | CS:APP 3rd Ed | Section 3.6 โControlโ |
| System V AMD64 ABI | Official ABI Document | Sections 3.2 (Function Calling) |
| GDB Manual | GNU GDB Documentation | Chapters on examining data, stack |
| Practical debugging | โDebugging with GDBโ | By Richard Stallman et al. |
| Security analysis | โHacking: The Art of Exploitationโ | By Jon Erickson, Ch. 2-3 |
This guide was expanded from CSAPP_3E_DEEP_LEARNING_PROJECTS.md. For the complete learning path, see the project index.