Project 4: x86-64 Calling Convention Crash Cart

Project 4: x86-64 Calling Convention Crash Cart

Debug crashes by reading the machine: master stack frames, registers, and assembly patterns to trace any failure back to its source.

Quick Reference

Attribute Value
Difficulty Advanced
Time Estimate 1-2 weeks
Language C (Alternatives: Rust, Zig, C++)
Prerequisites Comfort using a debugger (GDB/LLDB), Projects 1-2 recommended
Key Topics x86-64 calling conventions, stack frames, ABI, assembly patterns, debugging
CS:APP Chapters 3

Table of Contents

  1. Learning Objectives
  2. Deep Theoretical Foundation
  3. Project Specification
  4. Solution Architecture
  5. Implementation Guide
  6. Testing Strategy
  7. Common Pitfalls
  8. Extensions
  9. Real-World Connections
  10. Resources
  11. Self-Assessment Checklist

1. Learning Objectives

By completing this project, you will:

  1. Map assembly to C constructs: Read compiler-generated assembly and understand how it implements C control flow, function calls, and data structures
  2. Explain stack layout precisely: Given any crash, draw the stack frame showing saved registers, return addresses, local variables, and arguments
  3. Master the System V AMD64 ABI: Know which registers hold arguments, return values, and which are caller/callee-saved
  4. Recognize compiler patterns: Identify prologues, epilogues, loops, conditionals, and switch statements in disassembly
  5. Debug from machine state: Given a register dump and stack bytes, reconstruct what happened and why
  6. Identify vulnerability classes: Recognize buffer overflows, use-after-free, and other memory errors from assembly signatures

2. Deep Theoretical Foundation

2.1 x86-64 Register Conventions

The x86-64 architecture has 16 general-purpose 64-bit registers. The System V AMD64 ABI (used on Linux, macOS, and BSD) assigns specific purposes to each:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                        x86-64 General Purpose Registers                     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                              โ”‚
โ”‚   CALLER-SAVED (volatile) - Callee may trash these                          โ”‚
โ”‚   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                          โ”‚
โ”‚   %rax  - Return value, syscall number                                       โ”‚
โ”‚   %rcx  - 4th argument (syscalls: destroyed)                                 โ”‚
โ”‚   %rdx  - 3rd argument, 2nd return value                                     โ”‚
โ”‚   %rsi  - 2nd argument                                                       โ”‚
โ”‚   %rdi  - 1st argument                                                       โ”‚
โ”‚   %r8   - 5th argument                                                       โ”‚
โ”‚   %r9   - 6th argument                                                       โ”‚
โ”‚   %r10  - Temporary, syscall argument                                        โ”‚
โ”‚   %r11  - Temporary (destroyed by syscalls)                                  โ”‚
โ”‚                                                                              โ”‚
โ”‚   CALLEE-SAVED (non-volatile) - Must be preserved across calls              โ”‚
โ”‚   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                 โ”‚
โ”‚   %rbx  - General purpose (preserved)                                        โ”‚
โ”‚   %rbp  - Frame pointer (optional, but conventional)                         โ”‚
โ”‚   %r12  - General purpose (preserved)                                        โ”‚
โ”‚   %r13  - General purpose (preserved)                                        โ”‚
โ”‚   %r14  - General purpose (preserved)                                        โ”‚
โ”‚   %r15  - General purpose (preserved)                                        โ”‚
โ”‚                                                                              โ”‚
โ”‚   SPECIAL PURPOSE                                                            โ”‚
โ”‚   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                           โ”‚
โ”‚   %rsp  - Stack pointer (always preserved)                                   โ”‚
โ”‚   %rip  - Instruction pointer (not directly accessible)                      โ”‚
โ”‚                                                                              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Register Size Variants

Each 64-bit register has smaller addressable portions:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚    63                              31              15       7     0โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                              %rax (64-bit)                         โ”‚
โ”‚                                   โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                   โ”‚          %eax (32-bit)         โ”‚
โ”‚                                   โ”‚              โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                   โ”‚              โ”‚   %ax (16-bit)  โ”‚
โ”‚                                   โ”‚              โ”‚      โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                   โ”‚              โ”‚  %ah โ”‚   %al    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key insight: Writing to %eax ZEROS the upper 32 bits of %rax.
             Writing to %ax, %ah, or %al preserves the upper bits.

Why This Matters for Debugging

When you see a crash, the first thing you check is registers. Knowing:

  • %rdi, %rsi, %rdx, %rcx, %r8, %r9 = What arguments were passed?
  • %rax = What was (or would be) the return value?
  • %rsp = Where is the stack? Is it aligned?
  • %rbp = Can we walk the stack frames?
  • %rip = What instruction crashed?

2.2 System V AMD64 ABI Calling Convention

The calling convention defines the contract between caller and callee:

Argument Passing Rules

Integer/Pointer Arguments (in order):

  1. %rdi - 1st argument
  2. %rsi - 2nd argument
  3. %rdx - 3rd argument
  4. %rcx - 4th argument
  5. %r8 - 5th argument
  6. %r9 - 6th argument
  7. Stack - 7th argument onwards (pushed right-to-left)

Floating-Point Arguments:

  • %xmm0 through %xmm7 (up to 8 float/double arguments)
  • Additional float arguments go on the stack

Return Values:

  • Integer/pointer: %rax (and %rdx for 128-bit values)
  • Floating-point: %xmm0 (and %xmm1 for pairs)
// Example: How arguments are passed
long example(long a, long b, long c, long d, long e, long f, long g, long h);
//              %rdi   %rsi   %rdx   %rcx   %r8    %r9   stack  stack

Stack Alignment Requirement

Critical Rule: The stack must be 16-byte aligned BEFORE the call instruction.

After call pushes the 8-byte return address, %rsp will be 8 mod 16. The callee typically pushes %rbp (if using frame pointer) to realign.

Before call:  %rsp % 16 == 0   (aligned)
After call:   %rsp % 16 == 8   (misaligned due to return address)
After push:   %rsp % 16 == 0   (realigned by saving %rbp)

Why alignment matters: SSE instructions require 16-byte alignment. Calling printf with a misaligned stack can crash!

2.3 Stack Frame Layout

A complete stack frame for a function call:

High addresses
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚          Caller's frame              โ”‚
                    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
                    โ”‚     Stack argument n (if any)        โ”‚  โ† Arguments 7+
                    โ”‚              ...                     โ”‚
                    โ”‚     Stack argument 7 (if any)        โ”‚
                    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
         +8(%rbp)   โ”‚      Return address (8 bytes)        โ”‚  โ† Pushed by call
                    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
            (%rbp)  โ”‚      Saved %rbp (8 bytes)            โ”‚  โ† Frame pointer
                    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
         -8(%rbp)   โ”‚      Saved callee-saved registers    โ”‚  โ† %rbx, %r12-15
                    โ”‚      (as needed)                     โ”‚
                    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
                    โ”‚      Local variables                 โ”‚
                    โ”‚      (growing downward)              โ”‚
                    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
                    โ”‚      Padding for alignment           โ”‚  โ† 16-byte align
                    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
            (%rsp)  โ”‚      (Red zone: 128 bytes below)     โ”‚  โ† Leaf functions only
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Low addresses

Reading a Stack Frame in GDB

(gdb) x/20gx $rsp
0x7fffffffddc0: 0x0000000000000001      0x00007fffffffdeb8
0x7fffffffddd0: 0x0000000000000000      0x0000000000401234
0x7fffffffdde0: 0x00007fffffffde00      0x0000000000401456
                โ†‘                       โ†‘
                Saved %rbp              Return address

(gdb) info frame
Stack level 0, frame at 0x7fffffffddf0:
 rip = 0x401234 in main (example.c:15); saved rip = 0x7ffff7a2d830
 called by frame at 0x7fffffffde50
 source language c.
 Arglist at 0x7fffffffdde0, args: argc=1, argv=0x7fffffffdeb8
 Locals at 0x7fffffffdde0, Previous frame's sp is 0x7fffffffddf0
 Saved registers:
  rbp at 0x7fffffffdde0, rip at 0x7fffffffdde8

2.4 The Red Zone

On System V AMD64 (Linux, macOS), leaf functions (functions that donโ€™t call other functions) can use 128 bytes below %rsp without adjusting the stack pointer:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    The Red Zone (128 bytes)                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                              โ”‚
โ”‚   %rsp โ†’ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚          โ”‚                                               โ”‚   โ”‚
โ”‚          โ”‚   Red Zone: 128 bytes that leaf functions     โ”‚   โ”‚
โ”‚          โ”‚   can use WITHOUT subtracting from %rsp       โ”‚   โ”‚
โ”‚          โ”‚                                               โ”‚   โ”‚
โ”‚          โ”‚   - Preserved across signal handlers          โ”‚   โ”‚
โ”‚          โ”‚   - NOT preserved if function calls another   โ”‚   โ”‚
โ”‚          โ”‚   - Allows optimization: no stack setup       โ”‚   โ”‚
โ”‚          โ”‚                                               โ”‚   โ”‚
โ”‚   %rsp-128 โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€   โ”‚
โ”‚                                                              โ”‚
โ”‚   Why it matters for debugging:                              โ”‚
โ”‚   - If you crash in a leaf function, locals may be          โ”‚
โ”‚     in the red zone (below %rsp)                            โ”‚
โ”‚   - On Windows (different ABI), there is NO red zone        โ”‚
โ”‚                                                              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Debugging implication: When examining a crash, donโ€™t just look at %rsp and above. Check %rsp - 128 for leaf function locals.

2.5 Arrays and Structs in Memory

Array Layout

Arrays are contiguous in memory:

int arr[4] = {10, 20, 30, 40};
Address:      arr[0]    arr[1]    arr[2]    arr[3]
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
Memory:       โ”‚   10    โ”‚   20    โ”‚   30    โ”‚   40    โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Offset:       +0        +4        +8        +12

Assembly access: arr[i] โ†’ movl (%rax,%rcx,4), %edx
                          โ†‘     โ†‘    โ†‘   โ†‘
                          dest  base index scale(sizeof int)

Struct Layout with Padding

struct example {
    char   a;      // 1 byte
    // 3 bytes padding
    int    b;      // 4 bytes
    char   c;      // 1 byte
    // 7 bytes padding
    long   d;      // 8 bytes
};
// Total size: 24 bytes (not 14!)
Offset:  0        1-3        4        8       9-15       16       24
         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
         โ”‚   a    โ”‚ padding  โ”‚   b    โ”‚   c   โ”‚ padding  โ”‚   d    โ”‚
         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
              1       3          4        1        7          8

Alignment rules:

  • Each field is aligned to its natural size (or max alignment)
  • Struct size is padded to multiple of largest alignment
  • char = 1-byte aligned
  • short = 2-byte aligned
  • int = 4-byte aligned
  • long, pointer = 8-byte aligned

Struct Access in Assembly

struct point { int x; int y; };
struct point p;
int val = p.y;
# Assuming %rdi points to struct point
movl    4(%rdi), %eax    # offset 4 = y field

Debugging tip: Know your struct layout! Use pahole or sizeof/offsetof to verify.

2.6 Common Instruction Patterns

Function Prologue (with frame pointer)

pushq   %rbp              # Save caller's frame pointer
movq    %rsp, %rbp        # Establish new frame pointer
subq    $32, %rsp         # Allocate 32 bytes for locals
pushq   %rbx              # Save callee-saved register (if used)

Function Prologue (without frame pointer, -fomit-frame-pointer)

subq    $40, %rsp         # Allocate locals + alignment
movq    %rbx, 32(%rsp)    # Save callee-saved in stack slot

Function Epilogue

addq    $32, %rsp         # Deallocate locals
popq    %rbp              # Restore caller's frame pointer
retq                      # Pop return address, jump to it

Or with leave:

leave                     # Equivalent to: movq %rbp, %rsp; popq %rbp
retq

Loop Patterns

For loop:

for (int i = 0; i < n; i++) { ... }
        movl    $0, %eax          # i = 0
.L2:
        cmpl    %edi, %eax        # compare i with n
        jge     .L1               # if i >= n, exit loop
        # ... loop body ...
        addl    $1, %eax          # i++
        jmp     .L2               # continue loop
.L1:

While loop:

while (condition) { ... }
        jmp     .Ltest            # Jump to condition test
.Lbody:
        # ... loop body ...
.Ltest:
        testl   %eax, %eax        # Test condition
        jne     .Lbody            # If true, continue loop

Conditional Patterns

If-else:

if (a > b) { x = 1; } else { x = 0; }
        cmpl    %esi, %edi        # Compare a (edi) with b (esi)
        jle     .Lelse            # If a <= b, goto else
        movl    $1, %eax          # x = 1
        jmp     .Ldone
.Lelse:
        movl    $0, %eax          # x = 0
.Ldone:

Conditional move (branchless):

return a > b ? a : b;  // max
        cmpl    %esi, %edi        # Compare a with b
        movl    %esi, %eax        # Assume b
        cmovg   %edi, %eax        # If a > b, use a instead

Switch Statement with Jump Table

switch (x) {
    case 0: return 10;
    case 1: return 20;
    case 2: return 30;
}
        cmpl    $2, %edi          # Check if x > 2
        ja      .Ldefault         # If so, default case
        leaq    .Ljumptable(%rip), %rax
        movslq  (%rax,%rdi,4), %rdx   # Load offset from table
        addq    %rax, %rdx            # Add base address
        jmpq    *%rdx                 # Indirect jump

.Ljumptable:
        .long   .Lcase0 - .Ljumptable
        .long   .Lcase1 - .Ljumptable
        .long   .Lcase2 - .Ljumptable

Function Call Pattern

result = foo(a, b, c);
        movl    $3, %edx          # 3rd arg (c)
        movl    $2, %esi          # 2nd arg (b)
        movl    $1, %edi          # 1st arg (a)
        call    foo
        movl    %eax, result(%rip) # Save return value

2.7 Reading Crash Dumps

When a program crashes, you get a signal (usually SIGSEGV or SIGBUS). The debugger preserves the exact machine state at the crash point.

Key information in a crash:

  1. %rip - What instruction caused the crash?
  2. Signal type - What kind of error?
    • SIGSEGV: Invalid memory access
    • SIGBUS: Misaligned access, bad address
    • SIGFPE: Arithmetic error (division by zero)
    • SIGILL: Illegal instruction
    • SIGABRT: abort() called
  3. Faulting address - What address was accessed?
  4. Register state - What values were in play?
  5. Stack trace - How did we get here?

Example GDB crash analysis session:

Program received signal SIGSEGV, Segmentation fault.
0x0000000000401234 in process_data (buf=0x7fffffffddc0, len=256) at crash.c:42

(gdb) info registers
rax            0x0                 0
rbx            0x7fffffffdeb8      140737488346808
rcx            0x100               256
rdx            0x0                 0
rsi            0x100               256
rdi            0x7fffffffddc0      140737488346560
rbp            0x7fffffffddf0      0x7fffffffddf0
rsp            0x7fffffffdda0      0x7fffffffdda0
rip            0x401234            0x401234 <process_data+52>

(gdb) x/i $rip
=> 0x401234 <process_data+52>:  movb   %al,(%rdx)
                                        โ†‘
                                        Writing to address 0x0 (NULL pointer!)

(gdb) bt
#0  0x0000000000401234 in process_data (buf=0x7fffffffddc0, len=256) at crash.c:42
#1  0x0000000000401456 in main (argc=1, argv=0x7fffffffdeb8) at crash.c:60

Diagnosis: The instruction movb %al, (%rdx) tried to write to address 0x0 (the value in %rdx). This is a NULL pointer dereference.


3. Project Specification

3.1 What You Will Build

A โ€œcrash cartโ€ toolkit consisting of:

  1. Crash-inducing programs: Small C programs that demonstrate specific failure modes
  2. Post-mortem report template: A standardized format for analyzing crashes
  3. Analysis tools/scripts: Helpers for extracting and formatting crash information
  4. Solution narratives: Complete explanations for each crash type

3.2 Functional Requirements

Part A: Crash Program Suite

Create programs that reliably trigger:

  1. Stack buffer overflow - Overwrite return address
  2. NULL pointer dereference - Read/write through NULL
  3. Use-after-free - Access freed memory
  4. Double free - Free the same pointer twice
  5. Stack overflow (recursion) - Infinite recursion
  6. Uninitialized variable - Use garbage value
  7. Off-by-one - Array bounds violation
  8. Format string vulnerability - printf with user data as format
  9. Integer overflow leading to crash - Arithmetic causes bad access
  10. Misaligned access (if applicable) - Access unaligned address

Part B: Post-Mortem Report Format

For each crash, produce a report containing:

โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
                    CRASH POST-MORTEM REPORT
โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

1. CRASH IDENTIFICATION
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   Program:       [executable name]
   Source file:   [source.c:line]
   Crash signal:  [SIGSEGV/SIGBUS/etc]
   Crash address: [0x...]

2. REGISTER STATE AT CRASH
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   %rip = 0x...    (Faulting instruction)
   %rsp = 0x...    (Stack pointer)
   %rbp = 0x...    (Frame pointer)
   %rax = 0x...    [interpretation]
   %rdi = 0x...    [interpretation]
   ...

3. FAULTING INSTRUCTION
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   Disassembly:  [instruction]
   Operation:    [what it was trying to do]
   Why it failed: [the specific reason]

4. STACK TRACE
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   #0 function_a at file.c:XX
   #1 function_b at file.c:YY
   ...

5. STACK FRAME ANALYSIS
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   [ASCII diagram of relevant stack frame(s)]

   Return address: 0x... โ†’ [function name + offset]
   Saved %rbp:     0x...
   Local variables:
     - var1 @ rbp-8:  0x...
     - var2 @ rbp-16: 0x...
   Arguments (if on stack):
     - arg7 @ rbp+16: 0x...

6. ROOT CAUSE
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   [Clear explanation of what went wrong at the C level]

7. VULNERABILITY CLASSIFICATION
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   CWE ID:      [if applicable]
   Category:    [buffer overflow / use-after-free / etc]
   Exploitable: [Yes/No/Maybe - brief explanation]

8. ASSEMBLY SIGNATURE
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   [Key assembly patterns that identify this bug class]

9. PREVENTION
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   [How to avoid this bug in the future]

โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

Part C: GDB Automation Scripts

Create scripts that:

  1. Run program and capture crash state
  2. Extract register values to structured format
  3. Dump relevant stack bytes
  4. Generate disassembly of faulting function
  5. Produce initial report template

3.3 Example Crash Report

โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
                    CRASH POST-MORTEM REPORT
โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

1. CRASH IDENTIFICATION
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   Program:       ./buffer_overflow
   Source file:   buffer_overflow.c:15
   Crash signal:  SIGSEGV
   Crash address: 0x0000414141414141

2. REGISTER STATE AT CRASH
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   %rip = 0x414141414141  (Invalid! This is ASCII 'AAAAAA')
   %rsp = 0x7fffffffddf8  (Stack pointer looks valid)
   %rbp = 0x4141414141414141  (Corrupted! Also 'AAAAAAAA')
   %rax = 0x0             (Return value was 0)

3. FAULTING INSTRUCTION
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   Disassembly:  Cannot disassemble - %rip points to invalid memory
   Operation:    Attempted to execute code at address 0x414141414141
   Why it failed: Address is not mapped; %rip was overwritten

4. STACK TRACE
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   Cannot unwind - frame pointer chain corrupted
   Last valid frame: vulnerable_function at buffer_overflow.c:12

5. STACK FRAME ANALYSIS
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   Before overflow:
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚  rbp+8:  Return address โ†’ main+42   โ”‚
   โ”‚  rbp:    Saved %rbp โ†’ 0x7fffffffde00โ”‚
   โ”‚  rbp-32: char buffer[32]            โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

   After overflow (64 bytes written to 32-byte buffer):
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚  rbp+8:  0x4141414141414141 (AAAAA) โ”‚ โ† Overwritten!
   โ”‚  rbp:    0x4141414141414141 (AAAAA) โ”‚ โ† Overwritten!
   โ”‚  rbp-32: "AAAA..." (overflow data)  โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

6. ROOT CAUSE
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   The function vulnerable_function() uses strcpy() to copy
   user-controlled input into a 32-byte stack buffer without
   length checking. Providing 64 bytes of input overwrites:
   - 32 bytes of buffer (intended)
   - 8 bytes of saved %rbp (unintended)
   - 8 bytes of return address (unintended)

   When the function returns (ret instruction), it pops
   0x4141414141414141 into %rip and attempts to execute there.

7. VULNERABILITY CLASSIFICATION
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   CWE ID:      CWE-121 (Stack-based Buffer Overflow)
   Category:    Buffer Overflow / Stack Smash
   Exploitable: Yes - attacker can redirect execution by
                controlling the overwritten return address

8. ASSEMBLY SIGNATURE
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   Look for:
   - call to strcpy/gets/sprintf without bounds checking
   - Buffer allocated on stack (subq $N, %rsp where N < input size)
   - No stack canary (missing __stack_chk_fail reference)

   Suspicious pattern:
     leaq    -32(%rbp), %rdi    # buffer address
     movq    %rsi, %rsi         # user input
     call    strcpy             # unbounded copy!

9. PREVENTION
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   - Use strncpy() or strlcpy() with explicit length limit
   - Enable stack canaries: -fstack-protector-strong
   - Enable ASLR and NX/DEP
   - Use static analyzers that catch unbounded copies
   - Prefer safe string APIs (snprintf, strncat)

โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

4. Solution Architecture

4.1 Project Structure

crash-cart/
โ”œโ”€โ”€ programs/                    # Crash-inducing programs
โ”‚   โ”œโ”€โ”€ 01_stack_buffer_overflow.c
โ”‚   โ”œโ”€โ”€ 02_null_pointer.c
โ”‚   โ”œโ”€โ”€ 03_use_after_free.c
โ”‚   โ”œโ”€โ”€ 04_double_free.c
โ”‚   โ”œโ”€โ”€ 05_stack_exhaustion.c
โ”‚   โ”œโ”€โ”€ 06_uninitialized.c
โ”‚   โ”œโ”€โ”€ 07_off_by_one.c
โ”‚   โ”œโ”€โ”€ 08_format_string.c
โ”‚   โ”œโ”€โ”€ 09_integer_overflow.c
โ”‚   โ””โ”€โ”€ 10_misaligned.c
โ”œโ”€โ”€ reports/                     # Completed post-mortem reports
โ”‚   โ”œโ”€โ”€ 01_stack_buffer_overflow.md
โ”‚   โ”œโ”€โ”€ 02_null_pointer.md
โ”‚   โ””โ”€โ”€ ...
โ”œโ”€โ”€ scripts/
โ”‚   โ”œโ”€โ”€ crash_analyze.sh         # Run program, capture crash
โ”‚   โ”œโ”€โ”€ gdb_commands.txt         # GDB automation commands
โ”‚   โ”œโ”€โ”€ parse_registers.py       # Extract register values
โ”‚   โ””โ”€โ”€ generate_report.py       # Create report template
โ”œโ”€โ”€ templates/
โ”‚   โ””โ”€โ”€ report_template.md       # Empty report template
โ”œโ”€โ”€ Makefile
โ””โ”€โ”€ README.md

4.2 GDB Automation

gdb_commands.txt - Commands to run on crash:

# Disable pagination for batch mode
set pagination off
set confirm off

# Run the program
run

# When it crashes, collect information
printf "\n=== REGISTERS ===\n"
info registers

printf "\n=== FAULTING INSTRUCTION ===\n"
x/1i $rip

printf "\n=== STACK TRACE ===\n"
bt

printf "\n=== STACK CONTENTS ===\n"
x/32gx $rsp

printf "\n=== CURRENT FRAME ===\n"
info frame

printf "\n=== DISASSEMBLY ===\n"
disassemble

printf "\n=== MEMORY MAPS ===\n"
info proc mappings

quit

Usage:

gdb -batch -x gdb_commands.txt ./crashme > crash_output.txt 2>&1

4.3 Essential GDB Commands Reference

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   GDB Commands for Crash Analysis                        โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                          โ”‚
โ”‚ EXAMINING REGISTERS                                                      โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                      โ”‚
โ”‚   info registers          Show all general-purpose registers             โ”‚
โ”‚   info all-registers      Show all registers including FP/vector         โ”‚
โ”‚   p/x $rax               Print %rax in hex                               โ”‚
โ”‚   p/d $rdi               Print %rdi in decimal                           โ”‚
โ”‚   p (char*)$rsi          Interpret %rsi as string pointer                โ”‚
โ”‚                                                                          โ”‚
โ”‚ EXAMINING MEMORY                                                         โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                         โ”‚
โ”‚   x/Nfz address          Examine memory:                                 โ”‚
โ”‚                          N = count                                       โ”‚
โ”‚                          f = format (x=hex, d=decimal, s=string,         โ”‚
โ”‚                                      i=instruction, c=char)              โ”‚
โ”‚                          z = size (b=byte, h=half, w=word, g=giant)      โ”‚
โ”‚                                                                          โ”‚
โ”‚   x/32gx $rsp            32 giant (8-byte) words at stack pointer        โ”‚
โ”‚   x/10i $rip             10 instructions at instruction pointer          โ”‚
โ”‚   x/s $rdi               String at first argument                        โ”‚
โ”‚   x/20wx 0x7fff...       20 words (4-byte) at address                    โ”‚
โ”‚                                                                          โ”‚
โ”‚ STACK NAVIGATION                                                         โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                         โ”‚
โ”‚   bt                     Backtrace (show call stack)                     โ”‚
โ”‚   bt full                Backtrace with local variables                  โ”‚
โ”‚   frame N                Select frame N                                  โ”‚
โ”‚   up / down              Move up/down the call stack                     โ”‚
โ”‚   info frame             Detailed info about current frame               โ”‚
โ”‚   info locals            Show local variables                            โ”‚
โ”‚   info args              Show function arguments                         โ”‚
โ”‚                                                                          โ”‚
โ”‚ DISASSEMBLY                                                              โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                              โ”‚
โ”‚   disassemble            Disassemble current function                    โ”‚
โ”‚   disas /r               With raw bytes                                  โ”‚
โ”‚   disas /m               Mixed with source (if available)                โ”‚
โ”‚   disas function_name    Disassemble specific function                   โ”‚
โ”‚   x/20i $rip-40          Instructions around crash point                 โ”‚
โ”‚                                                                          โ”‚
โ”‚ BREAKPOINTS & CONTROL                                                    โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                    โ”‚
โ”‚   break *0x401234        Break at address                                โ”‚
โ”‚   break function         Break at function entry                         โ”‚
โ”‚   break file.c:42        Break at source line                            โ”‚
โ”‚   watch *0x7fff...       Break when memory changes                       โ”‚
โ”‚   catch signal SIGSEGV   Break on signal                                 โ”‚
โ”‚                                                                          โ”‚
โ”‚ USEFUL SHORTCUTS                                                         โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                         โ”‚
โ”‚   p/x $rbp+8             Calculate return address location               โ”‚
โ”‚   x/a $rbp+8             Show return address                             โ”‚
โ”‚   x/s *(char**)($rsp)    Dereference string pointer on stack             โ”‚
โ”‚                                                                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

5. Implementation Guide

5.1 Development Environment Setup

# Required tools
# On Ubuntu/Debian:
sudo apt-get install gcc gdb build-essential

# On macOS:
xcode-select --install
brew install gdb  # Note: macOS gdb requires code signing

# Verify
gcc --version
gdb --version

# Create project structure
mkdir -p crash-cart/{programs,reports,scripts,templates}
cd crash-cart

5.2 Makefile

CC = gcc
# Compile with debug symbols, no optimization, no stack protector
CFLAGS = -g -O0 -fno-stack-protector -fno-pie -no-pie
# Enable all warnings
CFLAGS += -Wall -Wextra

# Source files
SOURCES = $(wildcard programs/*.c)
TARGETS = $(SOURCES:.c=)

.PHONY: all clean analyze

all: $(TARGETS)

programs/%: programs/%.c
	$(CC) $(CFLAGS) -o $@ $<

# Run crash analysis on a specific program
analyze: programs/$(PROG)
	./scripts/crash_analyze.sh programs/$(PROG)

clean:
	rm -f programs/01_* programs/02_* programs/03_* programs/04_*
	rm -f programs/05_* programs/06_* programs/07_* programs/08_*
	rm -f programs/09_* programs/10_*

5.3 Implementation Phases

Phase 1: Foundation (Days 1-2)

Goals:

  • Set up project structure
  • Create first crash program (NULL pointer)
  • Establish GDB workflow
  • Create report template

Task 1.1: NULL Pointer Dereference

// programs/02_null_pointer.c
#include <stdio.h>
#include <stdlib.h>

struct data {
    int value;
    char *name;
};

struct data *get_data(int should_fail) {
    if (should_fail) {
        return NULL;  // Simulate failed allocation or lookup
    }
    struct data *d = malloc(sizeof(struct data));
    d->value = 42;
    d->name = "valid";
    return d;
}

void process(struct data *d) {
    // Bug: No NULL check before dereference
    printf("Value: %d\n", d->value);  // Crash here if d is NULL
}

int main(int argc, char *argv[]) {
    int fail = (argc > 1);  // Fail if any argument given
    struct data *d = get_data(fail);
    process(d);
    return 0;
}

Task 1.2: GDB Analysis Script

#!/bin/bash
# scripts/crash_analyze.sh

PROG="$1"
if [ -z "$PROG" ]; then
    echo "Usage: $0 <program>"
    exit 1
fi

OUTPUT="${PROG}.crash_dump.txt"

cat > /tmp/gdb_crash.txt << 'EOF'
set pagination off
set confirm off
run
printf "\n========== SIGNAL INFO ==========\n"
info signals SIGSEGV SIGBUS SIGFPE SIGABRT
printf "\n========== REGISTERS ==========\n"
info registers
printf "\n========== FAULTING INSTRUCTION ==========\n"
x/3i $rip-8
printf "=> "
x/1i $rip
x/3i $rip+1
printf "\n========== BACKTRACE ==========\n"
bt full
printf "\n========== STACK (16 quadwords) ==========\n"
x/16gx $rsp
printf "\n========== FRAME INFO ==========\n"
info frame
printf "\n========== DISASSEMBLY ==========\n"
disassemble
quit
EOF

echo "Analyzing crash in: $PROG"
echo "Output: $OUTPUT"

gdb -batch -x /tmp/gdb_crash.txt "$PROG" > "$OUTPUT" 2>&1

echo "Done. Check $OUTPUT for crash details."

Checkpoint: Run ./crash_analyze.sh programs/02_null_pointer and verify you get register dump and crash info.

Phase 2: Crash Program Suite (Days 3-6)

Task 2.1: Stack Buffer Overflow

// programs/01_stack_buffer_overflow.c
#include <stdio.h>
#include <string.h>

void vulnerable_function(char *input) {
    char buffer[32];
    strcpy(buffer, input);  // No bounds check!
    printf("Copied: %s\n", buffer);
}

int main(int argc, char *argv[]) {
    if (argc < 2) {
        printf("Usage: %s <input>\n", argv[0]);
        printf("Try: %s $(python3 -c \"print('A'*64)\")\n", argv[0]);
        return 1;
    }
    vulnerable_function(argv[1]);
    printf("Returned safely (this shouldn't print with overflow)\n");
    return 0;
}

Task 2.2: Use-After-Free

// programs/03_use_after_free.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct user {
    char name[32];
    int id;
};

int main(void) {
    struct user *u = malloc(sizeof(struct user));
    strcpy(u->name, "Alice");
    u->id = 1001;

    printf("Before free: %s (id=%d)\n", u->name, u->id);

    free(u);  // Free the memory

    // Some other allocation might reuse this memory
    char *other = malloc(100);
    memset(other, 'X', 100);

    // Use after free - u points to freed (and now corrupted) memory
    printf("After free: %s (id=%d)\n", u->name, u->id);  // UB!

    return 0;
}

Task 2.3: Double Free

// programs/04_double_free.c
#include <stdlib.h>

int main(void) {
    char *ptr = malloc(100);
    free(ptr);
    free(ptr);  // Double free!
    return 0;
}

Task 2.4: Stack Exhaustion

// programs/05_stack_exhaustion.c
#include <stdio.h>

int recurse(int depth) {
    char buffer[4096];  // 4KB per frame
    buffer[0] = depth;  // Use buffer to prevent optimization
    printf("Depth: %d\n", depth);
    return recurse(depth + 1);  // Infinite recursion
}

int main(void) {
    return recurse(0);
}

Task 2.5: Uninitialized Variable

// programs/06_uninitialized.c
#include <stdio.h>
#include <stdlib.h>

int main(void) {
    int *ptr;  // Uninitialized pointer

    // Some path doesn't initialize ptr
    int condition = rand() % 2;
    if (condition) {
        ptr = malloc(sizeof(int));
        *ptr = 42;
    }
    // If condition is 0, ptr is garbage

    printf("Value: %d\n", *ptr);  // May crash with garbage pointer
    return 0;
}

Task 2.6: Off-by-One

// programs/07_off_by_one.c
#include <stdio.h>
#include <string.h>

int main(void) {
    char buffer[16];
    int important_value = 0x12345678;

    // Off-by-one: writing 17 bytes including null terminator
    // into 16-byte buffer, corrupting adjacent data
    strcpy(buffer, "1234567890123456");  // 16 chars + null = 17 bytes

    printf("Buffer: %s\n", buffer);
    printf("Important: 0x%x (should be 0x12345678)\n", important_value);

    // Depending on stack layout, this may overwrite important_value
    // or saved registers
    return 0;
}

Task 2.7: Format String

// programs/08_format_string.c
#include <stdio.h>

void vulnerable(char *user_input) {
    printf(user_input);  // Format string vulnerability!
    printf("\n");
}

int main(int argc, char *argv[]) {
    if (argc < 2) {
        printf("Usage: %s <input>\n", argv[0]);
        printf("Try: %s '%%x.%%x.%%x.%%x'\n", argv[0]);
        printf("Or:  %s '%%s' (may crash)\n", argv[0]);
        return 1;
    }
    vulnerable(argv[1]);
    return 0;
}

Task 2.8: Integer Overflow

// programs/09_integer_overflow.c
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

int main(void) {
    // Simulate size calculation overflow
    uint32_t count = 0x40000001;  // About 1 billion
    uint32_t size = 4;            // 4 bytes each

    // Overflow: count * size = 0x100000004 but wraps to 4
    uint32_t total = count * size;

    printf("Count: %u, Size: %u, Total: %u\n", count, size, total);

    // Allocate based on overflowed value
    char *buf = malloc(total);  // Only allocates 4 bytes!

    // Try to use it as if it were large
    for (uint32_t i = 0; i < count && i < 1000; i++) {
        buf[i * size] = 'A';  // Out-of-bounds access!
    }

    return 0;
}

Task 2.9: Misaligned Access

// programs/10_misaligned.c
#include <stdio.h>
#include <stdint.h>

int main(void) {
    char buffer[16];

    // Create misaligned pointer for 8-byte access
    uint64_t *misaligned = (uint64_t *)(buffer + 1);

    // This may crash on strict-alignment architectures
    // or cause performance penalty on x86-64
    *misaligned = 0x123456789ABCDEF0ULL;

    printf("Value: 0x%lx\n", *misaligned);

    return 0;
}

Checkpoint: All programs compile with make. Each triggers its specific crash mode.

Phase 3: Report Writing (Days 7-10)

For each crash program:

  1. Run the crash analysis script
  2. Fill in the post-mortem template
  3. Include:
    • Exact register values
    • Stack frame diagram
    • Root cause explanation
    • Assembly patterns that identify the bug

Example workflow:

# 1. Compile with debug info, no protections
make PROG=01_stack_buffer_overflow

# 2. Trigger the crash
./programs/01_stack_buffer_overflow $(python3 -c "print('A'*64)")

# 3. Analyze with GDB
./scripts/crash_analyze.sh ./programs/01_stack_buffer_overflow

# 4. Open the dump and template, write the report
cat programs/01_stack_buffer_overflow.crash_dump.txt
vim reports/01_stack_buffer_overflow.md

Phase 4: Pattern Recognition (Days 11-14)

Goal: Practice recognizing crash types from assembly alone.

Exercise: Given only disassembly output, identify:

  1. What kind of bug this is
  2. Where the bug manifests
  3. What the fix should be

Practice scenarios:

Scenario A:

vulnerable:
    pushq   %rbp
    movq    %rsp, %rbp
    subq    $32, %rsp
    movq    %rdi, -24(%rbp)
    movq    -24(%rbp), %rax
    movq    %rax, %rsi
    leaq    -16(%rbp), %rdi    # <- 16-byte buffer
    call    strcpy              # <- Unbounded copy!
    leave
    ret

Identify: Stack buffer overflow (16-byte buffer, strcpy has no limit)

Scenario B:

process:
    pushq   %rbp
    movq    %rsp, %rbp
    movq    %rdi, -8(%rbp)
    movq    -8(%rbp), %rax
    movl    (%rax), %eax       # <- Dereference without check
    ...

Identify: Missing NULL check before dereference

Scenario C:

danger:
    pushq   %rbp
    movq    %rsp, %rbp
    subq    $16, %rsp
    movq    %rdi, -8(%rbp)
    movq    -8(%rbp), %rdi
    call    printf             # <- User input as format!

Identify: Format string vulnerability (first arg to printf is user-controlled)


6. Testing Strategy

6.1 Test Categories

Category Purpose How to Test
Crash Reproduction Each program crashes as expected Run and verify signal received
Report Completeness Reports contain all required sections Checklist verification
Assembly Accuracy Disassembly matches source behavior Manual verification
Pattern Recognition Can identify bug from asm alone Blind test with new programs

6.2 Crash Reproduction Tests

#!/bin/bash
# scripts/test_crashes.sh

test_crash() {
    local prog="$1"
    local args="$2"
    local expected_signal="$3"

    echo -n "Testing $prog... "

    # Run and capture exit code
    timeout 5 "$prog" $args >/dev/null 2>&1
    local exit_code=$?

    # Signals result in exit code = 128 + signal number
    # SIGSEGV = 11, so exit 139
    # SIGABRT = 6, so exit 134

    case $expected_signal in
        SIGSEGV) expected_exit=139 ;;
        SIGABRT) expected_exit=134 ;;
        SIGBUS)  expected_exit=138 ;;
        SIGFPE)  expected_exit=136 ;;
        *)       expected_exit=1 ;;
    esac

    if [ $exit_code -eq $expected_exit ]; then
        echo "PASS (exit $exit_code)"
        return 0
    else
        echo "FAIL (expected exit $expected_exit, got $exit_code)"
        return 1
    fi
}

# Run tests
test_crash "./programs/01_stack_buffer_overflow" "$(python3 -c \"print('A'*64)\")" "SIGSEGV"
test_crash "./programs/02_null_pointer" "fail" "SIGSEGV"
test_crash "./programs/04_double_free" "" "SIGABRT"
test_crash "./programs/05_stack_exhaustion" "" "SIGSEGV"
# ... add more tests

6.3 Report Quality Checklist

For each report, verify:

  • Crash signal correctly identified
  • All registers documented
  • Faulting instruction explained
  • Stack trace included
  • Stack frame diagram accurate
  • Root cause clearly explained
  • Vulnerability classification correct
  • Assembly signature described
  • Prevention measures listed

6.4 Blind Testing

Have someone else create a crashing program. Given only:

  • The executable
  • A way to trigger the crash
  • GDB access

Write a complete post-mortem without seeing the source code.


7. Common Pitfalls

7.1 Analysis Mistakes

Pitfall Symptom Solution
Trusting %rbp when itโ€™s corrupted Backtrace makes no sense Check if %rbp looks like valid stack address
Missing red zone locals Canโ€™t find local variables Check $rsp - 128 for leaf functions
Forgetting PIE Addresses donโ€™t match objdump Use info proc mappings to find base address
Wrong endianness in memory Values look scrambled x86-64 is little-endian
Optimized code confusion Variables โ€œoptimized outโ€ Compile with -O0 -g for learning

7.2 GDB Issues

Problem: GDB doesnโ€™t show source lines Solution: Compile with -g and ensure source files are accessible

Problem: Canโ€™t set breakpoints Solution: Check if binary has debug symbols: file <binary>

Problem: Stack trace shows โ€œ??โ€ frames Solution: Frame pointer may be omitted. Use bt with -fno-omit-frame-pointer or rely on DWARF unwinding

Problem: Canโ€™t examine memory after crash Solution: The process is still โ€œaliveโ€ in GDB. Use x/ commands normally.

7.3 Platform Differences

Aspect Linux macOS
ABI System V AMD64 System V AMD64
Stack protector Default on in many distros Default on
ASLR Enabled by default Enabled by default
Debugger gdb (native) lldb (native), gdb (needs signing)
Red zone 128 bytes 128 bytes

macOS GDB Signing (if using gdb instead of lldb):

# Create a certificate and sign gdb
# See: https://sourceware.org/gdb/wiki/PermissionsDarwin

8. Extensions

8.1 Beginner Extensions

  • Add more crash types: Null function pointer call, signed integer overflow
  • Colorize reports: Use ANSI colors to highlight important values
  • Create crash quizzes: Given a register dump, identify the bug type

8.2 Intermediate Extensions

  • Stack canary analysis: Enable -fstack-protector and show how it detects overflow
  • ASLR demonstration: Show address randomization across runs
  • Heap corruption detection: Add ASAN/Valgrind analysis to reports
  • Automated pattern matching: Script that suggests vulnerability type from crash dump

8.3 Advanced Extensions

  • ROP gadget finder: Locate usable code sequences for return-oriented programming
  • Crash reproduction from core dump: Analyze core files without live debugging
  • Cross-architecture analysis: Apply same methodology to ARM64
  • Integration with fuzzing: Use AFL/libFuzzer to generate crash-inducing inputs
  • Symbolic execution preview: Connect crash to path constraints

9. Real-World Connections

9.1 Industry Applications

Security Research & Bug Bounties: Every serious vulnerability report requires crash analysis. The skills here directly apply to:

  • Chrome/Firefox security bugs
  • Linux kernel vulnerabilities
  • IoT device exploitation

Incident Response: When production systems crash:

  • Analyze core dumps from customer environments
  • Determine root cause without source access
  • Write post-mortems for engineering teams

Compiler Development: Understanding calling conventions is essential for:

  • Implementing new language features
  • Debugging code generation bugs
  • Optimizing function calls

Embedded Systems: Resource-constrained systems often lack full debuggers:

  • Must understand raw memory dumps
  • Debug without symbols
  • Analyze boot-time crashes
Tool Purpose
AddressSanitizer (ASAN) Runtime memory error detection
Valgrind Memory debugging and profiling
GDB Primary interactive debugger
LLDB LLVM debugger (macOS default)
radare2 Reverse engineering framework
Ghidra NSAโ€™s reverse engineering tool
pwndbg/GEF GDB plugins for exploitation
Crash Linux kernel crash dump analysis

9.3 Interview Relevance

This project prepares you for questions like:

  1. โ€œWalk me through what happens when a function is calledโ€
  2. โ€œHow would you debug a segfault in production?โ€
  3. โ€œExplain how a stack buffer overflow worksโ€
  4. โ€œWhatโ€™s the difference between caller-saved and callee-saved registers?โ€
  5. โ€œHow do you read a core dump?โ€
  6. โ€œWhat protections exist against buffer overflows?โ€

10. Resources

10.1 Essential Reading

  • CS:APP Chapter 3: โ€œMachine-Level Representation of Programsโ€
    • Sections 3.4 (Accessing Information) and 3.7 (Procedures) are critical
  • System V AMD64 ABI Specification: The authoritative reference
  • Intel 64 and IA-32 Architectures Software Developerโ€™s Manual: Volume 1 covers calling conventions

10.2 GDB Resources

10.3 x86-64 Reference

10.4 Security Resources

  • CWE (Common Weakness Enumeration): Vulnerability classification
  • Smashing the Stack for Fun and Profit: Classic buffer overflow paper
  • ROP Emporium: Practice return-oriented programming
  • Previous: P2 (Bitwise Data Inspector), P3 (Data Lab Clone)
  • Next: P5 (Bomb Lab Workflow) - Apply debugging skills to reverse engineering
  • Related: P6 (Attack Lab) - Use crash analysis for exploitation

11. Self-Assessment Checklist

Before considering this project complete, verify:

Understanding

  • I can list the 6 argument registers in order
  • I can explain the difference between caller-saved and callee-saved registers
  • I can draw a complete stack frame from memory
  • I understand why the stack must be 16-byte aligned before call
  • I can explain what the red zone is and when itโ€™s used
  • I know how structs are laid out in memory including padding
  • I can identify function prologues and epilogues in assembly
  • I understand how loops and conditionals appear in assembly

Implementation

  • All 10 crash programs compile and crash as expected
  • GDB automation script captures complete crash information
  • At least 5 complete post-mortem reports written
  • Reports include accurate stack frame diagrams
  • Reports correctly identify vulnerability classes

Skill Demonstration

  • Given a crash dump, I can identify the faulting instruction
  • I can trace back from crash address to the source-level bug
  • I can recognize common vulnerability patterns from assembly
  • I can use GDB commands fluently without reference
  • I can explain crashes without guessing

Growth Indicators

  • I debugged at least one โ€œmysteryโ€ crash using only machine state
  • I can read compiler-generated assembly and understand its purpose
  • I catch myself thinking about stack layout when writing C code
  • I understand why certain coding patterns are dangerous

12. Submission / Completion Criteria

Minimum Viable Completion:

  • At least 5 crash programs implemented
  • At least 3 complete post-mortem reports
  • GDB automation script works
  • Can explain stack frame layout accurately

Full Completion:

  • All 10 crash programs implemented
  • All 10 post-mortem reports complete
  • Pattern recognition exercises completed
  • Can analyze crashes without looking at source

Excellence (Going Above & Beyond):

  • Cross-platform (Linux + macOS) analysis
  • pwndbg/GEF integration
  • Automated vulnerability classification
  • Core dump analysis without live debugging
  • Extended to ARM64 calling convention

13. Real World Outcome

When you complete this project, hereโ€™s exactly what youโ€™ll see when running your crash analysis toolkit:

$ ./programs/01_stack_buffer_overflow $(python3 -c "print('A'*64)")
Copied: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Segmentation fault (core dumped)

$ ./scripts/crash_analyze.sh ./programs/01_stack_buffer_overflow

================================================================================
                    CRASH CART ANALYSIS REPORT
================================================================================

CRASH IDENTIFICATION
--------------------------------------------------------------------------------
  Program:       ./programs/01_stack_buffer_overflow
  Arguments:     AAAA...(64 bytes)
  Signal:        SIGSEGV (Segmentation fault)
  Fault address: 0x0000414141414141

REGISTER STATE AT CRASH
--------------------------------------------------------------------------------
  %rax = 0x0000000000000000   (return value: 0)
  %rbx = 0x00007fffffffdeb8   (callee-saved, preserved)
  %rcx = 0x0000000000000041   ('A' character, from strcpy)
  %rdx = 0x00007fffffffdf00   (string pointer)
  %rsi = 0x00007fffffffe200   (original source string)
  %rdi = 0x00007fffffffddc0   (destination buffer address)
  %rbp = 0x4141414141414141   [CORRUPTED! Contains "AAAAAAAA"]
  %rsp = 0x00007fffffffddf8   (valid stack pointer)
  %rip = 0x0000414141414141   [INVALID! Contains "AAAAAA"]

  Analysis: %rip and %rbp contain 0x41 bytes ('A'), indicating the return
  address and saved frame pointer were overwritten by the overflow.

FAULTING INSTRUCTION
--------------------------------------------------------------------------------
  Cannot disassemble at 0x0000414141414141 - address not mapped!

  What happened:
    The 'ret' instruction at end of vulnerable_function popped the
    overwritten return address (0x0000414141414141) into %rip.
    CPU then tried to fetch instruction at that invalid address.

STACK TRACE
--------------------------------------------------------------------------------
  Cannot unwind - frame pointer chain is corrupted.

  Last known good frame:
    vulnerable_function() at buffer_overflow.c:12
    Called from main() at buffer_overflow.c:20

STACK FRAME ANALYSIS
--------------------------------------------------------------------------------

  BEFORE OVERFLOW (intended layout):
  +------------------+---------------------------+
  | Address          | Content                   |
  +------------------+---------------------------+
  | rbp+8            | Return address -> main+42 |
  | rbp              | Saved %rbp -> 0x7fff...   |
  | rbp-32           | char buffer[32]           |
  +------------------+---------------------------+

  AFTER OVERFLOW (64 bytes written):
  +------------------+---------------------------+
  | Address          | Content                   |
  +------------------+---------------------------+
  | rbp+8            | 0x4141414141414141 (AAAA) | <- OVERWRITTEN!
  | rbp              | 0x4141414141414141 (AAAA) | <- OVERWRITTEN!
  | rbp-32           | AAAAAAAAAAAAAAAAAAA...    |
  +------------------+---------------------------+

  Stack bytes at crash:
  0x7fffffffddc0: 41 41 41 41 41 41 41 41  AAAAAAAA
  0x7fffffffddc8: 41 41 41 41 41 41 41 41  AAAAAAAA
  0x7fffffffddd0: 41 41 41 41 41 41 41 41  AAAAAAAA
  0x7fffffffddd8: 41 41 41 41 41 41 41 41  AAAAAAAA
  0x7fffffffdde0: 41 41 41 41 41 41 41 41  <- saved %rbp (corrupted)
  0x7fffffffdde8: 41 41 41 41 41 41 41 41  <- return address (corrupted)

ROOT CAUSE
--------------------------------------------------------------------------------
  The function vulnerable_function() uses strcpy() to copy user input
  into a 32-byte stack buffer without bounds checking.

  When the input exceeds 32 bytes:
    - Bytes 1-32:  Fill the buffer (intended)
    - Bytes 33-40: Overwrite saved %rbp
    - Bytes 41-48: Overwrite return address
    - Bytes 49+:   Overwrite caller's stack frame

  Upon function return, the CPU loads the corrupted return address
  into %rip and attempts to execute code at address 0x4141414141414141.

VULNERABILITY CLASSIFICATION
--------------------------------------------------------------------------------
  CWE ID:      CWE-121 (Stack-based Buffer Overflow)
  CVSS:        High (can lead to arbitrary code execution)
  Exploitable: YES

  An attacker who controls the input can:
    1. Redirect execution to arbitrary address (by crafting return address)
    2. Execute shellcode (if placed on executable stack)
    3. Perform ROP attack (by chaining gadget addresses)

ASSEMBLY SIGNATURE
--------------------------------------------------------------------------------
  Vulnerable pattern in disassembly:

    4011a0:  lea    -0x20(%rbp),%rdi    # buffer at rbp-32 (only 32 bytes!)
    4011a4:  mov    %rsi,%rsi           # user-controlled source
    4011a7:  call   401030 <strcpy@plt>  # UNBOUNDED COPY!

  Red flags:
    - No length check before strcpy
    - No __stack_chk_fail reference (no stack canary)
    - Buffer size (0x20 = 32) smaller than potential input

PREVENTION
--------------------------------------------------------------------------------
  1. Use strncpy() or strlcpy() with explicit size limit:
       strncpy(buffer, input, sizeof(buffer) - 1);
       buffer[sizeof(buffer) - 1] = '\0';

  2. Enable stack canaries: -fstack-protector-strong

  3. Use safe string functions (snprintf, strlcpy)

  4. Enable ASLR and NX/DEP (defense in depth)

  5. Use static analysis tools (cppcheck, Coverity)

================================================================================

14. The Core Question Youโ€™re Answering

โ€œWhen a program crashes, how do I read the machine state (registers, stack, memory) and trace backwards to understand exactly what went wrong and why?โ€

This project transforms debugging from โ€œsomething happened and I donโ€™t know whatโ€ into a systematic forensic process. Youโ€™ll learn to read crash dumps like a detective reads a crime sceneโ€”every register value, stack byte, and instruction tells part of the story.


15. Concepts You Must Understand First

Before starting this project, ensure you understand these concepts:

Concept Why It Matters Where to Learn
Function call mechanics (call/ret) Core of stack frame understanding CS:APP 3.7.2
Stack grows downward Essential for understanding buffer overflows CS:APP 3.4.4
What โ€œpushโ€ and โ€œpopโ€ do Stack manipulation CS:APP 3.4.2
Basic x86-64 registers Youโ€™ll interpret register dumps CS:APP 3.4.1
What a pointer is Every address is a pointer CS:APP 3.8
How arrays work in C Buffer overflows involve arrays CS:APP 3.8.1
Basic GDB usage (run, break, print) Your primary tool GDB manual

16. Questions to Guide Your Design

Work through these questions BEFORE writing code:

  1. Crash Triggering: How do you reliably trigger each type of crash? What inputs or conditions are needed?

  2. GDB Automation: How do you script GDB to capture crash state non-interactively? Whatโ€™s the output format?

  3. Stack Walking: How do you walk the stack frame chain? What if %rbp is corrupted?

  4. Symbol Resolution: How do you map an address (like 0x401234) back to a function name and source line?

  5. Report Format: What information is essential in a post-mortem? Whatโ€™s nice-to-have?

  6. Cross-Platform: Will your toolkit work on both Linux (GDB) and macOS (LLDB)? How do they differ?

  7. Compiler Options: What compiler flags affect crash behavior? (-g, -O0, -fno-stack-protector, -fno-pie)


17. Thinking Exercise

Before writing any code, work through this crash analysis by hand:

Given this register dump from a crash:

rax    0x0
rbx    0x7fffffffdeb8
rcx    0x40
rdx    0x7fffffffde00
rsi    0x7fffffffe200
rdi    0x0
rbp    0x7fffffffdde0
rsp    0x7fffffffddc0
rip    0x401234

And this stack contents:

0x7fffffffddc0: 0x0000000000000001
0x7fffffffddc8: 0x00007fffffffe200
0x7fffffffddd0: 0x0000000000000000
0x7fffffffddd8: 0x0000000000000040
0x7fffffffdde0: 0x00007fffffffde00  <- %rbp points here
0x7fffffffdde8: 0x00000000004011a0  <- return address

And the faulting instruction:

0x401234: movl (%rdi), %eax

Questions to answer:

  1. What operation did the instruction try to perform?
    • movl (%rdi), %eax means: load 4 bytes from address in %rdi into %eax
  2. What was in %rdi?
    • %rdi = 0x0 (NULL pointer!)
  3. Why did this crash?
    • Attempted to read from address 0x0, which is not mapped
  4. What type of bug is this?
    • NULL pointer dereference
  5. What was the function probably trying to do?
    • Access the first element of a struct or array passed via %rdi (first argument)
  6. Whatโ€™s at the return address (0x4011a0)?
    • The callerโ€”check disassembly to see which function called this one

18. The Interview Questions Theyโ€™ll Ask

After completing this project, youโ€™ll be ready for these common interview questions:

  1. โ€œWalk me through what happens when a function is called.โ€
    • Expected:
      1. Arguments placed in %rdi, %rsi, %rdx, %rcx, %r8, %r9 (or stack)
      2. call pushes return address and jumps
      3. Callee pushes %rbp, sets up new frame
      4. Local variables allocated on stack
    • Bonus: Mention 16-byte alignment requirement, red zone
  2. โ€œHow would you debug a segfault in a production system?โ€
    • Expected: Analyze core dump with GDB, check registers and stack, use bt for backtrace
    • Bonus: Mention AddressSanitizer, Valgrind for memory issues
  3. โ€œExplain how a stack buffer overflow works.โ€
    • Expected: Writing past buffer end overwrites saved %rbp and return address
    • Bonus: Explain how this leads to control flow hijacking, mention mitigations (canaries, ASLR)
  4. โ€œWhatโ€™s the difference between caller-saved and callee-saved registers?โ€
    • Expected: Caller-saved (%rax, %rcx, %rdx, %rsi, %rdi, %r8-r11) may be trashed by callee; callee-saved (%rbx, %rbp, %r12-r15) must be preserved
    • Bonus: Explain why this matters for optimization
  5. โ€œHow do you read a core dump?โ€
    • Expected: Use GDB with gdb program core, examine registers, stack, memory
    • Bonus: Know how to enable core dumps (ulimit -c unlimited)
  6. โ€œWhat protections exist against buffer overflows?โ€
    • Expected: Stack canaries, ASLR, NX/DEP, PIE
    • Bonus: Explain how each works and can be bypassed

19. Hints in Layers

If youโ€™re stuck, reveal hints one at a time:

Hint 1: GDB Batch Mode

You can script GDB to run non-interactively:

gdb -batch -x commands.txt ./program

Where commands.txt contains:

set pagination off
run
info registers
bt
x/20gx $rsp
quit

This captures crash state without user interaction.

Hint 2: Disabling Security Features for Learning

To see โ€œclassicโ€ crashes without modern protections:

# Compile without stack protector
gcc -fno-stack-protector program.c

# Disable ASLR (Linux, temporarily)
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

# Compile without PIE
gcc -no-pie -fno-pie program.c

# Enable core dumps
ulimit -c unlimited

Remember to re-enable protections after!

Hint 3: Reading the Stack

In GDB, examine stack memory:

x/20gx $rsp    # 20 8-byte values starting at stack pointer
x/20wx $rsp    # 20 4-byte values
x/s $rdi       # String at address in %rdi
x/10i $rip     # 10 instructions starting at instruction pointer

Key pattern: return address is at $rbp + 8 (if frame pointer used).

Hint 4: Identifying Crash Types

Quick identification guide:

| Symptom | Likely Cause | |โ€”โ€”โ€”|โ€”โ€”โ€”โ€”โ€“| | %rip = user-controlled bytes | Buffer overflow corrupted return address | | %rip valid, accessing %rdi=0 | NULL pointer dereference | | Crash in free() | Double free or use-after-free | | Stack trace very deep | Infinite recursion | | SIGFPE (not SIGSEGV) | Division by zero | | Random crash location | Use-after-free, uninitialized pointer |


20. Books That Will Help

Topic Book Chapter/Section
x86-64 registers CS:APP 3rd Ed Section 3.4 โ€œAccessing Informationโ€
Stack frames CS:APP 3rd Ed Section 3.7 โ€œProceduresโ€
Calling conventions CS:APP 3rd Ed Section 3.7.3 โ€œData Transferโ€
Local variables on stack CS:APP 3rd Ed Section 3.7.4 โ€œLocal Storage on the Stackโ€
Register conventions CS:APP 3rd Ed Section 3.7.5 โ€œLocal Storage in Registersโ€
Arrays and structs CS:APP 3rd Ed Section 3.8-3.9 โ€œArray Allocation and Accessโ€, โ€œHeterogeneous Data Structuresโ€
Buffer overflow attacks CS:APP 3rd Ed Section 3.10.3 โ€œThwarting Buffer Overflow Attacksโ€
Control flow patterns CS:APP 3rd Ed Section 3.6 โ€œControlโ€
System V AMD64 ABI Official ABI Document Sections 3.2 (Function Calling)
GDB Manual GNU GDB Documentation Chapters on examining data, stack
Practical debugging โ€œDebugging with GDBโ€ By Richard Stallman et al.
Security analysis โ€œHacking: The Art of Exploitationโ€ By Jon Erickson, Ch. 2-3

This guide was expanded from CSAPP_3E_DEEP_LEARNING_PROJECTS.md. For the complete learning path, see the project index.