Project 6: Crackme Challenges

Expanded deep-dive guide for Project 6 from the Binary Analysis sprint.

Quick Reference

Attribute	Value
Difficulty	Level 2: Intermediate
Time Estimate	2-4 weeks
Main Programming Language	Assembly analysis, Python for keygens
Alternative Programming Languages	Any
Coolness Level	Level 4: Hardcore Tech Flex
Business Potential	1. The “Resume Gold”
Knowledge Area	Reverse Engineering / Password Bypass
Software or Tool	Ghidra, GDB, crackmes.one
Main Book	“Reversing: Secrets of Reverse Engineering” by Eldad Eilam

1. Learning Objectives

Build a working implementation with reproducible outputs.
Justify key design choices with binary-analysis principles.
Produce an evidence-backed report of findings and limitations.
Document hardening or next-step improvements.

2. All Theory Needed (Per-Concept Breakdown)

This project depends on concepts from the main sprint primer: loader semantics, control/data-flow recovery, runtime observation, and mitigation-aware vulnerability reasoning. Before implementation, restate the project’s core assumptions in your own words and define how you will validate them.

3. Project Specification

3.1 What You Will Build

Solve 10+ crackme challenges of increasing difficulty, learning patching, keygen writing, and anti-debugging bypass.

3.2 Functional Requirements

Accept the target binary/input and validate format assumptions.
Produce analyzable outputs (console report and/or artifacts).
Handle malformed inputs safely with explicit errors.

3.3 Non-Functional Requirements

Reproducibility: same input should produce equivalent findings.
Safety: unknown samples run only in isolated lab contexts.
Clarity: separate facts, hypotheses, and inferred conclusions.

3.4 Expanded Project Brief

File: P06-crackme-challenges.md
Main Programming Language: Assembly analysis, Python for keygens
Alternative Programming Languages: Any
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Reverse Engineering / Password Bypass
Software or Tool: Ghidra, GDB, crackmes.one
Main Book: “Reversing: Secrets of Reverse Engineering” by Eldad Eilam

What you’ll build: Solve 10+ crackme challenges of increasing difficulty, learning patching, keygen writing, and anti-debugging bypass.

Why it teaches binary analysis: Crackmes are purpose-built learning tools. They teach you to find and understand password checks, then bypass them.

Core challenges you’ll face:

Finding the check → maps to string references, control flow
Understanding the algorithm → maps to decompilation, debugging
Patching vs keygen → maps to two approaches to bypass
Anti-debugging → maps to detection evasion

Resources for key challenges:

crackmes.one - Download challenges
crackme.re walkthroughs - Detailed solutions
Ghidra Crackme Tutorial

Key Concepts:

Patching: Tutorial #10 - The Levels of Patching
Keygen Writing: “Reversing” Ch. 5 - Eilam
Anti-Debugging Bypass: OpenRCE Anti-Reversing Database

Difficulty: Intermediate Time estimate: 2-4 weeks Prerequisites: Projects 4-5 (GDB, Ghidra)

Real World Outcome

Deliverables:

Analysis output or tooling scripts
Report with control/data flow notes

Validation checklist:

Parses sample binaries correctly
Findings are reproducible in debugger
No unsafe execution outside lab ```bash
Approach 1: Patching

$ ./crackme Enter password: wrong Access Denied!

Found the check: JNE (jump if not equal) to fail

Patch JNE to JE (or NOP it out)

$ xxd crackme | grep “75 28” 00001234: 75 28 # JNE +0x28 $ printf ‘\x90\x90’ | dd of=crackme bs=1 seek=4660 conv=notrunc $ ./crackme Enter password: anything Access Granted!

Approach 2: Keygen

Found algorithm: password = (username XOR 0x55) + 0x1337

$ python3 keygen.py “admin” Valid password for ‘admin’: 0xAB12CD34

#### Hints in Layers
Systematic approach:
1. Run the binary to understand expected behavior
2. Find strings ("Enter password", "Access Denied")
3. Find cross-references to those strings
4. Trace backwards to find the comparison
5. Understand what makes it pass
6. Either patch the jump or write a keygen

Patching levels:
1. **LAME**: NOP out the check entirely
2. **Better**: Invert the jump condition
3. **Good**: Patch the comparison to always succeed
4. **Best**: Understand algorithm, write keygen

Questions:
- What's the difference between `JE` and `JNE`?
- How do you find the password comparison in decompiled code?
- What are common string comparison functions?

**Learning milestones**:
1. **Solve easy crackmes** → Find obvious password checks
2. **Understand algorithms** → XOR, hashing, encoding
3. **Write keygens** → Reverse the algorithm
4. **Bypass protections** → Handle obfuscation

#### The Core Question You Are Answering

**How do you systematically reverse engineer authentication mechanisms, understand their underlying algorithms, and create tools to bypass or generate valid credentials—all without source code?**

This project teaches the complete reverse engineering workflow: from initial binary exploration to algorithm extraction to automated solution generation. You'll learn both the "quick and dirty" approach (patching) and the "deep understanding" approach (keygen writing).

#### Concepts You Must Understand First

##### 1. String References and Cross-References
Most crackmes leave clues in strings ("Correct!", "Wrong password"). Learning to trace from strings to code is your first reverse engineering skill.

**Guiding Questions**:
- Why do string references often lead directly to validation logic?
- How do you distinguish between format strings and actual password strings?
- What happens when strings are obfuscated or encrypted at runtime?

**Book Reference**: "Practical Binary Analysis" Ch. 5.4 - Finding Main Manually

##### 2. Comparison Operations in Assembly
Password checks ultimately boil down to comparisons: `cmp`, `test`, `sub` followed by conditional jumps. Recognizing these patterns is essential.

**Guiding Questions**:
- What's the difference between `cmp rax, rbx` and `test rax, rax`?
- How do `je`, `jne`, `jz`, `jnz` relate to the zero flag?
- Why does `sub` set flags differently than `cmp`?

**Book Reference**: "Low-Level Programming" Ch. 5 - Arithmetic and Logical Operations

##### 3. Control Flow Manipulation (Patching)
The simplest bypass is changing a conditional jump (`je` → `jne`) or removing checks entirely (NOP padding).

**Guiding Questions**:
- What's the opcode for `jne` vs `je`, and how do you swap them?
- Why is NOPing (0x90) preferred over zeroing bytes?
- How do you ensure patch size matches original instruction size?

**Book Reference**: "Hacking: The Art of Exploitation" Ch. 3 - Exploitation

##### 4. Common Validation Algorithms
Crackmes use predictable patterns: XOR encoding, simple hashing (MD5/SHA), base64, character manipulation.

**Guiding Questions**:
- How do you recognize XOR in assembly (repeated `xor` with constants)?
- What does a SHA256 implementation look like in decompiled code?
- How do you distinguish encryption from simple obfuscation?

**Book Reference**: "Reversing: Secrets of Reverse Engineering" Ch. 5 - Applied Reverse Engineering

##### 5. Keygen Development
Once you understand the algorithm, you reverse it: if validation does `hash(input) == stored_hash`, your keygen does `input = reverse_hash(stored_hash)`.

**Guiding Questions**:
- What algorithms are reversible (XOR, Caesar cipher) vs irreversible (SHA256)?
- How do you handle one-way hashes (hint: you can't reverse them)?
- When is it easier to brute force than to write a perfect keygen?

**Book Reference**: "Reversing: Secrets of Reverse Engineering" Ch. 5

##### 6. Anti-Debugging Basics
Some crackmes detect debuggers using `ptrace`, timing checks, or `IsDebuggerPresent()`. You'll need to recognize and bypass these.

**Guiding Questions**:
- How does the `ptrace(PTRACE_TRACEME)` trick detect debuggers?
- What's a timing-based anti-debug check and how do you defeat it?
- Why do debuggers change program behavior even without breakpoints?

**Book Reference**: "Practical Malware Analysis" Ch. 15 - Anti-Debugging

##### 7. Binary Patching Tools and Techniques
You'll need to modify binaries with hex editors, `dd`, or specialized tools like `radare2` or Binary Ninja.

**Guiding Questions**:
- How do you find the file offset of a memory address in an ELF/PE binary?
- What's the difference between patching in-memory vs on-disk?
- How do you verify your patch didn't corrupt the binary?

**Book Reference**: "Practical Binary Analysis" Ch. 7 - Simple Code Injection

##### 8. Input Validation and User Input Flow
Understanding where user input enters (stdin, argv, environment variables) and how it's processed helps you trace to the validation logic.

**Guiding Questions**:
- How do you identify `scanf`, `fgets`, or `read` calls in disassembly?
- Where does command-line input (`argv`) appear in the program state?
- How do you trace tainted input through the program?

**Book Reference**: "Computer Systems: A Programmer's Perspective" Ch. 8.4 - Process Control

#### Questions to Guide Your Design

1. **Given a crackme that accepts a serial number, what's your systematic process to find the validation function?** Consider strings, imports, control flow, and data flow.

2. **When is patching preferable to writing a keygen, and vice versa?** Think about time investment, learning value, and reusability.

3. **How would you approach a crackme that generates a unique serial for each user's machine (HWID-based)?** Consider what machine identifiers it might use (MAC address, disk serial, CPU ID).

4. **What strategies help when the password check is heavily obfuscated (no strings, indirect jumps)?** Think about dynamic analysis, symbolic execution, and emulation.

5. **How do you build a test suite for your keygen to ensure it works for all inputs?** Consider edge cases, random testing, and comparing against the original binary.

6. **When a crackme uses a cryptographic hash (SHA256), what are your options since you can't reverse it?** Think about rainbow tables, brute force, or patching the comparison.

7. **How would you document your reverse engineering process so others can learn from your analysis?** Consider annotated disassembly, step-by-step walkthroughs, and algorithm explanations.

8. **What ethical and legal considerations apply to cracking software, even in a learning context?** Think about responsible disclosure, CTF vs commercial software, and intent.

#### Thinking Exercise

**Before attempting any crackmes, complete this exercise**:

1. **Manual Algorithm Reversal**: Here's a simple validation function in C:
   ```c
   int validate(char *input) {
       int sum = 0;
       for (int i = 0; i < strlen(input); i++) {
           sum += input[i] ^ 0x42;
       }
       return sum == 0x1337;
   }

Compile it (without optimization: gcc -O0)
Disassemble it with objdump or load in Ghidra
Identify the loop structure in assembly
Find the XOR operation and the constant 0x42
Find the final comparison with 0x1337
Write a keygen in Python that generates valid inputs

Patch Practice: Create a simple password checker:
```
#include <stdio.h>
#include <string.h>
int main() {
    char pass[32];
    printf("Password: ");
    scanf("%s", pass);
    if (strcmp(pass, "secret") == 0) {
        printf("Correct!\n");
    } else {
        printf("Wrong!\n");
    }
}
```
- Compile it
- Find the strcmp call in assembly (use objdump -d or Ghidra)
- Note the conditional jump after the comparison
- Patch the binary three ways:
  - Method 1: Change jne to je (swap success/failure)
  - Method 2: NOP out the entire check
  - Method 3: Change the comparison to cmp rax, rax (always equal)
- Verify each patch works

Trace User Input: Take this program:

int main(int argc, char **argv) {
    if (argc != 2) return 1;
    int key = atoi(argv[1]);
    key = (key * 13) + 37;
    key ^= 0xDEADBEEF;
    if (key == 0x12345678) {
        printf("Win!\n");
    }
}

Trace argv[1] through each transformation
Write the mathematical inverse: key = ((target ^ 0xDEADBEEF) - 37) / 13
Implement in Python and find the winning input
Verify by running the original binary

Anti-Debug Detection: Create a program with ptrace anti-debugging:
```
#include <sys/ptrace.h>
#include <stdio.h>
int main() {
    if (ptrace(PTRACE_TRACEME, 0, NULL, NULL) == -1) {
        printf("Debugger detected!\n");
        return 1;
    }
    printf("Not debugging\n");
    // rest of program
}
```
- Try running it under GDB (it will detect the debugger)
- Bypass it by:
  - Method 1: Patching the ptrace call to always return 0
  - Method 2: Setting a breakpoint before ptrace and changing the return value
  - Method 3: Using LD_PRELOAD to hook ptrace

The Interview Questions They’ll Ask

“Walk me through your methodology for solving an unknown crackme from start to finish.” Expected: Run it → check strings → find validation → understand algorithm → patch or keygen → verify success.
“What’s the difference between je and jne at the opcode level, and how would you patch one to the other?” Expected: je (0x74), jne (0x75). They differ by one bit. Patch by changing byte at that offset.
“You find this assembly: xor eax, eax; test eax, eax; je 0x401234. What’s happening and is there a shortcut?” Expected: xor eax, eax zeroes eax, test sets zero flag, je always jumps. Shortcut: jmp 0x401234.
“How would you approach a crackme that checks username AND serial number together (no valid serial without the right username)?” Expected: Trace both inputs, find where they’re combined (concatenation, XOR), understand the relationship, write a keygen that takes username as input.
“Explain three different patching strategies and when you’d use each.” Expected: (1) Invert jump—quick but obvious; (2) NOP the check—clean; (3) Change comparison target—stealthy. Use based on goals (speed vs stealth).
“A crackme uses MD5(serial) == ‘abc123…’. Can you write a keygen? What are your options?” Expected: Can’t reverse MD5. Options: brute force (if short), rainbow table lookup, or patch the comparison.
“How do you identify a validation loop (character-by-character check) in disassembly?” Expected: Look for loop structures (counter increment, conditional jump back), array indexing, character-wise operations.
“What’s the ‘cyclic pattern’ technique and how is it useful in crackmes?” Expected: Generates unique substrings to identify buffer positions. Useful for finding offset to critical data in password buffers.
“You’ve reversed the algorithm but your keygen produces ‘valid’ serials that the program rejects. What went wrong?” Expected: Likely issues: integer overflow, endianness, off-by-one errors, missing constraints (e.g., serial must be printable ASCII).
“Describe the legal and ethical boundaries of reverse engineering copy protection.” Expected: CTF/educational crackmes are legal. Commercial software varies by jurisdiction (DMCA, EU directives). Intent matters. Always use isolated VMs.

Books That Will Help

Topic	Book	Chapter/Section
Reverse Engineering Fundamentals	“Reversing: Secrets of Reverse Engineering”	Ch. 1-3 (Foundations, RE Process)
Applied Crackme Solving	“Reversing: Secrets of Reverse Engineering”	Ch. 5 (Applied RE)
x86/x64 Comparison Operations	“Low-Level Programming”	Ch. 5.3 (Conditional Jumps)
Control Flow in Assembly	“Low-Level Programming”	Ch. 6 (Control Flow)
String Analysis	“Practical Binary Analysis”	Ch. 5.4 (Finding Functions)
Binary Patching Techniques	“Practical Binary Analysis”	Ch. 7 (Code Injection)
Debugger Usage (GDB)	“Hacking: The Art of Exploitation”	Ch. 2 (Programming)
Anti-Debugging Techniques	“Practical Malware Analysis”	Ch. 15 (Anti-Debugging)
Common Crypto Algorithms	“Serious Cryptography”	Ch. 1-6 (Hashing, Encryption)
Assembly Language Basics	“Computer Systems: A Programmer’s Perspective”	Ch. 3 (Machine-Level Representation)
Stack and Calling Conventions	“Computer Systems: A Programmer’s Perspective”	Ch. 3.7 (Procedures)
Tool Usage (Ghidra)	“Ghidra Software Reverse Engineering for Beginners”	Ch. 4-6 (Analysis Features)
Input Tracing	“Computer Systems: A Programmer’s Perspective”	Ch. 8.4 (Process Control)
Opcode Reference	“Low-Level Programming”	Appendix A (x86-64 Instruction Reference)
Hex Editing and Binary Structure	“Practical Binary Analysis”	Ch. 2 (Binary Formats)

Common Pitfalls and Debugging

Problem 1: “Your interpretation does not match runtime behavior”

Why: Static analysis can hide runtime-resolved addresses, lazy binding, and input-dependent branches.
Fix: Reproduce the path with debugger or tracer, then compare static assumptions against live register/memory state.
Quick test: Run the same sample through both your static workflow and a debugger transcript, and confirm control-flow decisions align.

Problem 2: “Tool output is inconsistent across machines”

Why: ASLR, tool version drift, and different binary build flags (PIE, RELRO, symbols stripped) change observed addresses and metadata.
Fix: Pin tool versions, capture checksec/metadata, and document environment assumptions in your report.
Quick test: Re-run analysis in a container or VM with pinned tools and compare hashes of generated outputs.

Problem 3: “Analysis accidentally executes unsafe code”

Why: Dynamic workflows run binaries in host context without sufficient isolation.
Fix: Use disposable snapshots, no-network execution, and non-privileged users for all unknown samples.
Quick test: Validate isolation controls first (network disabled, snapshot active, unprivileged user), then execute sample.

Definition of Done

Core functionality works on reference inputs
Edge cases are tested and documented
Results are reproducible (same binary, same tools, same report output)
Analysis notes clearly separate observations, assumptions, and conclusions
Lab safety controls were applied for any dynamic execution

4. Solution Architecture

Input Artifact -> Parse/Decode -> Analysis Engine -> Validation Layer -> Report

Design each stage so intermediate artifacts are inspectable (JSON/text/notes), which makes debugging and peer review much easier.

5. Implementation Phases

Phase 1: Foundation

Define input assumptions and format checks.
Produce a minimal golden output on one known sample.

Phase 2: Core Functionality

Implement full analysis pass for normal cases.
Add validation against an external ground-truth tool.

Phase 3: Hard Cases and Reporting

Add malformed/edge-case handling.
Finalize report template and reproducibility notes.

6. Testing Strategy

Unit-level checks for parser/decoder helpers.
Integration checks against known binaries/challenges.
Regression tests for previously failing cases.

7. Extensions & Challenges

Add automation for batch analysis and comparative reports.
Add confidence scoring for each major finding.
Add export formats suitable for CI/security pipelines.

8. Production Reflection

Map your project output to a production analogue: what reliability, observability, and security controls would be required to run this continuously in an engineering organization?