Project 17: radare2 Mastery
Expanded deep-dive guide for Project 17 from the Binary Analysis sprint.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 2: Intermediate |
| Time Estimate | 2-3 weeks |
| Main Programming Language | r2 commands, r2pipe (Python) |
| Alternative Programming Languages | JavaScript (r2js) |
| Coolness Level | Level 4: Hardcore Tech Flex |
| Business Potential | 1. The “Resume Gold” |
| Knowledge Area | Static Analysis / Command Line RE |
| Software or Tool | radare2, Cutter (GUI) |
| Main Book | “The radare2 Book” |
1. Learning Objectives
- Build a working implementation with reproducible outputs.
- Justify key design choices with binary-analysis principles.
- Produce an evidence-backed report of findings and limitations.
- Document hardening or next-step improvements.
2. All Theory Needed (Per-Concept Breakdown)
This project depends on concepts from the main sprint primer: loader semantics, control/data-flow recovery, runtime observation, and mitigation-aware vulnerability reasoning. Before implementation, restate the project’s core assumptions in your own words and define how you will validate them.
3. Project Specification
3.1 What You Will Build
Complete analysis of binaries using only radare2’s command-line interface, plus automation with r2pipe.
3.2 Functional Requirements
- Accept the target binary/input and validate format assumptions.
- Produce analyzable outputs (console report and/or artifacts).
- Handle malformed inputs safely with explicit errors.
3.3 Non-Functional Requirements
- Reproducibility: same input should produce equivalent findings.
- Safety: unknown samples run only in isolated lab contexts.
- Clarity: separate facts, hypotheses, and inferred conclusions.
3.4 Expanded Project Brief
-
File: P17-radare2-mastery.md
- Main Programming Language: r2 commands, r2pipe (Python)
- Alternative Programming Languages: JavaScript (r2js)
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Static Analysis / Command Line RE
- Software or Tool: radare2, Cutter (GUI)
- Main Book: “The radare2 Book”
What you’ll build: Complete analysis of binaries using only radare2’s command-line interface, plus automation with r2pipe.
Why it teaches binary analysis: radare2 is the most powerful open-source RE framework. Its CLI forces you to think about what you’re doing.
Core challenges you’ll face:
- Command syntax → maps to steep learning curve
- Navigation → maps to moving through binaries
- Visual mode → maps to interactive disassembly
- Scripting → maps to r2pipe automation
Resources for key challenges:
Key Concepts:
- Command Structure: radare2 book
- Visual Mode:
VandVVcommands - r2pipe: Python bindings documentation
Difficulty: Intermediate Time estimate: 2-3 weeks Prerequisites: Projects 1-4
Real World Outcome
Deliverables:
- Analysis output or tooling scripts
- Report with control/data flow notes
Validation checklist:
- Parses sample binaries correctly
- Findings are reproducible in debugger
- No unsafe execution outside lab ```bash $ r2 ./crackme [0x00401040]> aaa # Analyze all [0x00401040]> afl # List functions 0x00401040 1 43 entry0 0x00401170 4 101 main 0x004011e0 3 67 sym.check_password
[0x00401040]> s main # Seek to main [0x00401170]> pdf # Print disassembly function ; CODE XREF from entry0 ┌ 101: int main (int argc, char **argv); │ 0x00401170 push rbp │ 0x00401171 mov rbp, rsp │ 0x00401174 sub rsp, 0x40 │ … │ 0x004011a0 call sym.check_password │ ┌─< 0x004011a5 test eax, eax │ │ 0x004011a7 je 0x4011b8 │ │ 0x004011a9 lea rdi, str.Correct │ │ 0x004011b0 call sym.imp.puts
[0x00401170]> VV # Visual graph mode [0x00401170]> s sym.check_password [0x004011e0]> pdc # Decompile (with r2ghidra)
int check_password(char *input) { return strcmp(input, “s3cr3t”) == 0; }
r2pipe automation
$ python3
import r2pipe r2 = r2pipe.open(‘./crackme’) r2.cmd(‘aaa’) functions = r2.cmdj(‘aflj’) # JSON output for f in functions: … print(f[‘name’], hex(f[‘offset’])) ```
Hints in Layers
Essential r2 commands:
# Analysis
aaa # Analyze all
afl # List functions
axt addr # Xrefs to address
axf addr # Xrefs from address
iz # List strings
ii # List imports
# Navigation
s addr # Seek to address
s main # Seek to function
sf # Seek to next function
sb # Seek to previous function
# Disassembly
pd 20 # Print 20 instructions
pdf # Print function disassembly
pdc # Pseudo-decompile (with plugins)
pdr # Print function in raw bytes
# Visual mode
V # Visual mode (press p to cycle views)
VV # Visual graph mode
Vp # Visual panel mode
# Debugging
db addr # Set breakpoint
dc # Continue
ds # Step
dr # Show registers
doo # Reopen for debugging
# Patching
wa nop # Write assembly (nop)
wx 90 # Write hex bytes
Common workflows:
aaa; afl- Analyze and list functionsiz; iz~password- Find interesting stringsaxt str.password- Find references to strings ref; pdf- Go to reference, disassemble
Learning milestones:
- Basic navigation → Move around binaries
- Visual mode → Efficient analysis
- Find vulnerabilities → Locate interesting code
- Automate with r2pipe → Script your analysis
The Core Question You Are Answering
How do you efficiently analyze and reverse engineer binaries using only a command-line interface, and why is mastering text-based tools essential for professional reverse engineering work?
This project challenges you to think beyond GUI tools and understand reverse engineering at a fundamental level. When you can’t rely on visual cues and mouse clicks, you’re forced to understand the underlying concepts, develop systematic workflows, and build automation that scales to hundreds of binaries.
Concepts You Must Understand First
1. Command-Line Philosophy and UNIX Composability
- radare2 follows the UNIX philosophy: small, composable commands that do one thing well
- Understanding why
~(internal grep),|(pipe to shell), and@(temporary seek) exist - The power of combining simple commands to create complex analysis workflows
Guiding Questions:
- Why does radare2 use single-letter commands instead of descriptive names?
- How does the command prefix system (a=analysis, p=print, d=debug) help organize functionality?
- What’s the advantage of
pdf @ sym.mainvs seeking to main first?
Book References:
- “The radare2 Book” (online) - Chapter 1: Introduction, Chapter 4: Basic Usage
- “The Art of UNIX Programming” by Eric S. Raymond - Chapter 1: Philosophy
2. Binary Analysis State and Context
- Understanding the current seek position (like a cursor in your binary)
- How radare2 maintains analysis state (function boundaries, cross-references, types)
- The difference between ephemeral commands and persistent state changes
Guiding Questions:
- What’s the difference between
s mainand@ mainin terms of state? - How does
aaa(analyze all) build the function database, and when should you useaavsaaavsaaaa? - Why might you want to save a project (
Ps) instead of re-analyzing each time?
Book References:
- “The radare2 Book” - Chapter 4: Basic Usage (Seeking and Navigation)
- “Practical Binary Analysis” by Dennis Andriesse - Chapter 5: Basic Binary Analysis
3. Visual Mode as Interactive Disassembly
- Visual mode (
V) isn’t just pretty printing—it’s an interactive analysis workspace - Understanding the different visual panels (hex, disassembly, graph, debugging)
- How visual mode keybindings map to command-line operations
Guiding Questions:
- What’s the relationship between pressing
pin visual mode and thepdcommand? - How does
VV(visual graph mode) help you understand control flow better than linear disassembly? - When would you use visual panel mode (
V!) with multiple panes?
Book References:
- “The radare2 Book” - Chapter 6: Visual Mode
- “Reversing: Secrets of Reverse Engineering” by Eldad Eilam - Chapter 4: Reverse Engineering
4. Cross-References and Program Flow
- Cross-references (xrefs) are the roadmap of your binary—who calls what
- Understanding
axt(xrefs to) vsaxf(xrefs from) vsax(list all) - How to trace data flow and control flow through xref analysis
Guiding Questions:
- If you find an interesting string, how do you find all code that uses it?
- How do you determine if a function is called from multiple places or just one?
- What’s the difference between code xrefs and data xrefs?
Book References:
- “The radare2 Book” - Chapter 5: Analysis (Cross-References section)
- “Practical Binary Analysis” by Dennis Andriesse - Chapter 6: Disassembly and Binary Analysis
5. r2pipe and Programmatic Analysis
- r2pipe lets you control radare2 from any programming language
- Understanding the JSON output mode (
jsuffix) for machine parsing - Building analysis pipelines that scale to multiple binaries
Guiding Questions:
- Why would you use
r2.cmdj('aflj')instead of parsing text output fromafl? - How can you build a script that finds all functions using dangerous functions like
strcpy? - What’s the advantage of r2pipe over scraping radare2 text output?
Book References:
- “The radare2 Book” - Chapter 15: r2pipe
- “Practical Binary Analysis” by Dennis Andriesse - Chapter 12: Principles of Dynamic Analysis
6. Binary Patching and Modification
- Understanding the difference between
wa(write assembly),wx(write hex), andwao(write operation) - How to patch binaries in-place and save changes with
wc(write cache) - The concept of reversible vs permanent patches
Guiding Questions:
- How do you NOP out a conditional jump to bypass a check?
- What’s the difference between patching in-memory vs writing changes to disk?
- How do you ensure your patch doesn’t break relocations or other code?
Book References:
- “The radare2 Book” - Chapter 8: Writing and Patching
- “Hacking: The Art of Exploitation” by Jon Erickson - Chapter 5: Exploitation
7. Analysis Automation with r2 Scripts
- r2 scripts (
.r2files) let you automate repetitive analysis tasks - Understanding how to combine commands with
;and create macros - Building reusable analysis workflows
Guiding Questions:
- How do you create a script that automatically finds and patches anti-debugging checks?
- What’s the difference between running a script with
.vs sourcing commands? - How can you make your analysis reproducible for team members?
Book References:
- “The radare2 Book” - Chapter 14: Scripting
- “Practical Binary Analysis” by Dennis Andriesse - Chapter 13: Binary Instrumentation
Questions to Guide Your Design
-
Command Discovery: How will you learn and remember the hundreds of radare2 commands? Should you create personal cheat sheets, use
?help extensively, or build muscle memory through repetition? -
Workflow Efficiency: What’s your standard workflow for analyzing a new binary? Do you start with
aaa, thenafl, then investigate interesting functions? Or do you prefer a different sequence? -
Visual vs Command-Line: When should you use visual mode vs staying in command-line mode? Is visual mode just for beginners, or does it offer unique insights?
-
Scripting Strategy: Which analysis tasks should you automate with r2pipe vs do manually? At what point does scripting become more efficient than interactive analysis?
-
Plugin Ecosystem: Should you rely on plugins like
r2ghidra(decompiler) andr2dec, or stick to core radare2 functionality? How do plugins affect reproducibility? -
Collaborative Analysis: How do you share your radare2 analysis with team members? Do you save projects, export commands, or create scripts?
-
Integration with Other Tools: How should radare2 fit into your overall RE workflow? Should it complement Ghidra/IDA, or can it be your primary tool?
-
Learning Curve Management: radare2 is notoriously difficult to learn. How will you structure your learning to avoid frustration—start with small binaries, follow tutorials, or dive into complex samples?
Thinking Exercise
Exercise 1: Manual Command Reconstruction Before using visual mode, analyze a simple crackme using only command-line mode:
- Open the binary:
r2 ./crackme - Run analysis:
aaa - List functions:
afl- identify main and other interesting functions - Seek to main:
s main - Print disassembly:
pdf - Find string references:
izthenaxt str.password - Navigate to the xref:
s [address] - Trace the check logic without using visual mode
Reflection: Which commands did you use most? What was frustrating? How would you optimize this workflow?
Exercise 2: Visual Mode Mapping In visual mode, press different keys and observe what happens:
- Enter visual mode:
V - Press
prepeatedly - note each view (hex, disasm, debug, words, etc.) - Press
?- study the help screen - In graph mode (
VV), navigate withhjkland tab through nodes - Return to command mode with
q, then recreate one visual operation using CLI commands
Reflection: Which visual mode do you prefer? Can you recreate visual graph mode insights using pdf and agf?
Exercise 3: r2pipe Automation Planning Manually perform this analysis, then plan how to automate it:
Task: Find all functions that call dangerous functions (strcpy, gets, sprintf)
Manual steps:
r2 ./binary
aaa
afl
s sym.imp.strcpy
axt
# repeat for each dangerous function
Automation plan:
- What JSON commands will you need? (
aflj,axtj) - How will you iterate through dangerous functions?
- What output format will be most useful?
- Write pseudocode before writing Python
Exercise 4: Binary Patching Practice Find a simple crackme with a password check and practice patching:
- Locate the comparison: look for
cmportestbefore a conditional jump - Understand the logic: does it jump if correct or if incorrect?
- Plan your patch: should you NOP the jump, change the condition, or modify the comparison?
- Apply the patch: use
waorwx - Verify in-memory: use
pdto see your changes - Test: run with
ood(open in debug mode) - Save permanently: use
wc [filename](write changes)
Reflection: Did your first patch work? What did you learn about instruction lengths and side effects?
The Interview Questions They’ll Ask
Technical Understanding:
-
Q: Explain the difference between
aa,aaa, andaaaain radare2. When would you use each? A: They perform progressively deeper analysis:aadoes basic analysis (functions, xrefs),aaaadds deeper analysis including strings and function arguments,aaaais even more aggressive. Useaafor quick checks,aaafor normal analysis, andaaaawhen comprehensive analysis is needed. -
Q: How would you find all calls to
strcpyin a binary using radare2? A: Runaaato analyze,afl~strcpyto check if it’s imported,s sym.imp.strcpyto seek to it, thenaxtto find all cross-references (calls) to strcpy. Or use r2pipe:r2.cmdj('axtj @ sym.imp.strcpy')for JSON output. -
Q: What’s the purpose of the
@operator in radare2 commands? A: The@operator performs a temporary seek. For example,pdf @ sym.mainprints the disassembly of main without changing your current seek position. It’s essential for scripting and avoiding state changes. -
Q: How do you patch a binary in radare2 and save the changes permanently? A: Use
wa(write assembly) orwx(write hex bytes) to modify in memory, thenwc [filename]to write changes to a new file. You can also useoo+(open in write mode) to modify the original. -
Q: Explain the different visual modes in radare2 and when you’d use each. A:
Venters visual hex/disassembly (presspto cycle views),VVshows the graph view (control flow),V!enters panel mode (multiple panes). Use hex view for raw bytes, disassembly for linear code, graph for understanding flow, and panels for debugging.
Practical Application:
-
Q: You’re analyzing a stripped binary with no symbols. How would you find the main function in radare2? A: Run
aaa, thens entry0to go to the entry point,pdfto see the code, look for the call to__libc_start_mainwhich takes main as the first argument (in RDI on x64). Use the disassembly to trace the argument. -
Q: How would you use r2pipe to automatically analyze 100 binaries and find which ones have NX disabled? A: Write a Python script that opens each binary with
r2pipe.open(), runsiI(binary info), parses the JSON output withcmdj('iIj'), checks thenxfield, and logs results. -
Q: A binary crashes when you run it. How do you use radare2 to investigate without executing it? A: Open without execution:
r2 ./binary(notr2 -d), runaaafor static analysis, find likely crash points (maybeinvalidinstruction or null pointer dereference), usepdfto understand context. For dynamic analysis, usedoo(reopen in debug mode) and set breakpoints before the crash.
Tool Comparison:
-
Q: When would you choose radare2 over Ghidra or IDA Pro? A: radare2 excels in: automation via r2pipe, command-line environments (servers, CTFs), binary patching, custom analysis scripts, and open-source requirements. Ghidra is better for decompilation and collaborative projects. IDA has better disassembly quality and commercial support.
-
Q: How do you use radare2’s JSON output mode, and why is it important? A: Append
jto most commands:aflj(functions as JSON),iIj(binary info),axtj(xrefs). This is crucial for r2pipe scripting because parsing JSON is reliable, while parsing text output is fragile.
Books That Will Help
| Topic | Book | Chapters | Why It Helps |
|---|---|---|---|
| radare2 Fundamentals | “The radare2 Book” (online) | Ch 1-8: Introduction through Patching | Official documentation, comprehensive command reference, essential for learning the tool |
| Command-Line Philosophy | “The Art of UNIX Programming” by Eric S. Raymond | Ch 1: Philosophy, Ch 11: Interfaces | Understand why radare2 is designed the way it is - composable, text-based, scriptable |
| Binary Analysis Concepts | “Practical Binary Analysis” by Dennis Andriesse | Ch 5-6: Basic Binary Analysis, Disassembly | Context for what you’re analyzing - radare2 is the tool, this book explains the concepts |
| Disassembly Fundamentals | “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron | Ch 3: Machine-Level Programming | Understanding what you’re seeing in pdf output - instruction encoding, calling conventions |
| Reverse Engineering Workflow | “Reversing: Secrets of Reverse Engineering” by Eldad Eilam | Ch 4-5: Reverse Engineering, Reversing Tools | Learn systematic RE approaches that you’ll implement in radare2 |
| r2pipe Programming | “The radare2 Book” | Ch 15: r2pipe | Learn to automate radare2 with Python, JavaScript, or other languages |
| Binary Patching | “Hacking: The Art of Exploitation” by Jon Erickson | Ch 5: Exploitation (patching sections) | Understand when and how to modify binaries using radare2’s write commands |
| x86-64 Assembly | “Low-Level Programming” by Igor Zhirkov | Ch 5-8: Assembly Programming | Read disassembly fluently - understand what mov rdi, rsp means in context |
| Control Flow Analysis | “Practical Binary Analysis” by Dennis Andriesse | Ch 6: Binary Analysis (CFG section) | Understand what VV graph mode is showing you - basic blocks, edges, loops |
| Dynamic Analysis Integration | “Practical Malware Analysis” by Sikorski & Honig | Ch 9: Dynamic Analysis | Learn when to use radare2’s debugger (ood, dc, ds) vs static analysis |
Common Pitfalls and Debugging
Problem 1: “Your interpretation does not match runtime behavior”
- Why: Static analysis can hide runtime-resolved addresses, lazy binding, and input-dependent branches.
- Fix: Reproduce the path with debugger or tracer, then compare static assumptions against live register/memory state.
- Quick test: Run the same sample through both your static workflow and a debugger transcript, and confirm control-flow decisions align.
Problem 2: “Tool output is inconsistent across machines”
- Why: ASLR, tool version drift, and different binary build flags (PIE, RELRO, symbols stripped) change observed addresses and metadata.
- Fix: Pin tool versions, capture
checksec/metadata, and document environment assumptions in your report. - Quick test: Re-run analysis in a container or VM with pinned tools and compare hashes of generated outputs.
Problem 3: “Analysis accidentally executes unsafe code”
- Why: Dynamic workflows run binaries in host context without sufficient isolation.
- Fix: Use disposable snapshots, no-network execution, and non-privileged users for all unknown samples.
- Quick test: Validate isolation controls first (network disabled, snapshot active, unprivileged user), then execute sample.
Definition of Done
- Core functionality works on reference inputs
- Edge cases are tested and documented
- Results are reproducible (same binary, same tools, same report output)
- Analysis notes clearly separate observations, assumptions, and conclusions
- Lab safety controls were applied for any dynamic execution
4. Solution Architecture
Input Artifact -> Parse/Decode -> Analysis Engine -> Validation Layer -> Report
Design each stage so intermediate artifacts are inspectable (JSON/text/notes), which makes debugging and peer review much easier.
5. Implementation Phases
Phase 1: Foundation
- Define input assumptions and format checks.
- Produce a minimal golden output on one known sample.
Phase 2: Core Functionality
- Implement full analysis pass for normal cases.
- Add validation against an external ground-truth tool.
Phase 3: Hard Cases and Reporting
- Add malformed/edge-case handling.
- Finalize report template and reproducibility notes.
6. Testing Strategy
- Unit-level checks for parser/decoder helpers.
- Integration checks against known binaries/challenges.
- Regression tests for previously failing cases.
7. Extensions & Challenges
- Add automation for batch analysis and comparative reports.
- Add confidence scoring for each major finding.
- Add export formats suitable for CI/security pipelines.
8. Production Reflection
Map your project output to a production analogue: what reliability, observability, and security controls would be required to run this continuously in an engineering organization?