Project 17: radare2 Mastery

Expanded deep-dive guide for Project 17 from the Binary Analysis sprint.

Quick Reference

Attribute Value
Difficulty Level 2: Intermediate
Time Estimate 2-3 weeks
Main Programming Language r2 commands, r2pipe (Python)
Alternative Programming Languages JavaScript (r2js)
Coolness Level Level 4: Hardcore Tech Flex
Business Potential 1. The “Resume Gold”
Knowledge Area Static Analysis / Command Line RE
Software or Tool radare2, Cutter (GUI)
Main Book “The radare2 Book”

1. Learning Objectives

  1. Build a working implementation with reproducible outputs.
  2. Justify key design choices with binary-analysis principles.
  3. Produce an evidence-backed report of findings and limitations.
  4. Document hardening or next-step improvements.

2. All Theory Needed (Per-Concept Breakdown)

This project depends on concepts from the main sprint primer: loader semantics, control/data-flow recovery, runtime observation, and mitigation-aware vulnerability reasoning. Before implementation, restate the project’s core assumptions in your own words and define how you will validate them.

3. Project Specification

3.1 What You Will Build

Complete analysis of binaries using only radare2’s command-line interface, plus automation with r2pipe.

3.2 Functional Requirements

  1. Accept the target binary/input and validate format assumptions.
  2. Produce analyzable outputs (console report and/or artifacts).
  3. Handle malformed inputs safely with explicit errors.

3.3 Non-Functional Requirements

  • Reproducibility: same input should produce equivalent findings.
  • Safety: unknown samples run only in isolated lab contexts.
  • Clarity: separate facts, hypotheses, and inferred conclusions.

3.4 Expanded Project Brief

  • File: P17-radare2-mastery.md

  • Main Programming Language: r2 commands, r2pipe (Python)
  • Alternative Programming Languages: JavaScript (r2js)
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Static Analysis / Command Line RE
  • Software or Tool: radare2, Cutter (GUI)
  • Main Book: “The radare2 Book”

What you’ll build: Complete analysis of binaries using only radare2’s command-line interface, plus automation with r2pipe.

Why it teaches binary analysis: radare2 is the most powerful open-source RE framework. Its CLI forces you to think about what you’re doing.

Core challenges you’ll face:

  • Command syntax → maps to steep learning curve
  • Navigation → maps to moving through binaries
  • Visual mode → maps to interactive disassembly
  • Scripting → maps to r2pipe automation

Resources for key challenges:

Key Concepts:

  • Command Structure: radare2 book
  • Visual Mode: V and VV commands
  • r2pipe: Python bindings documentation

Difficulty: Intermediate Time estimate: 2-3 weeks Prerequisites: Projects 1-4

Real World Outcome

Deliverables:

  • Analysis output or tooling scripts
  • Report with control/data flow notes

Validation checklist:

  • Parses sample binaries correctly
  • Findings are reproducible in debugger
  • No unsafe execution outside lab ```bash $ r2 ./crackme [0x00401040]> aaa # Analyze all [0x00401040]> afl # List functions 0x00401040 1 43 entry0 0x00401170 4 101 main 0x004011e0 3 67 sym.check_password

[0x00401040]> s main # Seek to main [0x00401170]> pdf # Print disassembly function ; CODE XREF from entry0 ┌ 101: int main (int argc, char **argv); │ 0x00401170 push rbp │ 0x00401171 mov rbp, rsp │ 0x00401174 sub rsp, 0x40 │ … │ 0x004011a0 call sym.check_password │ ┌─< 0x004011a5 test eax, eax │ │ 0x004011a7 je 0x4011b8 │ │ 0x004011a9 lea rdi, str.Correct │ │ 0x004011b0 call sym.imp.puts

[0x00401170]> VV # Visual graph mode [0x00401170]> s sym.check_password [0x004011e0]> pdc # Decompile (with r2ghidra)

int check_password(char *input) { return strcmp(input, “s3cr3t”) == 0; }

r2pipe automation

$ python3

import r2pipe r2 = r2pipe.open(‘./crackme’) r2.cmd(‘aaa’) functions = r2.cmdj(‘aflj’) # JSON output for f in functions: … print(f[‘name’], hex(f[‘offset’])) ```

Hints in Layers

Essential r2 commands:

# Analysis
aaa              # Analyze all
afl              # List functions
axt addr         # Xrefs to address
axf addr         # Xrefs from address
iz               # List strings
ii               # List imports

# Navigation
s addr           # Seek to address
s main           # Seek to function
sf               # Seek to next function
sb               # Seek to previous function

# Disassembly
pd 20            # Print 20 instructions
pdf              # Print function disassembly
pdc              # Pseudo-decompile (with plugins)
pdr              # Print function in raw bytes

# Visual mode
V                # Visual mode (press p to cycle views)
VV               # Visual graph mode
Vp               # Visual panel mode

# Debugging
db addr          # Set breakpoint
dc               # Continue
ds               # Step
dr               # Show registers
doo              # Reopen for debugging

# Patching
wa nop           # Write assembly (nop)
wx 90            # Write hex bytes

Common workflows:

  1. aaa; afl - Analyze and list functions
  2. iz; iz~password - Find interesting strings
  3. axt str.password - Find references to string
  4. s ref; pdf - Go to reference, disassemble

Learning milestones:

  1. Basic navigation → Move around binaries
  2. Visual mode → Efficient analysis
  3. Find vulnerabilities → Locate interesting code
  4. Automate with r2pipe → Script your analysis

    The Core Question You Are Answering

How do you efficiently analyze and reverse engineer binaries using only a command-line interface, and why is mastering text-based tools essential for professional reverse engineering work?

This project challenges you to think beyond GUI tools and understand reverse engineering at a fundamental level. When you can’t rely on visual cues and mouse clicks, you’re forced to understand the underlying concepts, develop systematic workflows, and build automation that scales to hundreds of binaries.

Concepts You Must Understand First

1. Command-Line Philosophy and UNIX Composability

  • radare2 follows the UNIX philosophy: small, composable commands that do one thing well
  • Understanding why ~ (internal grep), | (pipe to shell), and @ (temporary seek) exist
  • The power of combining simple commands to create complex analysis workflows

Guiding Questions:

  • Why does radare2 use single-letter commands instead of descriptive names?
  • How does the command prefix system (a=analysis, p=print, d=debug) help organize functionality?
  • What’s the advantage of pdf @ sym.main vs seeking to main first?

Book References:

  • “The radare2 Book” (online) - Chapter 1: Introduction, Chapter 4: Basic Usage
  • “The Art of UNIX Programming” by Eric S. Raymond - Chapter 1: Philosophy

2. Binary Analysis State and Context

  • Understanding the current seek position (like a cursor in your binary)
  • How radare2 maintains analysis state (function boundaries, cross-references, types)
  • The difference between ephemeral commands and persistent state changes

Guiding Questions:

  • What’s the difference between s main and @ main in terms of state?
  • How does aaa (analyze all) build the function database, and when should you use aa vs aaa vs aaaa?
  • Why might you want to save a project (Ps) instead of re-analyzing each time?

Book References:

  • “The radare2 Book” - Chapter 4: Basic Usage (Seeking and Navigation)
  • “Practical Binary Analysis” by Dennis Andriesse - Chapter 5: Basic Binary Analysis

3. Visual Mode as Interactive Disassembly

  • Visual mode (V) isn’t just pretty printing—it’s an interactive analysis workspace
  • Understanding the different visual panels (hex, disassembly, graph, debugging)
  • How visual mode keybindings map to command-line operations

Guiding Questions:

  • What’s the relationship between pressing p in visual mode and the pd command?
  • How does VV (visual graph mode) help you understand control flow better than linear disassembly?
  • When would you use visual panel mode (V!) with multiple panes?

Book References:

  • “The radare2 Book” - Chapter 6: Visual Mode
  • “Reversing: Secrets of Reverse Engineering” by Eldad Eilam - Chapter 4: Reverse Engineering

4. Cross-References and Program Flow

  • Cross-references (xrefs) are the roadmap of your binary—who calls what
  • Understanding axt (xrefs to) vs axf (xrefs from) vs ax (list all)
  • How to trace data flow and control flow through xref analysis

Guiding Questions:

  • If you find an interesting string, how do you find all code that uses it?
  • How do you determine if a function is called from multiple places or just one?
  • What’s the difference between code xrefs and data xrefs?

Book References:

  • “The radare2 Book” - Chapter 5: Analysis (Cross-References section)
  • “Practical Binary Analysis” by Dennis Andriesse - Chapter 6: Disassembly and Binary Analysis

5. r2pipe and Programmatic Analysis

  • r2pipe lets you control radare2 from any programming language
  • Understanding the JSON output mode (j suffix) for machine parsing
  • Building analysis pipelines that scale to multiple binaries

Guiding Questions:

  • Why would you use r2.cmdj('aflj') instead of parsing text output from afl?
  • How can you build a script that finds all functions using dangerous functions like strcpy?
  • What’s the advantage of r2pipe over scraping radare2 text output?

Book References:

  • “The radare2 Book” - Chapter 15: r2pipe
  • “Practical Binary Analysis” by Dennis Andriesse - Chapter 12: Principles of Dynamic Analysis

6. Binary Patching and Modification

  • Understanding the difference between wa (write assembly), wx (write hex), and wao (write operation)
  • How to patch binaries in-place and save changes with wc (write cache)
  • The concept of reversible vs permanent patches

Guiding Questions:

  • How do you NOP out a conditional jump to bypass a check?
  • What’s the difference between patching in-memory vs writing changes to disk?
  • How do you ensure your patch doesn’t break relocations or other code?

Book References:

  • “The radare2 Book” - Chapter 8: Writing and Patching
  • “Hacking: The Art of Exploitation” by Jon Erickson - Chapter 5: Exploitation

7. Analysis Automation with r2 Scripts

  • r2 scripts (.r2 files) let you automate repetitive analysis tasks
  • Understanding how to combine commands with ; and create macros
  • Building reusable analysis workflows

Guiding Questions:

  • How do you create a script that automatically finds and patches anti-debugging checks?
  • What’s the difference between running a script with . vs sourcing commands?
  • How can you make your analysis reproducible for team members?

Book References:

  • “The radare2 Book” - Chapter 14: Scripting
  • “Practical Binary Analysis” by Dennis Andriesse - Chapter 13: Binary Instrumentation

Questions to Guide Your Design

  1. Command Discovery: How will you learn and remember the hundreds of radare2 commands? Should you create personal cheat sheets, use ? help extensively, or build muscle memory through repetition?

  2. Workflow Efficiency: What’s your standard workflow for analyzing a new binary? Do you start with aaa, then afl, then investigate interesting functions? Or do you prefer a different sequence?

  3. Visual vs Command-Line: When should you use visual mode vs staying in command-line mode? Is visual mode just for beginners, or does it offer unique insights?

  4. Scripting Strategy: Which analysis tasks should you automate with r2pipe vs do manually? At what point does scripting become more efficient than interactive analysis?

  5. Plugin Ecosystem: Should you rely on plugins like r2ghidra (decompiler) and r2dec, or stick to core radare2 functionality? How do plugins affect reproducibility?

  6. Collaborative Analysis: How do you share your radare2 analysis with team members? Do you save projects, export commands, or create scripts?

  7. Integration with Other Tools: How should radare2 fit into your overall RE workflow? Should it complement Ghidra/IDA, or can it be your primary tool?

  8. Learning Curve Management: radare2 is notoriously difficult to learn. How will you structure your learning to avoid frustration—start with small binaries, follow tutorials, or dive into complex samples?

Thinking Exercise

Exercise 1: Manual Command Reconstruction Before using visual mode, analyze a simple crackme using only command-line mode:

  1. Open the binary: r2 ./crackme
  2. Run analysis: aaa
  3. List functions: afl - identify main and other interesting functions
  4. Seek to main: s main
  5. Print disassembly: pdf
  6. Find string references: iz then axt str.password
  7. Navigate to the xref: s [address]
  8. Trace the check logic without using visual mode

Reflection: Which commands did you use most? What was frustrating? How would you optimize this workflow?

Exercise 2: Visual Mode Mapping In visual mode, press different keys and observe what happens:

  1. Enter visual mode: V
  2. Press p repeatedly - note each view (hex, disasm, debug, words, etc.)
  3. Press ? - study the help screen
  4. In graph mode (VV), navigate with hjkl and tab through nodes
  5. Return to command mode with q, then recreate one visual operation using CLI commands

Reflection: Which visual mode do you prefer? Can you recreate visual graph mode insights using pdf and agf?

Exercise 3: r2pipe Automation Planning Manually perform this analysis, then plan how to automate it:

Task: Find all functions that call dangerous functions (strcpy, gets, sprintf)

Manual steps:

r2 ./binary
aaa
afl
s sym.imp.strcpy
axt
# repeat for each dangerous function

Automation plan:

  • What JSON commands will you need? (aflj, axtj)
  • How will you iterate through dangerous functions?
  • What output format will be most useful?
  • Write pseudocode before writing Python

Exercise 4: Binary Patching Practice Find a simple crackme with a password check and practice patching:

  1. Locate the comparison: look for cmp or test before a conditional jump
  2. Understand the logic: does it jump if correct or if incorrect?
  3. Plan your patch: should you NOP the jump, change the condition, or modify the comparison?
  4. Apply the patch: use wa or wx
  5. Verify in-memory: use pd to see your changes
  6. Test: run with ood (open in debug mode)
  7. Save permanently: use wc [filename] (write changes)

Reflection: Did your first patch work? What did you learn about instruction lengths and side effects?

The Interview Questions They’ll Ask

Technical Understanding:

  1. Q: Explain the difference between aa, aaa, and aaaa in radare2. When would you use each? A: They perform progressively deeper analysis: aa does basic analysis (functions, xrefs), aaa adds deeper analysis including strings and function arguments, aaaa is even more aggressive. Use aa for quick checks, aaa for normal analysis, and aaaa when comprehensive analysis is needed.

  2. Q: How would you find all calls to strcpy in a binary using radare2? A: Run aaa to analyze, afl~strcpy to check if it’s imported, s sym.imp.strcpy to seek to it, then axt to find all cross-references (calls) to strcpy. Or use r2pipe: r2.cmdj('axtj @ sym.imp.strcpy') for JSON output.

  3. Q: What’s the purpose of the @ operator in radare2 commands? A: The @ operator performs a temporary seek. For example, pdf @ sym.main prints the disassembly of main without changing your current seek position. It’s essential for scripting and avoiding state changes.

  4. Q: How do you patch a binary in radare2 and save the changes permanently? A: Use wa (write assembly) or wx (write hex bytes) to modify in memory, then wc [filename] to write changes to a new file. You can also use oo+ (open in write mode) to modify the original.

  5. Q: Explain the different visual modes in radare2 and when you’d use each. A: V enters visual hex/disassembly (press p to cycle views), VV shows the graph view (control flow), V! enters panel mode (multiple panes). Use hex view for raw bytes, disassembly for linear code, graph for understanding flow, and panels for debugging.

Practical Application:

  1. Q: You’re analyzing a stripped binary with no symbols. How would you find the main function in radare2? A: Run aaa, then s entry0 to go to the entry point, pdf to see the code, look for the call to __libc_start_main which takes main as the first argument (in RDI on x64). Use the disassembly to trace the argument.

  2. Q: How would you use r2pipe to automatically analyze 100 binaries and find which ones have NX disabled? A: Write a Python script that opens each binary with r2pipe.open(), runs iI (binary info), parses the JSON output with cmdj('iIj'), checks the nx field, and logs results.

  3. Q: A binary crashes when you run it. How do you use radare2 to investigate without executing it? A: Open without execution: r2 ./binary (not r2 -d), run aaa for static analysis, find likely crash points (maybe invalid instruction or null pointer dereference), use pdf to understand context. For dynamic analysis, use doo (reopen in debug mode) and set breakpoints before the crash.

Tool Comparison:

  1. Q: When would you choose radare2 over Ghidra or IDA Pro? A: radare2 excels in: automation via r2pipe, command-line environments (servers, CTFs), binary patching, custom analysis scripts, and open-source requirements. Ghidra is better for decompilation and collaborative projects. IDA has better disassembly quality and commercial support.

  2. Q: How do you use radare2’s JSON output mode, and why is it important? A: Append j to most commands: aflj (functions as JSON), iIj (binary info), axtj (xrefs). This is crucial for r2pipe scripting because parsing JSON is reliable, while parsing text output is fragile.

Books That Will Help

Topic Book Chapters Why It Helps
radare2 Fundamentals “The radare2 Book” (online) Ch 1-8: Introduction through Patching Official documentation, comprehensive command reference, essential for learning the tool
Command-Line Philosophy “The Art of UNIX Programming” by Eric S. Raymond Ch 1: Philosophy, Ch 11: Interfaces Understand why radare2 is designed the way it is - composable, text-based, scriptable
Binary Analysis Concepts “Practical Binary Analysis” by Dennis Andriesse Ch 5-6: Basic Binary Analysis, Disassembly Context for what you’re analyzing - radare2 is the tool, this book explains the concepts
Disassembly Fundamentals “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron Ch 3: Machine-Level Programming Understanding what you’re seeing in pdf output - instruction encoding, calling conventions
Reverse Engineering Workflow “Reversing: Secrets of Reverse Engineering” by Eldad Eilam Ch 4-5: Reverse Engineering, Reversing Tools Learn systematic RE approaches that you’ll implement in radare2
r2pipe Programming “The radare2 Book” Ch 15: r2pipe Learn to automate radare2 with Python, JavaScript, or other languages
Binary Patching “Hacking: The Art of Exploitation” by Jon Erickson Ch 5: Exploitation (patching sections) Understand when and how to modify binaries using radare2’s write commands
x86-64 Assembly “Low-Level Programming” by Igor Zhirkov Ch 5-8: Assembly Programming Read disassembly fluently - understand what mov rdi, rsp means in context
Control Flow Analysis “Practical Binary Analysis” by Dennis Andriesse Ch 6: Binary Analysis (CFG section) Understand what VV graph mode is showing you - basic blocks, edges, loops
Dynamic Analysis Integration “Practical Malware Analysis” by Sikorski & Honig Ch 9: Dynamic Analysis Learn when to use radare2’s debugger (ood, dc, ds) vs static analysis


Common Pitfalls and Debugging

Problem 1: “Your interpretation does not match runtime behavior”

  • Why: Static analysis can hide runtime-resolved addresses, lazy binding, and input-dependent branches.
  • Fix: Reproduce the path with debugger or tracer, then compare static assumptions against live register/memory state.
  • Quick test: Run the same sample through both your static workflow and a debugger transcript, and confirm control-flow decisions align.

Problem 2: “Tool output is inconsistent across machines”

  • Why: ASLR, tool version drift, and different binary build flags (PIE, RELRO, symbols stripped) change observed addresses and metadata.
  • Fix: Pin tool versions, capture checksec/metadata, and document environment assumptions in your report.
  • Quick test: Re-run analysis in a container or VM with pinned tools and compare hashes of generated outputs.

Problem 3: “Analysis accidentally executes unsafe code”

  • Why: Dynamic workflows run binaries in host context without sufficient isolation.
  • Fix: Use disposable snapshots, no-network execution, and non-privileged users for all unknown samples.
  • Quick test: Validate isolation controls first (network disabled, snapshot active, unprivileged user), then execute sample.

Definition of Done

  • Core functionality works on reference inputs
  • Edge cases are tested and documented
  • Results are reproducible (same binary, same tools, same report output)
  • Analysis notes clearly separate observations, assumptions, and conclusions
  • Lab safety controls were applied for any dynamic execution

4. Solution Architecture

Input Artifact -> Parse/Decode -> Analysis Engine -> Validation Layer -> Report

Design each stage so intermediate artifacts are inspectable (JSON/text/notes), which makes debugging and peer review much easier.

5. Implementation Phases

Phase 1: Foundation

  • Define input assumptions and format checks.
  • Produce a minimal golden output on one known sample.

Phase 2: Core Functionality

  • Implement full analysis pass for normal cases.
  • Add validation against an external ground-truth tool.

Phase 3: Hard Cases and Reporting

  • Add malformed/edge-case handling.
  • Finalize report template and reproducibility notes.

6. Testing Strategy

  • Unit-level checks for parser/decoder helpers.
  • Integration checks against known binaries/challenges.
  • Regression tests for previously failing cases.

7. Extensions & Challenges

  • Add automation for batch analysis and comparative reports.
  • Add confidence scoring for each major finding.
  • Add export formats suitable for CI/security pipelines.

8. Production Reflection

Map your project output to a production analogue: what reliability, observability, and security controls would be required to run this continuously in an engineering organization?