Project 13: Binary Diffing
Expanded deep-dive guide for Project 13 from the Binary Analysis sprint.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 2: Intermediate |
| Time Estimate | 1-2 weeks |
| Main Programming Language | Python |
| Alternative Programming Languages | Ghidra scripts |
| Coolness Level | Level 3: Genuinely Clever |
| Business Potential | 2. The “Micro-SaaS / Pro Tool” |
| Knowledge Area | Patch Analysis / Vulnerability Research |
| Software or Tool | BinDiff, Diaphora, Ghidriff |
| Main Book | N/A (tool documentation) |
1. Learning Objectives
- Build a working implementation with reproducible outputs.
- Justify key design choices with binary-analysis principles.
- Produce an evidence-backed report of findings and limitations.
- Document hardening or next-step improvements.
2. All Theory Needed (Per-Concept Breakdown)
This project depends on concepts from the main sprint primer: loader semantics, control/data-flow recovery, runtime observation, and mitigation-aware vulnerability reasoning. Before implementation, restate the project’s core assumptions in your own words and define how you will validate them.
3. Project Specification
3.1 What You Will Build
Compare two versions of a binary to find what changed, useful for understanding patches and finding 1-day vulnerabilities.
3.2 Functional Requirements
- Accept the target binary/input and validate format assumptions.
- Produce analyzable outputs (console report and/or artifacts).
- Handle malformed inputs safely with explicit errors.
3.3 Non-Functional Requirements
- Reproducibility: same input should produce equivalent findings.
- Safety: unknown samples run only in isolated lab contexts.
- Clarity: separate facts, hypotheses, and inferred conclusions.
3.4 Expanded Project Brief
-
File: P13-binary-diffing.md
- Main Programming Language: Python
- Alternative Programming Languages: Ghidra scripts
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Patch Analysis / Vulnerability Research
- Software or Tool: BinDiff, Diaphora, Ghidriff
- Main Book: N/A (tool documentation)
What you’ll build: Compare two versions of a binary to find what changed, useful for understanding patches and finding 1-day vulnerabilities.
Why it teaches binary analysis: Comparing old and new versions reveals exactly what was fixed, helping you understand vulnerabilities.
Core challenges you’ll face:
- Function matching → maps to identifying same function across versions
- Diffing algorithms → maps to graph-based comparison
- Finding security patches → maps to what was the vulnerability?
- Interpreting results → maps to understanding the change
Resources for key challenges:
Key Concepts:
- Function Matching: BinDiff documentation
- Graph Isomorphism: Comparison algorithms
- Patch Tuesday Analysis: Security research blogs
Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 5 (Ghidra)
Real World Outcome
Deliverables:
- Analysis output or tooling scripts
- Report with control/data flow notes
Validation checklist:
- Parses sample binaries correctly
- Findings are reproducible in debugger
- No unsafe execution outside lab
```bash
Using ghidriff
$ ghidriff libpng-1.6.39.so libpng-1.6.40.so -o diff_report
Output:
Modified Functions:
png_read_IDAT_data (similarity: 0.87)
- Added bounds check at 0x1234
- New comparison: if (length > max_size)
#
png_handle_chunk (similarity: 0.95)
- Additional validation in switch statement
#
New Functions:
png_check_chunk_length
#
Deleted Functions:
(none)
Analysis:
The patch adds a bounds check in png_read_IDAT_data
This fixes CVE-2023-XXXX (buffer overflow)
Vulnerable code: memcpy without size check
Fixed code: size validated before copy
#### Hints in Layers
Binary diffing workflow:
1. Get old and new versions of binary
2. Export to BinDiff/Diaphora format
3. Run the diffing tool
4. Focus on:
- Modified functions with low similarity
- New validation/bounds check functions
- Changes near memory operations
Tools:
- **BinDiff**: Best for IDA Pro users
- **Diaphora**: Open source, works with IDA
- **Ghidriff**: Works with Ghidra, command-line
- **Ghidra Version Tracking**: Built-in
Identifying security patches:
- Look for new `if` statements (validation)
- Look for changes to buffer operations
- Look for new error handling
- Check functions near strings like "overflow", "bounds"
**Learning milestones**:
1. **Diff two versions** → Generate comparison report
2. **Identify changed functions** → Focus on modifications
3. **Find security patches** → Understand what was fixed
4. **Recreate vulnerability** → Test on old version
#### The Core Question You Are Answering
**"How do you identify what changed between two versions of a binary when you only have compiled code, and why is this the first step in finding 1-day vulnerabilities?"**
This project explores patch analysis: when a vendor releases a security update, the binary changes but source code is rarely available. You must reverse-engineer both versions, identify differences, understand what was fixed, and potentially discover the vulnerability before attackers do.
#### Concepts You Must Understand First
1. **Control Flow Graph (CFG) Isomorphism**
- A CFG represents a function's execution paths as a directed graph where nodes are basic blocks and edges are jumps/branches
- Graph isomorphism algorithms determine if two CFGs are structurally identical even if addresses differ
*Guiding Questions:*
- How does compiler optimization affect CFG structure without changing functionality?
- Why can't you simply compare binaries byte-by-byte?
- What makes two functions "similar" when their assembly differs but behavior is identical?
*Book References:*
- "Practical Binary Analysis" by Dennis Andriesse - Ch 6: Disassembly and Binary Analysis
- "Computer Systems: A Programmer's Perspective" by Bryant & O'Hallaron - Ch 3.6: Control Flow
2. **Basic Block Hashing and Function Fingerprinting**
- Basic blocks are instruction sequences with single entry/exit points
- Hashing creates unique fingerprints based on instruction semantics
*Guiding Questions:*
- How do you create a hash resilient to address changes but sensitive to instruction changes?
- What happens to basic block boundaries when a single instruction is added?
*Book References:*
- "Practical Binary Analysis" by Dennis Andriesse - Ch 5: Binary Analysis Fundamentals
3. **Structural vs. Semantic Diffing**
- Structural diffing compares code organization (CFG structure, basic block count)
- Semantic diffing analyzes what code actually does
*Guiding Questions:*
- How can functions be structurally different but semantically identical?
- What security patches show up in structural diff but not semantic diff?
*Book References:*
- "Practical Binary Analysis" by Dennis Andriesse - Ch 6: Advanced Binary Analysis
4. **Call Graph Analysis**
- Call graphs map relationships between functions
- Changes in call patterns often indicate security-relevant modifications
*Guiding Questions:*
- How does a new security check manifest in the call graph?
- Why are changes to error-handling call paths interesting for security?
*Book References:*
- "Practical Binary Analysis" by Dennis Andriesse - Ch 7: Advanced Static Analysis
5. **Patch Analysis Workflow**
- Systematic process: acquire binaries → analyze → diff → triage → focus on security changes
*Guiding Questions:*
- What function changes most likely indicate security fixes?
- How do you differentiate critical security patches from benign bug fixes?
*Book References:*
- "Hacking: The Art of Exploitation" by Jon Erickson - Ch 0x300: Exploitation
#### Questions to Guide Your Design
1. **What matching algorithm first?** Simple heuristics (function size, strings) or CFG isomorphism?
2. **How will you handle false positives?** What secondary checks confirm matches?
3. **Strategy for unmatched functions?** How do you analyze functions in only one version?
4. **How do you visualize results?** Command-line, side-by-side disassembly, HTML reports?
5. **What metadata to extract?** Beyond CFGs, what information helps disambiguate functions?
6. **Handling different compiler optimizations?** How do you compare -O0 vs -O2 binaries?
7. **Triaging strategy?** How do you prioritize which differences to investigate?
8. **Validating findings?** How do you prove a vulnerability is exploitable?
#### Thinking Exercise
**Manual binary diffing exercise:**
Compile two versions: Version 1 with `strcpy(buffer, input)` and Version 2 with bounds checking. Then:
- Disassemble both in Ghidra/IDA/radare2
- Draw CFGs for both versions
- Identify exact assembly differences
- Document: V1 has single basic block, V2 has diamond pattern with conditional
#### The Interview Questions They'll Ask
1. **"Explain BinDiff vs Diaphora vs Ghidriff."** - BinDiff: IDA integration. Diaphora: open-source. Ghidriff: Ghidra integration.
2. **"How would you diff stripped binaries?"** - Use structural features: prologues, CFG structure, string refs, API calls.
3. **"Function shows 85% similarity. Same function or false positive?"** - Check callers/callees, strings, constants.
4. **"Describe graph isomorphism problem."** - NP-intermediate—use heuristics for practical performance.
5. **"How do compiler optimizations affect diffing?"** - Compensate with normalized sequences, semantic equivalence.
6. **"Walk through Patch Tuesday analysis."** - Download → diff → filter security patterns → reverse-engineer.
7. **"Identify an added bounds check?"** - New comparison + conditional jump creating diamond CFG.
8. **"Optimizing large binary diffs?"** - Filter functions, use exact hashes, parallelize.
9. **"Detecting use-after-free patches?"** - NULL checks after free, pointers set to NULL.
10. **"Build differ from scratch?"** - Disassembly → CFG → fingerprinting → matching → reporting.
#### Books That Will Help
| Topic | Book | Chapters |
|-------|------|----------|
| **Binary Analysis** | "Practical Binary Analysis" by Dennis Andriesse | Ch 5-7 |
| **Control Flow** | "Computer Systems: A Programmer's Perspective" by Bryant & O'Hallaron | Ch 3.6-3.7 |
| **Assembly** | "Low-Level Programming" by Igor Zhirkov | Ch 4-5 |
| **Vulnerabilities** | "Hacking: The Art of Exploitation" by Jon Erickson | Ch 0x300 |
| **Static Analysis** | "Practical Malware Analysis" by Sikorski & Honig | Ch 5-6 |
---
#### Common Pitfalls and Debugging
**Problem 1: "Your interpretation does not match runtime behavior"**
- **Why:** Static analysis can hide runtime-resolved addresses, lazy binding, and input-dependent branches.
- **Fix:** Reproduce the path with debugger or tracer, then compare static assumptions against live register/memory state.
- **Quick test:** Run the same sample through both your static workflow and a debugger transcript, and confirm control-flow decisions align.
**Problem 2: "Tool output is inconsistent across machines"**
- **Why:** ASLR, tool version drift, and different binary build flags (PIE, RELRO, symbols stripped) change observed addresses and metadata.
- **Fix:** Pin tool versions, capture `checksec`/metadata, and document environment assumptions in your report.
- **Quick test:** Re-run analysis in a container or VM with pinned tools and compare hashes of generated outputs.
**Problem 3: "Analysis accidentally executes unsafe code"**
- **Why:** Dynamic workflows run binaries in host context without sufficient isolation.
- **Fix:** Use disposable snapshots, no-network execution, and non-privileged users for all unknown samples.
- **Quick test:** Validate isolation controls first (network disabled, snapshot active, unprivileged user), then execute sample.
#### Definition of Done
- [ ] Core functionality works on reference inputs
- [ ] Edge cases are tested and documented
- [ ] Results are reproducible (same binary, same tools, same report output)
- [ ] Analysis notes clearly separate observations, assumptions, and conclusions
- [ ] Lab safety controls were applied for any dynamic execution
## 4. Solution Architecture
```text
Input Artifact -> Parse/Decode -> Analysis Engine -> Validation Layer -> Report
Design each stage so intermediate artifacts are inspectable (JSON/text/notes), which makes debugging and peer review much easier.
5. Implementation Phases
Phase 1: Foundation
- Define input assumptions and format checks.
- Produce a minimal golden output on one known sample.
Phase 2: Core Functionality
- Implement full analysis pass for normal cases.
- Add validation against an external ground-truth tool.
Phase 3: Hard Cases and Reporting
- Add malformed/edge-case handling.
- Finalize report template and reproducibility notes.
6. Testing Strategy
- Unit-level checks for parser/decoder helpers.
- Integration checks against known binaries/challenges.
- Regression tests for previously failing cases.
7. Extensions & Challenges
- Add automation for batch analysis and comparative reports.
- Add confidence scoring for each major finding.
- Add export formats suitable for CI/security pipelines.
8. Production Reflection
Map your project output to a production analogue: what reliability, observability, and security controls would be required to run this continuously in an engineering organization?