Project 13: Binary Diffing

Expanded deep-dive guide for Project 13 from the Binary Analysis sprint.

Quick Reference

Attribute	Value
Difficulty	Level 2: Intermediate
Time Estimate	1-2 weeks
Main Programming Language	Python
Alternative Programming Languages	Ghidra scripts
Coolness Level	Level 3: Genuinely Clever
Business Potential	2. The “Micro-SaaS / Pro Tool”
Knowledge Area	Patch Analysis / Vulnerability Research
Software or Tool	BinDiff, Diaphora, Ghidriff
Main Book	N/A (tool documentation)

1. Learning Objectives

Build a working implementation with reproducible outputs.
Justify key design choices with binary-analysis principles.
Produce an evidence-backed report of findings and limitations.
Document hardening or next-step improvements.

2. All Theory Needed (Per-Concept Breakdown)

This project depends on concepts from the main sprint primer: loader semantics, control/data-flow recovery, runtime observation, and mitigation-aware vulnerability reasoning. Before implementation, restate the project’s core assumptions in your own words and define how you will validate them.

3. Project Specification

3.1 What You Will Build

Compare two versions of a binary to find what changed, useful for understanding patches and finding 1-day vulnerabilities.

3.2 Functional Requirements

Accept the target binary/input and validate format assumptions.
Produce analyzable outputs (console report and/or artifacts).
Handle malformed inputs safely with explicit errors.

3.3 Non-Functional Requirements

Reproducibility: same input should produce equivalent findings.
Safety: unknown samples run only in isolated lab contexts.
Clarity: separate facts, hypotheses, and inferred conclusions.

3.4 Expanded Project Brief

File: P13-binary-diffing.md
Main Programming Language: Python
Alternative Programming Languages: Ghidra scripts
Coolness Level: Level 3: Genuinely Clever
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 2: Intermediate
Knowledge Area: Patch Analysis / Vulnerability Research
Software or Tool: BinDiff, Diaphora, Ghidriff
Main Book: N/A (tool documentation)

What you’ll build: Compare two versions of a binary to find what changed, useful for understanding patches and finding 1-day vulnerabilities.

Why it teaches binary analysis: Comparing old and new versions reveals exactly what was fixed, helping you understand vulnerabilities.

Core challenges you’ll face:

Function matching → maps to identifying same function across versions
Diffing algorithms → maps to graph-based comparison
Finding security patches → maps to what was the vulnerability?
Interpreting results → maps to understanding the change

Resources for key challenges:

Key Concepts:

Function Matching: BinDiff documentation
Graph Isomorphism: Comparison algorithms
Patch Tuesday Analysis: Security research blogs

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 5 (Ghidra)

Real World Outcome

Deliverables:

Analysis output or tooling scripts
Report with control/data flow notes

Validation checklist:

Parses sample binaries correctly
Findings are reproducible in debugger
No unsafe execution outside lab ```bash
Using ghidriff

$ ghidriff libpng-1.6.39.so libpng-1.6.40.so -o diff_report

Output:

Modified Functions:

png_read_IDAT_data (similarity: 0.87)

- Added bounds check at 0x1234

- New comparison: if (length > max_size)

png_handle_chunk (similarity: 0.95)

- Additional validation in switch statement

New Functions:

png_check_chunk_length

Deleted Functions:

(none)

Analysis:

The patch adds a bounds check in png_read_IDAT_data

This fixes CVE-2023-XXXX (buffer overflow)

Vulnerable code: memcpy without size check

Fixed code: size validated before copy

#### Hints in Layers
Binary diffing workflow:
1. Get old and new versions of binary
2. Export to BinDiff/Diaphora format
3. Run the diffing tool
4. Focus on:
   - Modified functions with low similarity
   - New validation/bounds check functions
   - Changes near memory operations

Tools:
- **BinDiff**: Best for IDA Pro users
- **Diaphora**: Open source, works with IDA
- **Ghidriff**: Works with Ghidra, command-line
- **Ghidra Version Tracking**: Built-in

Identifying security patches:
- Look for new `if` statements (validation)
- Look for changes to buffer operations
- Look for new error handling
- Check functions near strings like "overflow", "bounds"

**Learning milestones**:
1. **Diff two versions** → Generate comparison report
2. **Identify changed functions** → Focus on modifications
3. **Find security patches** → Understand what was fixed
4. **Recreate vulnerability** → Test on old version

#### The Core Question You Are Answering

**"How do you identify what changed between two versions of a binary when you only have compiled code, and why is this the first step in finding 1-day vulnerabilities?"**

This project explores patch analysis: when a vendor releases a security update, the binary changes but source code is rarely available. You must reverse-engineer both versions, identify differences, understand what was fixed, and potentially discover the vulnerability before attackers do.

#### Concepts You Must Understand First

1. **Control Flow Graph (CFG) Isomorphism**
   - A CFG represents a function's execution paths as a directed graph where nodes are basic blocks and edges are jumps/branches
   - Graph isomorphism algorithms determine if two CFGs are structurally identical even if addresses differ

   *Guiding Questions:*
   - How does compiler optimization affect CFG structure without changing functionality?
   - Why can't you simply compare binaries byte-by-byte?
   - What makes two functions "similar" when their assembly differs but behavior is identical?

   *Book References:*
   - "Practical Binary Analysis" by Dennis Andriesse - Ch 6: Disassembly and Binary Analysis
   - "Computer Systems: A Programmer's Perspective" by Bryant & O'Hallaron - Ch 3.6: Control Flow

2. **Basic Block Hashing and Function Fingerprinting**
   - Basic blocks are instruction sequences with single entry/exit points
   - Hashing creates unique fingerprints based on instruction semantics

   *Guiding Questions:*
   - How do you create a hash resilient to address changes but sensitive to instruction changes?
   - What happens to basic block boundaries when a single instruction is added?

   *Book References:*
   - "Practical Binary Analysis" by Dennis Andriesse - Ch 5: Binary Analysis Fundamentals

3. **Structural vs. Semantic Diffing**
   - Structural diffing compares code organization (CFG structure, basic block count)
   - Semantic diffing analyzes what code actually does

   *Guiding Questions:*
   - How can functions be structurally different but semantically identical?
   - What security patches show up in structural diff but not semantic diff?

   *Book References:*
   - "Practical Binary Analysis" by Dennis Andriesse - Ch 6: Advanced Binary Analysis

4. **Call Graph Analysis**
   - Call graphs map relationships between functions
   - Changes in call patterns often indicate security-relevant modifications

   *Guiding Questions:*
   - How does a new security check manifest in the call graph?
   - Why are changes to error-handling call paths interesting for security?

   *Book References:*
   - "Practical Binary Analysis" by Dennis Andriesse - Ch 7: Advanced Static Analysis

5. **Patch Analysis Workflow**
   - Systematic process: acquire binaries → analyze → diff → triage → focus on security changes

   *Guiding Questions:*
   - What function changes most likely indicate security fixes?
   - How do you differentiate critical security patches from benign bug fixes?

   *Book References:*
   - "Hacking: The Art of Exploitation" by Jon Erickson - Ch 0x300: Exploitation

#### Questions to Guide Your Design

1. **What matching algorithm first?** Simple heuristics (function size, strings) or CFG isomorphism?

2. **How will you handle false positives?** What secondary checks confirm matches?

3. **Strategy for unmatched functions?** How do you analyze functions in only one version?

4. **How do you visualize results?** Command-line, side-by-side disassembly, HTML reports?

5. **What metadata to extract?** Beyond CFGs, what information helps disambiguate functions?

6. **Handling different compiler optimizations?** How do you compare -O0 vs -O2 binaries?

7. **Triaging strategy?** How do you prioritize which differences to investigate?

8. **Validating findings?** How do you prove a vulnerability is exploitable?

#### Thinking Exercise

**Manual binary diffing exercise:**

Compile two versions: Version 1 with `strcpy(buffer, input)` and Version 2 with bounds checking. Then:
- Disassemble both in Ghidra/IDA/radare2
- Draw CFGs for both versions
- Identify exact assembly differences
- Document: V1 has single basic block, V2 has diamond pattern with conditional

#### The Interview Questions They'll Ask

1. **"Explain BinDiff vs Diaphora vs Ghidriff."** - BinDiff: IDA integration. Diaphora: open-source. Ghidriff: Ghidra integration.

2. **"How would you diff stripped binaries?"** - Use structural features: prologues, CFG structure, string refs, API calls.

3. **"Function shows 85% similarity. Same function or false positive?"** - Check callers/callees, strings, constants.

4. **"Describe graph isomorphism problem."** - NP-intermediate—use heuristics for practical performance.

5. **"How do compiler optimizations affect diffing?"** - Compensate with normalized sequences, semantic equivalence.

6. **"Walk through Patch Tuesday analysis."** - Download → diff → filter security patterns → reverse-engineer.

7. **"Identify an added bounds check?"** - New comparison + conditional jump creating diamond CFG.

8. **"Optimizing large binary diffs?"** - Filter functions, use exact hashes, parallelize.

9. **"Detecting use-after-free patches?"** - NULL checks after free, pointers set to NULL.

10. **"Build differ from scratch?"** - Disassembly → CFG → fingerprinting → matching → reporting.

#### Books That Will Help

| Topic | Book | Chapters |
|-------|------|----------|
| **Binary Analysis** | "Practical Binary Analysis" by Dennis Andriesse | Ch 5-7 |
| **Control Flow** | "Computer Systems: A Programmer's Perspective" by Bryant & O'Hallaron | Ch 3.6-3.7 |
| **Assembly** | "Low-Level Programming" by Igor Zhirkov | Ch 4-5 |
| **Vulnerabilities** | "Hacking: The Art of Exploitation" by Jon Erickson | Ch 0x300 |
| **Static Analysis** | "Practical Malware Analysis" by Sikorski & Honig | Ch 5-6 |

---


#### Common Pitfalls and Debugging

**Problem 1: "Your interpretation does not match runtime behavior"**
- **Why:** Static analysis can hide runtime-resolved addresses, lazy binding, and input-dependent branches.
- **Fix:** Reproduce the path with debugger or tracer, then compare static assumptions against live register/memory state.
- **Quick test:** Run the same sample through both your static workflow and a debugger transcript, and confirm control-flow decisions align.

**Problem 2: "Tool output is inconsistent across machines"**
- **Why:** ASLR, tool version drift, and different binary build flags (PIE, RELRO, symbols stripped) change observed addresses and metadata.
- **Fix:** Pin tool versions, capture `checksec`/metadata, and document environment assumptions in your report.
- **Quick test:** Re-run analysis in a container or VM with pinned tools and compare hashes of generated outputs.

**Problem 3: "Analysis accidentally executes unsafe code"**
- **Why:** Dynamic workflows run binaries in host context without sufficient isolation.
- **Fix:** Use disposable snapshots, no-network execution, and non-privileged users for all unknown samples.
- **Quick test:** Validate isolation controls first (network disabled, snapshot active, unprivileged user), then execute sample.

#### Definition of Done

- [ ] Core functionality works on reference inputs
- [ ] Edge cases are tested and documented
- [ ] Results are reproducible (same binary, same tools, same report output)
- [ ] Analysis notes clearly separate observations, assumptions, and conclusions
- [ ] Lab safety controls were applied for any dynamic execution


## 4. Solution Architecture

```text
Input Artifact -> Parse/Decode -> Analysis Engine -> Validation Layer -> Report

Design each stage so intermediate artifacts are inspectable (JSON/text/notes), which makes debugging and peer review much easier.

5. Implementation Phases

Phase 1: Foundation

Define input assumptions and format checks.
Produce a minimal golden output on one known sample.

Phase 2: Core Functionality

Implement full analysis pass for normal cases.
Add validation against an external ground-truth tool.

Phase 3: Hard Cases and Reporting

Add malformed/edge-case handling.
Finalize report template and reproducibility notes.

6. Testing Strategy

Unit-level checks for parser/decoder helpers.
Integration checks against known binaries/challenges.
Regression tests for previously failing cases.

7. Extensions & Challenges

Add automation for batch analysis and comparative reports.
Add confidence scoring for each major finding.
Add export formats suitable for CI/security pipelines.

8. Production Reflection

Map your project output to a production analogue: what reliability, observability, and security controls would be required to run this continuously in an engineering organization?