Project 4: The Corruption - Using Watchpoints
Create a memory corruption bug and use LLDB watchpoints to catch the exact write.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | 2 hours |
| Language | C (debug target), LLDB commands |
| Prerequisites | Projects 1-3 |
| Key Topics | Watchpoints, memory writes, stack vs global memory |
1. Learning Objectives
By completing this project, you will:
- Set watchpoints on variables and addresses.
- Identify the exact instruction that corrupts memory.
- Interpret watchpoint stop reports.
- Use conditional watchpoints for targeted debugging.
2. Theoretical Foundation
2.1 Core Concepts
- Watchpoints: Hardware-assisted breakpoints that trigger when a memory address is written or read.
- Memory Corruption: Writes outside valid bounds can silently change unrelated variables.
- Stack vs. Globals: Stack variables live in per-thread frames; globals are fixed addresses, making watchpoints useful for both.
2.2 Why This Matters
Memory corruption bugs are expensive and hard to track. Watchpoints let you stop at the first illegal write, not the final crash.
2.3 Historical Context / Background
Hardware debug registers (e.g., x86 DR0-DR7) enable watchpoints. LLDB exposes them in a user-friendly interface with conditions and commands.
2.4 Common Misconceptions
- “Watchpoints are slow”: Hardware watchpoints are fast but limited in count.
- “Only for globals”: You can watch stack addresses, too.
3. Project Specification
3.1 What You Will Build
A C program with an off-by-one write that corrupts a nearby value. You will set a watchpoint on the corrupted variable to find the exact write.
3.2 Functional Requirements
- Trigger corruption: Program should modify an adjacent memory location.
- Set a watchpoint: On the corrupted variable.
- Catch the write: LLDB should stop on the corrupting instruction.
3.3 Non-Functional Requirements
- Deterministic bug: Corruption should happen every run.
- Small surface area: Keep the program tiny for clarity.
3.4 Example Usage / Output
$ clang -g -o corrupt corrupt.c
$ lldb ./corrupt
(lldb) b main
(lldb) run
(lldb) watchpoint set variable local_value
Watchpoint created: Watchpoint 1: addr = 0x7ffee7c0f5ec size = 4
(lldb) continue
Watchpoint 1 hit:
old value: 200
new value: 0
* thread #1, stop reason = watchpoint 1
frame #0: corrupt`buggy_function at corrupt.c:5
3.5 Real World Outcome
You will see LLDB stop at the exact line that overwrote local_value, before any crash. Example output:
Watchpoint 1 hit:
old value: 200
new value: 0
* thread #1, stop reason = watchpoint 1
frame #0: 0x0000000100003f48 corrupt`buggy_function at corrupt.c:5
4. Solution Architecture
4.1 High-Level Design
Program writes out of bounds -> Watchpoint triggers -> LLDB stops -> Backtrace to culprit
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
corrupt.c |
Off-by-one bug | Simple pointer arithmetic |
| Watchpoint | Track memory writes | Watch the victim variable |
4.3 Data Structures
int global_value = 100;
int local_value = 200;
4.4 Algorithm Overview
Key Algorithm: Watchpoint Debugging
- Set watchpoint on
local_value. - Run until watchpoint triggers.
- Inspect the current frame and backtrace.
- Identify the faulty write.
Complexity Analysis:
- Time: O(1) for each watchpoint trigger
- Space: O(1)
5. Implementation Guide
5.1 Development Environment Setup
clang -g -o corrupt corrupt.c
5.2 Project Structure
project-root/
├── corrupt.c
└── README.md
5.3 The Core Question You’re Answering
“How do I catch the first write that corrupts memory?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Pointer Arithmetic
- What does
p + 1mean for anint *? - How does size affect offsets?
- Book Reference: “Effective C” Ch. 4
- What does
- Watchpoints
- Hardware vs software watchpoints.
- Limitations on number of active watchpoints.
- Stack Layout
- Why locals can be adjacent in memory.
- How compiler optimizations affect layout.
5.5 Questions to Guide Your Design
Before implementing, think through these:
- Which variable will be corrupted, and why?
- What line should the watchpoint stop on?
- What does the backtrace tell you about how you got there?
5.6 Thinking Exercise
Predict the Corruption
Draw a small memory diagram and show how *(p + 1) overwrites the next int.
5.7 The Interview Questions They’ll Ask
Prepare to answer these:
- “What is a watchpoint, and how is it different from a breakpoint?”
- “How would you debug an off-by-one write?”
- “Why do watchpoints have limits?”
5.8 Hints in Layers
Hint 1: Use variable watchpoints
(lldb) watchpoint set variable local_value
Hint 2: Check the stack
(lldb) bt
Hint 3: Add a condition
(lldb) watchpoint modify -c 'new_val == 0'
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Pointer arithmetic | “Effective C” | Ch. 4 |
| Debugging memory bugs | “The Art of Debugging with GDB” | Ch. 7 |
| Stack layout | CS:APP | Ch. 3.7 |
5.10 Implementation Phases
Phase 1: Foundation (30 minutes)
Goals:
- Create the buggy program.
- Build with debug symbols.
Tasks:
- Write
corrupt.cwith an off-by-one write. - Compile with
-g.
Checkpoint: Program compiles and runs.
Phase 2: Core Functionality (40 minutes)
Goals:
- Set watchpoint and catch the write.
Tasks:
- Set watchpoint on
local_value. - Continue execution until watchpoint triggers.
Checkpoint: LLDB stops in buggy_function.
Phase 3: Polish & Edge Cases (30 minutes)
Goals:
- Use conditional watchpoints.
- Explain the memory layout.
Tasks:
- Add a conditional watchpoint.
- Use
memory readto view nearby bytes.
Checkpoint: You can explain why the corruption happened.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Bug type | Off-by-one vs buffer overflow | Off-by-one | Predictable and small |
| Watchpoint target | Local vs global | Local | Shows stack corruption |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Watchpoint Trigger | Ensure it fires | watchpoint set variable |
| Stack Inspection | Validate culprit | bt, frame variable |
| Memory Read | Confirm corruption | memory read |
6.2 Critical Test Cases
- Watchpoint hits: LLDB stops on the write.
- Backtrace shows culprit:
buggy_functionis on top. - Old/new values: LLDB shows change from 200 to 0.
6.3 Test Data
old value: 200
new value: 0
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Optimizations reorder locals | Watchpoint hits unexpected lines | Use -O0 -g |
| Too many watchpoints | LLDB refuses | Remove unused watchpoints |
| Wrong variable scope | Watchpoint never triggers | Set on correct variable |
7.2 Debugging Strategies
- Validate the address:
watchpoint listshows the exact address. - Inspect nearby memory:
memory read --format x --count 8 &local_value.
7.3 Performance Traps
Hardware watchpoints are limited; keep them minimal.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add a second corrupted variable and set multiple watchpoints.
- Use
watchpoint set expressionwith an address.
8.2 Intermediate Extensions
- Debug a stack buffer overflow with watchpoints.
- Compare watchpoint behavior in optimized vs debug builds.
8.3 Advanced Extensions
- Use LLDB stop hooks to auto-print the stack on watchpoint hit.
- Write a script to manage watchpoints across multiple runs.
9. Real-World Connections
9.1 Industry Applications
- Memory corruption triage: Use watchpoints to find writes in legacy C/C++.
- Security debugging: Identify buffer overflows and write gadgets.
9.2 Related Open Source Projects
- LLVM Sanitizers: https://clang.llvm.org/docs/AddressSanitizer.html
- LLDB: https://lldb.llvm.org
9.3 Interview Relevance
- Understanding watchpoints and memory corruption is valuable for systems roles.
10. Resources
10.1 Essential Reading
- LLDB Watchpoint Docs - https://lldb.llvm.org/use/watchpoints.html
- Effective C by Robert C. Seacord - Ch. 4
10.2 Video Resources
- Watchpoint debugging demos - LLDB community videos
10.3 Tools & Documentation
watchpointcommand: https://lldb.llvm.org/use/command.html#watchpoint
10.4 Related Projects in This Series
- LLDB Python Scripting: automate repetitive watchpoint workflows.
11. Self-Assessment Checklist
11.1 Understanding
- I can explain how watchpoints work at the hardware level.
- I can explain why the off-by-one write corrupts memory.
- I can use conditional watchpoints.
11.2 Implementation
- Watchpoint triggers on the correct line.
- I can inspect old vs new values.
- I can identify the corrupting instruction.
11.3 Growth
- I can apply watchpoints to larger programs.
- I can combine watchpoints with stop hooks.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Create a corruption bug and catch it with a watchpoint.
- Identify the exact line and explain the cause.
Full Completion:
- Use a conditional watchpoint and document why it helps.
- Inspect raw memory around the corrupted variable.
Excellence (Going Above & Beyond):
- Compare watchpoints with AddressSanitizer output.
- Automate watchpoint setup with an LLDB script.
This guide was generated from LEARN_LLDB_DEEP_DIVE.md. For the complete learning path, see the parent directory.