Project 7: Build a Mini-Debugger (ptrace)
Implement a tiny debugger in C that can launch a process, set breakpoints, read registers, and single-step.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Expert |
| Time Estimate | 1-2 weeks |
| Language | C (Linux) |
| Prerequisites | Projects 1-5, Linux syscalls, basic assembly |
| Key Topics | ptrace, breakpoints, process control, registers |
1. Learning Objectives
By completing this project, you will:
- Use
ptraceto start and control a child process. - Implement software breakpoints by patching
int 3. - Read and write registers with
PTRACE_GETREGS. - Build a tiny REPL that mirrors essential GDB behavior.
2. Theoretical Foundation
2.1 Core Concepts
- ptrace: Kernel API that allows one process to observe and control another.
- Software breakpoints: Replace instruction byte with
0xCC(int 3) and restore later. - Process control:
waitpidinforms the debugger when the tracee stops.
2.2 Why This Matters
When you build a mini-debugger, you stop treating GDB as magic. You understand the OS-level mechanics that enable debugging.
2.3 Historical Context / Background
ptrace dates back to early UNIX systems and remains the foundation for debuggers like GDB, strace, and rr.
2.4 Common Misconceptions
- “Breakpoints are metadata”: They are real instruction byte patches.
- “You need full symbols”: Basic debugging works with addresses alone.
3. Project Specification
3.1 What You Will Build
A minimal debugger executable that:
- launches a target program,
- sets breakpoints by address,
- continues and single-steps,
- prints register state.
3.2 Functional Requirements
- Launch a child process with
fork+exec+PTRACE_TRACEME. - Implement
break <addr>to install a breakpoint. - Implement
continue,step, andregscommands. - Restore instructions when resuming from a breakpoint.
3.3 Non-Functional Requirements
- Reliability: Breakpoints should be reversible.
- Performance: Acceptable for small debugging sessions.
- Usability: Simple REPL with clear output.
3.4 Example Usage / Output
mini_gdb> break 0x40100a
mini_gdb> continue
Stopped at breakpoint 1: 0x40100a
mini_gdb> regs
rax: 0x5
rip: 0x40100a
3.5 Real World Outcome
When you run your debugger, you can control a target:
$ ./mini_gdb ./my_program
mini_gdb> break 0x40100a
Breakpoint set at 0x40100a
mini_gdb> continue
Stopped at breakpoint 1: 0x40100a
mini_gdb> regs
rax: 0x5
rbx: 0x0
rip: 0x40100a
mini_gdb> step
Stopped at 0x40100b
mini_gdb> quit
4. Solution Architecture
4.1 High-Level Design
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ mini_gdb │────▶│ ptrace + wait│────▶│ tracee proc │
└──────────────┘ └──────────────┘ └──────────────┘
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| REPL | Parse commands | Simple string parser |
| Breakpoint manager | Patch and restore int 3 |
Store original byte |
| ptrace interface | Read/write regs and memory | Use PTRACE_PEEKDATA |
4.3 Data Structures
struct Breakpoint {
unsigned long addr;
unsigned char original_byte;
int enabled;
};
struct DebuggerState {
pid_t tracee;
struct Breakpoint bps[32];
size_t bp_count;
};
4.4 Algorithm Overview
Key Algorithm: Software breakpoint handling
- Read original byte at address.
- Write
0xCCto set breakpoint. - On stop, restore byte, adjust RIP, single-step, and re-insert breakpoint.
Complexity Analysis:
- Time: O(1) per breakpoint hit.
- Space: O(B) for number of breakpoints.
5. Implementation Guide
5.1 Development Environment Setup
uname -a
man ptrace
5.2 Project Structure
project-root/
├── src/
│ ├── main.c
│ ├── repl.c
│ ├── breakpoints.c
│ └── ptrace_wrap.c
├── include/
│ └── debugger.h
└── Makefile
5.3 The Core Question You’re Answering
“How does a debugger actually control another process?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- ptrace lifecycle
PTRACE_TRACEME,PTRACE_CONT,waitpid
- Breakpoints
int 3opcode and RIP adjustment
- Registers
struct user_regs_struct
5.5 Questions to Guide Your Design
- How will you map breakpoints to original bytes?
- What happens if a breakpoint is hit twice?
- How do you avoid corrupting instruction streams?
5.6 Thinking Exercise
If the breakpoint is set on a multi-byte instruction, why is writing a single 0xCC still valid?
5.7 The Interview Questions They’ll Ask
- How does GDB set a breakpoint under the hood?
- What does
PTRACE_SINGLESTEPdo? - Why must you decrement RIP after a breakpoint hit?
5.8 Hints in Layers
Hint 1: Tracee setup
- Child calls
PTRACE_TRACEMEthenexec.
Hint 2: Breakpoint patching
- Read a word, change low byte to
0xCC, write back.
Hint 3: Breakpoint hit
- Restore original byte, set RIP back one, single-step.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| ptrace | TLPI | Ch. 19 |
| Debugger internals | “The Linux Programming Interface” | Ch. 20 |
| Assembly and traps | CSAPP | Ch. 3 |
5.10 Implementation Phases
Phase 1: Foundation (2-3 days)
Goals:
- Start and stop a child process.
Tasks:
- Implement
fork+PTRACE_TRACEME. - Wait for initial stop and print PID.
Checkpoint: Child stops at exec and you can continue it.
Phase 2: Core Functionality (4-6 days)
Goals:
- Add breakpoints and register reads.
Tasks:
- Implement breakpoint set/clear.
- Implement
regscommand.
Checkpoint: Breakpoints stop execution at correct address.
Phase 3: Polish & Edge Cases (3-5 days)
Goals:
- Handle repeated hits and multiple breakpoints.
Tasks:
- Fix RIP adjustments.
- Support multiple breakpoints.
Checkpoint: No crashes after repeated break/continue.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Breakpoint storage | array vs hashmap | array | Simplicity for small count |
| Address input | hex only vs mixed | hex only | Avoid parsing ambiguity |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Process control | Verify attach/continue | Child runs after continue |
| Breakpoints | Verify stop | rip equals address |
| Registers | Validate reads | regs prints values |
6.2 Critical Test Cases
- Breakpoint hit pauses at correct address.
stepadvances exactly one instruction.- Multiple breakpoints do not conflict.
6.3 Test Data
Target program with a known function address
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Wrong RIP after hit | Breakpoints skip or loop | Decrement RIP by 1 |
| Missing waitpid | Race conditions | Wait for stop after each action |
| Bad memory writes | Crashes | Restore original byte correctly |
7.2 Debugging Strategies
- Use
straceon your debugger to validate ptrace calls. - Print debug logs for each ptrace request.
7.3 Performance Traps
Single-stepping every instruction is slow; use it sparingly.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add
disassembleby callingobjdump.
8.2 Intermediate Extensions
- Add symbol resolution with
libelf.
8.3 Advanced Extensions
- Add software watchpoints by single-stepping memory access.
9. Real-World Connections
9.1 Industry Applications
- Debuggers: GDB, LLDB, rr all use the same fundamentals.
- Security: Exploit development and reverse engineering rely on ptrace.
9.2 Related Open Source Projects
- gdb: Full-featured debugger.
- rr: Record and replay debugging.
9.3 Interview Relevance
- Demonstrates OS internals knowledge and low-level tooling skills.
10. Resources
10.1 Essential Reading
- TLPI - Process tracing and signals.
- ptrace(2) man page.
10.2 Video Resources
- Search: “build a debugger ptrace”.
10.3 Tools & Documentation
- GDB: https://sourceware.org/gdb/
- man 2 ptrace
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain how
ptraceworks. - I can describe how a breakpoint is implemented.
11.2 Implementation
- My debugger can set and hit a breakpoint.
- My debugger can read registers.
11.3 Growth
- I can explain GDB internals at a high level.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Launch a target and single-step it.
Full Completion:
- Set and handle breakpoints correctly.
Excellence (Going Above & Beyond):
- Add symbol lookup and source line mapping.
This guide was generated from LEARN_GDB_DEEP_DIVE.md. For the complete learning path, see the parent directory README.