Project 12: Debugger

Build a minimal debugger using ptrace to set breakpoints and inspect registers.

Quick Reference

Attribute Value
Difficulty Master
Time Estimate 4-6 weeks
Language C
Prerequisites Processes, signals, ELF basics
Key Topics ptrace, breakpoints, symbols

1. Learning Objectives

By completing this project, you will:

  1. Use ptrace to control a child process.
  2. Set and remove software breakpoints.
  3. Read registers and memory.
  4. Map addresses to symbols with ELF parsing.

2. Theoretical Foundation

2.1 Core Concepts

  • ptrace: The kernel API for debugging and tracing.
  • Breakpoints: Replace instruction with int3 and restore later.
  • Registers: Inspect instruction pointer and general registers.

2.2 Why This Matters

Debuggers are the ultimate systems tool. Building one reveals how breakpoints, single-stepping, and symbol lookup really work.

2.3 Historical Context / Background

gdb and other debuggers rely on ptrace-like mechanisms. Modern tools layer richer UI on top of these core primitives.

2.4 Common Misconceptions

  • “Breakpoints are magic”: They are just modified instructions.
  • “You can read memory freely”: You must respect ptrace and permissions.

3. Project Specification

3.1 What You Will Build

A debugger that supports:

  • Launching a target process
  • Setting breakpoints by address
  • Continuing and single-stepping
  • Inspecting registers
  • Optional symbol lookup for functions

3.2 Functional Requirements

  1. Start a target under tracing.
  2. Set breakpoints using int3.
  3. Continue and stop on breakpoint.
  4. Print register values on stop.

3.3 Non-Functional Requirements

  • Correctness: Restore original instructions.
  • Reliability: Handle multiple breakpoints.
  • Usability: Clear prompt and commands.

3.4 Example Usage / Output

$ ./dbg ./app
(dbg) break main
(dbg) run
[stop] hit breakpoint at main
(dbg) regs
RIP=0x401050 RSP=0x7ffd...
(dbg) step

3.5 Real World Outcome

You can set breakpoints and inspect registers while a program runs. This is the core capability of a debugger like gdb, implemented in C.


4. Solution Architecture

4.1 High-Level Design

CLI -> ptrace control -> breakpoint table -> register/memory inspect

4.2 Key Components

Component Responsibility Key Decisions
Target launcher fork + exec Use PTRACE_TRACEME
Breakpoints Insert/remove int3 Store original byte
Command loop Parse debugger commands Minimal REPL
Symbol loader Map names to addresses Parse ELF symbols

4.3 Data Structures

typedef struct {
    void *addr;
    uint8_t saved_byte;
} Breakpoint;

4.4 Algorithm Overview

Key Algorithm: Set breakpoint

  1. Read word at address with PTRACE_PEEKDATA.
  2. Save low byte.
  3. Replace low byte with 0xCC (int3).
  4. Write back with PTRACE_POKEDATA.

Complexity Analysis:

  • Breakpoint operations: O(1)
  • Symbol lookup: O(n) in symbols table

5. Implementation Guide

5.1 Development Environment Setup

cc -Wall -Wextra -O2 -g -o dbg dbg.c elf.c

5.2 Project Structure

dbg/
├── src/
│   ├── dbg.c
│   └── elf.c
├── tests/
│   └── test_dbg.sh
└── README.md

5.3 The Core Question You’re Answering

“How does a debugger interrupt and inspect a running process?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Signals and traps
    • How does SIGTRAP work?
  2. ptrace API
    • Difference between PTRACE_CONT and PTRACE_SINGLESTEP.
  3. ELF symbols
    • How do function names map to addresses?

5.5 Questions to Guide Your Design

Before implementing, think through these:

  1. Will you support symbol names or addresses only?
  2. How will you manage multiple breakpoints?
  3. How will you handle stepping over a breakpoint?

5.6 Thinking Exercise

Breakpoint Step-Over

When a breakpoint triggers, the instruction pointer is after the int3. How do you restore and re-execute the original instruction?

5.7 The Interview Questions They’ll Ask

Prepare to answer these:

  1. “How does a software breakpoint work?”
  2. “What does ptrace do under the hood?”
  3. “How do you read registers of another process?”

5.8 Hints in Layers

Hint 1: Start with run/continue Launch a child and just wait for exits.

Hint 2: Add breakpoint at fixed address Hardcode an address before adding symbols.

Hint 3: Add symbol lookup Parse ELF symbol table to map names.

5.9 Books That Will Help

Topic Book Chapter
Debugging tools “The Art of Debugging” Ch. 1-3
ELF format “Linkers and Loaders” ELF chapters

5.10 Implementation Phases

Phase 1: Foundation (7-10 days)

Goals:

  • Launch and control child

Tasks:

  1. Implement run, continue.
  2. Handle exit status.

Checkpoint: Child runs under debugger.

Phase 2: Core Functionality (10-14 days)

Goals:

  • Breakpoints and registers

Tasks:

  1. Implement breakpoint insert/remove.
  2. Print registers on stop.

Checkpoint: Breakpoint stops program.

Phase 3: Symbols and Polish (7-10 days)

Goals:

  • Resolve symbols

Tasks:

  1. Parse ELF symbols.
  2. Support break main.

Checkpoint: Breakpoints by name work.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Breakpoint type Software vs hardware Software Simpler and portable
Symbol lookup External tools vs ELF parse ELF parse Learn internals

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests ELF parsing Known binaries
Integration Tests Breakpoint hit Simple loop program
Edge Cases Multiple breakpoints Different functions

6.2 Critical Test Cases

  1. Breakpoint hit: Stops at expected location.
  2. Continue: Resumes execution correctly.
  3. Step: Advances a single instruction.

6.3 Test Data

int main(){ int x=0; x++; return 0; }

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Not restoring instruction Crash Replace int3 byte on resume
Wrong RIP adjustment Infinite loop Decrement RIP before resume
Incorrect ELF parsing Wrong symbols Validate headers

7.2 Debugging Strategies

  • Use gdb to compare register values.
  • Print raw bytes at breakpoint address.

7.3 Performance Traps

Symbol resolution can be slow; cache results in memory.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add memory read command.
  • Add disasm using objdump output.

8.2 Intermediate Extensions

  • Add watchpoints (memory breakpoints).
  • Add backtrace support.

8.3 Advanced Extensions

  • Add DWARF line info.
  • Implement remote debugging stub.

9. Real-World Connections

9.1 Industry Applications

  • Debugging: Core skill for systems work.
  • Security: Understanding breakpoints and memory inspection.
  • gdb: Full-featured debugger.

9.3 Interview Relevance

Debugger internals show deep OS and binary knowledge.


10. Resources

10.1 Essential Reading

  • “The Art of Debugging” - Ch. 1-3
  • “Linkers and Loaders” - ELF chapters

10.2 Video Resources

  • Debugger internals lectures

10.3 Tools & Documentation

  • man 2 ptrace: Core debugging API
  • Unix Shell: Process control basics.
  • Compiler Frontend: Produces symbols to debug.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain software breakpoints.
  • I understand ptrace control flow.
  • I can map symbols to addresses.

11.2 Implementation

  • Breakpoints work reliably.
  • Registers are printed correctly.
  • Symbols resolve correctly.

11.3 Growth

  • I can add watchpoints.
  • I can explain this project in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Run, continue, breakpoint by address.

Full Completion:

  • Breakpoints by symbol and register inspection.

Excellence (Going Above & Beyond):

  • DWARF line info and backtraces.

This guide was generated from C_PROGRAMMING_COMPLETE_MASTERY.md. For the complete learning path, see the parent directory.