Project 4: The Corruption - Watchpoints

Catch the exact line that corrupts a variable by using GDB watchpoints.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 2 hours
Language GDB commands (C target)
Prerequisites Project 1, pointers, stack layout
Key Topics watchpoints, memory layout, pointer arithmetic

1. Learning Objectives

By completing this project, you will:

  1. Set watchpoints on variables and memory addresses.
  2. Identify the exact instruction that modifies data unexpectedly.
  3. Explain memory corruption using stack layout and pointer math.
  4. Differentiate between watch, rwatch, and awatch.

2. Theoretical Foundation

2.1 Core Concepts

  • Watchpoints: Break when a value changes (write), is read (read), or accessed (read/write).
  • Hardware vs Software Watchpoints: Hardware uses CPU debug registers (fast, limited). Software is slower but more flexible.
  • Pointer Arithmetic: p + 1 moves by sizeof(*p) bytes, which can corrupt adjacent data.

2.2 Why This Matters

Memory corruption bugs are often hard to reproduce and trace. Watchpoints act like a tripwire and tell you who wrote the bad value.

2.3 Historical Context / Background

Hardware watchpoints are implemented by CPU debug registers that originated in x86 debugging support; GDB exposes them as first-class features.

2.4 Common Misconceptions

  • “Watchpoints always work”: They can fail if the variable is optimized into a register.
  • “Only writes matter”: Reads are often suspicious too (use rwatch).

3. Project Specification

3.1 What You Will Build

A small C program where a global is accidentally used to overwrite a neighboring stack variable. You will use a watchpoint to locate the corrupting line.

3.2 Functional Requirements

  1. Compile with debug symbols and no optimizations.
  2. Identify a variable with incorrect value.
  3. Set a watchpoint before corruption occurs.
  4. Use the watchpoint stop to identify the culprit line.

3.3 Non-Functional Requirements

  • Reliability: The corruption should be reproducible.
  • Usability: Watchpoint should trigger quickly.
  • Performance: Avoid heavy loops that flood the watchpoint.

3.4 Example Usage / Output

(gdb) watch local_value
(gdb) continue
Hardware watchpoint 1: local_value
Old value = 200
New value = 0
buggy_function () at corrupt.c:8

3.5 Real World Outcome

You will see GDB stop at the exact line that corrupts memory:

$ gcc -g -O0 -o corrupt corrupt.c
$ gdb ./corrupt
(gdb) break main
(gdb) run
(gdb) watch local_value
(gdb) continue
Hardware watchpoint 1: local_value
Old value = 200
New value = 0
buggy_function () at corrupt.c:8

4. Solution Architecture

4.1 High-Level Design

┌────────────┐     ┌──────────────┐     ┌──────────────┐
│ corrupt.c  │────▶│ watchpoint   │────▶│ culprit line │
└────────────┘     └──────────────┘     └──────────────┘

4.2 Key Components

Component Responsibility Key Decisions
Target program Create predictable corruption Off-by-one pointer write
Watchpoint Stop on modification Use hardware watchpoint
Analysis Explain corruption Inspect addresses and stack

4.3 Data Structures

struct CorruptionEvidence {
    void *writer_ip;
    void *victim_addr;
};

4.4 Algorithm Overview

Key Algorithm: Watchpoint-driven search

  1. Identify corrupted variable.
  2. Set watch on that variable.
  3. Continue until watchpoint triggers.

Complexity Analysis:

  • Time: O(T) until the corrupt write occurs.
  • Space: O(1).

5. Implementation Guide

5.1 Development Environment Setup

gcc -g -O0 -o corrupt corrupt.c

5.2 Project Structure

project-root/
├── corrupt.c
└── corrupt

5.3 The Core Question You’re Answering

“Which exact line of code changed my variable to the wrong value?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Pointer arithmetic
    • How p + 1 computes addresses
  2. Stack layout
    • Why locals can be adjacent
  3. Watchpoint types
    • watch, rwatch, awatch

5.5 Questions to Guide Your Design

  1. When should the watchpoint be set so the variable exists in memory?
  2. What happens if the variable is in a register?
  3. How do you interpret the stop location?

5.6 Thinking Exercise

If the corruption happens in a different thread, how would you identify which thread wrote the value?

5.7 The Interview Questions They’ll Ask

  1. What is the difference between watch and rwatch?
  2. Why might a watchpoint not trigger?
  3. How do hardware watchpoints differ from software ones?

5.8 Hints in Layers

Hint 1: Set it early

  • break main, run, watch local_value

Hint 2: Inspect addresses

  • print &local_value, print &global_value

Hint 3: Disassemble

  • disassemble buggy_function

5.9 Books That Will Help

Topic Book Chapter
Watchpoints “The Art of Debugging with GDB” Ch. 4
Memory layout CSAPP Ch. 3
Signals and traps TLPI Ch. 20

5.10 Implementation Phases

Phase 1: Foundation (30 minutes)

Goals:

  • Reproduce the corruption.

Tasks:

  1. Compile and run corrupt.c.
  2. Observe the incorrect value.

Checkpoint: You see local_value become 0.

Phase 2: Core Functionality (45 minutes)

Goals:

  • Use watchpoints to catch the writer.

Tasks:

  1. Set watch local_value.
  2. Continue until the watchpoint stops.

Checkpoint: You stop in buggy_function.

Phase 3: Polish & Edge Cases (30 minutes)

Goals:

  • Confirm root cause.

Tasks:

  1. Inspect addresses of variables.
  2. Disassemble and explain the write.

Checkpoint: You can describe the exact pointer mistake.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Watchpoint type watch, rwatch, awatch watch Only writes matter here
Optimization -O0, -O2 -O0 Avoid register-only locals

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Corruption Verify bug exists After: global=100, local=0
Watchpoint Ensure stop watch triggers in buggy_function
Analysis Confirm cause pointer math explains write

6.2 Critical Test Cases

  1. Watchpoint triggers on the first bad write.
  2. Backtrace points to the corrupting function.
  3. Address math shows adjacent variables.

6.3 Test Data

local_value = 200 -> 0

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Variable optimized out Watchpoint fails Use -O0
Too many watchpoints “Resource unavailable” Delete unused watchpoints
Wrong variable No trigger Watch the actual corrupt victim

7.2 Debugging Strategies

  • Use info watchpoints to manage hardware slots.
  • Use watch -l for exact memory location.

7.3 Performance Traps

Watchpoints on hot variables can slow programs significantly.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Use rwatch to catch unexpected reads.

8.2 Intermediate Extensions

  • Use conditional watchpoints on a value range.

8.3 Advanced Extensions

  • Track corruption across multiple threads.

9. Real-World Connections

9.1 Industry Applications

  • Heisenbugs: Watchpoints help with hard-to-reproduce memory issues.
  • Security: Detect unexpected writes to sensitive data.
  • Valgrind: alternative memory debugging tool.
  • rr: record-replay for deterministic corruption analysis.

9.3 Interview Relevance

  • Shows knowledge of memory debugging and tools.

10. Resources

10.1 Essential Reading

  • GDB Manual - Watchpoints section.
  • CSAPP - Stack and memory layout.

10.2 Video Resources

  • Search: “gdb watchpoints memory corruption”.

10.3 Tools & Documentation

  • GDB: https://sourceware.org/gdb/

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain how a watchpoint works.
  • I can explain pointer arithmetic in terms of bytes.

11.2 Implementation

  • Watchpoint stopped on the corrupting line.
  • I verified the address math.

11.3 Growth

  • I can apply this to a real corruption bug.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Watchpoint triggers at the corrupting line.

Full Completion:

  • Explain the memory layout that made the bug possible.

Excellence (Going Above & Beyond):

  • Build a second corruption case and debug it with rwatch.

This guide was generated from LEARN_GDB_DEEP_DIVE.md. For the complete learning path, see the parent directory README.