Project 14: Memory Debugger (Mini-Valgrind)

Build a library that interposes on malloc/free to detect memory leaks, use-after-free, double-free, and buffer overflows at runtime.


Quick Reference

Attribute Value
Language C (with inline assembly for some features)
Difficulty Level 5 (Wizard)
Time 3 Weeks
Key Concepts Memory management, interposition, debugging
Prerequisites P01-P13, deep understanding of malloc
Portfolio Value Exceptional - demonstrates systems mastery

Learning Objectives

By completing this project, you will:

  1. Master malloc/free interposition: Understand how to intercept and wrap memory allocation functions at runtime and link time

  2. Implement allocation tracking: Build data structures to track every allocation, including call site, size, and status

  3. Detect memory errors: Identify leaks (allocated but never freed), use-after-free (accessing freed memory), and double-free (freeing same address twice)

  4. Implement canary values: Use sentinel bytes around allocations to detect buffer underflows and overflows

  5. Understand shadow memory: Learn the concept of shadow memory used by tools like AddressSanitizer

  6. Create useful diagnostics: Generate reports showing where allocations occurred and where errors were detected

  7. Handle real-world complexity: Deal with thread safety, signal handlers, and libraries that allocate memory


Theoretical Foundation

Why Memory Debugging Matters

Memory errors are among the most dangerous bugs in C/C++:

    ┌────────────────────────────────────────────────────────────────────────────┐
    │                    MEMORY ERROR CONSEQUENCES                               │
    ├────────────────────────────────────────────────────────────────────────────┤
    │                                                                            │
    │  Memory Leaks:                                                             │
    │  ─────────────                                                             │
    │  • Long-running programs slowly consume all RAM                            │
    │  • System becomes unresponsive                                             │
    │  • OOM killer terminates processes                                         │
    │                                                                            │
    │  Use-After-Free:                                                           │
    │  ───────────────                                                           │
    │  • Dangling pointer dereference                                            │
    │  • Exploitable for code execution                                          │
    │  • CVE-2021-21224 (Chrome), CVE-2022-0847 (Dirty Pipe)                     │
    │                                                                            │
    │  Double-Free:                                                              │
    │  ────────────                                                              │
    │  • Heap corruption                                                         │
    │  • Arbitrary write primitive for exploitation                              │
    │  • Often leads to RCE                                                      │
    │                                                                            │
    │  Buffer Overflow (Heap):                                                   │
    │  ───────────────────────                                                   │
    │  • Corrupt heap metadata                                                   │
    │  • Corrupt adjacent allocations                                            │
    │  • Heartbleed (2014) was a heap buffer over-read                          │
    │                                                                            │
    └────────────────────────────────────────────────────────────────────────────┘

How Valgrind Works

Valgrind uses dynamic binary instrumentation:

    ┌────────────────────────────────────────────────────────────────────────────┐
    │                    VALGRIND ARCHITECTURE                                   │
    ├────────────────────────────────────────────────────────────────────────────┤
    │                                                                            │
    │  Your Program                           Valgrind                           │
    │  ┌────────────────┐                   ┌──────────────────────────────┐     │
    │  │  main()        │                   │  Synthetic CPU               │     │
    │  │    malloc(64)  │ ──────────────▶   │  ┌────────────────────────┐  │     │
    │  │    ...         │   Binary          │  │ Instruction decoder    │  │     │
    │  │    free(ptr)   │   instrumented    │  │ ┌──────────────────┐   │  │     │
    │  └────────────────┘   at runtime      │  │ │ Shadow memory    │   │  │     │
    │                                       │  │ │ Every byte of    │   │  │     │
    │                                       │  │ │ your program's   │   │  │     │
    │                                       │  │ │ memory has       │   │  │     │
    │                                       │  │ │ metadata shadow  │   │  │     │
    │                                       │  │ └──────────────────┘   │  │     │
    │                                       │  └────────────────────────┘  │     │
    │                                       └──────────────────────────────┘     │
    │                                                     │                      │
    │                                                     ▼                      │
    │                                       ┌──────────────────────────────┐     │
    │                                       │ Native CPU (10-50x slower)   │     │
    │                                       └──────────────────────────────┘     │
    │                                                                            │
    │  Every memory access is checked against shadow memory                      │
    │  Every allocation/free is tracked                                          │
    │                                                                            │
    └────────────────────────────────────────────────────────────────────────────┘

Our Simpler Approach: Interposition

We will use a simpler but still effective approach - malloc interposition:

    ┌────────────────────────────────────────────────────────────────────────────┐
    │                    MALLOC INTERPOSITION                                    │
    ├────────────────────────────────────────────────────────────────────────────┤
    │                                                                            │
    │  LINK-TIME INTERPOSITION (--wrap):                                         │
    │  ────────────────────────────────                                          │
    │                                                                            │
    │  gcc -Wl,--wrap=malloc -Wl,--wrap=free program.c our_malloc.c              │
    │                                                                            │
    │  The linker redirects:                                                     │
    │  • Calls to "malloc" → "__wrap_malloc" (your wrapper)                      │
    │  • Your wrapper can call "__real_malloc" (the real malloc)                 │
    │                                                                            │
    │  void *__wrap_malloc(size_t size) {                                        │
    │      void *ptr = __real_malloc(size + overhead);                           │
    │      track_allocation(ptr, size, __builtin_return_address(0));             │
    │      return ptr;                                                           │
    │  }                                                                         │
    │                                                                            │
    │  RUNTIME INTERPOSITION (LD_PRELOAD):                                       │
    │  ─────────────────────────────────                                         │
    │                                                                            │
    │  LD_PRELOAD=./libmemdbg.so ./program                                       │
    │                                                                            │
    │  Our shared library is loaded first, so our malloc/free                    │
    │  symbols take precedence. Use dlsym(RTLD_NEXT, "malloc")                   │
    │  to get the real functions.                                                │
    │                                                                            │
    │  static void *(*real_malloc)(size_t) = NULL;                               │
    │  void *malloc(size_t size) {                                               │
    │      if (!real_malloc)                                                     │
    │          real_malloc = dlsym(RTLD_NEXT, "malloc");                         │
    │      void *ptr = real_malloc(size + overhead);                             │
    │      track_allocation(ptr, size);                                          │
    │      return ptr;                                                           │
    │  }                                                                         │
    │                                                                            │
    └────────────────────────────────────────────────────────────────────────────┘

Allocation Tracking Data Structure

    ┌────────────────────────────────────────────────────────────────────────────┐
    │                    ALLOCATION RECORD                                       │
    ├────────────────────────────────────────────────────────────────────────────┤
    │                                                                            │
    │  For each allocation, we track:                                            │
    │                                                                            │
    │  typedef struct AllocationRecord {                                         │
    │      void *user_ptr;        // Pointer returned to user                    │
    │      void *internal_ptr;    // Pointer from real_malloc                    │
    │      size_t size;           // User-requested size                         │
    │      void *call_site;       // Return address of malloc call               │
    │      char call_site_info[128]; // Function name, file, line               │
    │      time_t alloc_time;     // When allocated                              │
    │      enum { ALLOCATED, FREED } status;                                     │
    │      void *free_site;       // Where it was freed (if freed)               │
    │      struct AllocationRecord *next;  // Hash chain                         │
    │  } AllocationRecord;                                                       │
    │                                                                            │
    │  Storage options:                                                          │
    │  ─────────────────                                                         │
    │  1. Hash table by address (fast lookup, used for free/access)              │
    │  2. Linked list of all records (for leak reporting)                        │
    │  3. Keep freed records for use-after-free detection                        │
    │                                                                            │
    │  MEMORY LAYOUT WITH CANARIES:                                              │
    │  ─────────────────────────────                                             │
    │                                                                            │
    │  real_malloc allocates:                                                    │
    │                                                                            │
    │  ┌──────────────┬────────────────────────────────────┬──────────────┐      │
    │  │ Front Canary │           User Data                │ Rear Canary  │      │
    │  │   8 bytes    │          (size bytes)              │   8 bytes    │      │
    │  │ 0xDEADBEEF.. │                                    │ 0xCAFEBABE.. │      │
    │  └──────────────┴────────────────────────────────────┴──────────────┘      │
    │  ↑                ↑                                                        │
    │  internal_ptr     user_ptr (what we return)                                │
    │                                                                            │
    │  On free: Check canaries are intact (no overflow/underflow)                │
    │                                                                            │
    └────────────────────────────────────────────────────────────────────────────┘

Detecting Use-After-Free

    ┌────────────────────────────────────────────────────────────────────────────┐
    │                    USE-AFTER-FREE DETECTION                                │
    ├────────────────────────────────────────────────────────────────────────────┤
    │                                                                            │
    │  APPROACH 1: Quarantine (like AddressSanitizer)                            │
    │  ────────────────────────────────────────────────                          │
    │                                                                            │
    │  Don't actually free memory immediately. Keep it in a quarantine:          │
    │                                                                            │
    │  void free(void *ptr) {                                                    │
    │      mark_freed(ptr);           // Record it's freed                       │
    │      fill_with_poison(ptr);     // Fill with 0xDD                          │
    │      add_to_quarantine(ptr);    // Hold in quarantine                      │
    │                                                                            │
    │      if (quarantine_size > MAX_QUARANTINE) {                               │
    │          void *old = pop_oldest_from_quarantine();                         │
    │          real_free(old);        // Actually free oldest                    │
    │      }                                                                     │
    │  }                                                                         │
    │                                                                            │
    │  If program accesses 0xDDDDDDDD pattern, we know it's use-after-free.      │
    │  Downside: Higher memory usage.                                            │
    │                                                                            │
    │  APPROACH 2: Shadow Memory (simplified)                                    │
    │  ─────────────────────────────────────                                     │
    │                                                                            │
    │  Map every 8 bytes of application memory to 1 byte of shadow:              │
    │                                                                            │
    │  Application:  [████████][████████][████████]                              │
    │                    ↓          ↓          ↓                                 │
    │  Shadow:       [status1 ][status2 ][status3 ]                              │
    │                                                                            │
    │  Status byte meanings:                                                     │
    │  0x00 = Fully addressable                                                  │
    │  0xFA = Freed (heap)                                                       │
    │  0xFE = Not allocated                                                      │
    │  1-7  = Only first N bytes addressable (partial)                           │
    │                                                                            │
    │  This requires instrumenting every memory access (like ASan does).         │
    │  We'll use the simpler quarantine approach.                                │
    │                                                                            │
    └────────────────────────────────────────────────────────────────────────────┘

Call Site Tracking

    ┌────────────────────────────────────────────────────────────────────────────┐
    │                    CALL SITE INFORMATION                                   │
    ├────────────────────────────────────────────────────────────────────────────┤
    │                                                                            │
    │  We want to report WHERE allocations/frees happened:                       │
    │                                                                            │
    │  "Memory leak: 64 bytes allocated at main.c:42 (in function parse_config)" │
    │                                                                            │
    │  METHOD 1: __builtin_return_address                                        │
    │  ─────────────────────────────────                                         │
    │                                                                            │
    │  void *__wrap_malloc(size_t size) {                                        │
    │      void *call_site = __builtin_return_address(0);                        │
    │      // call_site is the address in the caller                             │
    │      track(call_site, ...);                                                │
    │  }                                                                         │
    │                                                                            │
    │  Later, use addr2line or dladdr to resolve:                                │
    │                                                                            │
    │  Dl_info info;                                                             │
    │  if (dladdr(call_site, &info)) {                                           │
    │      printf("Called from %s in %s\n",                                      │
    │             info.dli_sname,     // Symbol name                             │
    │             info.dli_fname);    // File name                               │
    │  }                                                                         │
    │                                                                            │
    │  METHOD 2: Full Stack Trace (libunwind or backtrace)                       │
    │  ────────────────────────────────────────────────────                      │
    │                                                                            │
    │  #include <execinfo.h>                                                     │
    │                                                                            │
    │  void *stack[16];                                                          │
    │  int depth = backtrace(stack, 16);                                         │
    │  char **symbols = backtrace_symbols(stack, depth);                         │
    │                                                                            │
    │  for (int i = 0; i < depth; i++) {                                         │
    │      printf("  %s\n", symbols[i]);                                         │
    │  }                                                                         │
    │  free(symbols);                                                            │
    │                                                                            │
    │  Note: backtrace_symbols calls malloc! Must handle re-entrancy.            │
    │                                                                            │
    └────────────────────────────────────────────────────────────────────────────┘

Handling Re-entrancy

    ┌────────────────────────────────────────────────────────────────────────────┐
    │                    RE-ENTRANCY PROBLEM                                     │
    ├────────────────────────────────────────────────────────────────────────────┤
    │                                                                            │
    │  Problem: Our malloc wrapper needs to track allocations, which might       │
    │  require memory allocation itself!                                         │
    │                                                                            │
    │  void *malloc(size_t size) {                                               │
    │      AllocationRecord *rec = real_malloc(sizeof(*rec));  // Oops, malloc!  │
    │      ... causes infinite recursion                                         │
    │  }                                                                         │
    │                                                                            │
    │  SOLUTION 1: Thread-local recursion guard                                  │
    │  ─────────────────────────────────────────                                 │
    │                                                                            │
    │  static __thread int in_allocator = 0;                                     │
    │                                                                            │
    │  void *malloc(size_t size) {                                               │
    │      if (in_allocator) {                                                   │
    │          return real_malloc(size);  // Bypass tracking                     │
    │      }                                                                     │
    │      in_allocator = 1;                                                     │
    │      // ... do our tracking (may call malloc internally)                   │
    │      in_allocator = 0;                                                     │
    │      return ptr;                                                           │
    │  }                                                                         │
    │                                                                            │
    │  SOLUTION 2: Pre-allocated pool for tracking data                          │
    │  ────────────────────────────────────────────────                          │
    │                                                                            │
    │  At startup, allocate a large pool of AllocationRecord structures.         │
    │  Use that pool instead of malloc for internal allocations.                 │
    │                                                                            │
    │  SOLUTION 3: mmap for internal memory                                      │
    │  ────────────────────────────────────                                      │
    │                                                                            │
    │  Use mmap() directly instead of malloc() for internal needs.               │
    │  mmap doesn't go through our wrapper.                                      │
    │                                                                            │
    └────────────────────────────────────────────────────────────────────────────┘

Project Specification

What You Will Build

A shared library (libmemdbg.so) that can be preloaded to detect memory errors:

$ LD_PRELOAD=./libmemdbg.so ./my_program
... program runs ...

=== MEMORY DEBUG REPORT ===
Total allocations: 1,234
Total bytes allocated: 567,890
Total frees: 1,230
Current allocations: 4

MEMORY LEAKS (4 allocations, 256 bytes):
  Leak 1: 64 bytes at 0x7f1234567890
    Allocated at: parse_config (config.c:142)
    Stack:
      main (main.c:28)
      load_config (config.c:58)
      parse_config (config.c:142)

  Leak 2: 128 bytes at 0x7f1234568000
    ...

ERRORS DETECTED:
  Double-free at: cleanup (main.c:45)
    Memory was already freed at: process_data (data.c:89)
    Originally allocated at: read_input (io.c:23)

Functional Requirements

Memory Error Detection:

// Errors we detect:
typedef enum {
    MEMERR_NONE = 0,
    MEMERR_LEAK,              // Allocated but never freed
    MEMERR_DOUBLE_FREE,       // free() called twice on same pointer
    MEMERR_INVALID_FREE,      // free() called on non-heap pointer
    MEMERR_USE_AFTER_FREE,    // Access after free (if detectable)
    MEMERR_BUFFER_OVERFLOW,   // Write past allocation end (canary check)
    MEMERR_BUFFER_UNDERFLOW,  // Write before allocation start (canary)
} MemoryError;

API (for programs that want to link directly):

// Initialize/shutdown (optional, auto-detected)
void memdbg_init(void);
void memdbg_shutdown(void);

// Manual leak check point
void memdbg_check_leaks(void);

// Mark expected allocations (suppress leak warnings)
void memdbg_expect_leak(void *ptr);

// Get current statistics
typedef struct {
    size_t total_allocs;
    size_t total_frees;
    size_t current_allocs;
    size_t current_bytes;
    size_t peak_bytes;
    size_t error_count;
} MemoryStats;
void memdbg_get_stats(MemoryStats *stats);

// Control verbosity
void memdbg_set_verbose(int level);

// Annotate allocations (for better reports)
void *memdbg_malloc_annotated(size_t size, const char *file, int line);
#define MALLOC(size) memdbg_malloc_annotated(size, __FILE__, __LINE__)

Non-Functional Requirements

  1. Low overhead: < 2x slowdown for normal programs
  2. Thread safety: Works with multi-threaded programs
  3. No false positives: Only report real errors
  4. Useful diagnostics: Reports include file, line, function where possible
  5. Works with real programs: Handle complex scenarios (dlopen, fork, etc.)
  6. Portable: Works on Linux and macOS (with conditional code)

Example Usage and Output

Program with bugs (leaky.c):

#include <stdlib.h>
#include <string.h>

void process(void) {
    char *buf = malloc(64);
    strcpy(buf, "Hello");
    // Forgot to free!
}

int main(void) {
    char *p1 = malloc(100);
    char *p2 = malloc(200);

    process();  // Leaks 64 bytes

    free(p1);
    free(p1);   // Double free!

    free(p2);
    return 0;
}

Running with your library:

$ gcc -g -o leaky leaky.c
$ LD_PRELOAD=./libmemdbg.so ./leaky

*** MEMORY ERROR: Double-free detected ***
  Pointer: 0x5555555592a0
  Second free called at:
    main (leaky.c:17)
  First free was at:
    main (leaky.c:16)
  Originally allocated at:
    main (leaky.c:11)

... program exits ...

========================================
        MEMORY DEBUGGING REPORT
========================================

Statistics:
  Total allocations:     4
  Total frees:          3 (1 double-free)
  Bytes allocated:      364
  Peak memory:          364

Memory Leaks Detected: 1

Leak #1: 64 bytes
  Address: 0x555555559310
  Allocated at:
    process (leaky.c:6)
  Allocation stack trace:
    main (leaky.c:13)
    process (leaky.c:6)

========================================

Solution Architecture

High-Level Design

    ┌─────────────────────────────────────────────────────────────────────────────┐
    │                        MEMORY DEBUGGER ARCHITECTURE                         │
    ├─────────────────────────────────────────────────────────────────────────────┤
    │                                                                             │
    │  ┌─────────────────────────────────────────────────────────────────────┐    │
    │  │                     INTERPOSITION LAYER                             │    │
    │  │                                                                     │    │
    │  │  ┌───────────┐  ┌───────────┐  ┌───────────┐  ┌───────────┐        │    │
    │  │  │  malloc   │  │  calloc   │  │  realloc  │  │   free    │        │    │
    │  │  │  wrapper  │  │  wrapper  │  │  wrapper  │  │  wrapper  │        │    │
    │  │  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘        │    │
    │  │        │              │              │              │              │    │
    │  └────────┼──────────────┼──────────────┼──────────────┼──────────────┘    │
    │           │              │              │              │                    │
    │           ▼              ▼              ▼              ▼                    │
    │  ┌─────────────────────────────────────────────────────────────────────┐    │
    │  │                     TRACKING LAYER                                  │    │
    │  │                                                                     │    │
    │  │  ┌───────────────────────────────────────────────────────────────┐  │    │
    │  │  │                   Hash Table                                  │  │    │
    │  │  │  [ptr] -> AllocationRecord { size, callsite, status, ... }    │  │    │
    │  │  │                                                               │  │    │
    │  │  │  [0x100] -> { 64 bytes, main.c:42, ALLOCATED }                │  │    │
    │  │  │  [0x200] -> { 128 bytes, util.c:15, FREED }                   │  │    │
    │  │  └───────────────────────────────────────────────────────────────┘  │    │
    │  │                                                                     │    │
    │  │  ┌───────────────────┐  ┌───────────────────────────────────────┐   │    │
    │  │  │    Quarantine     │  │         Statistics                    │   │    │
    │  │  │  (freed blocks)   │  │  allocs, frees, bytes, errors         │   │    │
    │  │  └───────────────────┘  └───────────────────────────────────────┘   │    │
    │  └─────────────────────────────────────────────────────────────────────┘    │
    │                               │                                             │
    │                               ▼                                             │
    │  ┌─────────────────────────────────────────────────────────────────────┐    │
    │  │                     CANARY LAYER                                    │    │
    │  │                                                                     │    │
    │  │  ┌────────────┬─────────────────────────────┬────────────┐          │    │
    │  │  │  CANARY_L  │        User Data            │  CANARY_R  │          │    │
    │  │  │ 0xDEADBEEF │                             │ 0xCAFEBABE │          │    │
    │  │  └────────────┴─────────────────────────────┴────────────┘          │    │
    │  │                                                                     │    │
    │  │  On free: verify canaries match expected values                     │    │
    │  │  If not: BUFFER OVERFLOW/UNDERFLOW detected                         │    │
    │  └─────────────────────────────────────────────────────────────────────┘    │
    │                               │                                             │
    │                               ▼                                             │
    │  ┌─────────────────────────────────────────────────────────────────────┐    │
    │  │                     REPORTING LAYER                                 │    │
    │  │                                                                     │    │
    │  │  • Real-time error reporting (to stderr)                            │    │
    │  │  • atexit() summary report                                          │    │
    │  │  • Optional file logging                                            │    │
    │  │  • Stack trace resolution (addr2line, dladdr)                       │    │
    │  └─────────────────────────────────────────────────────────────────────┘    │
    │                                                                             │
    └─────────────────────────────────────────────────────────────────────────────┘

Module Structure

memory-debugger/
├── include/
│   └── memdbg.h           # Public API header
├── src/
│   ├── memdbg.c           # Main interposition and entry points
│   ├── tracking.c         # Allocation tracking hash table
│   ├── tracking.h
│   ├── canary.c           # Canary implementation
│   ├── canary.h
│   ├── stacktrace.c       # Call site and stack trace capture
│   ├── stacktrace.h
│   ├── report.c           # Error and summary reporting
│   ├── report.h
│   ├── quarantine.c       # Freed memory quarantine
│   ├── quarantine.h
│   └── internal.h         # Internal utilities
├── tests/
│   ├── test_basic.c       # Basic malloc/free tracking
│   ├── test_leaks.c       # Leak detection
│   ├── test_double_free.c # Double-free detection
│   ├── test_overflow.c    # Buffer overflow detection
│   ├── test_threads.c     # Multi-threaded programs
│   └── run_tests.sh
├── examples/
│   ├── leaky_program.c
│   ├── double_free.c
│   └── buffer_overflow.c
├── Makefile
└── README.md

Key Data Structures

/* Canary values */
#define CANARY_LEFT  0xDEADBEEFCAFED00D
#define CANARY_RIGHT 0xCAFEBABE8BADF00D
#define CANARY_SIZE  8
#define POISON_BYTE  0xDD

/* Allocation record */
typedef struct AllocationRecord {
    void *user_ptr;              /* What we returned to user */
    void *internal_ptr;          /* What real_malloc returned */
    size_t requested_size;       /* User-requested size */
    size_t actual_size;          /* With canaries and alignment */

    void *alloc_site;            /* Return address of malloc */
    void *alloc_stack[16];       /* Full stack trace */
    int   alloc_stack_depth;

    void *free_site;             /* Where free was called */
    void *free_stack[16];
    int   free_stack_depth;

    time_t alloc_time;
    time_t free_time;

    enum { REC_ALLOCATED, REC_FREED } status;

    struct AllocationRecord *hash_next;  /* Hash chain */
    struct AllocationRecord *list_next;  /* All records list */
} AllocationRecord;

/* Hash table for fast lookup */
#define HASH_BUCKETS 4096
static AllocationRecord *hash_table[HASH_BUCKETS];
static AllocationRecord *all_records;  /* Linked list head */

/* Statistics */
typedef struct {
    size_t total_allocs;
    size_t total_frees;
    size_t current_allocs;
    size_t current_bytes;
    size_t peak_bytes;
    size_t double_frees;
    size_t invalid_frees;
    size_t overflows_detected;
} Stats;
static Stats global_stats;

/* Thread safety */
static pthread_mutex_t tracking_lock = PTHREAD_MUTEX_INITIALIZER;

Implementation Guide

Phase 1: Basic Interposition (Days 1-4)

Goals:

  • Get LD_PRELOAD working
  • Intercept malloc/free
  • Pass through to real functions

Tasks:

  1. Create the shared library skeleton:
/* memdbg.c */
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

/* Function pointers to real implementations */
static void *(*real_malloc)(size_t) = NULL;
static void (*real_free)(void *) = NULL;
static void *(*real_calloc)(size_t, size_t) = NULL;
static void *(*real_realloc)(void *, size_t) = NULL;

/* Thread-local recursion guard */
static __thread int in_hook = 0;

/* Initialization */
static void init_real_functions(void) {
    real_malloc = dlsym(RTLD_NEXT, "malloc");
    real_free = dlsym(RTLD_NEXT, "free");
    real_calloc = dlsym(RTLD_NEXT, "calloc");
    real_realloc = dlsym(RTLD_NEXT, "realloc");

    if (!real_malloc || !real_free) {
        fprintf(stderr, "memdbg: Failed to find real malloc/free\n");
        exit(1);
    }
}

/* Called when library is loaded */
__attribute__((constructor))
static void memdbg_init(void) {
    init_real_functions();
    fprintf(stderr, "[memdbg] Memory debugger initialized\n");
}

/* Called when library is unloaded */
__attribute__((destructor))
static void memdbg_fini(void) {
    fprintf(stderr, "[memdbg] Memory debugger shutting down\n");
    /* Print report here */
}
  1. Implement malloc wrapper:
void *malloc(size_t size) {
    /* Handle early calls before init */
    if (!real_malloc) {
        init_real_functions();
    }

    /* Recursion guard */
    if (in_hook) {
        return real_malloc(size);
    }
    in_hook = 1;

    void *ptr = real_malloc(size);

    fprintf(stderr, "[memdbg] malloc(%zu) = %p\n", size, ptr);

    in_hook = 0;
    return ptr;
}

void free(void *ptr) {
    if (!real_free) {
        init_real_functions();
    }

    if (in_hook) {
        real_free(ptr);
        return;
    }
    in_hook = 1;

    fprintf(stderr, "[memdbg] free(%p)\n", ptr);
    real_free(ptr);

    in_hook = 0;
}
  1. Build and test:
# Makefile
CC = gcc
CFLAGS = -Wall -Wextra -g -fPIC
LDFLAGS = -shared -ldl

libmemdbg.so: src/memdbg.c
	$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^

test: libmemdbg.so
	LD_PRELOAD=./libmemdbg.so ls

Checkpoint: LD_PRELOAD works, malloc/free calls are logged.

Phase 2: Allocation Tracking (Days 5-9)

Goals:

  • Track all allocations in hash table
  • Detect double-free and invalid-free
  • Add canary protection

Tasks:

  1. Implement hash table:
/* tracking.c */
#include <stdint.h>
#include <string.h>

static inline size_t hash_ptr(void *ptr) {
    uintptr_t addr = (uintptr_t)ptr;
    return (addr >> 4) % HASH_BUCKETS;
}

AllocationRecord *find_record(void *user_ptr) {
    size_t bucket = hash_ptr(user_ptr);
    AllocationRecord *rec = hash_table[bucket];

    while (rec) {
        if (rec->user_ptr == user_ptr) {
            return rec;
        }
        rec = rec->hash_next;
    }
    return NULL;
}

AllocationRecord *create_record(void *user_ptr, void *internal_ptr,
                                 size_t size, void *call_site) {
    /* Use mmap to avoid malloc recursion */
    AllocationRecord *rec = mmap(NULL, sizeof(*rec),
                                  PROT_READ | PROT_WRITE,
                                  MAP_PRIVATE | MAP_ANONYMOUS,
                                  -1, 0);
    if (rec == MAP_FAILED) return NULL;

    memset(rec, 0, sizeof(*rec));
    rec->user_ptr = user_ptr;
    rec->internal_ptr = internal_ptr;
    rec->requested_size = size;
    rec->alloc_site = call_site;
    rec->status = REC_ALLOCATED;
    rec->alloc_time = time(NULL);

    /* Insert into hash table */
    size_t bucket = hash_ptr(user_ptr);
    rec->hash_next = hash_table[bucket];
    hash_table[bucket] = rec;

    /* Insert into all-records list */
    rec->list_next = all_records;
    all_records = rec;

    return rec;
}
  1. Implement canary protection:
/* canary.c */

/* Layout: [CANARY_L][user data][CANARY_R] */

void *wrap_with_canaries(void *internal_ptr, size_t user_size) {
    char *p = internal_ptr;

    /* Write left canary */
    memcpy(p, &CANARY_LEFT, CANARY_SIZE);

    /* User data starts after left canary */
    void *user_ptr = p + CANARY_SIZE;

    /* Write right canary after user data */
    memcpy(p + CANARY_SIZE + user_size, &CANARY_RIGHT, CANARY_SIZE);

    return user_ptr;
}

bool check_canaries(void *user_ptr, size_t user_size, bool *left_ok, bool *right_ok) {
    char *p = (char *)user_ptr - CANARY_SIZE;

    uint64_t left, right;
    memcpy(&left, p, CANARY_SIZE);
    memcpy(&right, (char *)user_ptr + user_size, CANARY_SIZE);

    *left_ok = (left == CANARY_LEFT);
    *right_ok = (right == CANARY_RIGHT);

    return *left_ok && *right_ok;
}

size_t total_allocation_size(size_t user_size) {
    return CANARY_SIZE + user_size + CANARY_SIZE;
}
  1. Update malloc/free with tracking:
void *malloc(size_t size) {
    if (!real_malloc) init_real_functions();
    if (in_hook) return real_malloc(size);
    in_hook = 1;

    pthread_mutex_lock(&tracking_lock);

    /* Allocate with room for canaries */
    size_t total = total_allocation_size(size);
    void *internal = real_malloc(total);

    if (!internal) {
        pthread_mutex_unlock(&tracking_lock);
        in_hook = 0;
        return NULL;
    }

    void *user = wrap_with_canaries(internal, size);
    void *call_site = __builtin_return_address(0);

    AllocationRecord *rec = create_record(user, internal, size, call_site);
    if (rec) {
        capture_stack_trace(rec->alloc_stack, &rec->alloc_stack_depth);
        global_stats.total_allocs++;
        global_stats.current_allocs++;
        global_stats.current_bytes += size;
        if (global_stats.current_bytes > global_stats.peak_bytes) {
            global_stats.peak_bytes = global_stats.current_bytes;
        }
    }

    pthread_mutex_unlock(&tracking_lock);
    in_hook = 0;
    return user;
}

void free(void *ptr) {
    if (!real_free) init_real_functions();
    if (ptr == NULL) return;
    if (in_hook) { real_free(ptr); return; }
    in_hook = 1;

    pthread_mutex_lock(&tracking_lock);

    AllocationRecord *rec = find_record(ptr);

    if (!rec) {
        report_error(MEMERR_INVALID_FREE, ptr, NULL);
        pthread_mutex_unlock(&tracking_lock);
        in_hook = 0;
        return;
    }

    if (rec->status == REC_FREED) {
        report_error(MEMERR_DOUBLE_FREE, ptr, rec);
        pthread_mutex_unlock(&tracking_lock);
        in_hook = 0;
        return;
    }

    /* Check canaries */
    bool left_ok, right_ok;
    if (!check_canaries(ptr, rec->requested_size, &left_ok, &right_ok)) {
        if (!left_ok) report_error(MEMERR_BUFFER_UNDERFLOW, ptr, rec);
        if (!right_ok) report_error(MEMERR_BUFFER_OVERFLOW, ptr, rec);
    }

    /* Mark as freed */
    rec->status = REC_FREED;
    rec->free_site = __builtin_return_address(0);
    capture_stack_trace(rec->free_stack, &rec->free_stack_depth);
    rec->free_time = time(NULL);

    global_stats.total_frees++;
    global_stats.current_allocs--;
    global_stats.current_bytes -= rec->requested_size;

    /* Fill with poison and add to quarantine */
    memset(ptr, POISON_BYTE, rec->requested_size);
    add_to_quarantine(rec);

    pthread_mutex_unlock(&tracking_lock);
    in_hook = 0;
}

Checkpoint: Double-free and overflow detection works.

Phase 3: Stack Traces and Reporting (Days 10-14)

Goals:

  • Capture meaningful stack traces
  • Resolve addresses to symbols
  • Generate useful reports

Tasks:

  1. Stack trace capture:
/* stacktrace.c */
#include <execinfo.h>
#include <dlfcn.h>

void capture_stack_trace(void **stack, int *depth) {
    *depth = backtrace(stack, 16);
}

void format_stack_trace(FILE *out, void **stack, int depth) {
    for (int i = 0; i < depth; i++) {
        Dl_info info;
        if (dladdr(stack[i], &info) && info.dli_sname) {
            fprintf(out, "    #%d %s + 0x%lx (%s)\n",
                    i,
                    info.dli_sname,
                    (unsigned long)((char *)stack[i] - (char *)info.dli_saddr),
                    info.dli_fname);
        } else {
            fprintf(out, "    #%d %p\n", i, stack[i]);
        }
    }
}
  1. Error reporting:
/* report.c */

static FILE *report_file = NULL;

void init_reporting(void) {
    const char *logfile = getenv("MEMDBG_LOG");
    if (logfile) {
        report_file = fopen(logfile, "w");
    }
    if (!report_file) {
        report_file = stderr;
    }
}

void report_error(MemoryError type, void *ptr, AllocationRecord *rec) {
    fprintf(report_file, "\n*** MEMORY ERROR: ");

    switch (type) {
        case MEMERR_DOUBLE_FREE:
            fprintf(report_file, "Double-free detected ***\n");
            fprintf(report_file, "  Pointer: %p\n", ptr);
            fprintf(report_file, "  Second free called at:\n");
            format_stack_trace(report_file,
                (void *[]){__builtin_return_address(0)}, 1);
            if (rec) {
                fprintf(report_file, "  First free was at:\n");
                format_stack_trace(report_file, rec->free_stack, rec->free_stack_depth);
                fprintf(report_file, "  Originally allocated at:\n");
                format_stack_trace(report_file, rec->alloc_stack, rec->alloc_stack_depth);
            }
            global_stats.double_frees++;
            break;

        case MEMERR_INVALID_FREE:
            fprintf(report_file, "Invalid free (not allocated) ***\n");
            fprintf(report_file, "  Pointer: %p\n", ptr);
            global_stats.invalid_frees++;
            break;

        case MEMERR_BUFFER_OVERFLOW:
            fprintf(report_file, "Buffer overflow detected ***\n");
            fprintf(report_file, "  Pointer: %p, Size: %zu\n",
                    ptr, rec ? rec->requested_size : 0);
            if (rec) {
                fprintf(report_file, "  Allocated at:\n");
                format_stack_trace(report_file, rec->alloc_stack, rec->alloc_stack_depth);
            }
            global_stats.overflows_detected++;
            break;

        default:
            fprintf(report_file, "Unknown error ***\n");
    }
    fprintf(report_file, "\n");
    fflush(report_file);
}
  1. Leak summary at exit:
void print_leak_summary(void) {
    fprintf(report_file, "\n========================================\n");
    fprintf(report_file, "        MEMORY DEBUGGING REPORT\n");
    fprintf(report_file, "========================================\n\n");

    fprintf(report_file, "Statistics:\n");
    fprintf(report_file, "  Total allocations:     %zu\n", global_stats.total_allocs);
    fprintf(report_file, "  Total frees:          %zu\n", global_stats.total_frees);
    fprintf(report_file, "  Peak memory:          %zu bytes\n", global_stats.peak_bytes);
    fprintf(report_file, "  Double-frees:         %zu\n", global_stats.double_frees);
    fprintf(report_file, "  Overflows:            %zu\n", global_stats.overflows_detected);

    /* Count leaks */
    int leak_count = 0;
    size_t leak_bytes = 0;
    AllocationRecord *rec = all_records;
    while (rec) {
        if (rec->status == REC_ALLOCATED) {
            leak_count++;
            leak_bytes += rec->requested_size;
        }
        rec = rec->list_next;
    }

    if (leak_count > 0) {
        fprintf(report_file, "\nMemory Leaks Detected: %d (%zu bytes)\n\n",
                leak_count, leak_bytes);

        rec = all_records;
        int n = 1;
        while (rec) {
            if (rec->status == REC_ALLOCATED) {
                fprintf(report_file, "Leak #%d: %zu bytes at %p\n",
                        n++, rec->requested_size, rec->user_ptr);
                fprintf(report_file, "  Allocated at:\n");
                format_stack_trace(report_file, rec->alloc_stack, rec->alloc_stack_depth);
                fprintf(report_file, "\n");
            }
            rec = rec->list_next;
        }
    } else {
        fprintf(report_file, "\nNo memory leaks detected.\n");
    }

    fprintf(report_file, "========================================\n");
}

Checkpoint: Full reports with stack traces work.

Phase 4: Quarantine and Hardening (Days 15-18)

Goals:

  • Implement quarantine for use-after-free detection
  • Handle realloc and calloc correctly
  • Thread safety testing

Tasks:

  1. Quarantine implementation:
/* quarantine.c */
#define QUARANTINE_SIZE (1024 * 1024)  /* 1MB quarantine */

static AllocationRecord *quarantine_head = NULL;
static AllocationRecord *quarantine_tail = NULL;
static size_t quarantine_bytes = 0;

void add_to_quarantine(AllocationRecord *rec) {
    /* Add to tail */
    rec->list_next = NULL;
    if (quarantine_tail) {
        quarantine_tail->list_next = rec;
    } else {
        quarantine_head = rec;
    }
    quarantine_tail = rec;
    quarantine_bytes += rec->requested_size;

    /* Drain oldest if over limit */
    while (quarantine_bytes > QUARANTINE_SIZE && quarantine_head) {
        AllocationRecord *old = quarantine_head;
        quarantine_head = old->list_next;
        if (!quarantine_head) quarantine_tail = NULL;
        quarantine_bytes -= old->requested_size;

        /* Actually free now */
        real_free(old->internal_ptr);
    }
}
  1. Implement realloc:
void *realloc(void *ptr, size_t size) {
    if (!real_realloc) init_real_functions();
    if (ptr == NULL) return malloc(size);
    if (size == 0) { free(ptr); return NULL; }
    if (in_hook) return real_realloc(ptr, size);
    in_hook = 1;

    pthread_mutex_lock(&tracking_lock);

    AllocationRecord *rec = find_record(ptr);
    if (!rec || rec->status == REC_FREED) {
        report_error(rec ? MEMERR_DOUBLE_FREE : MEMERR_INVALID_FREE, ptr, rec);
        pthread_mutex_unlock(&tracking_lock);
        in_hook = 0;
        return NULL;
    }

    /* Check canaries before realloc */
    bool left_ok, right_ok;
    check_canaries(ptr, rec->requested_size, &left_ok, &right_ok);
    if (!left_ok) report_error(MEMERR_BUFFER_UNDERFLOW, ptr, rec);
    if (!right_ok) report_error(MEMERR_BUFFER_OVERFLOW, ptr, rec);

    /* Allocate new block with canaries */
    size_t new_total = total_allocation_size(size);
    void *new_internal = real_malloc(new_total);
    if (!new_internal) {
        pthread_mutex_unlock(&tracking_lock);
        in_hook = 0;
        return NULL;
    }

    void *new_user = wrap_with_canaries(new_internal, size);

    /* Copy old data */
    size_t copy_size = rec->requested_size < size ? rec->requested_size : size;
    memcpy(new_user, ptr, copy_size);

    /* Update tracking */
    remove_from_hash(rec);
    rec->user_ptr = new_user;
    rec->internal_ptr = new_internal;
    global_stats.current_bytes -= rec->requested_size;
    global_stats.current_bytes += size;
    rec->requested_size = size;
    insert_into_hash(rec);

    /* Free old block */
    memset(ptr, POISON_BYTE, copy_size);
    real_free((char *)ptr - CANARY_SIZE);

    if (global_stats.current_bytes > global_stats.peak_bytes) {
        global_stats.peak_bytes = global_stats.current_bytes;
    }

    pthread_mutex_unlock(&tracking_lock);
    in_hook = 0;
    return new_user;
}

Checkpoint: realloc works correctly, quarantine catches some use-after-free.

Phase 5: Polish and Testing (Days 19-21)

Goals:

  • Comprehensive test suite
  • Performance measurement
  • Documentation

Tasks:

  1. Test various error scenarios:
/* test_double_free.c */
int main(void) {
    char *p = malloc(100);
    free(p);
    free(p);  /* Should report double-free */
    return 0;
}

/* test_overflow.c */
int main(void) {
    char *p = malloc(10);
    strcpy(p, "This is way too long");  /* Overflow */
    free(p);  /* Should detect overflow via canary */
    return 0;
}

/* test_leak.c */
int main(void) {
    for (int i = 0; i < 5; i++) {
        malloc(100);  /* Never freed */
    }
    return 0;
}

Testing Strategy

Test Categories

Category Purpose Examples
Basic Operations malloc/free work correctly Allocate, use, free
Leak Detection Unreleased memory reported Forget to free
Double-Free Caught and reported Free twice
Invalid-Free Caught and reported Free stack pointer
Overflow Canary detects Write past end
Underflow Canary detects Write before start
Threading No races Multiple threads allocating

Critical Test Cases

# Test 1: Basic leak detection
$ cat > test1.c << 'EOF'
#include <stdlib.h>
int main() {
    malloc(100);  // Leak!
    return 0;
}
EOF
$ gcc -g test1.c -o test1
$ LD_PRELOAD=./libmemdbg.so ./test1
# Should report: 1 leak, 100 bytes

# Test 2: Double-free
$ cat > test2.c << 'EOF'
#include <stdlib.h>
int main() {
    char *p = malloc(100);
    free(p);
    free(p);  // Double-free!
    return 0;
}
EOF
$ gcc -g test2.c -o test2
$ LD_PRELOAD=./libmemdbg.so ./test2
# Should report: Double-free detected

# Test 3: Buffer overflow
$ cat > test3.c << 'EOF'
#include <stdlib.h>
#include <string.h>
int main() {
    char *p = malloc(10);
    strcpy(p, "This string is too long");  // Overflow!
    free(p);
    return 0;
}
EOF
$ gcc -g test3.c -o test3
$ LD_PRELOAD=./libmemdbg.so ./test3
# Should report: Buffer overflow detected

Common Pitfalls & Debugging

Frequent Mistakes

Pitfall Symptom Solution
Infinite recursion Stack overflow on startup Use recursion guard
dlsym returns NULL Crash in init Check before using
Race conditions Crashes in threaded programs Use mutex around tracking
Corrupt output Garbled error messages Ensure in_hook prevents re-entry
Missing realloc Leaks with realloc() Implement all allocators

Debugging the Debugger

/* Use stderr with no buffering */
setbuf(stderr, NULL);

/* Trace entry/exit */
#ifdef DEBUG
#define TRACE_ENTER(func) fprintf(stderr, ">>> %s\n", func)
#define TRACE_EXIT(func)  fprintf(stderr, "<<< %s\n", func)
#else
#define TRACE_ENTER(func)
#define TRACE_EXIT(func)
#endif

void *malloc(size_t size) {
    TRACE_ENTER("malloc");
    // ...
    TRACE_EXIT("malloc");
    return ptr;
}

Extensions & Challenges

Beginner Extensions

  • Add MEMDBG_VERBOSE environment variable
  • Add MEMDBG_LOG=file.txt for file output
  • Add MEMDBG_ABORT_ON_ERROR for immediate crash

Intermediate Extensions

  • Add memory usage graphs (track over time)
  • Add allocation size histogram
  • Detect memory fragmentation
  • Add mmap/munmap tracking

Advanced Extensions

  • Implement simplified shadow memory
  • Add real-time leak detection (not just at exit)
  • Support fork() correctly
  • Create IDE plugin for visualization
  • Add memory profiling (hot allocation sites)

Real-World Connections

Professional Tools

Valgrind Memcheck: The gold standard, 10-50x slowdown but thorough

AddressSanitizer (ASan): Compile-time instrumentation, 2x slowdown

Electric Fence: Simple page-based overflow detection

Your tool fits between: Faster than Valgrind, simpler than ASan, educational value

Industry Impact

Famous memory bugs found by similar tools:

  • Heartbleed (buffer over-read)
  • Dirty Pipe (use-after-free)
  • Numerous Chrome/Firefox CVEs

Resources

Essential Reading

Topic Source Chapter/Section
Dynamic memory CS:APP 3rd Ed Section 9.9: Dynamic Memory Allocation
Linking & libraries CS:APP 3rd Ed Chapter 7: Linking
LD_PRELOAD technique “Library Interposition” Various online resources
Valgrind internals Valgrind documentation Manual and papers
  • Previous: P13 (Safe String Library) - Preventing overflows at source
  • Next: P15 (Struct Packing Analyzer) - Understanding memory layout
  • Related: P04 (Stack Frame Inspector) - Understanding call stacks

Self-Assessment Checklist

Understanding Verification

  • I can explain how LD_PRELOAD works
  • I understand why canaries detect overflows
  • I know how quarantine helps detect use-after-free
  • I can explain the re-entrancy problem and solution

Implementation Verification

  • malloc/free are correctly intercepted
  • Double-free is detected and reported
  • Buffer overflow corrupts canary and is detected
  • Leak report shows allocation sites

Quality Verification

  • Works with multi-threaded programs
  • No false positives with normal programs
  • Performance overhead is reasonable (< 3x)

Submission / Completion Criteria

Minimum Viable Completion:

  • malloc/free interposition working
  • Basic leak reporting at exit
  • Double-free detection

Full Completion:

  • All allocator functions wrapped
  • Canary-based overflow detection
  • Stack traces in reports
  • Thread-safe implementation
  • Comprehensive test suite

Excellence:

  • Quarantine for use-after-free
  • Comparable output to Valgrind
  • Performance benchmarks
  • Integration tests with real programs

The Core Question You’re Answering

“How do tools like Valgrind detect memory errors, and can we build a simpler version that still catches real bugs?”

This project teaches you that debugging tools are not magic - they use understandable techniques like interposition, canaries, and tracking. By building your own, you understand exactly how memory debugging works.


This guide was expanded from EXPERT_C_PROGRAMMING_DEEP_DIVE.md. For the complete learning path, see the project index.