Project 24: Memory Leak Detector

Build a runtime memory leak detector that interposes malloc/free using LD_PRELOAD, tracks all allocations with stack traces, and emits a comprehensive leak report on program exit.


Quick Reference

Attribute Value
Language C (alt: C++)
Difficulty Advanced
Time 1-2 weeks
Chapters 7, 9, 3
Coolness ★★★★★ Debug Magic
Portfolio Value Systems Tool Gold

Learning Objectives

By completing this project, you will:

  1. Master library interposition: Understand how LD_PRELOAD (Linux) and DYLD_INSERT_LIBRARIES (macOS) intercept function calls at runtime
  2. Navigate dynamic linking internals: Use dlsym() with RTLD_NEXT to call the real allocation functions
  3. Avoid allocator recursion: Bootstrap metadata storage without calling malloc from within your interposed malloc
  4. Capture stack traces: Use backtrace() and backtrace_symbols() to record call sites for leak attribution
  5. Design allocation tracking structures: Build hash tables or other data structures using only mmap-backed memory
  6. Understand symbol resolution: See how the dynamic linker resolves symbols and how interposition changes that
  7. Implement atexit handlers: Register cleanup code that runs on program exit to report leaks
  8. Compare with professional tools: Understand how Valgrind, AddressSanitizer, and similar tools work

Deep Theoretical Foundation

Why Memory Leak Detection Matters

Memory leaks are silent killers. Unlike crashes, they don’t immediately announce themselves:

┌────────────────────────────────────────────────────────────────────┐
│                    MEMORY LEAK LIFECYCLE                           │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  Time →                                                             │
│                                                                     │
│  Program Start          Hours Later           Eventually            │
│  ┌─────────────┐       ┌─────────────┐       ┌─────────────┐       │
│  │             │       │ ████████    │       │█████████████│       │
│  │   Memory    │       │ ████████    │       │█████████████│       │
│  │   Usage     │       │ ████████    │       │█████████████│       │
│  │             │       │ ████████    │       │ OOM KILLER! │       │
│  │   Low       │       │  Growing    │       │   CRASH!    │       │
│  └─────────────┘       └─────────────┘       └─────────────┘       │
│                                                                     │
│  The Problem:                                                       │
│  - Application works fine in development                           │
│  - Leaks accumulate slowly over time                               │
│  - Production crash after days/weeks of uptime                     │
│  - No stack trace pointing to the leak!                            │
│                                                                     │
│  The Solution:                                                      │
│  - Track EVERY allocation and its call site                        │
│  - Match allocations with deallocations                            │
│  - Report unmatched allocations on exit                            │
│  - Include stack traces for attribution                            │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Real-world impact:

  • Firefox had a memory leak detection system that found thousands of bugs
  • Server applications can leak gigabytes before anyone notices
  • Embedded systems with limited memory need leak-free code
  • Long-running daemons accumulate leaks over months

Library Interposition: The Core Mechanism

Library interposition lets you replace library functions with your own implementations:

┌────────────────────────────────────────────────────────────────────┐
│                    LIBRARY INTERPOSITION                           │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  NORMAL EXECUTION (No Interposition):                              │
│  ────────────────────────────────────                              │
│                                                                     │
│    main.c                  libc.so                                 │
│    ┌─────────────┐        ┌─────────────┐                          │
│    │ p = malloc  │───────▶│   malloc()  │                          │
│    │    (100)    │        │  (libc impl)│                          │
│    └─────────────┘        └─────────────┘                          │
│                                                                     │
│                                                                     │
│  WITH INTERPOSITION (LD_PRELOAD):                                  │
│  ────────────────────────────────                                  │
│                                                                     │
│    main.c              libleakcheck.so        libc.so              │
│    ┌─────────────┐    ┌────────────────┐    ┌─────────────┐        │
│    │ p = malloc  │───▶│  YOUR malloc() │───▶│   malloc()  │        │
│    │    (100)    │    │  - Track alloc │    │  (real impl)│        │
│    └─────────────┘    │  - Record stack│    └─────────────┘        │
│                       │  - Call real   │                            │
│                       └────────────────┘                            │
│                                                                     │
│  Symbol Resolution Order:                                          │
│  1. LD_PRELOAD libraries (checked FIRST)                           │
│  2. Application's own symbols                                      │
│  3. Dependent libraries (libc, etc.)                               │
│                                                                     │
│  This is why LD_PRELOAD "wins" - the dynamic linker finds          │
│  your malloc() first and binds all calls to it!                    │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

How LD_PRELOAD Works Internally

┌────────────────────────────────────────────────────────────────────┐
│                    LD_PRELOAD MECHANISM                            │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  1. PROGRAM LAUNCH                                                 │
│     $ LD_PRELOAD=./libleakcheck.so ./my_program                   │
│                                                                     │
│  2. DYNAMIC LINKER (ld-linux.so) ACTIONS:                         │
│                                                                     │
│     a) Parse LD_PRELOAD environment variable                       │
│     b) Load libleakcheck.so FIRST into memory                     │
│     c) Add its symbols to the global symbol table                  │
│     d) Load my_program and its dependencies (libc)                │
│     e) Resolve symbols - libleakcheck.so's malloc() wins!         │
│                                                                     │
│  3. SYMBOL TABLE STATE:                                            │
│                                                                     │
│     ┌─────────────────────────────────────────────┐                │
│     │        DYNAMIC SYMBOL TABLE                  │                │
│     ├───────────────┬─────────────────────────────┤                │
│     │   Symbol      │   Resolved To               │                │
│     ├───────────────┼─────────────────────────────┤                │
│     │   malloc      │   libleakcheck.so:malloc   ← WINS!          │
│     │   free        │   libleakcheck.so:free     ← WINS!          │
│     │   printf      │   libc.so:printf           │                │
│     │   ...         │   ...                       │                │
│     └───────────────┴─────────────────────────────┘                │
│                                                                     │
│  4. AT RUNTIME:                                                    │
│     - Every malloc() call goes to YOUR implementation              │
│     - You can do anything, then call the real malloc()            │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Using dlsym() with RTLD_NEXT

To call the “real” malloc from your interposed malloc, use dlsym(RTLD_NEXT, "malloc"):

┌────────────────────────────────────────────────────────────────────┐
│                    dlsym() AND RTLD_NEXT                           │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  RTLD_NEXT: "Find the NEXT occurrence of this symbol"              │
│  ─────────────────────────────────────────────────────              │
│                                                                     │
│  Symbol Search Path:                                                │
│                                                                     │
│  ┌──────────────────┐    ┌──────────────────┐    ┌──────────────┐  │
│  │  libleakcheck.so │───▶│     libc.so      │───▶│     END     │  │
│  │   (YOUR code)    │    │   (real malloc)  │    │              │  │
│  │                  │    │                  │    │              │  │
│  │  malloc() HERE   │    │  malloc() HERE   │    │              │  │
│  └──────────────────┘    └──────────────────┘    └──────────────┘  │
│         ↑                       ↑                                   │
│         │                       │                                   │
│    Current position      RTLD_NEXT finds this!                     │
│                                                                     │
│                                                                     │
│  CODE PATTERN:                                                      │
│  ─────────────                                                      │
│                                                                     │
│  #define _GNU_SOURCE                                                │
│  #include <dlfcn.h>                                                 │
│                                                                     │
│  static void *(*real_malloc)(size_t) = NULL;                       │
│                                                                     │
│  void *malloc(size_t size) {                                        │
│      // Get real malloc on first call                              │
│      if (real_malloc == NULL) {                                    │
│          real_malloc = dlsym(RTLD_NEXT, "malloc");                 │
│          if (real_malloc == NULL) {                                │
│              // Fatal error - can't find real malloc!              │
│              _exit(1);                                              │
│          }                                                          │
│      }                                                              │
│                                                                     │
│      // Your tracking code here...                                 │
│                                                                     │
│      // Call the real malloc                                       │
│      return real_malloc(size);                                      │
│  }                                                                  │
│                                                                     │
│                                                                     │
│  IMPORTANT: dlsym() itself may call malloc!                        │
│  This creates a bootstrap problem - see next section.              │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

The Recursion Problem and Bootstrap

The biggest challenge: your interposed malloc might be called BEFORE you’ve obtained the real malloc pointer:

┌────────────────────────────────────────────────────────────────────┐
│                    THE BOOTSTRAP PROBLEM                           │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  DANGEROUS SCENARIO:                                               │
│  ───────────────────                                               │
│                                                                     │
│  1. Program starts, first malloc() call                            │
│  2. Your malloc() is invoked                                       │
│  3. You call dlsym(RTLD_NEXT, "malloc")                           │
│  4. dlsym() internally calls malloc() !!!                         │
│  5. Your malloc() is invoked AGAIN                                 │
│  6. You call dlsym() AGAIN                                         │
│  7. INFINITE RECURSION → STACK OVERFLOW                           │
│                                                                     │
│                                                                     │
│  CALL STACK (CRASH):                                               │
│                                                                     │
│    malloc()  ← your implementation                                 │
│    dlsym()   ← needs memory                                        │
│    malloc()  ← your implementation (recursion!)                   │
│    dlsym()   ← needs memory                                        │
│    malloc()  ← your implementation (recursion!)                   │
│    ...       ← STACK OVERFLOW                                      │
│                                                                     │
│                                                                     │
│  SOLUTIONS:                                                        │
│  ──────────                                                        │
│                                                                     │
│  1. STATIC INITIALIZATION (Preferred):                             │
│     - Use __attribute__((constructor)) to initialize early        │
│     - Happens before main(), usually before any malloc()          │
│                                                                     │
│  2. BOOTSTRAP ALLOCATOR:                                           │
│     - Detect recursion with a flag                                 │
│     - Return memory from a small static buffer during bootstrap   │
│                                                                     │
│  3. THREAD-LOCAL RECURSION GUARD:                                  │
│     - Track if current thread is inside malloc()                  │
│     - Skip tracking during recursion                               │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Bootstrap Allocator Pattern

/* Static buffer for bootstrap allocations */
#define BOOTSTRAP_SIZE 65536
static char bootstrap_buffer[BOOTSTRAP_SIZE];
static size_t bootstrap_used = 0;
static int initializing = 0;

/* Bootstrap allocator - used during dlsym() calls */
static void *bootstrap_malloc(size_t size) {
    /* Align to 16 bytes */
    size = (size + 15) & ~15;

    if (bootstrap_used + size > BOOTSTRAP_SIZE) {
        /* Out of bootstrap memory - fatal */
        write(STDERR_FILENO, "Bootstrap OOM\n", 14);
        _exit(1);
    }

    void *ptr = bootstrap_buffer + bootstrap_used;
    bootstrap_used += size;
    return ptr;
}

/* Check if pointer is from bootstrap buffer */
static int is_bootstrap_ptr(void *ptr) {
    return (ptr >= (void *)bootstrap_buffer &&
            ptr < (void *)(bootstrap_buffer + BOOTSTRAP_SIZE));
}

void *malloc(size_t size) {
    /* During initialization, use bootstrap allocator */
    if (initializing) {
        return bootstrap_malloc(size);
    }

    /* Initialize on first call */
    if (real_malloc == NULL) {
        initializing = 1;
        real_malloc = dlsym(RTLD_NEXT, "malloc");
        real_free = dlsym(RTLD_NEXT, "free");
        /* ... other initializations ... */
        initializing = 0;
    }

    /* Normal operation */
    void *ptr = real_malloc(size);
    track_allocation(ptr, size);
    return ptr;
}

void free(void *ptr) {
    /* Bootstrap allocations are never freed */
    if (is_bootstrap_ptr(ptr)) {
        return;  /* Memory is static, just ignore */
    }

    track_deallocation(ptr);
    real_free(ptr);
}

Stack Trace Capture

To attribute leaks to specific code locations, capture the call stack at allocation time:

┌────────────────────────────────────────────────────────────────────┐
│                    STACK TRACE CAPTURE                             │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  USING backtrace() AND backtrace_symbols():                        │
│  ───────────────────────────────────────────                       │
│                                                                     │
│  #include <execinfo.h>                                              │
│                                                                     │
│  #define MAX_STACK_FRAMES 32                                        │
│                                                                     │
│  typedef struct {                                                   │
│      void *frames[MAX_STACK_FRAMES];                               │
│      int num_frames;                                                │
│  } StackTrace;                                                      │
│                                                                     │
│  void capture_stack(StackTrace *trace) {                           │
│      trace->num_frames = backtrace(trace->frames, MAX_STACK_FRAMES);│
│  }                                                                  │
│                                                                     │
│  void print_stack(StackTrace *trace) {                             │
│      char **symbols = backtrace_symbols(trace->frames,             │
│                                          trace->num_frames);       │
│      if (symbols) {                                                 │
│          for (int i = 0; i < trace->num_frames; i++) {             │
│              fprintf(stderr, "  #%d %s\n", i, symbols[i]);         │
│          }                                                          │
│          free(symbols);  /* Note: must free! */                    │
│      }                                                              │
│  }                                                                  │
│                                                                     │
│                                                                     │
│  EXAMPLE OUTPUT:                                                    │
│  ───────────────                                                    │
│                                                                     │
│  Leak of 1024 bytes at 0x7f8a1c000000                              │
│  Allocated at:                                                      │
│    #0 ./libleakcheck.so(malloc+0x42) [0x7f8b00000042]             │
│    #1 ./my_program(create_buffer+0x1a) [0x55555555521a]           │
│    #2 ./my_program(process_data+0x35) [0x555555555335]            │
│    #3 ./my_program(main+0x28) [0x555555555428]                    │
│    #4 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)     │
│                                                                     │
│                                                                     │
│  GETTING BETTER SYMBOLS:                                           │
│  ───────────────────────                                           │
│                                                                     │
│  Compile target program with:                                       │
│    -g          (debug symbols)                                      │
│    -rdynamic   (export dynamic symbols)                            │
│    -funwind-tables  (stack unwinding info)                         │
│                                                                     │
│  For production, use addr2line for file:line info:                 │
│    addr2line -e ./my_program 0x555555555335                        │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Performance Considerations for Stack Traces

┌────────────────────────────────────────────────────────────────────┐
│                    STACK TRACE OVERHEAD                            │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  backtrace() is EXPENSIVE:                                         │
│  - Must walk the entire stack                                      │
│  - ~1-5 microseconds per call                                      │
│  - Can be 100x slower than malloc itself                           │
│                                                                     │
│  OPTIMIZATION STRATEGIES:                                          │
│  ────────────────────────                                          │
│                                                                     │
│  1. SAMPLE STACK TRACES:                                           │
│     - Only capture every Nth allocation                            │
│     - Good for finding major leaks quickly                         │
│                                                                     │
│  2. DEFERRED SYMBOLIZATION:                                        │
│     - Store only raw addresses (void *frames[])                   │
│     - Convert to symbols only when printing                        │
│     - backtrace_symbols() is even more expensive!                  │
│                                                                     │
│  3. STACK HASH FOR DEDUPLICATION:                                  │
│     - Hash the stack trace addresses                               │
│     - Group allocations by identical call site                     │
│     - Report "42 allocations from same location" instead of 42x   │
│                                                                     │
│  4. LIMIT STACK DEPTH:                                             │
│     - 8-16 frames is usually enough                                │
│     - 32 frames is maximum useful depth                            │
│                                                                     │
│  5. USE FRAME POINTERS:                                            │
│     - Compile with -fno-omit-frame-pointer                         │
│     - Makes stack walking faster                                   │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Allocation Tracking Data Structures

You need to track every allocation - but you can’t use malloc for your tracking structures!

┌────────────────────────────────────────────────────────────────────┐
│                    ALLOCATION TRACKING                             │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  WHAT TO TRACK PER ALLOCATION:                                     │
│  ─────────────────────────────                                     │
│                                                                     │
│  typedef struct AllocationRecord {                                 │
│      void *ptr;                /* The allocated pointer */         │
│      size_t size;              /* Requested size */                │
│      StackTrace stack;         /* Call stack at malloc() */        │
│      uint64_t timestamp;       /* When allocated (optional) */     │
│      struct AllocationRecord *next;  /* For hash table chaining */ │
│  } AllocationRecord;                                               │
│                                                                     │
│                                                                     │
│  STORAGE OPTIONS:                                                   │
│  ────────────────                                                   │
│                                                                     │
│  Option 1: MMAP-BACKED ARENA                                       │
│  ┌─────────────────────────────────────────────────┐               │
│  │                                                  │               │
│  │  mmap(NULL, arena_size, PROT_READ|PROT_WRITE,  │               │
│  │       MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);       │               │
│  │                                                  │               │
│  │  - Bypasses malloc entirely                     │               │
│  │  - Grows in page-sized chunks                   │               │
│  │  - No fragmentation issues                      │               │
│  │  - Can unmap when done                          │               │
│  │                                                  │               │
│  └─────────────────────────────────────────────────┘               │
│                                                                     │
│  Option 2: STATIC ARRAY (Simple, Limited)                          │
│  ┌─────────────────────────────────────────────────┐               │
│  │                                                  │               │
│  │  static AllocationRecord records[MAX_RECORDS];  │               │
│  │                                                  │               │
│  │  - Very simple                                   │               │
│  │  - Limited capacity                             │               │
│  │  - Good for small programs / testing            │               │
│  │                                                  │               │
│  └─────────────────────────────────────────────────┘               │
│                                                                     │
│                                                                     │
│  HASH TABLE FOR O(1) LOOKUP:                                       │
│  ───────────────────────────                                       │
│                                                                     │
│  Hash by pointer value:                                            │
│                                                                     │
│  #define HASH_BUCKETS 65537  /* Prime number */                    │
│                                                                     │
│  static AllocationRecord *hash_table[HASH_BUCKETS];                │
│                                                                     │
│  size_t hash_ptr(void *ptr) {                                      │
│      /* Mix bits for better distribution */                        │
│      uintptr_t p = (uintptr_t)ptr;                                 │
│      p = ((p >> 16) ^ p) * 0x45d9f3b;                              │
│      p = ((p >> 16) ^ p) * 0x45d9f3b;                              │
│      p = (p >> 16) ^ p;                                            │
│      return p % HASH_BUCKETS;                                       │
│  }                                                                  │
│                                                                     │
│                                                                     │
│  HASH TABLE VISUALIZATION:                                         │
│  ─────────────────────────                                         │
│                                                                     │
│  Index   Bucket Chain                                               │
│  ─────   ────────────                                               │
│    0  →  [0x7f8a...|1024B|stack1] → NULL                          │
│    1  →  NULL                                                       │
│    2  →  [0x7f8b...|256B|stack2] → [0x7f8c...|64B|stack3] → NULL  │
│    3  →  NULL                                                       │
│   ...                                                               │
│                                                                     │
│  On malloc(ptr): Insert at hash_table[hash_ptr(ptr)]               │
│  On free(ptr):   Remove from hash_table[hash_ptr(ptr)]             │
│  On exit:        Everything remaining is a LEAK!                   │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Memory Arena for Metadata

Since you can’t use malloc for your tracking data, use mmap directly:

┌────────────────────────────────────────────────────────────────────┐
│                    MMAP-BACKED ARENA                               │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  ARENA STRUCTURE:                                                  │
│  ────────────────                                                  │
│                                                                     │
│  typedef struct Arena {                                            │
│      void *base;        /* Start of mmap'd region */              │
│      size_t capacity;   /* Total size */                          │
│      size_t used;       /* Current usage */                       │
│  } Arena;                                                          │
│                                                                     │
│  static Arena metadata_arena;                                       │
│                                                                     │
│  void arena_init(Arena *a, size_t initial_size) {                  │
│      a->base = mmap(NULL, initial_size,                           │
│                     PROT_READ | PROT_WRITE,                       │
│                     MAP_PRIVATE | MAP_ANONYMOUS,                  │
│                     -1, 0);                                        │
│      if (a->base == MAP_FAILED) {                                  │
│          /* Fatal error */                                         │
│          _exit(1);                                                  │
│      }                                                              │
│      a->capacity = initial_size;                                   │
│      a->used = 0;                                                   │
│  }                                                                  │
│                                                                     │
│  void *arena_alloc(Arena *a, size_t size) {                        │
│      /* Align to 16 bytes */                                       │
│      size = (size + 15) & ~15;                                     │
│                                                                     │
│      if (a->used + size > a->capacity) {                          │
│          /* Need to grow - remap with larger size */              │
│          /* ... or allocate new arena ... */                      │
│      }                                                              │
│                                                                     │
│      void *ptr = (char *)a->base + a->used;                       │
│      a->used += size;                                              │
│      return ptr;                                                    │
│  }                                                                  │
│                                                                     │
│                                                                     │
│  MEMORY LAYOUT:                                                    │
│  ──────────────                                                    │
│                                                                     │
│  ┌────────────────────────────────────────────────────────────┐    │
│  │                    METADATA ARENA (mmap'd)                 │    │
│  ├────────────────────────────────────────────────────────────┤    │
│  │ AllocationRecord │ AllocationRecord │ AllocationRecord │..│    │
│  │   ptr=0x7f8a..   │   ptr=0x7f8b..   │   ptr=0x7f8c..   │  │    │
│  │   size=1024      │   size=256       │   size=64        │  │    │
│  │   stack=...      │   stack=...      │   stack=...      │  │    │
│  └────────────────────────────────────────────────────────────┘    │
│   ↑                                                    ↑           │
│   base                                                 used        │
│                                                                     │
│  Note: Arena only grows, never shrinks                             │
│        (free'd AllocationRecords are just marked/removed)          │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Reporting Leaks on Exit

Use atexit() or __attribute__((destructor)) to run cleanup code:

┌────────────────────────────────────────────────────────────────────┐
│                    EXIT HANDLERS                                   │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  TWO OPTIONS FOR RUNNING CODE AT EXIT:                             │
│  ─────────────────────────────────────                             │
│                                                                     │
│  Option 1: atexit()                                                │
│  ──────────────────                                                │
│                                                                     │
│  void report_leaks(void) {                                         │
│      /* Called when program exits normally */                      │
│  }                                                                  │
│                                                                     │
│  __attribute__((constructor))                                       │
│  void init(void) {                                                  │
│      atexit(report_leaks);                                          │
│  }                                                                  │
│                                                                     │
│                                                                     │
│  Option 2: __attribute__((destructor))                             │
│  ──────────────────────────────────────                            │
│                                                                     │
│  __attribute__((destructor))                                        │
│  void report_leaks(void) {                                          │
│      /* Called when shared library is unloaded */                  │
│      /* Works even if program doesn't call exit() properly */      │
│  }                                                                  │
│                                                                     │
│                                                                     │
│  REPORT FORMAT:                                                    │
│  ──────────────                                                    │
│                                                                     │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │                                                               │  │
│  │  ==== MEMORY LEAK REPORT ====                                │  │
│  │                                                               │  │
│  │  SUMMARY:                                                    │  │
│  │    Total allocations:     1,234                              │  │
│  │    Total deallocations:   1,230                              │  │
│  │    Leaked allocations:        4                              │  │
│  │    Leaked bytes:          2,048                              │  │
│  │                                                               │  │
│  │  LEAKED ALLOCATIONS:                                         │  │
│  │                                                               │  │
│  │  Leak #1: 1024 bytes at 0x7f8a1c000000                      │  │
│  │    Allocated at:                                             │  │
│  │      #0 ./libleakcheck.so(malloc+0x42)                      │  │
│  │      #1 ./program(create_buffer+0x1a) [buffer.c:42]         │  │
│  │      #2 ./program(main+0x28) [main.c:15]                    │  │
│  │                                                               │  │
│  │  Leak #2: 512 bytes at 0x7f8a1c001000                       │  │
│  │    Allocated at:                                             │  │
│  │      #0 ./libleakcheck.so(malloc+0x42)                      │  │
│  │      #1 ./program(parse_config+0x55) [config.c:128]         │  │
│  │      #2 ./program(init+0x33) [init.c:45]                    │  │
│  │                                                               │  │
│  │  ... (2 more leaks)                                          │  │
│  │                                                               │  │
│  └──────────────────────────────────────────────────────────────┘  │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Thread Safety Considerations

In multi-threaded programs, multiple threads may call malloc simultaneously:

┌────────────────────────────────────────────────────────────────────┐
│                    THREAD SAFETY                                   │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  THE PROBLEM:                                                      │
│  ────────────                                                      │
│                                                                     │
│  Thread 1                     Thread 2                              │
│  ─────────                    ─────────                             │
│  malloc(100)                  malloc(200)                           │
│    │                            │                                   │
│    ├─→ record_alloc() ←────────┤                                   │
│    │      (RACE CONDITION!)     │                                   │
│    │                            │                                   │
│                                                                     │
│  Without synchronization:                                           │
│  - Hash table corruption                                           │
│  - Lost allocation records                                         │
│  - Crashes                                                          │
│                                                                     │
│                                                                     │
│  SOLUTION 1: GLOBAL MUTEX                                          │
│  ────────────────────────                                          │
│                                                                     │
│  static pthread_mutex_t track_mutex = PTHREAD_MUTEX_INITIALIZER;   │
│                                                                     │
│  void *malloc(size_t size) {                                        │
│      pthread_mutex_lock(&track_mutex);                              │
│      /* ... tracking code ... */                                   │
│      pthread_mutex_unlock(&track_mutex);                            │
│                                                                     │
│      return real_malloc(size);                                      │
│  }                                                                  │
│                                                                     │
│  Problem: High contention - serializes all allocations!            │
│                                                                     │
│                                                                     │
│  SOLUTION 2: THREAD-LOCAL TRACKING                                 │
│  ─────────────────────────────────                                 │
│                                                                     │
│  __thread AllocationRecord *thread_records = NULL;                 │
│  __thread int inside_malloc = 0;  /* Recursion guard */           │
│                                                                     │
│  Each thread has its own record list:                              │
│  - No locks needed for tracking                                    │
│  - Merge lists at exit for final report                            │
│  - Trade-off: more memory usage                                    │
│                                                                     │
│                                                                     │
│  SOLUTION 3: LOCK-FREE DATA STRUCTURES                             │
│  ─────────────────────────────────────                             │
│                                                                     │
│  Use atomic operations (compare-and-swap) for lock-free lists      │
│  - More complex implementation                                     │
│  - Better scalability                                              │
│  - What production tools use                                       │
│                                                                     │
│                                                                     │
│  RECURSION GUARD (Required for all solutions):                     │
│  ─────────────────────────────────────────────                     │
│                                                                     │
│  __thread int inside_malloc = 0;                                    │
│                                                                     │
│  void *malloc(size_t size) {                                        │
│      if (inside_malloc) {                                           │
│          /* Recursive call - skip tracking to avoid infinite loop */│
│          return real_malloc(size);                                  │
│      }                                                              │
│      inside_malloc = 1;                                             │
│      /* ... tracking code (may call malloc indirectly) ... */      │
│      void *ptr = real_malloc(size);                                 │
│      inside_malloc = 0;                                             │
│      return ptr;                                                    │
│  }                                                                  │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Comparison with Professional Tools

Understanding how your detector compares to production tools:

┌────────────────────────────────────────────────────────────────────┐
│                    PROFESSIONAL TOOLS COMPARISON                   │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  VALGRIND (Memcheck):                                              │
│  ────────────────────                                              │
│  - Binary instrumentation (no recompilation needed)                │
│  - Tracks every memory access                                      │
│  - Detects: leaks, use-after-free, uninit reads, buffer overflows │
│  - 10-50x slowdown (full emulation)                               │
│  - Most comprehensive                                               │
│                                                                     │
│  ADDRESS SANITIZER (ASan):                                         │
│  ─────────────────────────                                         │
│  - Compile-time instrumentation (-fsanitize=address)              │
│  - Shadow memory for tracking valid regions                        │
│  - Detects: use-after-free, buffer overflows, leaks               │
│  - 2-3x slowdown                                                   │
│  - Built into GCC and Clang                                        │
│                                                                     │
│  LEAK SANITIZER (LSan):                                            │
│  ──────────────────────                                            │
│  - Focused specifically on leak detection                          │
│  - Can run standalone or with ASan                                 │
│  - Fast: minimal runtime overhead until exit                       │
│  - Uses pointer scanning to find reachable memory                  │
│                                                                     │
│  MTRACE (glibc):                                                   │
│  ───────────────                                                   │
│  - Simple LD_PRELOAD-based tracking                               │
│  - Writes log file of alloc/free operations                       │
│  - Post-process with mtrace(1) to find leaks                      │
│  - Limited: no stack traces, single-threaded                       │
│                                                                     │
│  YOUR DETECTOR:                                                    │
│  ──────────────                                                    │
│  - LD_PRELOAD interposition (like mtrace)                         │
│  - Stack trace capture (like Valgrind/ASan)                       │
│  - Hash table tracking (like all of them)                          │
│  - Exit-time reporting (like LSan)                                 │
│                                                                     │
│                                                                     │
│  FEATURE COMPARISON:                                               │
│  ───────────────────                                               │
│                                                                     │
│  Feature              Valgrind  ASan  LSan  mtrace  YOUR DETECTOR  │
│  ───────────────────  ────────  ────  ────  ──────  ──────────────  │
│  Leak detection          ✓       ✓     ✓      ✓          ✓         │
│  Stack traces            ✓       ✓     ✓      ✗          ✓         │
│  Use-after-free          ✓       ✓     ✗      ✗         (ext)      │
│  Buffer overflow         ✓       ✓     ✗      ✗          ✗         │
│  No recompile needed     ✓       ✗     ✗      ✓          ✓         │
│  Low overhead            ✗       △     ✓      ✓          △         │
│  Thread-safe             ✓       ✓     ✓      ✗         (you!)    │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Project Specification

What You Will Build

A shared library (libleakcheck.so on Linux, libleakcheck.dylib on macOS) that:

  1. Interposes malloc/calloc/realloc/free to intercept all allocations
  2. Tracks every allocation with pointer, size, and stack trace
  3. Matches allocations with deallocations to find unfreed memory
  4. Reports leaks on exit with detailed information for debugging

Functional Requirements

┌────────────────────────────────────────────────────────────────────┐
│                    FUNCTIONAL REQUIREMENTS                         │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  INTERPOSED FUNCTIONS:                                             │
│  ─────────────────────                                             │
│                                                                     │
│  void *malloc(size_t size);                                        │
│    - Record allocation (ptr, size, stack trace)                   │
│    - Call real malloc and return result                           │
│                                                                     │
│  void *calloc(size_t nmemb, size_t size);                         │
│    - Record allocation (ptr, nmemb*size, stack trace)             │
│    - Call real calloc and return result                           │
│                                                                     │
│  void *realloc(void *ptr, size_t size);                           │
│    - Remove old record if ptr != NULL                             │
│    - Record new allocation                                         │
│    - Call real realloc and return result                          │
│                                                                     │
│  void free(void *ptr);                                             │
│    - Remove allocation record                                      │
│    - Call real free                                                │
│                                                                     │
│                                                                     │
│  TRACKING REQUIREMENTS:                                            │
│  ──────────────────────                                            │
│                                                                     │
│  For each allocation, store:                                       │
│  - Pointer address (void *)                                        │
│  - Requested size (size_t)                                         │
│  - Stack trace (8-16 frames)                                       │
│  - Timestamp (optional, for debugging)                             │
│                                                                     │
│                                                                     │
│  REPORTING REQUIREMENTS:                                           │
│  ───────────────────────                                           │
│                                                                     │
│  On program exit, print to stderr:                                 │
│  1. Summary statistics (total allocs, frees, leaks, bytes)        │
│  2. Each leaked allocation with:                                   │
│     - Address and size                                             │
│     - Full stack trace from allocation                            │
│  3. Sort by size (largest leaks first) or by call site            │
│                                                                     │
│                                                                     │
│  ERROR HANDLING:                                                   │
│  ───────────────                                                   │
│                                                                     │
│  - Handle double-free gracefully (warn but don't crash)           │
│  - Handle free of unknown pointer (warn but call real free)       │
│  - Handle out-of-metadata-memory (report and disable tracking)    │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Non-Functional Requirements

  • Performance: Less than 5x slowdown for typical applications
  • Memory overhead: Less than 100 bytes per tracked allocation
  • Thread safety: Must work correctly with multi-threaded programs
  • Portability: Linux primary, macOS secondary support
  • No false positives: Never report valid memory as leaked

Usage

# Build the library
$ gcc -shared -fPIC -o libleakcheck.so leakcheck.c -ldl -lpthread

# Run any program with leak checking
$ LD_PRELOAD=./libleakcheck.so ./my_program

# Or on macOS:
$ DYLD_INSERT_LIBRARIES=./libleakcheck.dylib ./my_program

Real World Outcome

When you complete this project, here’s exactly what you’ll see:

Running a Leaky Program

$ cat leaky_test.c
#include <stdlib.h>
#include <string.h>

char *create_buffer(int size) {
    return malloc(size);  // Caller must free!
}

void process_data(void) {
    char *buf1 = create_buffer(1024);
    char *buf2 = create_buffer(512);
    strcpy(buf1, "Hello");
    // Oops! Forgot to free buf1 and buf2!
}

int main(void) {
    process_data();

    char *buf3 = malloc(256);
    free(buf3);  // This one is properly freed

    char *buf4 = malloc(2048);
    // Oops! Forgot buf4 too!

    return 0;
}

$ gcc -g -rdynamic -o leaky_test leaky_test.c
$ LD_PRELOAD=./libleakcheck.so ./leaky_test

===============================================================================
                         MEMORY LEAK REPORT
===============================================================================

SUMMARY:
  Total allocations:      4
  Total deallocations:    1
  Leaked allocations:     3
  Leaked bytes:       3,584

-------------------------------------------------------------------------------
LEAKED ALLOCATIONS (sorted by size, largest first):
-------------------------------------------------------------------------------

Leak #1: 2048 bytes at 0x55a8b3c01040
  Allocated at:
    #0  ./libleakcheck.so(malloc+0x8a) [0x7f3c8a2008a]
    #1  ./leaky_test(main+0x52) [leaky_test.c:18]
    #2  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)
    #3  ./leaky_test(_start+0x2e)

Leak #2: 1024 bytes at 0x55a8b3c00820
  Allocated at:
    #0  ./libleakcheck.so(malloc+0x8a) [0x7f3c8a2008a]
    #1  ./leaky_test(create_buffer+0x1d) [leaky_test.c:5]
    #2  ./leaky_test(process_data+0x19) [leaky_test.c:9]
    #3  ./leaky_test(main+0xe) [leaky_test.c:14]
    #4  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)

Leak #3: 512 bytes at 0x55a8b3c00c30
  Allocated at:
    #0  ./libleakcheck.so(malloc+0x8a) [0x7f3c8a2008a]
    #1  ./leaky_test(create_buffer+0x1d) [leaky_test.c:5]
    #2  ./leaky_test(process_data+0x2a) [leaky_test.c:10]
    #3  ./leaky_test(main+0xe) [leaky_test.c:14]
    #4  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)

===============================================================================

Running a Clean Program

$ LD_PRELOAD=./libleakcheck.so ./clean_program

===============================================================================
                         MEMORY LEAK REPORT
===============================================================================

SUMMARY:
  Total allocations:    127
  Total deallocations:  127
  Leaked allocations:     0
  Leaked bytes:           0

No memory leaks detected!
===============================================================================

Debug Mode Output

$ LEAKCHECK_DEBUG=1 LD_PRELOAD=./libleakcheck.so ./my_program

[leakcheck] Initialized: tracking enabled
[leakcheck] malloc(1024) = 0x55a8b3c00820
[leakcheck] malloc(512) = 0x55a8b3c00c30
[leakcheck] malloc(256) = 0x55a8b3c00e40
[leakcheck] free(0x55a8b3c00e40)
[leakcheck] malloc(2048) = 0x55a8b3c01040
[leakcheck] Shutdown: generating leak report...
... (leak report follows) ...

Solution Architecture

High-Level Design

┌────────────────────────────────────────────────────────────────────┐
│                    LEAK DETECTOR ARCHITECTURE                      │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  TARGET APPLICATION                                                │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │  main()                                                       │  │
│  │    │                                                          │  │
│  │    ├──→ malloc(100) ────────────────────┐                    │  │
│  │    │                                     │                    │  │
│  │    ├──→ malloc(200) ─────────┐          │                    │  │
│  │    │                          │          │                    │  │
│  │    └──→ free(ptr) ─────┐     │          │                    │  │
│  └────────────────────────│─────│──────────│────────────────────┘  │
│                           │     │          │                        │
│                           ▼     ▼          ▼                        │
│  LIBLEAKCHECK.SO         ┌─────────────────────────────────────┐   │
│  ┌───────────────────────│     INTERPOSED FUNCTIONS           │   │
│  │                       │  ┌───────────────────────────────┐  │   │
│  │   malloc() ←──────────│──│ 1. Guard against recursion    │  │   │
│  │   calloc()            │  │ 2. Capture stack trace        │  │   │
│  │   realloc()           │  │ 3. Call real_malloc()         │  │   │
│  │   free() ←────────────│──│ 4. Record in hash table       │  │   │
│  │                       │  │ 5. Return pointer             │  │   │
│  │                       │  └───────────────────────────────┘  │   │
│  │                       └─────────────────────────────────────┘   │
│  │                                                                  │
│  │   TRACKING DATA STRUCTURES (mmap-backed)                        │
│  │   ┌────────────────────────────────────────────────────────┐    │
│  │   │                   HASH TABLE                            │    │
│  │   │  ┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐    │    │
│  │   │  │  0  │  1  │  2  │  3  │ ... │     │     │ N-1 │    │    │
│  │   │  └──│──┴─────┴──│──┴─────┴─────┴─────┴─────┴─────┘    │    │
│  │   │     │           │                                       │    │
│  │   │     ▼           ▼                                       │    │
│  │   │  [Record]    [Record]──→[Record]──→NULL                │    │
│  │   │   ptr=0x...   ptr=0x...  ptr=0x...                     │    │
│  │   │   size=100    size=200   size=50                       │    │
│  │   │   stack=...   stack=...  stack=...                     │    │
│  │   └────────────────────────────────────────────────────────┘    │
│  │                                                                  │
│  │   DESTRUCTOR (__attribute__((destructor)))                      │
│  │   ┌────────────────────────────────────────────────────────┐    │
│  │   │  report_leaks()                                         │    │
│  │   │  - Walk hash table for remaining records               │    │
│  │   │  - Sort by size                                         │    │
│  │   │  - Print leak report with stack traces                 │    │
│  │   └────────────────────────────────────────────────────────┘    │
│  └──────────────────────────────────────────────────────────────┘   │
│                                                                     │
│  LIBC.SO                                                           │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │  real_malloc() ←── Called via dlsym(RTLD_NEXT, "malloc")    │  │
│  │  real_calloc()                                                │  │
│  │  real_realloc()                                               │  │
│  │  real_free()                                                  │  │
│  └──────────────────────────────────────────────────────────────┘  │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Data Structures

/* ==================== CONFIGURATION ==================== */

#define MAX_STACK_FRAMES 16       /* Stack trace depth */
#define HASH_BUCKETS     65537    /* Prime for good distribution */
#define ARENA_SIZE       (16 * 1024 * 1024)  /* 16MB metadata arena */

/* ==================== CORE STRUCTURES ==================== */

/* Stack trace storage */
typedef struct {
    void *frames[MAX_STACK_FRAMES];
    int num_frames;
} StackTrace;

/* Per-allocation record */
typedef struct AllocationRecord {
    void *ptr;                           /* Allocated pointer */
    size_t size;                         /* Requested size */
    StackTrace stack;                    /* Call stack at malloc */
    struct AllocationRecord *next;       /* Hash chain link */
} AllocationRecord;

/* Hash table for O(1) lookup */
static AllocationRecord *hash_table[HASH_BUCKETS];

/* Memory arena for AllocationRecords */
typedef struct {
    void *base;
    size_t capacity;
    size_t used;
} Arena;

static Arena metadata_arena;

/* Function pointers to real allocation functions */
static void *(*real_malloc)(size_t) = NULL;
static void *(*real_calloc)(size_t, size_t) = NULL;
static void *(*real_realloc)(void *, size_t) = NULL;
static void (*real_free)(void *) = NULL;

/* State tracking */
static __thread int inside_alloc = 0;    /* Recursion guard */
static int initialized = 0;
static pthread_mutex_t track_lock = PTHREAD_MUTEX_INITIALIZER;

/* Statistics */
static size_t total_allocations = 0;
static size_t total_deallocations = 0;
static size_t total_bytes_allocated = 0;
static size_t total_bytes_freed = 0;

Hash Table Operations

┌────────────────────────────────────────────────────────────────────┐
│                    HASH TABLE OPERATIONS                           │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  HASH FUNCTION:                                                    │
│  ──────────────                                                    │
│                                                                     │
│  size_t hash_ptr(void *ptr) {                                      │
│      uintptr_t p = (uintptr_t)ptr;                                 │
│      // Mix bits using multiply-shift                              │
│      p = ((p >> 16) ^ p) * 0x45d9f3b;                              │
│      p = ((p >> 16) ^ p) * 0x45d9f3b;                              │
│      p = (p >> 16) ^ p;                                            │
│      return p % HASH_BUCKETS;                                       │
│  }                                                                  │
│                                                                     │
│                                                                     │
│  INSERT OPERATION (on malloc):                                     │
│  ─────────────────────────────                                     │
│                                                                     │
│  void track_alloc(void *ptr, size_t size, StackTrace *st) {       │
│      size_t idx = hash_ptr(ptr);                                   │
│                                                                     │
│      AllocationRecord *rec = arena_alloc(&metadata_arena,          │
│                                          sizeof(AllocationRecord));│
│      rec->ptr = ptr;                                               │
│      rec->size = size;                                             │
│      rec->stack = *st;                                             │
│                                                                     │
│      // Insert at head of bucket chain                             │
│      rec->next = hash_table[idx];                                  │
│      hash_table[idx] = rec;                                        │
│  }                                                                  │
│                                                                     │
│                                                                     │
│  REMOVE OPERATION (on free):                                       │
│  ───────────────────────────                                       │
│                                                                     │
│  AllocationRecord *track_free(void *ptr) {                         │
│      size_t idx = hash_ptr(ptr);                                   │
│                                                                     │
│      AllocationRecord *prev = NULL;                                │
│      AllocationRecord *curr = hash_table[idx];                     │
│                                                                     │
│      while (curr) {                                                 │
│          if (curr->ptr == ptr) {                                   │
│              // Found it - remove from chain                       │
│              if (prev)                                              │
│                  prev->next = curr->next;                          │
│              else                                                   │
│                  hash_table[idx] = curr->next;                     │
│              return curr;  // Return record for stats              │
│          }                                                          │
│          prev = curr;                                               │
│          curr = curr->next;                                         │
│      }                                                              │
│      return NULL;  // Not found - wasn't tracked (or double-free) │
│  }                                                                  │
│                                                                     │
│                                                                     │
│  VISUALIZATION:                                                    │
│  ──────────────                                                    │
│                                                                     │
│  hash_table[0] ──→ [0x7f8a..., 1024B] ──→ NULL                    │
│  hash_table[1] ──→ NULL                                            │
│  hash_table[2] ──→ [0x7f8b..., 256B] ──→ [0x7f8c..., 64B] ──→ NULL│
│  hash_table[3] ──→ [0x7f8d..., 512B] ──→ NULL                     │
│  ...                                                                │
│                                                                     │
│  After free(0x7f8b...):                                            │
│                                                                     │
│  hash_table[2] ──→ [0x7f8c..., 64B] ──→ NULL  (0x7f8b removed)    │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Interposition Flow

┌────────────────────────────────────────────────────────────────────┐
│                    MALLOC INTERPOSITION FLOW                       │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  Application calls malloc(100)                                     │
│         │                                                          │
│         ▼                                                          │
│  ┌────────────────────────────────────────────────────────────┐    │
│  │  YOUR malloc() [in libleakcheck.so]                         │    │
│  │                                                              │    │
│  │  1. Check recursion guard                                   │    │
│  │     ┌─────────────────────────────────────────┐            │    │
│  │     │ if (inside_alloc) {                     │            │    │
│  │     │     return real_malloc(size);           │            │    │
│  │     │ }                                        │            │    │
│  │     │ inside_alloc = 1;                       │            │    │
│  │     └─────────────────────────────────────────┘            │    │
│  │                                                              │    │
│  │  2. Capture stack trace                                     │    │
│  │     ┌─────────────────────────────────────────┐            │    │
│  │     │ StackTrace st;                          │            │    │
│  │     │ st.num_frames = backtrace(st.frames,    │            │    │
│  │     │                           MAX_FRAMES);  │            │    │
│  │     └─────────────────────────────────────────┘            │    │
│  │                                                              │    │
│  │  3. Call real malloc                                        │    │
│  │     ┌─────────────────────────────────────────┐            │    │
│  │     │ void *ptr = real_malloc(size);          │  ───────┐  │    │
│  │     └─────────────────────────────────────────┘         │  │    │
│  │                                                          │  │    │
│  │  4. Record allocation                                    │  │    │
│  │     ┌─────────────────────────────────────────┐         │  │    │
│  │     │ if (ptr) {                              │         │  │    │
│  │     │     pthread_mutex_lock(&track_lock);    │         │  │    │
│  │     │     track_alloc(ptr, size, &st);        │         │  │    │
│  │     │     total_allocations++;                │         │  │    │
│  │     │     total_bytes_allocated += size;      │         │  │    │
│  │     │     pthread_mutex_unlock(&track_lock);  │         │  │    │
│  │     │ }                                        │         │  │    │
│  │     └─────────────────────────────────────────┘         │  │    │
│  │                                                          │  │    │
│  │  5. Return pointer                                       │  │    │
│  │     ┌─────────────────────────────────────────┐         │  │    │
│  │     │ inside_alloc = 0;                       │         │  │    │
│  │     │ return ptr;                              │  ◄──────┘  │    │
│  │     └─────────────────────────────────────────┘            │    │
│  │                                                              │    │
│  └────────────────────────────────────────────────────────────┘    │
│         │                                                          │
│         ▼                                                          │
│  Application receives pointer                                      │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Implementation Guide

The Core Question You’re Answering

“When a program has a memory leak, how can I find exactly which line of code allocated the leaked memory without modifying the source code?”

This project teaches you that you can transparently intercept ANY library function call at runtime, perform your own bookkeeping, and then delegate to the original function. This is the foundation for debugging tools, profilers, and security monitoring. You’ll understand why Valgrind can find leaks without source code, how LD_PRELOAD enables powerful runtime tooling, and the intricate dance required to safely wrap functions that you yourself depend on.


Concepts You Must Understand First

Before starting this project, ensure you understand these concepts:

Concept Why It Matters Where to Learn
Dynamic linking and loading You’re intercepting the linker’s symbol resolution CS:APP 7.6-7.12
The dlsym() function You need it to call the original malloc man dlsym, CS:APP 7.11
Process memory layout Understanding where heap, stack, libraries live CS:APP 9.7
Stack frames and unwinding Stack traces require walking the call stack CS:APP 3.7, debugger docs
Thread-local storage Recursion guards need per-thread state TLPI Ch. 31
mmap() system call Allocating memory without malloc CS:APP 9.8, man mmap
Hash table implementation O(1) allocation lookup Any algorithms book
atexit/constructors/destructors Running code at library load/unload GCC docs

Self-Assessment Questions:

  1. What does LD_PRELOAD do and why does it affect symbol resolution?
  2. What is RTLD_NEXT and when would you use it?
  3. Why can’t you just call malloc() to allocate your tracking structures?
  4. How does backtrace() work (what does it read)?
  5. What happens if your interposed malloc calls a function that calls malloc?

Questions to Guide Your Design

Work through these questions BEFORE writing code:

  1. Bootstrap Problem: How will you get the real malloc pointer without causing infinite recursion?

  2. Metadata Storage: How will you allocate memory for your tracking structures without using malloc?

  3. Stack Trace Cost: backtrace() is expensive. Will you capture it for every allocation or sample?

  4. Thread Safety: How will you protect your hash table from concurrent access?

  5. Recursion Detection: How will you detect when your tracking code triggers another malloc?

  6. Exit Handling: How will you ensure your leak report runs even if the program crashes?

  7. Performance: What’s the acceptable overhead? 2x? 5x? 10x?

  8. realloc Handling: realloc can move data. How do you track the old and new pointers?


Thinking Exercise

Before writing any code, trace through this scenario by hand:

// Program being tested
int main() {
    char *a = malloc(100);  // #1
    char *b = malloc(200);  // #2
    free(a);                // #3
    char *c = malloc(300);  // #4
    // exit without freeing b or c
    return 0;
}

Exercise: Draw the state of your hash table after each numbered line:

  1. After malloc(100): What’s in the hash table? What’s in the allocation record?

  2. After malloc(200): Two entries now. Do they hash to the same bucket?

  3. After free(a): How do you find and remove the record?

  4. After malloc(300): What’s the final state?

  5. At exit: Walk through your leak reporting code. What will it print?

Verify your understanding by implementing and adding debug output.


Interview Questions They’ll Ask

After completing this project, you’ll be ready for these common interview questions:

  1. “How does LD_PRELOAD work?”
    • Expected: It prepends a shared library to the search path, so its symbols are found first during dynamic linking
    • Bonus: Explain symbol resolution order, show how to call original function with dlsym(RTLD_NEXT, …)
  2. “What’s the challenge of wrapping malloc with another malloc-dependent function?”
    • Expected: Recursion - if your wrapper calls something that calls malloc, infinite loop
    • Bonus: Describe bootstrap allocator, thread-local recursion guards, constructor-based initialization
  3. “How would you implement a memory leak detector?”
    • Expected: Track allocations in a hash table, match with frees, report unmatched on exit
    • Bonus: Discuss stack trace capture, mmap for metadata, handling realloc
  4. “Why can’t you just use malloc inside your interposed malloc?”
    • Expected: Your malloc IS the malloc, so you’d recurse infinitely
    • Bonus: Explain mmap-backed arena, static bootstrap buffer, recursion detection
  5. “How does Valgrind detect memory errors?”
    • Expected: Binary instrumentation - it interprets every instruction
    • Bonus: Compare with LD_PRELOAD approach (lighter but less coverage), AddressSanitizer (compile-time shadow memory)
  6. “How would you make your leak detector thread-safe?”
    • Expected: Mutex around hash table operations, thread-local recursion guards
    • Bonus: Discuss lock-free alternatives, per-thread tracking with merge at exit

Hints in Layers

If you’re stuck, reveal hints one at a time:

Hint 1: Basic Structure and Initialization

Start with the basic library structure:

#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <execinfo.h>
#include <pthread.h>
#include <sys/mman.h>
#include <unistd.h>

/* Function pointers to real implementations */
static void *(*real_malloc)(size_t) = NULL;
static void (*real_free)(void *) = NULL;

/* Called when library is loaded */
__attribute__((constructor))
static void init(void) {
    real_malloc = dlsym(RTLD_NEXT, "malloc");
    real_free = dlsym(RTLD_NEXT, "free");

    if (!real_malloc || !real_free) {
        fprintf(stderr, "Error: dlsym failed\n");
        _exit(1);
    }

    /* Initialize your tracking structures here */
}

/* Called when library is unloaded */
__attribute__((destructor))
static void fini(void) {
    /* Print leak report here */
}

The constructor runs BEFORE main(), the destructor runs AFTER main() returns.

Hint 2: Handling the Bootstrap/Recursion Problem

Use a thread-local flag and a bootstrap buffer:

#define BOOTSTRAP_SIZE 65536
static char bootstrap_buffer[BOOTSTRAP_SIZE];
static size_t bootstrap_used = 0;
static __thread int inside_alloc = 0;

/* Simple allocator for bootstrap phase */
static void *bootstrap_alloc(size_t size) {
    size = (size + 15) & ~15;  /* Align to 16 bytes */
    if (bootstrap_used + size > BOOTSTRAP_SIZE) {
        write(2, "Bootstrap OOM\n", 14);
        _exit(1);
    }
    void *ptr = bootstrap_buffer + bootstrap_used;
    bootstrap_used += size;
    return ptr;
}

static int is_bootstrap_ptr(void *ptr) {
    return ptr >= (void *)bootstrap_buffer &&
           ptr < (void *)(bootstrap_buffer + BOOTSTRAP_SIZE);
}

void *malloc(size_t size) {
    /* During initialization or recursion, use bootstrap/real */
    if (!real_malloc || inside_alloc) {
        if (real_malloc)
            return real_malloc(size);
        return bootstrap_alloc(size);
    }

    inside_alloc = 1;

    /* Your tracking code here */
    void *ptr = real_malloc(size);
    /* ... track allocation ... */

    inside_alloc = 0;
    return ptr;
}

void free(void *ptr) {
    if (!ptr) return;
    if (is_bootstrap_ptr(ptr)) return;  /* Never free bootstrap memory */

    if (!real_free || inside_alloc) {
        if (real_free) real_free(ptr);
        return;
    }

    inside_alloc = 1;
    /* ... track deallocation ... */
    inside_alloc = 0;

    real_free(ptr);
}
Hint 3: Stack Trace Capture

Capture and store stack traces efficiently:

#define MAX_STACK_FRAMES 16

typedef struct {
    void *frames[MAX_STACK_FRAMES];
    int num_frames;
} StackTrace;

static void capture_stack(StackTrace *st) {
    /* backtrace() fills array with return addresses */
    st->num_frames = backtrace(st->frames, MAX_STACK_FRAMES);
}

static void print_stack(StackTrace *st, FILE *out) {
    /* backtrace_symbols() converts addresses to strings */
    char **symbols = backtrace_symbols(st->frames, st->num_frames);
    if (symbols) {
        for (int i = 2; i < st->num_frames; i++) {  /* Skip 2: malloc wrapper + capture_stack */
            fprintf(out, "    #%d %s\n", i - 2, symbols[i]);
        }
        /* Note: symbols must be freed! But we use real_free */
        real_free(symbols);
    }
}

For better symbol names, compile target programs with -g -rdynamic.

Hint 4: mmap-Backed Arena for Metadata

Allocate tracking structures without using malloc:

typedef struct Arena {
    void *base;
    size_t capacity;
    size_t used;
} Arena;

static Arena metadata_arena;

static void arena_init(Arena *a, size_t size) {
    a->base = mmap(NULL, size, PROT_READ | PROT_WRITE,
                   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (a->base == MAP_FAILED) {
        write(2, "mmap failed\n", 12);
        _exit(1);
    }
    a->capacity = size;
    a->used = 0;
}

static void *arena_alloc(Arena *a, size_t size) {
    size = (size + 15) & ~15;  /* Align */

    if (a->used + size > a->capacity) {
        /* Could grow arena here with mremap() */
        return NULL;
    }

    void *ptr = (char *)a->base + a->used;
    a->used += size;
    return ptr;
}

/* Use in init(): */
/* arena_init(&metadata_arena, 16 * 1024 * 1024);  // 16MB */
Hint 5: Complete Hash Table Implementation

Putting it all together:

typedef struct AllocationRecord {
    void *ptr;
    size_t size;
    StackTrace stack;
    struct AllocationRecord *next;
} AllocationRecord;

#define HASH_BUCKETS 65537

static AllocationRecord *hash_table[HASH_BUCKETS];
static pthread_mutex_t hash_lock = PTHREAD_MUTEX_INITIALIZER;

static size_t hash_ptr(void *ptr) {
    uintptr_t p = (uintptr_t)ptr;
    p = ((p >> 16) ^ p) * 0x45d9f3b;
    p = ((p >> 16) ^ p) * 0x45d9f3b;
    p = (p >> 16) ^ p;
    return p % HASH_BUCKETS;
}

static void track_alloc(void *ptr, size_t size, StackTrace *st) {
    AllocationRecord *rec = arena_alloc(&metadata_arena, sizeof(*rec));
    if (!rec) return;  /* Out of metadata space */

    rec->ptr = ptr;
    rec->size = size;
    rec->stack = *st;

    size_t idx = hash_ptr(ptr);

    pthread_mutex_lock(&hash_lock);
    rec->next = hash_table[idx];
    hash_table[idx] = rec;
    pthread_mutex_unlock(&hash_lock);
}

static AllocationRecord *track_free(void *ptr) {
    size_t idx = hash_ptr(ptr);

    pthread_mutex_lock(&hash_lock);

    AllocationRecord *prev = NULL;
    AllocationRecord *curr = hash_table[idx];

    while (curr) {
        if (curr->ptr == ptr) {
            if (prev)
                prev->next = curr->next;
            else
                hash_table[idx] = curr->next;
            pthread_mutex_unlock(&hash_lock);
            return curr;
        }
        prev = curr;
        curr = curr->next;
    }

    pthread_mutex_unlock(&hash_lock);
    return NULL;  /* Not found */
}

/* In destructor, iterate all buckets to find leaks */
static void report_leaks(void) {
    size_t leak_count = 0;
    size_t leak_bytes = 0;

    for (size_t i = 0; i < HASH_BUCKETS; i++) {
        AllocationRecord *rec = hash_table[i];
        while (rec) {
            leak_count++;
            leak_bytes += rec->size;

            fprintf(stderr, "\nLeak: %zu bytes at %p\n", rec->size, rec->ptr);
            fprintf(stderr, "  Allocated at:\n");
            print_stack(&rec->stack, stderr);

            rec = rec->next;
        }
    }

    fprintf(stderr, "\n=== SUMMARY ===\n");
    fprintf(stderr, "Leaked: %zu allocations, %zu bytes\n", leak_count, leak_bytes);
}

Testing Strategy

Test Programs

Create test programs with known leaks:

/* test_simple_leak.c - Single intentional leak */
#include <stdlib.h>
int main() {
    void *p = malloc(1234);
    /* Intentionally not freed */
    return 0;
}
/* Expected: Report 1 leak of 1234 bytes */

/* test_no_leaks.c - All memory freed */
#include <stdlib.h>
int main() {
    void *p1 = malloc(100);
    void *p2 = malloc(200);
    void *p3 = malloc(300);
    free(p1);
    free(p2);
    free(p3);
    return 0;
}
/* Expected: No leaks reported */

/* test_multiple_leaks.c - Multiple leaks from same location */
#include <stdlib.h>
void leak_in_loop() {
    for (int i = 0; i < 10; i++) {
        malloc(100);  /* 10 leaks from same line */
    }
}
int main() {
    leak_in_loop();
    return 0;
}
/* Expected: 10 leaks, ideally grouped by call site */

/* test_realloc.c - Test realloc handling */
#include <stdlib.h>
int main() {
    void *p = malloc(100);
    p = realloc(p, 200);  /* Should track as single allocation */
    /* Leak the realloc'd memory */
    return 0;
}
/* Expected: 1 leak of 200 bytes */

/* test_threads.c - Multi-threaded allocations */
#include <stdlib.h>
#include <pthread.h>

void *thread_func(void *arg) {
    void *p = malloc(100);
    /* Leak */
    return NULL;
}

int main() {
    pthread_t threads[4];
    for (int i = 0; i < 4; i++) {
        pthread_create(&threads[i], NULL, thread_func, NULL);
    }
    for (int i = 0; i < 4; i++) {
        pthread_join(threads[i], NULL);
    }
    return 0;
}
/* Expected: 4 leaks of 100 bytes each, no crashes */

Test Runner Script

#!/bin/bash

# Build the leak checker
gcc -shared -fPIC -g -O2 -o libleakcheck.so leakcheck.c -ldl -lpthread

# Build test programs with debug info
for test in test_*.c; do
    gcc -g -rdynamic -o "${test%.c}" "$test" -lpthread
done

# Run tests
run_test() {
    echo "=== Running $1 ==="
    LD_PRELOAD=./libleakcheck.so "./$1"
    echo ""
}

run_test test_simple_leak
run_test test_no_leaks
run_test test_multiple_leaks
run_test test_realloc
run_test test_threads

echo "=== All tests completed ==="

Validation Criteria

┌────────────────────────────────────────────────────────────────────┐
│                    VALIDATION CHECKLIST                            │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  CORRECTNESS:                                                      │
│  ─────────────                                                     │
│  [ ] Reports known leaks accurately                                │
│  [ ] No false positives (reports correct programs as having leaks) │
│  [ ] Handles malloc/calloc/realloc/free correctly                 │
│  [ ] realloc(NULL, size) treated as malloc                        │
│  [ ] realloc(ptr, 0) treated as free                              │
│  [ ] free(NULL) is a no-op                                         │
│  [ ] Stack traces point to actual allocation site                 │
│                                                                     │
│  ROBUSTNESS:                                                       │
│  ────────────                                                      │
│  [ ] Handles double-free gracefully (warn, don't crash)           │
│  [ ] Works with multi-threaded programs                           │
│  [ ] No recursion overflow from internal allocations              │
│  [ ] Handles programs that don't call malloc                      │
│  [ ] Works with programs that fork()                              │
│                                                                     │
│  USABILITY:                                                        │
│  ───────────                                                       │
│  [ ] Clear, readable output format                                 │
│  [ ] Leaks sorted by size or grouped by call site                 │
│  [ ] Summary statistics (total allocs, frees, leaks)              │
│  [ ] Works with any program without recompilation                  │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Common Pitfalls and Debugging

Pitfall 1: Infinite Recursion

┌────────────────────────────────────────────────────────────────────┐
│  SYMPTOM: Stack overflow or hang on first malloc                   │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  CAUSE: Your malloc calls something that calls malloc             │
│                                                                     │
│  Common triggers:                                                   │
│  - dlsym() during initialization                                   │
│  - backtrace_symbols() in every allocation                        │
│  - fprintf() for debug output                                      │
│  - pthread_mutex_lock() (can allocate internally!)                │
│                                                                     │
│  FIX: Use recursion guard + bootstrap allocator:                   │
│                                                                     │
│  static __thread int inside_alloc = 0;                             │
│                                                                     │
│  void *malloc(size_t size) {                                        │
│      if (inside_alloc)                                              │
│          return real_malloc(size);  /* Skip tracking */            │
│      inside_alloc = 1;                                              │
│      /* ... tracking code ... */                                   │
│      inside_alloc = 0;                                              │
│  }                                                                  │
│                                                                     │
│  VERIFICATION:                                                      │
│  - Add debug output BEFORE and AFTER tracking                      │
│  - Use write() instead of printf() for safe output                │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Pitfall 2: dlsym Returns NULL

┌────────────────────────────────────────────────────────────────────┐
│  SYMPTOM: dlsym(RTLD_NEXT, "malloc") returns NULL                 │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  CAUSES:                                                           │
│  1. Library not loaded with LD_PRELOAD                            │
│  2. Symbol name misspelled                                         │
│  3. Missing #define _GNU_SOURCE before includes                   │
│                                                                     │
│  FIX:                                                              │
│                                                                     │
│  #define _GNU_SOURCE  /* MUST be before ANY includes! */          │
│  #include <dlfcn.h>                                                 │
│  #include <stdio.h>                                                 │
│                                                                     │
│  /* Check and report errors: */                                    │
│  real_malloc = dlsym(RTLD_NEXT, "malloc");                         │
│  if (!real_malloc) {                                                │
│      fprintf(stderr, "dlsym error: %s\n", dlerror());              │
│      _exit(1);                                                      │
│  }                                                                  │
│                                                                     │
│  VERIFICATION:                                                      │
│  - Check LD_PRELOAD is set correctly                               │
│  - Try running: LD_DEBUG=bindings LD_PRELOAD=./lib.so ./program   │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Pitfall 3: Thread Safety Issues

┌────────────────────────────────────────────────────────────────────┐
│  SYMPTOM: Crashes or corrupted data with multi-threaded programs  │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  CAUSES:                                                           │
│  1. Hash table modified by multiple threads simultaneously        │
│  2. Global counters modified without atomics                       │
│  3. Recursion guard is not thread-local                           │
│                                                                     │
│  FIXES:                                                            │
│                                                                     │
│  /* Make recursion guard thread-local */                           │
│  static __thread int inside_alloc = 0;                             │
│                                                                     │
│  /* Protect hash table with mutex */                               │
│  static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;          │
│                                                                     │
│  void track_alloc(...) {                                            │
│      pthread_mutex_lock(&lock);                                     │
│      /* ... modify hash table ... */                               │
│      pthread_mutex_unlock(&lock);                                   │
│  }                                                                  │
│                                                                     │
│  /* Or use atomic operations for counters */                       │
│  __atomic_add_fetch(&total_allocations, 1, __ATOMIC_RELAXED);     │
│                                                                     │
│  WARNING: pthread_mutex_lock may itself call malloc!               │
│  Ensure recursion guard is checked BEFORE acquiring lock.         │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Pitfall 4: Wrong Stack Trace

┌────────────────────────────────────────────────────────────────────┐
│  SYMPTOM: Stack traces are wrong or missing symbols               │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  CAUSES:                                                           │
│  1. Target program compiled without -g -rdynamic                  │
│  2. backtrace() called at wrong point (too many wrapper frames)   │
│  3. Frame pointers optimized away                                  │
│                                                                     │
│  FIXES:                                                            │
│                                                                     │
│  /* Compile target with: */                                        │
│  gcc -g -rdynamic -fno-omit-frame-pointer program.c               │
│                                                                     │
│  /* Skip wrapper frames when printing: */                          │
│  for (int i = 2; i < st->num_frames; i++) {  /* Skip 2 frames */  │
│      /* frame 0: capture_stack() */                                │
│      /* frame 1: your malloc() */                                  │
│      /* frame 2+: actual caller */                                 │
│      print_frame(st->frames[i]);                                   │
│  }                                                                  │
│                                                                     │
│  /* For production, use addr2line: */                              │
│  $ addr2line -e ./program -f 0x555555555335                        │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Pitfall 5: macOS Differences

┌────────────────────────────────────────────────────────────────────┐
│  SYMPTOM: Works on Linux, fails on macOS                          │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  DIFFERENCES:                                                      │
│  1. Use DYLD_INSERT_LIBRARIES instead of LD_PRELOAD               │
│  2. Library extension is .dylib, not .so                          │
│  3. Need DYLD_FORCE_FLAT_NAMESPACE=1 for interposition            │
│  4. SIP may block interposition for system binaries               │
│                                                                     │
│  macOS USAGE:                                                      │
│                                                                     │
│  # Build                                                            │
│  gcc -dynamiclib -o libleakcheck.dylib leakcheck.c                │
│                                                                     │
│  # Run (need both environment variables)                           │
│  DYLD_INSERT_LIBRARIES=./libleakcheck.dylib \                     │
│  DYLD_FORCE_FLAT_NAMESPACE=1 \                                     │
│  ./my_program                                                       │
│                                                                     │
│  # Note: May not work with SIP-protected binaries                 │
│  # Test with your own compiled programs first                      │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Extensions and Challenges

Beginner Extensions

  • Allocation statistics: Track peak memory usage, allocation counts by size
  • Environment variable control: LEAKCHECK_ENABLED, LEAKCHECK_OUTPUT_FILE
  • Grouped output: Show “N allocations from same call site” instead of N separate entries
  • Timestamp tracking: Record when each allocation was made

Intermediate Extensions

  • Double-free detection: Detect and report when memory is freed twice
  • Use-after-free detection: Mark freed memory as poisoned, detect access
  • Allocation lifetime stats: Track how long memory lives before being freed
  • JSON output: Emit structured data for CI/CD integration
  • Sampling mode: Only track every Nth allocation for lower overhead

Advanced Extensions

  • Shadow memory: Like AddressSanitizer, track validity of each byte
  • Heap profiler: Track memory over time, generate flamegraph data
  • Lock-free tracking: Use atomic operations instead of mutexes
  • Per-thread arenas: Each thread gets its own tracking arena, merge at exit
  • Call graph analysis: Build allocation graph showing which functions allocate most

Detection Extensions

┌────────────────────────────────────────────────────────────────────┐
│                    ADVANCED DETECTION                              │
├────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  DOUBLE-FREE DETECTION:                                            │
│  ──────────────────────                                            │
│                                                                     │
│  void free(void *ptr) {                                             │
│      AllocationRecord *rec = track_free(ptr);                      │
│      if (!rec && ptr != NULL) {                                    │
│          /* Not in our table - either double-free or not tracked */│
│          fprintf(stderr, "WARNING: free(%p) - not allocated!\n",   │
│                  ptr);                                              │
│          print_current_stack();  /* Where the bad free happened */ │
│          return;  /* Don't call real_free to avoid crash */        │
│      }                                                              │
│      real_free(ptr);                                                │
│  }                                                                  │
│                                                                     │
│                                                                     │
│  USE-AFTER-FREE DETECTION:                                         │
│  ─────────────────────────                                         │
│                                                                     │
│  On free(ptr):                                                      │
│    1. Fill memory with poison pattern (0xDEADBEEF)                │
│    2. Keep record with "freed" flag                                │
│    3. Delay actual free (quarantine)                               │
│                                                                     │
│  On malloc(size):                                                   │
│    1. Check if memory matches poison pattern                       │
│    2. If yes, likely use-after-free (someone wrote to freed mem)  │
│                                                                     │
│  This is a simplified version of what AddressSanitizer does!      │
│                                                                     │
└────────────────────────────────────────────────────────────────────┘

Books That Will Help

Topic Book Chapter/Section
Library interposition CS:APP 3rd Ed Chapter 7.13 “Library Interpositioning”
Dynamic linking CS:APP 3rd Ed Chapter 7.6-7.12 “Linking”
Heap and malloc CS:APP 3rd Ed Chapter 9.9 “Dynamic Memory Allocation”
Memory layout CS:APP 3rd Ed Chapter 9.7 “Memory Mapping”
Stack frames CS:APP 3rd Ed Chapter 3.7 “Procedures”
Shared libraries The Linux Programming Interface Chapter 41-42 “Shared Libraries”
Memory allocation The Linux Programming Interface Chapter 7 “Memory Allocation”
Thread safety The Linux Programming Interface Chapter 30-31 “Threads”
Dynamic loading Advanced Programming in the Unix Environment Chapter 7
Process memory Understanding the Linux Kernel Chapter 9

Self-Assessment Checklist

Understanding

  • I can explain how LD_PRELOAD changes symbol resolution
  • I understand why RTLD_NEXT is necessary to call the real malloc
  • I can describe the bootstrap problem and how to solve it
  • I understand why thread-local storage is needed for recursion guards
  • I can explain how backtrace() captures the call stack
  • I understand the trade-offs between different tracking data structures

Implementation

  • My library loads correctly with LD_PRELOAD
  • malloc/calloc/realloc/free are all interposed
  • The bootstrap allocator prevents initialization recursion
  • Stack traces are captured and printed correctly
  • The hash table correctly tracks allocations and frees
  • The leak report runs at program exit
  • Multi-threaded programs work correctly

Testing

  • Simple leak is detected with correct size and stack trace
  • Clean programs report no leaks (no false positives)
  • Multiple leaks from same location are handled
  • realloc() is tracked correctly
  • Multi-threaded allocations are tracked correctly
  • Performance overhead is acceptable (<5x)

Growth

  • I debugged at least one recursion issue
  • I understand how Valgrind and ASan compare to my approach
  • I can discuss the design trade-offs in an interview
  • I implemented at least one extension (double-free, etc.)

Submission / Completion Criteria

Minimum Viable Completion:

  • Library loads and interposes malloc/free
  • Tracks allocations with pointer and size
  • Reports leaks on program exit
  • Works for single-threaded programs

Full Completion:

  • All of the above plus calloc/realloc
  • Stack trace capture working
  • Thread-safe implementation
  • No false positives on clean programs
  • Clear, organized output format

Excellence (Going Above & Beyond):

  • Double-free detection
  • Grouped output by call site
  • JSON output option
  • Performance benchmarks
  • Comparison with Valgrind output
  • Use-after-free detection

This guide was expanded from CSAPP_3E_DEEP_LEARNING_PROJECTS.md. For the complete learning path, see the project index.