Project 3: Memory Leak Detector

The Core Question: “Who owns this memory, and when does it become invalid?”

Project Overview

Attribute	Value
Difficulty	Intermediate
Time Estimate	1-2 weeks
Language	C
Prerequisites	Comfort with structs, linked lists, macros
Main Book	“Understanding and Using C Pointers” by Richard Reese

Learning Objectives

By completing this project, you will:

Understand object lifetime - When memory is “alive” vs “dead”
Master ownership semantics - Who is responsible for calling free()
Detect common memory bugs - Leaks, double-free, use-after-free
Build debugging infrastructure - Track allocations with file/line info
Think like a tool author - Understand how Valgrind/ASan work

Theoretical Foundation

The Three Pointer States

Every pointer in C is in exactly one of three states:

┌────────────────────────────────────────────────────────────────┐
│ POINTER STATES                                                  │
├────────────────────────────────────────────────────────────────┤
│                                                                  │
│  1. NULL                                                         │
│     int *p = NULL;                                              │
│     - Explicitly points to nothing                              │
│     - Safe to check: if (p == NULL)                            │
│     - Dereferencing crashes (usually)                          │
│                                                                  │
│  2. VALID                                                        │
│     int *p = malloc(sizeof(int));                               │
│     - Points to allocated, live memory                          │
│     - Safe to dereference: *p = 42;                            │
│     - Owner must eventually free                                │
│                                                                  │
│  3. DANGLING (INVALID)                                          │
│     int *p = malloc(sizeof(int));                               │
│     free(p);                                                    │
│     // p is now DANGLING - still contains old address           │
│     // but memory is no longer valid                            │
│     *p = 42;  // UNDEFINED BEHAVIOR!                            │
│                                                                  │
└────────────────────────────────────────────────────────────────┘

The danger of dangling pointers: They look exactly like valid pointers. There’s no runtime check. The old address is still there—it’s just that the memory at that address is no longer yours.

Object Lifetime

Every piece of dynamically allocated memory has a lifetime:

Timeline:
─────────────────────────────────────────────────────────────────>
        │                              │
        │  malloc()                    │  free()
        │  ↓                           │  ↓
        │  ╔══════════════════════════╗│
        │  ║    VALID LIFETIME        ║│
        │  ║    Safe to use pointer   ║│
        │  ╚══════════════════════════╝│
        │                              │
   ─────┼──────────────────────────────┼───────────────────────────>
     BEFORE                          AFTER
   (unallocated)                   (freed - DANGLING!)

Memory Bugs: The Rogues Gallery

Bug 1: Memory Leak

void leak() {
    int *p = malloc(1000);
    // Function returns without free(p)
    // Those 1000 bytes are now UNREACHABLE but still ALLOCATED
}

// Call leak() 1000 times → 1MB of leaked memory

Symptoms: Memory usage grows over time. Long-running programs eventually crash with OOM.

Bug 2: Double-Free

int *p = malloc(sizeof(int));
free(p);
free(p);  // BUG! p was already freed

// Consequences:
// - Heap metadata corruption
// - Security vulnerability (exploitation possible)
// - Crash (if lucky)

Bug 3: Use-After-Free

int *p = malloc(sizeof(int));
*p = 42;
free(p);

// Later...
printf("%d\n", *p);  // BUG! Reading freed memory

// Consequences:
// - Reads garbage (old value might still be there!)
// - Reads new data (if memory was reused)
// - Security vulnerability (attacker controls reused memory)

Bug 4: Invalid Free

int x = 42;
free(&x);  // BUG! Can't free stack memory

char *str = "hello";
free(str);  // BUG! Can't free string literals

Why Bugs Appear Later Than Mistakes

This is the critical insight that makes memory bugs so hard:

int *p = malloc(sizeof(int));
*p = 42;
free(p);

// p is now dangling, but...
// The memory at *p might still contain 42!
// The heap hasn't reused it yet.

printf("%d\n", *p);  // Might print 42! "Works!"

// Much later, in unrelated code:
int *q = malloc(sizeof(int));
*q = 100;
// q might get the same address as p was!

printf("%d\n", *p);  // NOW prints 100 (or crashes)

The bug (use-after-free) was committed when we used *p after free(p). But the symptom only appeared when the memory was reused.

Project Specification

What You’re Building

A wrapper around malloc/free that:

Tracks all allocations - Registry of (address, size, file, line)
Detects memory leaks - Report unfreed allocations at program exit
Catches double-free - Error if freeing already-freed pointer
Warns about use-after-free - Poison freed memory to make bugs visible
Provides debugging info - Show WHERE the allocation happened

Core API

// Macros that capture file/line
#define malloc(size) debug_malloc(size, __FILE__, __LINE__)
#define free(ptr) debug_free(ptr, __FILE__, __LINE__)

// Underlying functions
void* debug_malloc(size_t size, const char* file, int line);
void debug_free(void* ptr, const char* file, int line);

// Reporting
void leak_check_report(void);    // Call at program exit
size_t get_active_allocations(void);  // Currently allocated count
size_t get_total_allocated_bytes(void);

Sample Output

$ ./my_program

=== Memory Leak Detector Report ===
[LEAK] 100 bytes allocated at test.c:7 (create_user) never freed
       Address: 0x600000004000
[LEAK] 100 bytes allocated at test.c:7 (create_user) never freed
       Address: 0x600000004100
[LEAK] 100 bytes allocated at test.c:7 (create_user) never freed
       Address: 0x600000004200

Total leaks: 3 allocations, 300 bytes

Solution Architecture

Module Design

leak_detector/
├── leak_detector.h     # Public API and macros
├── leak_detector.c     # Implementation
├── test_leak.c         # Memory leak test
├── test_double_free.c  # Double-free test
├── test_use_after_free.c # Use-after-free test
├── test_clean.c        # Clean program (no leaks)
└── Makefile

Data Structures

// Track one allocation
typedef struct Allocation {
    void *ptr;              // The allocated address
    size_t size;            // Size in bytes
    const char *file;       // Source file
    int line;               // Line number
    int freed;              // 0 = active, 1 = freed
    struct Allocation *next; // Linked list pointer
} Allocation;

// Global registry
static Allocation *registry_head = NULL;
static int initialized = 0;

Core Algorithm

debug_malloc(size, file, line):
Call real malloc(size)
Create Allocation record with (ptr, size, file, line, freed=0)
Add to registry
Return ptr

debug_free(ptr, file, line):
Find ptr in registry
If not found: WARNING - freeing untracked memory
If found and freed == 1: ERROR - double free!
Mark freed = 1
Fill memory with poison pattern (0xDEADBEEF)
Call real free(ptr)

leak_check_report():
Walk registry
For each entry where freed == 0: report as leak
Print summary statistics

Implementation Guide

Phase 1: Basic Tracking (3-4 hours)

Goal: Track allocations in a linked list.

// leak_detector.c

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

typedef struct Allocation {
    void *ptr;
    size_t size;
    const char *file;
    int line;
    int freed;
    struct Allocation *next;
} Allocation;

static Allocation *head = NULL;

// Store the real malloc before we redefine it
static void* (*real_malloc)(size_t) = NULL;
static void (*real_free)(void*) = NULL;

void init_leak_detector(void) {
    // Save references to real functions
    // (This is simplified - in practice you'd use dlsym)
}

void* debug_malloc(size_t size, const char* file, int line) {
    // Allocate the requested memory plus our tracking
    void *ptr = malloc(size);  // Call the real malloc
    if (ptr == NULL) return NULL;

    // Create tracking record
    Allocation *record = malloc(sizeof(Allocation));
    record->ptr = ptr;
    record->size = size;
    record->file = file;
    record->line = line;
    record->freed = 0;
    record->next = head;
    head = record;

    return ptr;
}

Phase 2: Leak Detection (2-3 hours)

Goal: Report unfreed allocations at exit.

void leak_check_report(void) {
    int leak_count = 0;
    size_t leak_bytes = 0;

    printf("\n=== Memory Leak Detector Report ===\n");

    for (Allocation *a = head; a != NULL; a = a->next) {
        if (!a->freed) {
            printf("[LEAK] %zu bytes allocated at %s:%d never freed\n",
                   a->size, a->file, a->line);
            printf("       Address: %p\n", a->ptr);
            leak_count++;
            leak_bytes += a->size;
        }
    }

    if (leak_count == 0) {
        printf("No memory leaks detected!\n");
    } else {
        printf("\nTotal leaks: %d allocations, %zu bytes\n",
               leak_count, leak_bytes);
    }
}

// Automatically run at program exit
__attribute__((destructor))
void auto_leak_report(void) {
    leak_check_report();
}

Phase 3: Double-Free Detection (2-3 hours)

Goal: Catch attempts to free already-freed memory.

void debug_free(void* ptr, const char* file, int line) {
    if (ptr == NULL) return;  // free(NULL) is valid and does nothing

    // Find this pointer in our registry
    Allocation *found = NULL;
    for (Allocation *a = head; a != NULL; a = a->next) {
        if (a->ptr == ptr) {
            found = a;
            break;
        }
    }

    if (found == NULL) {
        printf("[WARNING] Freeing untracked pointer %p at %s:%d\n",
               ptr, file, line);
        printf("          This may be a bug or memory not allocated via debug_malloc\n");
        free(ptr);
        return;
    }

    if (found->freed) {
        printf("[ERROR] DOUBLE-FREE DETECTED!\n");
        printf("  Pointer: %p\n", ptr);
        printf("  Size: %zu bytes\n", found->size);
        printf("  Originally allocated at: %s:%d\n", found->file, found->line);
        printf("  First freed at: (tracking not implemented yet)\n");
        printf("  Second free attempted at: %s:%d\n", file, line);
        printf("\n*** Aborting to prevent heap corruption ***\n");
        abort();
    }

    found->freed = 1;
    free(ptr);
}

Phase 4: Use-After-Free Detection (2-3 hours)

Goal: Poison freed memory to make UAF bugs visible.

void debug_free(void* ptr, const char* file, int line) {
    // ... (previous checks) ...

    // Before freeing, fill with poison pattern
    // This makes use-after-free more likely to be noticed
    memset(ptr, 0xDE, found->size);

    found->freed = 1;
    free(ptr);
}

Why poisoning helps:

If code reads freed memory and gets 0xDEDEDEDE, it’s obviously wrong
Much easier to debug than reading “old valid data”
Pattern 0xDE is visible in debuggers

Phase 5: Header File and Macros (1-2 hours)

// leak_detector.h

#ifndef LEAK_DETECTOR_H
#define LEAK_DETECTOR_H

#include <stddef.h>

// Initialize (optional - auto-init on first malloc)
void init_leak_detector(void);

// Underlying functions
void* debug_malloc(size_t size, const char* file, int line);
void debug_free(void* ptr, const char* file, int line);

// Reporting
void leak_check_report(void);
size_t get_active_allocations(void);
size_t get_total_allocated_bytes(void);

// Macros to capture file/line automatically
#define malloc(size) debug_malloc(size, __FILE__, __LINE__)
#define free(ptr) debug_free(ptr, __FILE__, __LINE__)

#endif

Testing Strategy

Test 1: Memory Leak Detection

// test_leak.c
#include "leak_detector.h"

void create_user(void) {
    char *name = malloc(100);
    // Oops, forgot to free!
}

int main(void) {
    for (int i = 0; i < 3; i++) {
        create_user();
    }
    return 0;
}

Expected Output:

=== Memory Leak Detector Report ===
[LEAK] 100 bytes at test_leak.c:5 never freed
[LEAK] 100 bytes at test_leak.c:5 never freed
[LEAK] 100 bytes at test_leak.c:5 never freed

Total leaks: 3 allocations, 300 bytes

Test 2: Double-Free Detection

// test_double_free.c
#include "leak_detector.h"

int main(void) {
    int *p = malloc(sizeof(int));
    *p = 42;
    free(p);
    free(p);  // Should trigger error!
    return 0;
}

Test 3: Clean Program

// test_clean.c
#include "leak_detector.h"

int main(void) {
    int *p = malloc(sizeof(int));
    *p = 42;
    free(p);  // Properly freed!

    char *str = malloc(100);
    strcpy(str, "Hello");
    free(str);  // Properly freed!

    return 0;
}

Expected Output:

=== Memory Leak Detector Report ===
No memory leaks detected!
Total allocations: 2
Total frees: 2

Common Pitfalls and Debugging Tips

Pitfall 1: Infinite Recursion

// WRONG - debug_malloc calls malloc which calls debug_malloc...
void* debug_malloc(size_t size, const char* file, int line) {
    Allocation *record = malloc(sizeof(Allocation));  // RECURSION!
    // ...
}

// Solution 1: Use a flag to detect recursion
static int in_debug_malloc = 0;

void* debug_malloc(size_t size, const char* file, int line) {
    if (in_debug_malloc) {
        return real_malloc(size);  // Bypass tracking
    }
    in_debug_malloc = 1;
    // ... do tracking ...
    in_debug_malloc = 0;
}

// Solution 2: Allocate tracking records from separate pool

Pitfall 2: Thread Safety

The simple linked list isn’t thread-safe. For multi-threaded programs:

Use a mutex around registry operations
Or use thread-local storage

Pitfall 3: Missing Some Allocations

Some code might call malloc before your detector initializes:

Use constructor attribute: __attribute__((constructor))
Or intercept at link time with --wrap flag

Extensions and Challenges

Challenge 1: Track Free Location

typedef struct Allocation {
    // ... existing fields ...
    const char *free_file;  // Where it was freed
    int free_line;
} Allocation;

Challenge 2: Memory Usage Statistics

typedef struct {
    size_t total_allocated;
    size_t total_freed;
    size_t peak_usage;
    int allocation_count;
    int free_count;
} MemoryStats;

MemoryStats get_memory_stats(void);

Challenge 3: Quarantine for UAF Detection

Instead of immediately freeing, keep freed memory in a “quarantine”:

#define QUARANTINE_SIZE 100

static void* quarantine[QUARANTINE_SIZE];
static int quarantine_index = 0;

void debug_free(void* ptr, ...) {
    // Add to quarantine instead of immediate free
    if (quarantine[quarantine_index] != NULL) {
        real_free(quarantine[quarantine_index]);
    }
    quarantine[quarantine_index] = ptr;
    quarantine_index = (quarantine_index + 1) % QUARANTINE_SIZE;
}

This delays reuse, making UAF bugs more likely to crash (and be detected).

Challenge 4: Stack Trace Capture

On Linux/macOS, capture the call stack at allocation time:

#include <execinfo.h>

typedef struct Allocation {
    // ... existing fields ...
    void* backtrace[10];
    int backtrace_size;
} Allocation;

Real-World Connections

Connection 1: How Valgrind Works

Valgrind uses similar techniques:

Intercepts malloc/free calls
Maintains shadow memory to track validity
Reports errors with stack traces

Your detector is a simplified version of Valgrind’s Memcheck.

Connection 2: AddressSanitizer

ASan (AddressSanitizer) uses compiler instrumentation:

Adds checks around every memory access
Much faster than Valgrind (2x slowdown vs 20x)
Catches more bugs (stack buffer overflows too)

Connection 3: Production Memory Allocators

Production allocators like jemalloc and tcmalloc:

Track statistics
Detect some corruption
Profile memory usage

Interview Questions You Can Now Answer

“What is a memory leak? How do you detect them?”
- Memory allocated but never freed
- Detect by tracking allocations and checking at exit
“What is use-after-free? Why is it dangerous?”
- Accessing memory after it’s freed
- Dangerous because memory might be reused, leading to corruption or security bugs
“What is double-free? What can go wrong?”
- Calling free() twice on same pointer
- Corrupts heap allocator metadata, can lead to exploitation
“How does Valgrind detect memory errors?”
- Intercepts malloc/free
- Tracks every byte’s validity state
- Checks every memory access
“What’s the difference between a dangling pointer and a NULL pointer?”
- NULL: explicitly points to nothing, checkable
- Dangling: points to freed memory, looks valid but isn’t
“If you free memory, why can you sometimes still read from it?”
- Memory isn’t erased, just marked as available
- Old data remains until overwritten

Resources

Books

“Understanding and Using C Pointers” by Richard Reese - Ch. 2
“Computer Systems: A Programmer’s Perspective” - Ch. 9.9
“Effective C” by Robert Seacord - Ch. 6

Online

Tools

Valgrind
AddressSanitizer
LeakSanitizer (LSan)

Self-Assessment Checklist

Explain why freeing memory doesn’t zero it out
Describe why use-after-free sometimes “works” and sometimes crashes
Think about every malloc in terms of “who frees this and when”
Implement a basic leak detector with file/line tracking
Explain how poisoning freed memory helps debugging
Use Valgrind to find memory errors

Final Milestone: You think about every malloc in terms of “who frees this and when”.