Project 3: Memory Leak Detector
Project 3: Memory Leak Detector
The Core Question: โWho owns this memory, and when does it become invalid?โ
Project Overview
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | 1-2 weeks |
| Language | C |
| Prerequisites | Comfort with structs, linked lists, macros |
| Main Book | โUnderstanding and Using C Pointersโ by Richard Reese |
Learning Objectives
By completing this project, you will:
- Understand object lifetime - When memory is โaliveโ vs โdeadโ
- Master ownership semantics - Who is responsible for calling
free() - Detect common memory bugs - Leaks, double-free, use-after-free
- Build debugging infrastructure - Track allocations with file/line info
- Think like a tool author - Understand how Valgrind/ASan work
Theoretical Foundation
The Three Pointer States
Every pointer in C is in exactly one of three states:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ POINTER STATES โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ 1. NULL โ
โ int *p = NULL; โ
โ - Explicitly points to nothing โ
โ - Safe to check: if (p == NULL) โ
โ - Dereferencing crashes (usually) โ
โ โ
โ 2. VALID โ
โ int *p = malloc(sizeof(int)); โ
โ - Points to allocated, live memory โ
โ - Safe to dereference: *p = 42; โ
โ - Owner must eventually free โ
โ โ
โ 3. DANGLING (INVALID) โ
โ int *p = malloc(sizeof(int)); โ
โ free(p); โ
โ // p is now DANGLING - still contains old address โ
โ // but memory is no longer valid โ
โ *p = 42; // UNDEFINED BEHAVIOR! โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
The danger of dangling pointers: They look exactly like valid pointers. Thereโs no runtime check. The old address is still thereโitโs just that the memory at that address is no longer yours.
Object Lifetime
Every piece of dynamically allocated memory has a lifetime:
Timeline:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ>
โ โ
โ malloc() โ free()
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ VALID LIFETIME โโ
โ โ Safe to use pointer โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโ>
BEFORE AFTER
(unallocated) (freed - DANGLING!)
Memory Bugs: The Rogues Gallery
Bug 1: Memory Leak
void leak() {
int *p = malloc(1000);
// Function returns without free(p)
// Those 1000 bytes are now UNREACHABLE but still ALLOCATED
}
// Call leak() 1000 times โ 1MB of leaked memory
Symptoms: Memory usage grows over time. Long-running programs eventually crash with OOM.
Bug 2: Double-Free
int *p = malloc(sizeof(int));
free(p);
free(p); // BUG! p was already freed
// Consequences:
// - Heap metadata corruption
// - Security vulnerability (exploitation possible)
// - Crash (if lucky)
Bug 3: Use-After-Free
int *p = malloc(sizeof(int));
*p = 42;
free(p);
// Later...
printf("%d\n", *p); // BUG! Reading freed memory
// Consequences:
// - Reads garbage (old value might still be there!)
// - Reads new data (if memory was reused)
// - Security vulnerability (attacker controls reused memory)
Bug 4: Invalid Free
int x = 42;
free(&x); // BUG! Can't free stack memory
char *str = "hello";
free(str); // BUG! Can't free string literals
Why Bugs Appear Later Than Mistakes
This is the critical insight that makes memory bugs so hard:
int *p = malloc(sizeof(int));
*p = 42;
free(p);
// p is now dangling, but...
// The memory at *p might still contain 42!
// The heap hasn't reused it yet.
printf("%d\n", *p); // Might print 42! "Works!"
// Much later, in unrelated code:
int *q = malloc(sizeof(int));
*q = 100;
// q might get the same address as p was!
printf("%d\n", *p); // NOW prints 100 (or crashes)
The bug (use-after-free) was committed when we used *p after free(p). But the symptom only appeared when the memory was reused.
Project Specification
What Youโre Building
A wrapper around malloc/free that:
- Tracks all allocations - Registry of (address, size, file, line)
- Detects memory leaks - Report unfreed allocations at program exit
- Catches double-free - Error if freeing already-freed pointer
- Warns about use-after-free - Poison freed memory to make bugs visible
- Provides debugging info - Show WHERE the allocation happened
Core API
// Macros that capture file/line
#define malloc(size) debug_malloc(size, __FILE__, __LINE__)
#define free(ptr) debug_free(ptr, __FILE__, __LINE__)
// Underlying functions
void* debug_malloc(size_t size, const char* file, int line);
void debug_free(void* ptr, const char* file, int line);
// Reporting
void leak_check_report(void); // Call at program exit
size_t get_active_allocations(void); // Currently allocated count
size_t get_total_allocated_bytes(void);
Sample Output
$ ./my_program
=== Memory Leak Detector Report ===
[LEAK] 100 bytes allocated at test.c:7 (create_user) never freed
Address: 0x600000004000
[LEAK] 100 bytes allocated at test.c:7 (create_user) never freed
Address: 0x600000004100
[LEAK] 100 bytes allocated at test.c:7 (create_user) never freed
Address: 0x600000004200
Total leaks: 3 allocations, 300 bytes
Solution Architecture
Module Design
leak_detector/
โโโ leak_detector.h # Public API and macros
โโโ leak_detector.c # Implementation
โโโ test_leak.c # Memory leak test
โโโ test_double_free.c # Double-free test
โโโ test_use_after_free.c # Use-after-free test
โโโ test_clean.c # Clean program (no leaks)
โโโ Makefile
Data Structures
// Track one allocation
typedef struct Allocation {
void *ptr; // The allocated address
size_t size; // Size in bytes
const char *file; // Source file
int line; // Line number
int freed; // 0 = active, 1 = freed
struct Allocation *next; // Linked list pointer
} Allocation;
// Global registry
static Allocation *registry_head = NULL;
static int initialized = 0;
Core Algorithm
debug_malloc(size, file, line):
1. Call real malloc(size)
2. Create Allocation record with (ptr, size, file, line, freed=0)
3. Add to registry
4. Return ptr
debug_free(ptr, file, line):
1. Find ptr in registry
2. If not found: WARNING - freeing untracked memory
3. If found and freed == 1: ERROR - double free!
4. Mark freed = 1
5. Fill memory with poison pattern (0xDEADBEEF)
6. Call real free(ptr)
leak_check_report():
1. Walk registry
2. For each entry where freed == 0: report as leak
3. Print summary statistics
Implementation Guide
Phase 1: Basic Tracking (3-4 hours)
Goal: Track allocations in a linked list.
// leak_detector.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
typedef struct Allocation {
void *ptr;
size_t size;
const char *file;
int line;
int freed;
struct Allocation *next;
} Allocation;
static Allocation *head = NULL;
// Store the real malloc before we redefine it
static void* (*real_malloc)(size_t) = NULL;
static void (*real_free)(void*) = NULL;
void init_leak_detector(void) {
// Save references to real functions
// (This is simplified - in practice you'd use dlsym)
}
void* debug_malloc(size_t size, const char* file, int line) {
// Allocate the requested memory plus our tracking
void *ptr = malloc(size); // Call the real malloc
if (ptr == NULL) return NULL;
// Create tracking record
Allocation *record = malloc(sizeof(Allocation));
record->ptr = ptr;
record->size = size;
record->file = file;
record->line = line;
record->freed = 0;
record->next = head;
head = record;
return ptr;
}
Phase 2: Leak Detection (2-3 hours)
Goal: Report unfreed allocations at exit.
void leak_check_report(void) {
int leak_count = 0;
size_t leak_bytes = 0;
printf("\n=== Memory Leak Detector Report ===\n");
for (Allocation *a = head; a != NULL; a = a->next) {
if (!a->freed) {
printf("[LEAK] %zu bytes allocated at %s:%d never freed\n",
a->size, a->file, a->line);
printf(" Address: %p\n", a->ptr);
leak_count++;
leak_bytes += a->size;
}
}
if (leak_count == 0) {
printf("No memory leaks detected!\n");
} else {
printf("\nTotal leaks: %d allocations, %zu bytes\n",
leak_count, leak_bytes);
}
}
// Automatically run at program exit
__attribute__((destructor))
void auto_leak_report(void) {
leak_check_report();
}
Phase 3: Double-Free Detection (2-3 hours)
Goal: Catch attempts to free already-freed memory.
void debug_free(void* ptr, const char* file, int line) {
if (ptr == NULL) return; // free(NULL) is valid and does nothing
// Find this pointer in our registry
Allocation *found = NULL;
for (Allocation *a = head; a != NULL; a = a->next) {
if (a->ptr == ptr) {
found = a;
break;
}
}
if (found == NULL) {
printf("[WARNING] Freeing untracked pointer %p at %s:%d\n",
ptr, file, line);
printf(" This may be a bug or memory not allocated via debug_malloc\n");
free(ptr);
return;
}
if (found->freed) {
printf("[ERROR] DOUBLE-FREE DETECTED!\n");
printf(" Pointer: %p\n", ptr);
printf(" Size: %zu bytes\n", found->size);
printf(" Originally allocated at: %s:%d\n", found->file, found->line);
printf(" First freed at: (tracking not implemented yet)\n");
printf(" Second free attempted at: %s:%d\n", file, line);
printf("\n*** Aborting to prevent heap corruption ***\n");
abort();
}
found->freed = 1;
free(ptr);
}
Phase 4: Use-After-Free Detection (2-3 hours)
Goal: Poison freed memory to make UAF bugs visible.
void debug_free(void* ptr, const char* file, int line) {
// ... (previous checks) ...
// Before freeing, fill with poison pattern
// This makes use-after-free more likely to be noticed
memset(ptr, 0xDE, found->size);
found->freed = 1;
free(ptr);
}
Why poisoning helps:
- If code reads freed memory and gets
0xDEDEDEDE, itโs obviously wrong - Much easier to debug than reading โold valid dataโ
- Pattern
0xDEis visible in debuggers
Phase 5: Header File and Macros (1-2 hours)
// leak_detector.h
#ifndef LEAK_DETECTOR_H
#define LEAK_DETECTOR_H
#include <stddef.h>
// Initialize (optional - auto-init on first malloc)
void init_leak_detector(void);
// Underlying functions
void* debug_malloc(size_t size, const char* file, int line);
void debug_free(void* ptr, const char* file, int line);
// Reporting
void leak_check_report(void);
size_t get_active_allocations(void);
size_t get_total_allocated_bytes(void);
// Macros to capture file/line automatically
#define malloc(size) debug_malloc(size, __FILE__, __LINE__)
#define free(ptr) debug_free(ptr, __FILE__, __LINE__)
#endif
Testing Strategy
Test 1: Memory Leak Detection
// test_leak.c
#include "leak_detector.h"
void create_user(void) {
char *name = malloc(100);
// Oops, forgot to free!
}
int main(void) {
for (int i = 0; i < 3; i++) {
create_user();
}
return 0;
}
Expected Output:
=== Memory Leak Detector Report ===
[LEAK] 100 bytes at test_leak.c:5 never freed
[LEAK] 100 bytes at test_leak.c:5 never freed
[LEAK] 100 bytes at test_leak.c:5 never freed
Total leaks: 3 allocations, 300 bytes
Test 2: Double-Free Detection
// test_double_free.c
#include "leak_detector.h"
int main(void) {
int *p = malloc(sizeof(int));
*p = 42;
free(p);
free(p); // Should trigger error!
return 0;
}
Test 3: Clean Program
// test_clean.c
#include "leak_detector.h"
int main(void) {
int *p = malloc(sizeof(int));
*p = 42;
free(p); // Properly freed!
char *str = malloc(100);
strcpy(str, "Hello");
free(str); // Properly freed!
return 0;
}
Expected Output:
=== Memory Leak Detector Report ===
No memory leaks detected!
Total allocations: 2
Total frees: 2
Common Pitfalls and Debugging Tips
Pitfall 1: Infinite Recursion
// WRONG - debug_malloc calls malloc which calls debug_malloc...
void* debug_malloc(size_t size, const char* file, int line) {
Allocation *record = malloc(sizeof(Allocation)); // RECURSION!
// ...
}
// Solution 1: Use a flag to detect recursion
static int in_debug_malloc = 0;
void* debug_malloc(size_t size, const char* file, int line) {
if (in_debug_malloc) {
return real_malloc(size); // Bypass tracking
}
in_debug_malloc = 1;
// ... do tracking ...
in_debug_malloc = 0;
}
// Solution 2: Allocate tracking records from separate pool
Pitfall 2: Thread Safety
The simple linked list isnโt thread-safe. For multi-threaded programs:
- Use a mutex around registry operations
- Or use thread-local storage
Pitfall 3: Missing Some Allocations
Some code might call malloc before your detector initializes:
- Use constructor attribute:
__attribute__((constructor)) - Or intercept at link time with
--wrapflag
Extensions and Challenges
Challenge 1: Track Free Location
typedef struct Allocation {
// ... existing fields ...
const char *free_file; // Where it was freed
int free_line;
} Allocation;
Challenge 2: Memory Usage Statistics
typedef struct {
size_t total_allocated;
size_t total_freed;
size_t peak_usage;
int allocation_count;
int free_count;
} MemoryStats;
MemoryStats get_memory_stats(void);
Challenge 3: Quarantine for UAF Detection
Instead of immediately freeing, keep freed memory in a โquarantineโ:
#define QUARANTINE_SIZE 100
static void* quarantine[QUARANTINE_SIZE];
static int quarantine_index = 0;
void debug_free(void* ptr, ...) {
// Add to quarantine instead of immediate free
if (quarantine[quarantine_index] != NULL) {
real_free(quarantine[quarantine_index]);
}
quarantine[quarantine_index] = ptr;
quarantine_index = (quarantine_index + 1) % QUARANTINE_SIZE;
}
This delays reuse, making UAF bugs more likely to crash (and be detected).
Challenge 4: Stack Trace Capture
On Linux/macOS, capture the call stack at allocation time:
#include <execinfo.h>
typedef struct Allocation {
// ... existing fields ...
void* backtrace[10];
int backtrace_size;
} Allocation;
Real-World Connections
Connection 1: How Valgrind Works
Valgrind uses similar techniques:
- Intercepts malloc/free calls
- Maintains shadow memory to track validity
- Reports errors with stack traces
Your detector is a simplified version of Valgrindโs Memcheck.
Connection 2: AddressSanitizer
ASan (AddressSanitizer) uses compiler instrumentation:
- Adds checks around every memory access
- Much faster than Valgrind (2x slowdown vs 20x)
- Catches more bugs (stack buffer overflows too)
Connection 3: Production Memory Allocators
Production allocators like jemalloc and tcmalloc:
- Track statistics
- Detect some corruption
- Profile memory usage
Interview Questions You Can Now Answer
- โWhat is a memory leak? How do you detect them?โ
- Memory allocated but never freed
- Detect by tracking allocations and checking at exit
- โWhat is use-after-free? Why is it dangerous?โ
- Accessing memory after itโs freed
- Dangerous because memory might be reused, leading to corruption or security bugs
- โWhat is double-free? What can go wrong?โ
- Calling free() twice on same pointer
- Corrupts heap allocator metadata, can lead to exploitation
- โHow does Valgrind detect memory errors?โ
- Intercepts malloc/free
- Tracks every byteโs validity state
- Checks every memory access
- โWhatโs the difference between a dangling pointer and a NULL pointer?โ
- NULL: explicitly points to nothing, checkable
- Dangling: points to freed memory, looks valid but isnโt
- โIf you free memory, why can you sometimes still read from it?โ
- Memory isnโt erased, just marked as available
- Old data remains until overwritten
Resources
Books
- โUnderstanding and Using C Pointersโ by Richard Reese - Ch. 2
- โComputer Systems: A Programmerโs Perspectiveโ - Ch. 9.9
- โEffective Cโ by Robert Seacord - Ch. 6
Online
Tools
- Valgrind
- AddressSanitizer
- LeakSanitizer (LSan)
Self-Assessment Checklist
- Explain why freeing memory doesnโt zero it out
- Describe why use-after-free sometimes โworksโ and sometimes crashes
- Think about every
mallocin terms of โwho frees this and whenโ - Implement a basic leak detector with file/line tracking
- Explain how poisoning freed memory helps debugging
- Use Valgrind to find memory errors
Final Milestone: You think about every malloc in terms of โwho frees this and whenโ.