Project 3: Memory Arena Allocator
Project 3: Memory Arena Allocator
Sprint: 2 - Data & Invariants Difficulty: Advanced Time Estimate: 1 week Prerequisites: Solid understanding of malloc/free, pointer arithmetic
Overview
What youโll build: A custom memory allocator that allocates from a fixed pool and frees everything at onceโthe pattern used in game engines, compilers, and parsers.
Why it teaches Data & Invariants: This is ownership crystallized into code. An arena says:
โI own all memory allocated from me. When I die, it all dies.โ
You must track:
- Whatโs allocated vs available
- Alignment requirements for different types
- When pointers into the arena become invalid
- The โhigh water markโ of usage
The Core Question Youโre Answering:
โIf I canโt call free() on individual allocations, how do I prevent memory leaks while still being efficient?โ
Learning Objectives
By the end of this project, you will be able to:
- Implement a bump allocator with proper alignment
- Enforce alignment requirements for different data types
- Design lifetime-based ownership models
- Understand why arenas are faster than malloc
- Handle allocation failures gracefully
- Integrate arena allocation with other data structures
Theoretical Foundation
Why Arenas?
Memory ownership doesnโt have to be about tracking individual allocations. By grouping allocations by lifetime, you can:
- Eliminate the entire class of โforgot to freeโ bugs
- Make deallocation O(1) instead of O(n)
- Prevent fragmentation entirely
Traditional malloc/free:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Each malloc() must be matched with free() โ
โ Individual tracking overhead โ
โ Fragmentation over time โ
โ Complex ownership tracking โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Arena allocation:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ All allocations from arena share same lifetime โ
โ Single free: arena_destroy() or arena_reset() โ
โ Zero fragmentation (bump allocation) โ
โ Simple ownership: arena owns everything โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Alignment Requirements
ALIGNMENT RULES (typical 64-bit system):
โโโโโโโโโโโโโโฌโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Type โ Size โ Must start at address... โ
โโโโโโโโโโโโโโผโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ char โ 1 byte โ Any address (1-aligned) โ
โ short โ 2 bytes โ Multiple of 2 (2-aligned) โ
โ int โ 4 bytes โ Multiple of 4 (4-aligned) โ
โ long โ 8 bytes โ Multiple of 8 (8-aligned) โ
โ double โ 8 bytes โ Multiple of 8 (8-aligned) โ
โ pointer โ 8 bytes โ Multiple of 8 (8-aligned) โ
โโโโโโโโโโโโโโดโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Alignment formula:
aligned = (offset + align - 1) & ~(align - 1)
Example: Align offset 5 to 4-byte boundary
offset = 5
align = 4
(5 + 4 - 1) & ~(4 - 1) = 8 & ~3 = 8
The Invariants
INV1: 0 <= offset <= capacity
(Current position within valid range)
INV2: base is aligned to _Alignof(max_align_t)
(Base address is maximally aligned)
INV3: All returned pointers are in range [base, base+offset)
(Every allocation is within arena)
INV4: After reset(), offset == 0
(Reset returns to initial state)
INV5: After destroy(), all pointers are INVALID
(Use-after-free = undefined behavior)
Project Specification
Core API
// Create arena with specified capacity
Arena* arena_create(size_t capacity);
// Allocate memory from arena
void* arena_alloc(Arena* arena, size_t size, size_t alignment);
// Type-safe allocation macro
#define arena_alloc_type(arena, T) \
(T*)arena_alloc(arena, sizeof(T), _Alignof(T))
// Reset arena (invalidates all pointers, keeps capacity)
void arena_reset(Arena* arena);
// Destroy arena (free all memory)
void arena_destroy(Arena* arena);
// Statistics
size_t arena_used(Arena* arena);
size_t arena_available(Arena* arena);
Expected Output
$ ./arena_demo
=== Memory Arena Allocator Demo ===
[Test 1: Basic Allocation]
Creating arena with 1MB capacity...
Arena created: base=0x7f8a4c000000 size=1048576 offset=0
Allocating 100 integers (400 bytes)...
Arena stats: allocated=400 available=1048176
Allocated at: 0x7f8a4c000000 (aligned to 4 bytes)
Allocating 50 doubles (400 bytes)...
Arena stats: allocated=800 available=1047776
Allocated at: 0x7f8a4c000190 (aligned to 8 bytes)
[Test 2: Alignment Verification]
Testing alignment for different types:
char: 0x7f8a4c000000 (alignment=1) โ
short: 0x7f8a4c000002 (alignment=2) โ
int: 0x7f8a4c000004 (alignment=4) โ
double: 0x7f8a4c000010 (alignment=8) โ
[Test 3: Arena vs Malloc Benchmark]
Allocating 100,000 small objects:
malloc/free: 83.9ms
arena: 1.32ms
Speedup: 63.6x faster
[Memory Leak Check]
$ valgrind --leak-check=full ./arena_demo
==12345== All heap blocks were freed -- no leaks are possible
Solution Architecture
Data Structure
typedef struct Arena {
char *base; // Start of memory
size_t capacity; // Total size
size_t offset; // Current position
} Arena;
Memory Layout
Arena struct (on heap):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ base โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ capacity = 1024 โ โ
โ offset = 300 โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
Backing memory (from mmap or malloc):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ[USED: 300 bytes] [AVAILABLE: 724 bytes] โ
โ ^ โ
โ base + offset โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Implementation Guide
Phase 1: Basic Arena (30 minutes)
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
typedef struct Arena {
char *base;
size_t capacity;
size_t offset;
} Arena;
Arena* arena_create(size_t capacity) {
Arena* arena = malloc(sizeof(Arena));
if (!arena) return NULL;
arena->base = malloc(capacity);
if (!arena->base) {
free(arena);
return NULL;
}
arena->capacity = capacity;
arena->offset = 0;
return arena;
}
void arena_destroy(Arena* arena) {
if (arena) {
free(arena->base);
free(arena);
}
}
void arena_reset(Arena* arena) {
arena->offset = 0;
}
Phase 2: Aligned Allocation (30 minutes)
static size_t align_up(size_t value, size_t alignment) {
return (value + alignment - 1) & ~(alignment - 1);
}
void* arena_alloc(Arena* arena, size_t size, size_t alignment) {
// Calculate aligned offset
size_t aligned_offset = align_up(arena->offset, alignment);
// Check for overflow
if (aligned_offset + size > arena->capacity) {
return NULL; // Arena exhausted
}
// Calculate pointer
void* ptr = arena->base + aligned_offset;
// Update offset
arena->offset = aligned_offset + size;
return ptr;
}
Phase 3: Type-Safe Macros (15 minutes)
// Allocate single object
#define arena_new(arena, T) \
((T*)arena_alloc((arena), sizeof(T), _Alignof(T)))
// Allocate array
#define arena_array(arena, T, count) \
((T*)arena_alloc((arena), sizeof(T) * (count), _Alignof(T)))
// Usage:
Player* player = arena_new(arena, Player);
int* scores = arena_array(arena, int, 100);
Phase 4: Statistics and Debug (30 minutes)
size_t arena_used(Arena* arena) {
return arena->offset;
}
size_t arena_available(Arena* arena) {
return arena->capacity - arena->offset;
}
void arena_debug_print(Arena* arena) {
printf("Arena Debug:\n");
printf(" Base: %p\n", (void*)arena->base);
printf(" Capacity: %zu bytes\n", arena->capacity);
printf(" Used: %zu bytes (%.1f%%)\n",
arena->offset,
(double)arena->offset / arena->capacity * 100);
printf(" Available: %zu bytes\n", arena_available(arena));
}
Phase 5: Benchmark (30 minutes)
#include <time.h>
#define ITERATIONS 100000
#define ALLOC_SIZE 64
void benchmark_malloc() {
clock_t start = clock();
void* ptrs[100];
for (int i = 0; i < ITERATIONS; i++) {
for (int j = 0; j < 100; j++) {
ptrs[j] = malloc(ALLOC_SIZE);
}
for (int j = 0; j < 100; j++) {
free(ptrs[j]);
}
}
printf("malloc/free: %.3f seconds\n",
(double)(clock() - start) / CLOCKS_PER_SEC);
}
void benchmark_arena() {
Arena* arena = arena_create(ALLOC_SIZE * 100);
clock_t start = clock();
for (int i = 0; i < ITERATIONS; i++) {
for (int j = 0; j < 100; j++) {
arena_alloc(arena, ALLOC_SIZE, 8);
}
arena_reset(arena);
}
printf("arena: %.3f seconds\n",
(double)(clock() - start) / CLOCKS_PER_SEC);
arena_destroy(arena);
}
Common Pitfalls
Pitfall 1: Forgetting Alignment Padding
// WRONG: Ignoring alignment
void* arena_alloc_bad(Arena* arena, size_t size) {
void* ptr = arena->base + arena->offset;
arena->offset += size;
return ptr; // May be misaligned!
}
// CORRECT: Always align
void* arena_alloc(Arena* arena, size_t size, size_t alignment) {
size_t aligned = align_up(arena->offset, alignment);
// ...
}
Pitfall 2: Use-After-Reset
// DANGEROUS
int* data = arena_new(arena, int);
*data = 42;
arena_reset(arena); // data is now INVALID
printf("%d\n", *data); // UNDEFINED BEHAVIOR
// CORRECT: Don't use pointers after reset
arena_reset(arena);
data = NULL; // Clear your pointers
Pitfall 3: Integer Overflow in Size Check
// WRONG: Can overflow
if (arena->offset + size > arena->capacity)
// CORRECT: Prevent overflow
if (size > arena->capacity - arena->offset)
Interview Preparation
Common Questions
- โExplain how an arena allocator works.โ
- Pre-allocate large block
- Bump pointer on each allocation
- Reset or destroy to free all at once
- โWhy are arenas faster than malloc?โ
- No per-allocation bookkeeping
- No free list management
- No fragmentation handling
- O(1) allocation and reset
- โWhen would you NOT use an arena?โ
- When you need to free individual allocations
- When allocations have varying lifetimes
- When memory is very constrained
- โHow do you handle alignment?โ
(offset + align - 1) & ~(align - 1)- Round up to next aligned address
Self-Assessment Checklist
- Understand bump allocation concept
- Implement aligned allocation correctly
- Handle arena exhaustion gracefully
- Valgrind shows no leaks
- Benchmark shows speedup over malloc
Resources
| Topic | Book | Chapter |
|---|---|---|
| Alignment | โComputer Systems: A Programmerโs Perspectiveโ | Section 3.9.3 |
| Arena Design | โC Interfaces and Implementationsโ by David Hanson | Chapter 5 |
| Ownership | โEffective Cโ by Robert Seacord | Chapter 6 |
Next Project: P04: JSON Parser - Apply arena allocation to a real parsing problem.