Project 3: Memory Arena Allocator

Project 3: Memory Arena Allocator

Sprint: 2 - Data & Invariants Difficulty: Advanced Time Estimate: 1 week Prerequisites: Solid understanding of malloc/free, pointer arithmetic


Overview

What youโ€™ll build: A custom memory allocator that allocates from a fixed pool and frees everything at onceโ€”the pattern used in game engines, compilers, and parsers.

Why it teaches Data & Invariants: This is ownership crystallized into code. An arena says:

โ€œI own all memory allocated from me. When I die, it all dies.โ€

You must track:

  • Whatโ€™s allocated vs available
  • Alignment requirements for different types
  • When pointers into the arena become invalid
  • The โ€œhigh water markโ€ of usage

The Core Question Youโ€™re Answering:

โ€œIf I canโ€™t call free() on individual allocations, how do I prevent memory leaks while still being efficient?โ€


Learning Objectives

By the end of this project, you will be able to:

  1. Implement a bump allocator with proper alignment
  2. Enforce alignment requirements for different data types
  3. Design lifetime-based ownership models
  4. Understand why arenas are faster than malloc
  5. Handle allocation failures gracefully
  6. Integrate arena allocation with other data structures

Theoretical Foundation

Why Arenas?

Memory ownership doesnโ€™t have to be about tracking individual allocations. By grouping allocations by lifetime, you can:

  1. Eliminate the entire class of โ€œforgot to freeโ€ bugs
  2. Make deallocation O(1) instead of O(n)
  3. Prevent fragmentation entirely
Traditional malloc/free:
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚  Each malloc() must be matched with free()              โ”‚
  โ”‚  Individual tracking overhead                           โ”‚
  โ”‚  Fragmentation over time                                โ”‚
  โ”‚  Complex ownership tracking                             โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Arena allocation:
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚  All allocations from arena share same lifetime         โ”‚
  โ”‚  Single free: arena_destroy() or arena_reset()          โ”‚
  โ”‚  Zero fragmentation (bump allocation)                   โ”‚
  โ”‚  Simple ownership: arena owns everything                โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Alignment Requirements

ALIGNMENT RULES (typical 64-bit system):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Type       โ”‚ Size      โ”‚ Must start at address...   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ char       โ”‚ 1 byte    โ”‚ Any address (1-aligned)    โ”‚
โ”‚ short      โ”‚ 2 bytes   โ”‚ Multiple of 2 (2-aligned)  โ”‚
โ”‚ int        โ”‚ 4 bytes   โ”‚ Multiple of 4 (4-aligned)  โ”‚
โ”‚ long       โ”‚ 8 bytes   โ”‚ Multiple of 8 (8-aligned)  โ”‚
โ”‚ double     โ”‚ 8 bytes   โ”‚ Multiple of 8 (8-aligned)  โ”‚
โ”‚ pointer    โ”‚ 8 bytes   โ”‚ Multiple of 8 (8-aligned)  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Alignment formula:
  aligned = (offset + align - 1) & ~(align - 1)

Example: Align offset 5 to 4-byte boundary
  offset = 5
  align = 4
  (5 + 4 - 1) & ~(4 - 1) = 8 & ~3 = 8

The Invariants

INV1: 0 <= offset <= capacity
  (Current position within valid range)

INV2: base is aligned to _Alignof(max_align_t)
  (Base address is maximally aligned)

INV3: All returned pointers are in range [base, base+offset)
  (Every allocation is within arena)

INV4: After reset(), offset == 0
  (Reset returns to initial state)

INV5: After destroy(), all pointers are INVALID
  (Use-after-free = undefined behavior)

Project Specification

Core API

// Create arena with specified capacity
Arena* arena_create(size_t capacity);

// Allocate memory from arena
void* arena_alloc(Arena* arena, size_t size, size_t alignment);

// Type-safe allocation macro
#define arena_alloc_type(arena, T) \
    (T*)arena_alloc(arena, sizeof(T), _Alignof(T))

// Reset arena (invalidates all pointers, keeps capacity)
void arena_reset(Arena* arena);

// Destroy arena (free all memory)
void arena_destroy(Arena* arena);

// Statistics
size_t arena_used(Arena* arena);
size_t arena_available(Arena* arena);

Expected Output

$ ./arena_demo
=== Memory Arena Allocator Demo ===

[Test 1: Basic Allocation]
Creating arena with 1MB capacity...
Arena created: base=0x7f8a4c000000 size=1048576 offset=0

Allocating 100 integers (400 bytes)...
Arena stats: allocated=400 available=1048176
Allocated at: 0x7f8a4c000000 (aligned to 4 bytes)

Allocating 50 doubles (400 bytes)...
Arena stats: allocated=800 available=1047776
Allocated at: 0x7f8a4c000190 (aligned to 8 bytes)

[Test 2: Alignment Verification]
Testing alignment for different types:
  char:   0x7f8a4c000000 (alignment=1) โœ“
  short:  0x7f8a4c000002 (alignment=2) โœ“
  int:    0x7f8a4c000004 (alignment=4) โœ“
  double: 0x7f8a4c000010 (alignment=8) โœ“

[Test 3: Arena vs Malloc Benchmark]
Allocating 100,000 small objects:
  malloc/free:  83.9ms
  arena:        1.32ms
  Speedup: 63.6x faster

[Memory Leak Check]
$ valgrind --leak-check=full ./arena_demo
==12345== All heap blocks were freed -- no leaks are possible

Solution Architecture

Data Structure

typedef struct Arena {
    char *base;      // Start of memory
    size_t capacity; // Total size
    size_t offset;   // Current position
} Arena;

Memory Layout

Arena struct (on heap):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ base โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚ capacity = 1024                      โ”‚                      โ”‚
โ”‚ offset = 300                         โ”‚                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”‚โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                       โ”‚
                                       โ–ผ
Backing memory (from mmap or malloc):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚[USED: 300 bytes]            [AVAILABLE: 724 bytes]          โ”‚
โ”‚                             ^                               โ”‚
โ”‚                         base + offset                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Implementation Guide

Phase 1: Basic Arena (30 minutes)

#include <stdlib.h>
#include <string.h>
#include <stdint.h>

typedef struct Arena {
    char *base;
    size_t capacity;
    size_t offset;
} Arena;

Arena* arena_create(size_t capacity) {
    Arena* arena = malloc(sizeof(Arena));
    if (!arena) return NULL;

    arena->base = malloc(capacity);
    if (!arena->base) {
        free(arena);
        return NULL;
    }

    arena->capacity = capacity;
    arena->offset = 0;
    return arena;
}

void arena_destroy(Arena* arena) {
    if (arena) {
        free(arena->base);
        free(arena);
    }
}

void arena_reset(Arena* arena) {
    arena->offset = 0;
}

Phase 2: Aligned Allocation (30 minutes)

static size_t align_up(size_t value, size_t alignment) {
    return (value + alignment - 1) & ~(alignment - 1);
}

void* arena_alloc(Arena* arena, size_t size, size_t alignment) {
    // Calculate aligned offset
    size_t aligned_offset = align_up(arena->offset, alignment);

    // Check for overflow
    if (aligned_offset + size > arena->capacity) {
        return NULL;  // Arena exhausted
    }

    // Calculate pointer
    void* ptr = arena->base + aligned_offset;

    // Update offset
    arena->offset = aligned_offset + size;

    return ptr;
}

Phase 3: Type-Safe Macros (15 minutes)

// Allocate single object
#define arena_new(arena, T) \
    ((T*)arena_alloc((arena), sizeof(T), _Alignof(T)))

// Allocate array
#define arena_array(arena, T, count) \
    ((T*)arena_alloc((arena), sizeof(T) * (count), _Alignof(T)))

// Usage:
Player* player = arena_new(arena, Player);
int* scores = arena_array(arena, int, 100);

Phase 4: Statistics and Debug (30 minutes)

size_t arena_used(Arena* arena) {
    return arena->offset;
}

size_t arena_available(Arena* arena) {
    return arena->capacity - arena->offset;
}

void arena_debug_print(Arena* arena) {
    printf("Arena Debug:\n");
    printf("  Base: %p\n", (void*)arena->base);
    printf("  Capacity: %zu bytes\n", arena->capacity);
    printf("  Used: %zu bytes (%.1f%%)\n",
           arena->offset,
           (double)arena->offset / arena->capacity * 100);
    printf("  Available: %zu bytes\n", arena_available(arena));
}

Phase 5: Benchmark (30 minutes)

#include <time.h>

#define ITERATIONS 100000
#define ALLOC_SIZE 64

void benchmark_malloc() {
    clock_t start = clock();
    void* ptrs[100];

    for (int i = 0; i < ITERATIONS; i++) {
        for (int j = 0; j < 100; j++) {
            ptrs[j] = malloc(ALLOC_SIZE);
        }
        for (int j = 0; j < 100; j++) {
            free(ptrs[j]);
        }
    }

    printf("malloc/free: %.3f seconds\n",
           (double)(clock() - start) / CLOCKS_PER_SEC);
}

void benchmark_arena() {
    Arena* arena = arena_create(ALLOC_SIZE * 100);
    clock_t start = clock();

    for (int i = 0; i < ITERATIONS; i++) {
        for (int j = 0; j < 100; j++) {
            arena_alloc(arena, ALLOC_SIZE, 8);
        }
        arena_reset(arena);
    }

    printf("arena: %.3f seconds\n",
           (double)(clock() - start) / CLOCKS_PER_SEC);
    arena_destroy(arena);
}

Common Pitfalls

Pitfall 1: Forgetting Alignment Padding

// WRONG: Ignoring alignment
void* arena_alloc_bad(Arena* arena, size_t size) {
    void* ptr = arena->base + arena->offset;
    arena->offset += size;
    return ptr;  // May be misaligned!
}

// CORRECT: Always align
void* arena_alloc(Arena* arena, size_t size, size_t alignment) {
    size_t aligned = align_up(arena->offset, alignment);
    // ...
}

Pitfall 2: Use-After-Reset

// DANGEROUS
int* data = arena_new(arena, int);
*data = 42;
arena_reset(arena);  // data is now INVALID
printf("%d\n", *data);  // UNDEFINED BEHAVIOR

// CORRECT: Don't use pointers after reset
arena_reset(arena);
data = NULL;  // Clear your pointers

Pitfall 3: Integer Overflow in Size Check

// WRONG: Can overflow
if (arena->offset + size > arena->capacity)

// CORRECT: Prevent overflow
if (size > arena->capacity - arena->offset)

Interview Preparation

Common Questions

  1. โ€œExplain how an arena allocator works.โ€
    • Pre-allocate large block
    • Bump pointer on each allocation
    • Reset or destroy to free all at once
  2. โ€œWhy are arenas faster than malloc?โ€
    • No per-allocation bookkeeping
    • No free list management
    • No fragmentation handling
    • O(1) allocation and reset
  3. โ€œWhen would you NOT use an arena?โ€
    • When you need to free individual allocations
    • When allocations have varying lifetimes
    • When memory is very constrained
  4. โ€œHow do you handle alignment?โ€
    • (offset + align - 1) & ~(align - 1)
    • Round up to next aligned address

Self-Assessment Checklist

  • Understand bump allocation concept
  • Implement aligned allocation correctly
  • Handle arena exhaustion gracefully
  • Valgrind shows no leaks
  • Benchmark shows speedup over malloc

Resources

Topic Book Chapter
Alignment โ€œComputer Systems: A Programmerโ€™s Perspectiveโ€ Section 3.9.3
Arena Design โ€œC Interfaces and Implementationsโ€ by David Hanson Chapter 5
Ownership โ€œEffective Cโ€ by Robert Seacord Chapter 6

Next Project: P04: JSON Parser - Apply arena allocation to a real parsing problem.