Project 4: Arena Allocator

Project 4: Arena Allocator

Sprint: 1 - Memory & Control Difficulty: Intermediate Time Estimate: Weekend - 1 week Prerequisites: Project 3 or understanding of malloc/free


Overview

What youโ€™ll build: A bump/arena allocator that allocates memory from a pre-allocated block, with O(1) allocation and bulk-free semantics.

Why it teaches memory & control: This is where you understand why allocators exist. Youโ€™ll see that malloc is just softwareโ€”someone wrote it. By building the simplest possible allocator, you understand memory as a raw resource to be carved up, not a magic service.

The Core Question Youโ€™re Answering:

โ€œWhat if I could allocate memory with just a pointer increment, and free everything at once?โ€

This is the arena allocatorโ€™s insight. It trades flexibility (individual frees) for speed (O(1) allocation, O(1) bulk free). By building one, you understand that malloc is just one way to manage memoryโ€”and often not the best way.


Learning Objectives

By the end of this project, you will be able to:

  1. Explain why allocators exist and what problem they solve
  2. Implement a bump allocator from scratch using mmap or malloc
  3. Handle memory alignment requirements for different data types
  4. Compare arena allocation patterns to general-purpose allocation
  5. Identify use cases where arena allocation is optimal
  6. Use mmap to request memory directly from the operating system
  7. Reason about allocation performance (O(1) vs O(n) patterns)
  8. Design lifetime-based memory management for batch operations

Theoretical Foundation

Why Allocators Exist

Every time you call malloc(), youโ€™re not asking the operating system directly. Instead, youโ€™re asking the malloc implementationโ€”a piece of software that mediates between your program and the OS.

Why the indirection?

System calls are expensive. Each call to the OS kernel involves:

  1. Context switch from user mode to kernel mode
  2. Security checks and permission validation
  3. Kernel data structure manipulation
  4. Context switch back to user mode
Without allocator (naive approach):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Your Program                                                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  malloc(16)  โ†’ syscall โ†’ OS gives 16 bytes                  โ”‚
โ”‚  malloc(32)  โ†’ syscall โ†’ OS gives 32 bytes                  โ”‚
โ”‚  malloc(8)   โ†’ syscall โ†’ OS gives 8 bytes                   โ”‚
โ”‚  ...                                                         โ”‚
โ”‚  1000 allocations = 1000 syscalls = SLOW!                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

With allocator (malloc):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Your Program                                                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  malloc(16)  โ†’ allocator gives 16 bytes from pool           โ”‚
โ”‚  malloc(32)  โ†’ allocator gives 32 bytes from pool           โ”‚
โ”‚  malloc(8)   โ†’ allocator gives 8 bytes from pool            โ”‚
โ”‚  ...                                                         โ”‚
โ”‚  When pool empty โ†’ ONE syscall to refill                    โ”‚
โ”‚  1000 allocations = ~1 syscall + 1000 pointer ops = FAST!   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

The allocatorโ€™s job:

  1. Request large chunks from the OS (infrequently)
  2. Subdivide those chunks for your program (frequently)
  3. Track whatโ€™s in use and whatโ€™s free
  4. Coalesce free blocks to reduce fragmentation

The Bump Allocator: Simplest Possible Design

A bump allocator (also called arena, linear, or region allocator) is the simplest possible memory allocator:

Arena/Bump Allocator Structure:

Initial state:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                                                               โ”‚
โ”‚                   1024 bytes of memory                        โ”‚
โ”‚                                                               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ–ฒ                                                               โ–ฒ
โ”‚                                                               โ”‚
base pointer                                                    end
(offset = 0)

After arena_alloc(100):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚    USED (100)   โ”‚            AVAILABLE (924)                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ–ฒ                 โ–ฒ                                             โ–ฒ
โ”‚                 โ”‚                                             โ”‚
base              current (offset = 100)                        end

After arena_alloc(200):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚    USED (100)   โ”‚      USED (200)             โ”‚ AVAILABLE(724)โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ–ฒ                                               โ–ฒ               โ–ฒ
โ”‚                                               โ”‚               โ”‚
base                                            current (300)   end

After arena_reset():
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                                                               โ”‚
โ”‚                   1024 bytes of memory                        โ”‚
โ”‚                                                               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ–ฒ                                                               โ–ฒ
โ”‚                                                               โ”‚
base (offset = 0 again!)                                        end
All "allocations" gone instantly - O(1)!

Why itโ€™s called โ€œbumpโ€ allocation:

  • To allocate, you just โ€œbumpโ€ (increment) the current pointer
  • No searching for free blocks
  • No updating free lists
  • Just ptr = base + offset; offset += size;

The Tradeoff: Speed vs Flexibility

General-purpose allocator (malloc):

  • Can free individual allocations
  • Handles arbitrary allocation/free patterns
  • Must track every allocation
  • O(n) or O(log n) allocation in worst case
  • Complex implementation

Arena allocator:

  • Can only free ALL allocations at once
  • Perfect for batch operations
  • Minimal tracking (just one offset)
  • O(1) allocation always
  • Simple implementation (~50 lines)
When to use which:

malloc is better when:
โ”œโ”€ Allocations have varying, unpredictable lifetimes
โ”œโ”€ You need to free individual objects
โ”œโ”€ Memory pressure requires immediate reclamation
โ””โ”€ General-purpose code with no allocation pattern

Arena is better when:
โ”œโ”€ Allocations share a common lifetime (batch operations)
โ”œโ”€ You can reset/free everything together
โ”œโ”€ Performance is critical
โ””โ”€ Examples:
   โ”œโ”€ Game frame allocations (allocate during frame, reset at frame end)
   โ”œโ”€ Request handlers (allocate during request, reset when done)
   โ”œโ”€ Compiler/parser (allocate AST nodes, free when compilation done)
   โ””โ”€ JSON parser (allocate tree nodes, free entire tree at once)

Memory Alignment

Modern CPUs require data to be aligned to certain boundaries for efficient (or even correct) access:

Alignment Requirements (typical x86-64):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Type          โ”‚ Size    โ”‚ Alignment   โ”‚ Valid Addresses        โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ char          โ”‚ 1 byte  โ”‚ 1 byte      โ”‚ any address            โ”‚
โ”‚ short         โ”‚ 2 bytes โ”‚ 2 bytes     โ”‚ 0, 2, 4, 6, 8, ...    โ”‚
โ”‚ int           โ”‚ 4 bytes โ”‚ 4 bytes     โ”‚ 0, 4, 8, 12, ...      โ”‚
โ”‚ long/pointer  โ”‚ 8 bytes โ”‚ 8 bytes     โ”‚ 0, 8, 16, 24, ...     โ”‚
โ”‚ double        โ”‚ 8 bytes โ”‚ 8 bytes     โ”‚ 0, 8, 16, 24, ...     โ”‚
โ”‚ long double   โ”‚ 16 bytesโ”‚ 16 bytes    โ”‚ 0, 16, 32, ...        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

What happens with misaligned access:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Aligned (good):                                                โ”‚
โ”‚                                                                โ”‚
โ”‚ Address: 0    4    8    12   16   20   24   28               โ”‚
โ”‚          โ”œโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ค                โ”‚
โ”‚          โ”‚int โ”‚int โ”‚int โ”‚int โ”‚... โ”‚    โ”‚    โ”‚                โ”‚
โ”‚          โ””โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”˜                โ”‚
โ”‚          โœ“ One memory read per int                            โ”‚
โ”‚                                                                โ”‚
โ”‚ Misaligned (bad):                                              โ”‚
โ”‚                                                                โ”‚
โ”‚ Address: 0    4    8    12   16   20   24   28               โ”‚
โ”‚          โ”œโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ค                โ”‚
โ”‚          โ”‚ x  โ”‚ IN T  โ–ผโ”‚ x  โ”‚    โ”‚    โ”‚    โ”‚                โ”‚
โ”‚          โ””โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”˜                โ”‚
โ”‚                  โ–ฒ int starting at address 2                  โ”‚
โ”‚          โœ— Two memory reads required, or CPU exception!       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Alignment formula: To align address addr up to alignment align (where align is a power of 2):

aligned_addr = (addr + align - 1) & ~(align - 1);

Example:

Aligning 5 to 4-byte boundary:
  addr = 5      = 0b00000101
  align = 4     = 0b00000100
  align - 1     = 0b00000011

  addr + align - 1 = 5 + 3 = 8 = 0b00001000
  ~(align - 1) = 0b11111100

  (8) & 0b11111100 = 0b00001000 = 8

  Result: 5 aligned up to 8 (next multiple of 4)

Getting Memory from the OS: mmap

While you can use malloc() to get memory for your arena, using mmap() directly gives you more control:

#include <sys/mman.h>

// Request 1MB of memory directly from OS
void* block = mmap(
    NULL,                    // Let OS choose address
    1024 * 1024,            // Size in bytes
    PROT_READ | PROT_WRITE,  // Can read and write
    MAP_PRIVATE | MAP_ANONYMOUS, // Private, not backed by file
    -1,                      // No file descriptor
    0                        // No offset
);

// Give memory back to OS
munmap(block, 1024 * 1024);

Why use mmap over malloc?

  1. Direct control: Know exactly how much youโ€™re getting
  2. No metadata overhead: malloc adds headers to each allocation
  3. Guaranteed contiguous: One big block, not potentially scattered
  4. Zero-initialized: mmap gives you zeroed memory (with MAP_ANONYMOUS)
  5. Large allocations: malloc may use mmap internally anyway for large requests

Project Specification

Core API

// Create a new arena with specified capacity
Arena* arena_create(size_t capacity);

// Allocate size bytes from the arena
// Returns NULL if arena is exhausted
void* arena_alloc(Arena* arena, size_t size);

// Allocate with specific alignment
void* arena_alloc_aligned(Arena* arena, size_t size, size_t alignment);

// Reset arena - all allocations become invalid
// This is O(1) - just resets the offset
void arena_reset(Arena* arena);

// Destroy arena and release memory to OS
void arena_destroy(Arena* arena);

// Query functions
size_t arena_used(Arena* arena);
size_t arena_remaining(Arena* arena);

Expected Output Examples

Basic Usage:

$ ./test_arena

Arena created: 1048576 bytes

After allocating p1:
  Used: 54 bytes
  Free: 1048522 bytes

After allocating p2:
  Used: 108 bytes
  Free: 1048468 bytes

After allocating 100 integers:
  Used: 508 bytes
  Free: 1048068 bytes

After arena_reset():
  Used: 0 bytes (back to zero!)
  Free: 1048576 bytes (all available again!)

Visual Bump Pointer:

$ ./test_visual

Initial state:
โ”œโ”€ Base:     0x100204000
โ”œโ”€ Current:  0x100204000 (base + 0)
โ”œโ”€ Capacity: 1024 bytes
โ””โ”€ Used:     0.00%

After arena_alloc(100):
โ”œโ”€ Base:     0x100204000
โ”œโ”€ Current:  0x100204064 (base + 100)
โ”œโ”€ Capacity: 1024 bytes
โ””โ”€ Used:     9.77%

After arena_alloc(200):
โ”œโ”€ Base:     0x100204000
โ”œโ”€ Current:  0x1002040c8 (base + 300)
โ”œโ”€ Capacity: 1024 bytes
โ””โ”€ Used:     29.30%

After arena_alloc(300):
โ”œโ”€ Base:     0x100204000
โ”œโ”€ Current:  0x10020412c (base + 600)
โ”œโ”€ Capacity: 1024 bytes
โ””โ”€ Used:     58.59%

Performance Benchmark:

$ ./benchmark

Benchmarking 1000000 iterations of 100 allocations each...

malloc/free:  8.342 seconds
Arena alloc:  0.241 seconds

Arena is 34.6x faster!

Minimum Viable Features

  1. arena_create: Request memory block from OS
  2. arena_alloc: Bump allocate with default alignment
  3. arena_reset: Reset offset to 0
  4. arena_destroy: Return memory to OS

Extended Features

  1. Alignment support: Align allocations to arbitrary power-of-2 boundaries
  2. Typed allocation macros: arena_alloc_type(arena, Type)
  3. Growing arenas: Chain multiple blocks when capacity exceeded
  4. Temporary arenas: Save/restore offset for nested scopes
  5. Debug mode: Track allocation count, high watermark, statistics

Solution Architecture

Data Structures

typedef struct Arena {
    char* base;       // Start of memory block
    size_t offset;    // Current position (bytes used)
    size_t capacity;  // Total size of block
} Arena;

Memory Layout:

Arena struct (on heap or stack):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ base โ†’ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                       โ”‚
โ”‚ offset = 300        โ”‚                       โ”‚
โ”‚ capacity = 1024     โ”‚                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”‚โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                      โ”‚
                      โ–ผ
Memory block (from mmap):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ A (100)  โ”‚  B (200)   โ”‚            โ”‚    Available (724)     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค            โ”‚                        โ”‚
โ”‚      USED (300)       โ”‚            โ”‚                        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ–ฒ                       โ–ฒ                                     โ–ฒ
โ”‚                       โ”‚                                     โ”‚
base                    base + offset                         base + capacity

Algorithm: Bump Allocation

arena_alloc(arena, size):
    1. Check if size fits: offset + size <= capacity?
       - If no: return NULL (out of space)

    2. Calculate pointer: ptr = base + offset

    3. Bump offset: offset += size

    4. Return ptr

Time complexity: O(1) - just arithmetic and comparison
Space overhead: 0 bytes per allocation (no headers!)

Algorithm: Aligned Allocation

arena_alloc_aligned(arena, size, alignment):
    1. Calculate aligned offset:
       aligned_offset = (offset + alignment - 1) & ~(alignment - 1)

    2. Check if aligned allocation fits:
       aligned_offset + size <= capacity?
       - If no: return NULL

    3. Calculate pointer: ptr = base + aligned_offset

    4. Bump offset: offset = aligned_offset + size

    5. Return ptr

Note: Some bytes may be "wasted" as padding between allocations
Alignment Padding Example:

Before (offset = 5, need 8-byte alignment):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚USED โ”‚                    AVAILABLE                            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
      โ–ฒ
      โ”‚
      offset = 5

After arena_alloc_aligned(arena, 16, 8):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚USED โ”‚PADโ”‚   NEW ALLOC (16) โ”‚           AVAILABLE              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
      โ–ฒ   โ–ฒ                  โ–ฒ
      โ”‚   โ”‚                  โ”‚
      5   8                  24 (new offset)

      3 bytes of padding "wasted" for alignment

Component Organization

arena.h          - Public API and Arena typedef
arena.c          - Implementation
โ”œโ”€ arena_create  - mmap wrapper
โ”œโ”€ arena_alloc   - Bump allocation
โ”œโ”€ arena_alloc_aligned - Aligned bump allocation
โ”œโ”€ arena_reset   - Set offset to 0
โ”œโ”€ arena_destroy - munmap wrapper
โ””โ”€ query functions

test_arena.c     - Basic functionality tests
benchmark.c      - Performance comparison with malloc

Implementation Guide

Phase 1: Minimal Arena (30 minutes)

Goal: Get basic allocation working without mmap or alignment.

  1. Create Arena struct with base, offset, capacity
  2. Implement arena_create using malloc(capacity) for now
  3. Implement arena_alloc: check bounds, bump offset, return pointer
  4. Implement arena_reset: set offset to 0
  5. Implement arena_destroy: free(base)

Test: Allocate some integers, print their addresses, verify theyโ€™re sequential.

// Your first test
Arena* arena = arena_create(1024);
int* a = arena_alloc(arena, sizeof(int));
int* b = arena_alloc(arena, sizeof(int));
int* c = arena_alloc(arena, sizeof(int));

*a = 10; *b = 20; *c = 30;

printf("a at %p, b at %p, c at %p\n", a, b, c);
// Should be exactly 4 bytes apart!

arena_destroy(arena);

Phase 2: Use mmap (30 minutes)

Goal: Request memory directly from OS.

  1. Replace malloc(capacity) with mmap() call
  2. Replace free() with munmap()
  3. Handle mmap failure (returns MAP_FAILED)
#include <sys/mman.h>

Arena* arena_create(size_t capacity) {
    Arena* arena = malloc(sizeof(Arena));
    if (!arena) return NULL;

    arena->base = mmap(
        NULL,
        capacity,
        PROT_READ | PROT_WRITE,
        MAP_PRIVATE | MAP_ANONYMOUS,
        -1,
        0
    );

    if (arena->base == MAP_FAILED) {
        free(arena);
        return NULL;
    }

    arena->offset = 0;
    arena->capacity = capacity;
    return arena;
}

Phase 3: Alignment (45 minutes)

Goal: Support aligned allocations for any power-of-2 alignment.

  1. Implement alignment helper function
  2. Create arena_alloc_aligned(arena, size, alignment)
  3. Make arena_alloc call arena_alloc_aligned with default alignment (8 or 16)
  4. Test with different types requiring different alignments
// Alignment helper
static size_t align_up(size_t value, size_t alignment) {
    return (value + alignment - 1) & ~(alignment - 1);
}

void* arena_alloc_aligned(Arena* arena, size_t size, size_t alignment) {
    size_t aligned_offset = align_up(arena->offset, alignment);

    if (aligned_offset + size > arena->capacity) {
        return NULL;
    }

    void* ptr = arena->base + aligned_offset;
    arena->offset = aligned_offset + size;
    return ptr;
}

Phase 4: Type-Safe Macros (30 minutes)

Goal: Create convenient macros for type allocations.

// Single allocation
#define arena_new(arena, type) \
    ((type*)arena_alloc_aligned((arena), sizeof(type), _Alignof(type)))

// Array allocation
#define arena_array(arena, type, count) \
    ((type*)arena_alloc_aligned((arena), sizeof(type) * (count), _Alignof(type)))

// Usage:
Player* player = arena_new(arena, Player);
int* scores = arena_array(arena, int, 100);

Phase 5: Performance Benchmark (30 minutes)

Goal: Quantify the speed advantage.

#include <time.h>

#define ITERATIONS 1000000
#define ALLOCS_PER_ITER 100
#define ALLOC_SIZE 64

void benchmark_malloc() {
    clock_t start = clock();
    void* ptrs[ALLOCS_PER_ITER];

    for (int i = 0; i < ITERATIONS; i++) {
        for (int j = 0; j < ALLOCS_PER_ITER; j++) {
            ptrs[j] = malloc(ALLOC_SIZE);
        }
        for (int j = 0; j < ALLOCS_PER_ITER; j++) {
            free(ptrs[j]);
        }
    }

    clock_t end = clock();
    printf("malloc/free: %.3f seconds\n",
           (double)(end - start) / CLOCKS_PER_SEC);
}

void benchmark_arena() {
    clock_t start = clock();
    Arena* arena = arena_create(ALLOC_SIZE * ALLOCS_PER_ITER);

    for (int i = 0; i < ITERATIONS; i++) {
        for (int j = 0; j < ALLOCS_PER_ITER; j++) {
            arena_alloc(arena, ALLOC_SIZE);
        }
        arena_reset(arena);
    }

    arena_destroy(arena);
    clock_t end = clock();
    printf("arena: %.3f seconds\n",
           (double)(end - start) / CLOCKS_PER_SEC);
}

Phase 6: Extended Features (Optional)

Temporary/Nested Arenas:

// Save current position
size_t arena_save(Arena* arena) {
    return arena->offset;
}

// Restore to saved position (invalidates all allocations since save)
void arena_restore(Arena* arena, size_t savepoint) {
    arena->offset = savepoint;
}

// Usage:
size_t saved = arena_save(arena);
// ... temporary allocations ...
arena_restore(arena, saved);  // All temp allocations gone

Growing Arenas:

typedef struct ArenaBlock {
    char* base;
    size_t capacity;
    struct ArenaBlock* next;
} ArenaBlock;

typedef struct GrowingArena {
    ArenaBlock* head;
    ArenaBlock* current;
    size_t offset;
    size_t block_size;
} GrowingArena;

// When current block is full, allocate new block and chain

Testing Strategy

Unit Tests

void test_basic_allocation() {
    Arena* arena = arena_create(1024);
    assert(arena != NULL);

    int* a = arena_alloc(arena, sizeof(int));
    assert(a != NULL);
    *a = 42;
    assert(*a == 42);

    arena_destroy(arena);
    printf("test_basic_allocation: PASSED\n");
}

void test_sequential_addresses() {
    Arena* arena = arena_create(1024);

    char* a = arena_alloc(arena, 1);
    char* b = arena_alloc(arena, 1);
    char* c = arena_alloc(arena, 1);

    // Should be sequential
    assert(b == a + 1);
    assert(c == b + 1);

    arena_destroy(arena);
    printf("test_sequential_addresses: PASSED\n");
}

void test_reset() {
    Arena* arena = arena_create(1024);

    arena_alloc(arena, 512);
    assert(arena_used(arena) == 512);

    arena_reset(arena);
    assert(arena_used(arena) == 0);
    assert(arena_remaining(arena) == 1024);

    arena_destroy(arena);
    printf("test_reset: PASSED\n");
}

void test_exhaustion() {
    Arena* arena = arena_create(100);

    void* a = arena_alloc(arena, 50);
    assert(a != NULL);

    void* b = arena_alloc(arena, 50);
    assert(b != NULL);

    // This should fail - arena exhausted
    void* c = arena_alloc(arena, 1);
    assert(c == NULL);

    arena_destroy(arena);
    printf("test_exhaustion: PASSED\n");
}

void test_alignment() {
    Arena* arena = arena_create(1024);

    // Allocate 1 byte to misalign
    arena_alloc(arena, 1);

    // Now allocate something requiring 8-byte alignment
    double* d = arena_alloc_aligned(arena, sizeof(double), 8);

    // Address should be 8-byte aligned
    assert(((uintptr_t)d % 8) == 0);

    // Should be usable
    *d = 3.14159;
    assert(*d == 3.14159);

    arena_destroy(arena);
    printf("test_alignment: PASSED\n");
}

Integration Test: Game Frame Simulation

void test_game_frame_pattern() {
    Arena* frame_arena = arena_create(1024 * 1024);  // 1MB

    for (int frame = 0; frame < 60; frame++) {
        // Simulate frame allocations
        Vector3* positions = arena_array(frame_arena, Vector3, 1000);
        Vector3* velocities = arena_array(frame_arena, Vector3, 1000);
        char* debug_buffer = arena_alloc(frame_arena, 4096);

        // "Use" the allocations
        for (int i = 0; i < 1000; i++) {
            positions[i].x = (float)i;
            velocities[i].x = (float)i * 0.1f;
        }
        snprintf(debug_buffer, 4096, "Frame %d", frame);

        // Reset for next frame
        arena_reset(frame_arena);
    }

    arena_destroy(frame_arena);
    printf("test_game_frame_pattern: PASSED (60 frames, no leaks)\n");
}

Common Pitfalls

Pitfall 1: Using Allocations After Reset

// WRONG: Using pointer after reset
Arena* arena = arena_create(1024);
int* data = arena_alloc(arena, sizeof(int) * 100);
arena_reset(arena);
data[0] = 42;  // UNDEFINED BEHAVIOR! Memory may be reused

// CORRECT: Reset invalidates ALL pointers from that arena
Arena* arena = arena_create(1024);
int* data = arena_alloc(arena, sizeof(int) * 100);
// Use data...
// When done:
data = NULL;  // Clear your pointer
arena_reset(arena);

Pitfall 2: Forgetting Alignment

// WRONG: May cause crashes on some CPUs
Arena* arena = arena_create(1024);
char c;
arena_alloc(arena, 1);  // offset = 1
double* d = arena_alloc(arena, sizeof(double));  // offset = 1, misaligned!
*d = 3.14;  // Potential bus error!

// CORRECT: Use aligned allocation
Arena* arena = arena_create(1024);
arena_alloc(arena, 1);  // offset = 1
double* d = arena_alloc_aligned(arena, sizeof(double), alignof(double));
*d = 3.14;  // Safe, properly aligned

Pitfall 3: Integer Overflow in Size Calculations

// WRONG: Possible overflow with large sizes
void* arena_alloc(Arena* arena, size_t size) {
    if (arena->offset + size > arena->capacity) {  // Can overflow!
        return NULL;
    }
    // ...
}

// CORRECT: Check for overflow
void* arena_alloc(Arena* arena, size_t size) {
    if (size > arena->capacity - arena->offset) {  // Safe check
        return NULL;
    }
    // ...
}

Pitfall 4: Not Handling mmap Failure

// WRONG: Not checking mmap return value
Arena* arena_create(size_t capacity) {
    Arena* arena = malloc(sizeof(Arena));
    arena->base = mmap(NULL, capacity, PROT_READ | PROT_WRITE,
                       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    // If mmap failed, base == MAP_FAILED, not NULL!
    arena->offset = 0;
    arena->capacity = capacity;
    return arena;
}

// CORRECT: Check for MAP_FAILED
Arena* arena_create(size_t capacity) {
    Arena* arena = malloc(sizeof(Arena));
    if (!arena) return NULL;

    arena->base = mmap(NULL, capacity, PROT_READ | PROT_WRITE,
                       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (arena->base == MAP_FAILED) {
        free(arena);
        return NULL;
    }

    arena->offset = 0;
    arena->capacity = capacity;
    return arena;
}

Pitfall 5: Using Arena for Wrong Pattern

// WRONG: Using arena when you need individual frees
Arena* arena = arena_create(1024);
void* item;
while ((item = get_next_item())) {
    void* processed = process_in_arena(arena, item);
    output(processed);
    // Can't free processed individually - arena keeps growing!
}
// Eventually arena exhausted

// CORRECT: Use arena for batch patterns only
Arena* arena = arena_create(1024);
while (has_batches()) {
    Batch* batch = get_batch();
    for (int i = 0; i < batch->count; i++) {
        void* processed = process_in_arena(arena, batch->items[i]);
        output(processed);
    }
    arena_reset(arena);  // Reset after each batch
}

Extensions and Challenges

Challenge 1: Growing Arena (Medium)

Implement an arena that automatically grows when exhausted:

  • Chain multiple blocks together
  • Keep allocating without interruption
  • Reset frees all blocks except the first
  • Measure overhead vs fixed arena

Challenge 2: Scratch/Temp Arena (Medium)

Implement nested arena scopes:

Arena* arena = arena_create(1024);
// ... some allocations ...

ArenaTemp temp = arena_begin_temp(arena);
// ... temporary allocations ...
arena_end_temp(temp);  // Everything since begin_temp is freed

// Original allocations still valid

Challenge 3: Pool Allocator (Hard)

Build a pool allocator on top of your arena:

  • Fixed-size blocks only
  • O(1) allocate AND free
  • No fragmentation
  • Free list stored in the free blocks themselves

Challenge 4: Thread-Local Arenas (Hard)

Make arenas thread-safe:

  • Option 1: One arena per thread
  • Option 2: Mutex-protected arena
  • Compare performance

Challenge 5: Debug Arena (Medium)

Add debugging features:

  • Track number of allocations
  • Track high-water mark
  • Poison freed memory (fill with 0xDE)
  • Detect use-after-reset

Real-World Applications

Game Engine Frame Allocator

// Used by many game engines (Unreal, Unity underlying systems, etc.)
void game_loop() {
    Arena* frame_arena = arena_create(16 * 1024 * 1024);  // 16MB

    while (game_running) {
        process_input(frame_arena);
        update_physics(frame_arena);
        render_frame(frame_arena);

        // All frame-specific allocations freed instantly
        arena_reset(frame_arena);
    }

    arena_destroy(frame_arena);
}

Web Server Request Handler

// Each request gets its own arena
void handle_request(Request* req) {
    Arena* request_arena = arena_create(64 * 1024);  // 64KB

    ParsedHeaders* headers = parse_headers(request_arena, req);
    JsonValue* body = parse_json(request_arena, req->body);
    Response* response = generate_response(request_arena, headers, body);

    send_response(response);

    // All request data freed at once
    arena_destroy(request_arena);
}

Compiler/Parser

// Parse phase allocates AST nodes into arena
AST* parse_file(const char* filename) {
    Arena* parse_arena = arena_create(1024 * 1024);  // 1MB

    Tokens* tokens = lex(parse_arena, filename);
    AST* ast = parse(parse_arena, tokens);

    // AST lives in arena - return it
    // Arena destroyed when AST no longer needed
    return ast;
}

Simulation/Scientific Computing

// Monte Carlo simulation with many temporary values
void run_simulation(int iterations) {
    Arena* iter_arena = arena_create(10 * 1024 * 1024);  // 10MB

    for (int i = 0; i < iterations; i++) {
        Matrix* temp1 = matrix_create(iter_arena, 1000, 1000);
        Matrix* temp2 = matrix_multiply(iter_arena, temp1, other);
        Vector* result = matrix_reduce(iter_arena, temp2);

        accumulate(result);

        // Hundreds of MB would be allocated without arena
        // With arena: just reset
        arena_reset(iter_arena);
    }

    arena_destroy(iter_arena);
}

Interview Preparation

Common Questions

  1. โ€œWhat is an arena allocator? When would you use one?โ€
    • An arena allocator pre-allocates a large block and hands out pieces sequentially
    • Use when allocations share a lifetime and can be freed together
    • Examples: per-frame, per-request, per-parse operations
  2. โ€œWhatโ€™s the time complexity of arena allocation vs malloc?โ€
    • Arena: O(1) always - just bump a pointer
    • malloc: O(1) best case, O(n) worst case (searching free lists)
    • Arena reset: O(1) - just set offset to 0
    • free: O(1) to O(log n) depending on implementation
  3. โ€œWhy canโ€™t you free individual allocations from an arena?โ€
    • No per-allocation metadata to track whatโ€™s allocated
    • Would need free list, defeating the purpose
    • Design choice: trade individual frees for speed
  4. โ€œWhat is memory alignment and why does it matter?โ€
    • CPUs require data at specific address boundaries (2, 4, 8, 16 bytes)
    • Misaligned access: slower (multiple memory reads) or crash (bus error)
    • Align formula: (addr + align - 1) & ~(align - 1)
  5. โ€œCompare arena allocators to pool allocators.โ€
    • Arena: variable-size, no individual free, bulk reset
    • Pool: fixed-size, individual free via free list, good for objects of same size
    • Arena for temporary data, pool for persistent same-sized objects
  6. โ€œIn what scenarios would an arena allocator be faster than malloc?โ€
    • Many small allocations followed by bulk free
    • Known lifetime patterns (frame, request, transaction)
    • Performance-critical paths with predictable allocation
    • When fragmentation would hurt malloc

Self-Assessment Checklist

Understanding (Can You Explain?)

  • Why syscalls are expensive and allocators batch them
  • What โ€œbump allocationโ€ means and why itโ€™s O(1)
  • The tradeoff between flexibility and performance in allocators
  • Why alignment matters and how to calculate aligned addresses
  • When arena allocation is appropriate vs general-purpose

Implementation (Can You Build?)

  • Basic arena with create/alloc/reset/destroy
  • Aligned allocation supporting power-of-2 alignments
  • Type-safe macros for allocation
  • mmap-based memory acquisition

Application (Can You Apply?)

  • Identify code that would benefit from arena allocation
  • Benchmark arena vs malloc for specific patterns
  • Design arena usage for a game frame or request handler
  • Avoid the common pitfalls (use-after-reset, alignment)

Extension (Can You Extend?)

  • Implement growing arenas
  • Add temp/scratch arena pattern
  • Build pool allocator on top of arena
  • Add debug/statistics features

Resources

Books

Topic Book Chapter
Arena design patterns โ€œC Interfaces and Implementationsโ€ by David Hanson Ch. 5-6
Memory allocator internals โ€œComputer Systems: A Programmerโ€™s Perspectiveโ€ Ch. 9.9
mmap and virtual memory โ€œThe Linux Programming Interfaceโ€ Ch. 49
Alignment requirements โ€œComputer Systems: A Programmerโ€™s Perspectiveโ€ Ch. 3.9.3
Allocator strategies โ€œOperating Systems: Three Easy Piecesโ€ Ch. 17

Online Resources

Code References


Summary

The arena allocator teaches a fundamental truth about memory management: malloc is just one solution, and often not the best one. By building your own allocator, you understand:

  1. Memory is a resource - You can manage it however you want
  2. Tradeoffs exist - Speed vs flexibility, simplicity vs features
  3. Patterns matter - Know your allocation pattern, choose the right allocator
  4. The OS is expensive - Batch syscalls, minimize kernel transitions
  5. Alignment is real - CPUs have requirements, respect them

When you can build a custom allocator for a specific use case and quantify its performance advantage, youโ€™ve moved from โ€œusingโ€ memory to โ€œmanagingโ€ memory. This is systems programming.


Next Project: P05: Exploit Lab (Buffer Overflow) - Now that you understand how memory really works, letโ€™s see what happens when it goes wrong on purpose.