Project 1: Memory Arena & Dynamic Array
Build a bump allocator and a vector-style dynamic array, then visualize how memory moves as you grow.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | 1 week |
| Language | C (Alternatives: Rust, Zig) |
| Prerequisites | Pointers, structs, basic malloc/free |
| Key Topics | Memory layout, pointer arithmetic, amortized analysis, cache locality |
1. Learning Objectives
By completing this project, you will:
- Explain how a bump allocator reserves bytes and why it is fast.
- Implement dynamic array growth with amortized O(1) append.
- Visualize contiguous memory and the effects of reallocation.
- Analyze time/space trade-offs for array resizing strategies.
2. Theoretical Foundation
2.1 Core Concepts
- Memory as a byte array: All data lives in a flat array of bytes. Types merely interpret bytes with different widths and alignments.
- Pointer arithmetic:
ptr + nmovesnelements of the pointer type, notnbytes. This underpins array indexing. - Allocation and alignment: Allocators round sizes to alignment boundaries to keep CPU access efficient.
- Amortized analysis: Resizing by doubling spreads occasional expensive copies across many cheap pushes.
- Cache locality: Contiguous arrays benefit from spatial locality, reducing cache misses.
2.2 Why This Matters
Every data structure reduces to memory layout decisions. Understanding how arrays grow and move is foundational for performance engineering, systems programming, and safe memory handling.
2.3 Historical Context / Background
Dynamic arrays are a classic structure popularized by early systems languages and later standardized in libraries (std::vector, Java ArrayList). Bump allocators are common in compilers, game engines, and embedded systems for their simplicity and speed.
2.4 Common Misconceptions
- “Realloc is always slow”: It is often fast when there is space to grow in place.
- “Arrays are always best”: Insert/delete in the middle is expensive and can dominate runtime.
- “Pointers are opaque”: They are just addresses; arithmetic is deterministic.
3. Project Specification
3.1 What You Will Build
- A bump allocator that manages a fixed-size arena.
- A dynamic array that uses the arena for backing storage.
- A visualization routine showing allocations, offsets, and array growth.
3.2 Functional Requirements
- Arena creation: Allocate a large block and track a bump offset.
- Aligned allocation: Return aligned pointers and update the offset.
- Arena reset: Reset offset to reuse memory instantly.
- Dynamic array: Support push, pop, get, and automatic growth.
- Visualization: Show current capacity, size, and memory addresses.
3.3 Non-Functional Requirements
- Performance: Append must be amortized O(1).
- Reliability: Detect out-of-memory conditions.
- Usability: Provide readable output that explains memory moves.
3.4 Example Usage / Output
$ ./memory_demo
Arena created: 1MB at 0x7f1234500000
Alloc 100 bytes -> 0x7f1234500000
Alloc 200 bytes -> 0x7f1234500064
Dynamic array:
Push 1: [1, _, _, _] size=1 cap=4
Push 5: [1,2,3,4,5,_,_,_] size=5 cap=8 (grew)
Data moved from 0x7f1234600000 to 0x7f1234700000
3.5 Real World Outcome
When you run the demo, you see a step-by-step memory story. The terminal prints the arena base address, then shows each allocation with offsets. When the array grows, the program prints the old and new data addresses so you can see the relocation. You end with a short benchmark showing how many reallocations occurred during a million pushes.
4. Solution Architecture
4.1 High-Level Design
┌──────────────┐ ┌──────────────────┐ ┌────────────────┐
│ Arena │────▶│ Dynamic Array │────▶│ Visualizer │
│ (bump alloc) │ │ (resizable) │ │ (prints maps) │
└──────────────┘ └──────────────────┘ └────────────────┘
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Arena | Owns raw memory block | Fixed size, aligned bumps |
| DynArray | Manages size/capacity | Doubling growth strategy |
| Visualizer | Explains memory behavior | Human-readable output |
4.3 Data Structures
typedef struct {
unsigned char* base;
size_t size;
size_t offset;
} Arena;
typedef struct {
int* data;
size_t size;
size_t capacity;
Arena* arena;
} DynArray;
4.4 Algorithm Overview
Key Algorithm: Dynamic Array Growth
- If
size < capacity, write in place. - Otherwise, allocate new buffer with
capacity * 2. - Copy old elements, update pointer and capacity.
Complexity Analysis:
- Time: O(1) amortized per push, O(n) on grow
- Space: O(n) storage + growth overhead
5. Implementation Guide
5.1 Development Environment Setup
clang -Wall -Wextra -O2 -g -o memory_demo src/*.c
5.2 Project Structure
project-root/
├── src/
│ ├── arena.c
│ ├── arena.h
│ ├── dynarray.c
│ ├── dynarray.h
│ └── main.c
├── tests/
│ └── test_dynarray.c
├── Makefile
└── README.md
5.3 The Core Question You’re Answering
“If memory is just bytes, how does a dynamic array turn that into fast random access while still growing safely?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Pointer arithmetic
- Why
ptr + 1moves bysizeof(*ptr) - How arrays decay to pointers
- Book: C Programming: A Modern Approach, Ch. 12
- Why
- Alignment rules
- Why misaligned reads are slower
- How to round up to 8 or 16 bytes
- Amortized analysis
- Why doubling gives O(1) average
- Book: CLRS, Ch. 17.4
- Cache locality
- Why contiguous storage is fast
- How cache lines work
5.5 Questions to Guide Your Design
Before implementing, think through these:
- How will you ensure allocations are aligned?
- What happens if the arena is full?
- Should array growth use
*2or*1.5? - How will you display address movement clearly?
5.6 Thinking Exercise
Trace a Growth Event
Manually trace capacity doubling for pushes 1..9 starting with capacity 4. Write down how many elements are copied during each resize and compute total copies.
5.7 The Interview Questions They’ll Ask
Prepare to answer these:
- “Why is
pushamortized O(1) in a dynamic array?” - “How does alignment affect allocator design?”
- “What is the difference between
mallocand a bump allocator?” - “When is a dynamic array a poor choice?”
5.8 Hints in Layers
Hint 1: Start with a fixed buffer
Allocate a large malloc block and hand out chunks from it.
Hint 2: Track alignment
Round size to a multiple of 8 before incrementing offset.
Hint 3: Log moves Print old and new data addresses each time the array grows.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Memory layout | Computer Systems: A Programmer’s Perspective | Ch. 9 |
| Amortized analysis | Introduction to Algorithms (CLRS) | Ch. 17.4 |
| Pointer fundamentals | C Programming: A Modern Approach | Ch. 12 |
5.10 Implementation Phases
Phase 1: Foundation (2 days)
Goals:
- Implement arena create/alloc/reset
- Verify alignment and bounds
Tasks:
- Write
arena_createandarena_alloc. - Add a small unit test to validate alignment.
Checkpoint: Allocations return increasing, aligned addresses.
Phase 2: Core Functionality (3 days)
Goals:
- Implement dynamic array create/push/get
- Handle growth
Tasks:
- Implement
dynarray_pushwith doubling. - Add visualization of size/capacity.
Checkpoint: Pushing 1..9 shows capacity growth at 5.
Phase 3: Polish & Edge Cases (2 days)
Goals:
- Add pop and bounds checks
- Add benchmark output
Tasks:
- Implement
dynarray_popand error paths. - Add a loop test for 1,000,000 pushes.
Checkpoint: All tests pass; benchmark prints reallocations.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Growth factor | 1.5x, 2x | 2x | Simple and standard amortized proof |
| Alignment | 4, 8, 16 | 8 | Good default for 64-bit systems |
| Arena size | Fixed, growable | Fixed | Keep allocator simple for learning |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Validate operations | push/pop, get bounds |
| Integration Tests | Arena + array together | array using arena storage |
| Edge Case Tests | Failure conditions | arena full, zero capacity |
6.2 Critical Test Cases
- Grow test: push 1..9 with capacity 4, verify size/capacity.
- Bounds test: get index == size returns error.
- Arena reset: allocations after reset reuse base.
6.3 Test Data
Push: [1..1000]
Pop: 1000..1
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Missing alignment | Crashes or slow access | Round sizes to alignment |
| Forgetting to copy | Data loss after growth | Use memcpy correctly |
| Off-by-one size | Last element corrupted | Update size after write |
7.2 Debugging Strategies
- Use
printfto log addresses and sizes at each step. - Run under
valgrindorasanto catch overflows.
7.3 Performance Traps
- Growing by +1 each time yields O(n^2) total work.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add
dynarray_insertat index. - Add
dynarray_removewith shift.
8.2 Intermediate Extensions
- Implement
realloc-style growth in place when possible. - Add a generic
void*array with element size.
8.3 Advanced Extensions
- Add memory profiling counters.
- Implement a free-list allocator instead of bump.
9. Real-World Connections
9.1 Industry Applications
- Game engines: arena allocators for frame memory.
- Standard libraries: vector/arraylist implementations.
9.2 Related Open Source Projects
- jemalloc: production allocator design patterns.
- Lua: uses arenas and arrays for fast tables.
9.3 Interview Relevance
- Amortized analysis and cache locality are common interview topics.
10. Resources
10.1 Essential Reading
- Computer Systems: A Programmer’s Perspective by Bryant & O’Hallaron - Ch. 9
- Introduction to Algorithms (CLRS) - Ch. 17.4
10.2 Video Resources
- “Memory Allocation Strategies” - university systems lectures
10.3 Tools & Documentation
- AddressSanitizer: https://clang.llvm.org/docs/AddressSanitizer.html
10.4 Related Projects in This Series
- Next Project: Linked List Toolkit
11. Self-Assessment Checklist
11.1 Understanding
- I can explain how a bump allocator works.
- I can justify why doubling yields amortized O(1).
- I can describe cache locality benefits.
11.2 Implementation
- All functional requirements are met.
- All tests pass.
- Edge cases are handled.
11.3 Growth
- I can describe trade-offs vs linked lists.
- I documented lessons learned.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Arena allocates aligned blocks and resets.
- Dynamic array supports push/get and growth.
- Output clearly shows memory changes.
Full Completion:
- All tests plus benchmark output.
- Visualization covers both arena and array.
Excellence (Going Above & Beyond):
- Generic element sizes.
- Additional allocator strategy implemented.
This guide was generated from DATA_STRUCTURES_FROM_FIRST_PRINCIPLES.md. For the complete learning path, see the parent directory README.