Project 1: Build a Simple Arena Allocator
Build a safe, ergonomic bump arena that returns references tied to the arena lifetime while using unsafe internals correctly.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 1-2 weeks |
| Main Programming Language | Rust (Alternatives: C++ for comparison) |
| Alternative Programming Languages | C, C++ |
| Coolness Level | High |
| Business Potential | Medium |
| Prerequisites | Ownership basics, lifetimes, raw pointers, alignment fundamentals |
| Key Topics | Arena allocation, lifetimes, Drop, unsafe invariants, alignment |
1. Learning Objectives
By completing this project, you will:
- Design a safe Rust API that exposes lifetime-tied references to arena-allocated values.
- Implement a bump allocator with correct alignment and layout handling.
- Document unsafe invariants and prove why the public API cannot create dangling references.
- Benchmark arena allocation performance vs
Box<T>and explain the trade-offs.
2. All Theory Needed (Per-Concept Breakdown)
2.1 Ownership, Drop, and Lifetime Ties
Fundamentals
Ownership in Rust is the rule that each value has a single responsible owner, and that owner determines when the value is dropped. An arena allocator intentionally shifts the lifetime boundary: instead of each allocation owning its memory, a single arena owns an entire region. That means individual values do not get dropped independently; the arena’s Drop implementation becomes the only cleanup point. Lifetimes are the compile-time proof that references can never outlive their backing storage. When your arena returns &'a T, the 'a must be tied to a borrow of the arena itself, so references can never survive the arena’s own scope. This combination of ownership + Drop + lifetime ties is what lets Rust make a bump allocator safe even though it internally uses raw pointers.
Deep Dive into the concept
The power of arena allocation is that it compresses many per-object lifetimes into a single shared lifetime. In Rust, that means you deliberately avoid per-item Drop and instead tie all allocations to the arena’s lifetime. The borrow checker cannot see your raw pointer arithmetic, so it relies entirely on the lifetime relations you encode in the public API. The typical pattern is fn alloc<'a, T>(&'a self, value: T) -> &'a T, which states: if you can borrow the arena for 'a, then you can also borrow the resulting T for the same 'a. This “lifetime tethering” is the key proof that references never outlive the arena. The compiler’s job is to ensure the arena borrow ends before the reference is used, which it can do using non-lexical lifetimes.
Drop is the second pillar. Because items in the arena are not individually dropped, you must define the arena’s safety boundary. Either you constrain what can be allocated (e.g., Copy types only) or you explicitly track destructors and run them on drop. A minimal arena might support any T but ignore destructors, which is unsound for types that own resources. The safe design is either: (a) accept only T: Copy or T: 'static and use MaybeUninit<T> plus a drop list, or (b) store a list of fn(*mut u8) destructors that you invoke during arena drop. This is why a “safe API” is not only about lifetimes; it is also about guaranteeing resources are cleaned exactly once.
The borrow checker models a reference as a loan. When you call alloc, you create a loan against the arena. The arena must stay valid for the duration of that loan. This is why a method with &mut self can often be easier: the exclusive borrow proves no aliasing of the arena occurs while you mutate the bump pointer, which makes internal reasoning simpler. The trade-off is ergonomics: &mut self makes it harder to allocate from multiple parts of your program concurrently. Many arena designs keep &mut self to avoid interior mutability, while others use Cell/UnsafeCell to allow &self allocation. The safest first version uses &mut self and then explicitly refactors to &self with UnsafeCell and careful invariants.
Lifetime ties also interact with drop-check. If you store references inside values that live in the arena, the compiler must ensure those references do not outlive their referents. Because the arena drops all at once, drop-check will ensure you do not create self-referential cycles that would require one item to be dropped before another. In practice, this is why arena-allocated graphs are common: you can store references between nodes without worrying about per-node drop order, but you must still guarantee that no node references an external value with a shorter lifetime.
The “arena as owner” mental model makes API design concrete. If the arena owns all items, then it must decide which borrowing rules are exposed. This leads to clear API questions: do you allow alloc while references exist? If yes, you must guarantee allocations never move existing items. This is why arenas allocate in a contiguous buffer and never reallocate or shrink it. This constraint is a memory-safety invariant: once a reference is handed out, its address cannot change. In Rust, moving a value is effectively changing its address; your arena must ensure that no internal reallocation moves previously allocated values.
Finally, think about what the compiler can and cannot prove. The compiler can prove the lifetime relation between &self and &T, but it cannot prove your pointer math. You must provide that proof in prose, with invariants. This is the essence of unsafe Rust: the compiler enforces your stated contracts, but you must supply the contract and then obey it. A good arena API is therefore a teaching tool for the borrow checker itself: it demonstrates how to build safe abstractions by tying unsafe operations to strict lifetime and ownership boundaries.
How this fit on projects
This concept is the spine of the arena allocator. You will apply it in §3.1 (defining what the arena owns), §4.2 (deciding &mut self vs &self), and §5.3 (the core question about lifetime proof). It also prepares you for Project 9 (connection pool) where Drop returns resources, and Project 10 (capstone) where you must design a safe API around unsafe internals.
Definitions & key terms
- Owner: The binding responsible for a value’s cleanup.
- Drop: The destructor logic run when a value goes out of scope.
- Lifetime tie: A function signature that relates the lifetime of a reference to another borrow.
- Loan: A conceptual borrow tracked by the compiler.
- Drop-check: Compiler rules that ensure references are valid during destruction.
Mental model diagram (ASCII)
Arena (owner)
|
| alloc(&self) -> &'a T
v
[buffer memory] --stable addresses--> references
|
v
Drop(Arena) => all references must be dead
How it works (step-by-step)
- You create an arena that owns a contiguous buffer.
- Calling
allocborrows the arena and returns a reference tied to that borrow. - The arena updates a bump pointer to reserve space; existing items never move.
- The compiler enforces that references cannot outlive the arena borrow.
- On
Drop, the arena reclaims all memory at once, and references cannot exist anymore.
Minimal concrete example
fn alloc<'a, T>(arena: &'a Arena, value: T) -> &'a T {
// unsafe: write to buffer, return reference
unimplemented!()
}
Common misconceptions
- “If it compiles, the unsafe code is correct.” It only means the lifetime ties are correct, not the pointer logic.
- “Arena allocation is always safe.” It is safe only if you never move items after handing out references.
- “Drop doesn’t matter because we free the whole arena.” It matters for resource-owning types.
Check-your-understanding questions
- Why must
allocreturn a reference tied to the arena’s lifetime? - What happens if the arena reallocates its buffer after references were handed out?
- Why is
Dropstill important even if the arena frees everything at once?
Check-your-understanding answers
- Without the tie, references could outlive the arena, leading to dangling references.
- Reallocation moves existing items, invalidating references and causing UB.
- Resource-owning types must be dropped to release OS resources; otherwise, you leak.
Real-world applications
- Parser arenas in compilers for AST nodes.
- Query planners in databases that allocate short-lived structures.
- High-performance request handlers that want fast allocation without per-item frees.
Where you’ll apply it
- This project: §3.1 (scope of ownership), §5.4 (concept prerequisites), §5.10 (phase 1 invariants).
- Also used in: Project 9: Connection Pool, Project 10: Capstone.
References
- “The Rust Programming Language” (TRPL), Ch. 4, 10, 15.
- “Rustonomicon”, chapters on ownership and drop-check.
- “Programming Rust” (Blandy & Orendorff), memory management sections.
Key insights
A safe arena is not about clever pointer math; it is about a lifetime contract that the compiler can enforce.
Summary
Ownership, Drop, and lifetime ties are the proof system for arenas: one owner, one drop point, many safe references.
Homework/Exercises to practice the concept
- Write a tiny
ScopeLoggertype that logs when it drops. Use it to observe drop order. - Implement a function
fn borrow_twice<'a>(x: &'a i32) -> (&'a i32, &'a i32)and explain why it’s safe. - Modify a struct to include a reference to another field and observe the compiler error.
Solutions to the homework/exercises
- Implement
DroponScopeLoggerand print indrop(). Create multiple instances in a scope. - The borrow is shared, so multiple immutable references are allowed; the lifetime tie is legal.
- Rust forbids self-references because moving the struct would invalidate the reference.
2.2 Raw Pointers, Alignment, and Layout
Fundamentals
Raw pointers (*const T, *mut T) bypass Rust’s borrow checker. They allow direct memory reads and writes, which is exactly what an arena allocator needs. But they also remove safety checks. To use raw pointers safely, you must obey alignment rules and ensure you never read uninitialized memory. Alignment means that a type T must be stored at an address divisible by align_of::<T>(). Layout describes the size and alignment for a type and is used to compute offsets inside the arena. A correct arena allocator uses Layout to compute a properly aligned offset, writes the value with ptr::write, and returns a reference with the correct lifetime.
Deep Dive into the concept
Alignment is the quiet, critical rule that determines whether your program is correct or subtly broken. CPUs often require aligned access for performance and sometimes for correctness. Rust enforces alignment in safe code, but in unsafe code you must do it manually. When implementing a bump allocator, you maintain a current offset into a byte buffer. Each allocation request for T must move that offset forward to the next address that satisfies align_of::<T>(). The canonical helper is align_up(offset, align) = (offset + align - 1) & !(align - 1) for power-of-two alignments. Rust’s Layout provides size() and align() and can be combined using Layout::extend to compute struct layouts.
You must also think about initialization. A raw buffer like Vec<u8> contains uninitialized bytes, and reading them is undefined behavior. The safe pattern is to reserve space, then write the value into that space using ptr::write. This avoids dropping uninitialized memory. If you later want to free, you must drop the values in place, typically by storing destructor callbacks or using MaybeUninit<T> to represent uninitialized slots. Many bump allocators ignore per-item destructors for performance, but in Rust that can only be safe if you constrain T or explicitly document the leak.
Pointer arithmetic is another source of danger. ptr.add(n) performs byte-wise (or element-wise) arithmetic but must stay within the allocated object. In an arena, your allocated object is the entire buffer. The correct pattern is to get base = buffer.as_mut_ptr(), compute ptr = base.add(offset) in bytes, and then cast to *mut T. This is only valid if the resulting pointer is aligned and the buffer is large enough. A classic invariant is: offset + size_of::<T>() <= capacity. If you violate it, you will overwrite memory you do not own, which is immediate UB.
Alignment interacts with padding. Suppose you allocate a u8 then a u64. The u64 must be aligned to 8 bytes, so the arena must insert 7 bytes of padding. This means that naive bump allocation wastes some space, but it preserves correctness. Layout is the official way to compute these alignments, and using Layout in your allocator keeps you aligned with Rust’s ABI rules. If you design your own layout logic, you must match Rust’s repr(Rust) or repr(C) requirements. For a generic arena, you should not assume layout; you should always use Layout.
Another key detail is provenance and aliasing. Rust’s memory model treats pointers as having provenance: they are derived from a specific allocation. When you cast a *mut u8 into *mut T, you must ensure that pointer came from your arena’s allocation. This is typically fine, but it means you should not create fake pointers or offset from an arbitrary integer. Stick to as_mut_ptr + add to preserve provenance.
Finally, you must reason about concurrency. If you allow &self allocation with interior mutability, multiple threads might allocate concurrently. Without synchronization, this is a data race even though it is inside unsafe code. For the initial project, keep the arena single-threaded: use &mut self and document that Arena is not Sync. Later, if you extend to multi-threaded allocation, you need atomic bump pointers or per-thread sub-arenas.
The heart of this concept is that alignment and layout are not optional details; they are the proof obligations that keep your unsafe code sound. If you get alignment wrong, your allocator will appear to work but will be undefined behavior. By explicitly using Layout and documenting the invariants, you turn these low-level rules into a verifiable engineering discipline.
How this fit on projects
You will apply this in §3.5 (data formats/layout), §4.4 (data structures), and §5.10 Phase 1 where the bump logic is implemented. This is also reused in Project 8 (lock-free stack node allocation) and Project 7 (self-referential structs with stable addresses).
Definitions & key terms
- Alignment: Required address multiple for a type.
- Layout: Size + alignment information for a type.
- Provenance: The allocation origin of a pointer.
- Padding: Extra bytes inserted to satisfy alignment.
- MaybeUninit: Wrapper for uninitialized memory.
Mental model diagram (ASCII)
Buffer: [..........64 bytes..........]
Offset: ^
Alloc u8: [1]
Align u64: [1][pad][u64........]
How it works (step-by-step)
- Compute
LayoutforT. - Align the bump pointer up to
layout.align(). - Ensure there is enough capacity for
layout.size(). - Write
Tinto the aligned slot withptr::write. - Return a reference to the initialized memory.
Minimal concrete example
let layout = Layout::new::<u64>();
let aligned = align_up(offset, layout.align());
let ptr = unsafe { base.add(aligned) as *mut u64 };
unsafe { ptr::write(ptr, 123); }
Common misconceptions
- “Alignment is only for performance.” It is required for correctness on many platforms.
- “
Vec<u8>is always safe to cast.” Only if you respect alignment and bounds. - “Uninitialized memory can be read as zero.” Reading uninitialized memory is UB.
Check-your-understanding questions
- Why do you need padding between allocations of different types?
- What does
Layoutguarantee that manual size arithmetic might miss? - Why is
ptr::writepreferred over*ptr = valuein uninitialized memory?
Check-your-understanding answers
- To ensure the next allocation begins at an address aligned for its type.
- It encodes Rust’s alignment rules and prevents UB due to misalignment.
ptr::writeavoids reading the old uninitialized value and does not drop it.
Real-world applications
- Bump allocators in compilers and game engines.
- Custom allocators in embedded systems with no OS heap.
- Memory pools in high-performance servers.
Where you’ll apply it
- This project: §4.4 (data structures), §5.10 Phase 2, §6.2 (critical tests).
- Also used in: Project 8: Lock-Free Stack, Project 7: Pin Self-Referential.
References
- Rustonomicon: “Uninitialized Memory” and “Raw Pointers”.
- “Programming Rust” memory management chapter.
- “Computer Systems: A Programmer’s Perspective” memory layout sections.
Key insights
Correct alignment is the difference between “seems to work” and “provably safe.”
Summary
Raw pointers are powerful but dangerous; alignment and layout are the rules that make them safe.
Homework/Exercises to practice the concept
- Compute alignment for a sequence of types by hand and compare to
Layout. - Write a helper
align_upand unit-test it for multiple alignments. - Use
MaybeUninit<[u8; N]>and practice safe initialization patterns.
Solutions to the homework/exercises
- Use
size_ofandalign_ofplus padding rules; verify withstd::alloc::Layout. - Test with alignments 1, 2, 4, 8, 16 and random offsets.
- Initialize each element explicitly, then
assume_initonce fully written.
2.3 Arena Allocation Strategy and Safety Invariants
Fundamentals
An arena allocator is a region-based allocator that hands out memory linearly and frees it all at once. The simplest implementation is a bump allocator: it keeps a pointer (or offset) that always moves forward. The safety story depends on two invariants: (1) previously allocated objects never move, and (2) references returned from the arena are always tied to the arena’s lifetime. This makes arenas ideal for workloads with many short-lived allocations, such as parsing or query planning. In Rust, the arena’s public API enforces lifetimes, while the unsafe code enforces memory bounds and non-moving guarantees.
Deep Dive into the concept
Arena allocation is a deliberate trade-off: you gain speed and simplicity at the cost of per-object deallocation. Instead of pairing each alloc with a free, you pair many allocs with a single drop of the arena. This can eliminate fragmentation and the overhead of the general-purpose allocator. The bump allocator variant is especially fast because allocation is just pointer arithmetic. In Rust, that speed is only safe if the arena never shrinks or reallocates in a way that moves earlier allocations. If you store the buffer in a Vec<u8>, growing it can reallocate and move the underlying memory. That would invalidate all existing references and violate your safety contract.
To avoid this, you must either pre-allocate the full buffer or use a structure that never moves existing chunks. One common design is a “chunked arena”: a vector of fixed-size blocks, where each block is a separate allocation. When a block fills, you allocate a new block and never touch the old ones. This preserves stable addresses while allowing growth. The cost is extra indirection and potentially less cache locality, but it keeps references valid. For this project, you can choose a fixed-capacity arena to keep the logic simpler, then include an extension for chunked growth.
A central design decision is whether to allow allocations via &self (shared) or &mut self (exclusive). If you require &mut self, you avoid interior mutability and ensure that only one allocation can happen at a time. That simplifies reasoning about the bump pointer but reduces ergonomics. If you allow &self, you must use Cell or UnsafeCell to mutate the bump pointer, and you must document the invariant that concurrent access is either forbidden or synchronized. This is why a non-thread-safe arena typically does not implement Sync and only offers &self allocation in single-threaded contexts.
Another subtlety is the treatment of destructors. A fully safe arena should either restrict T to Copy/Pod types or track destructors. A common pattern is to store a list of drop functions: when alloc stores a T, it also stores a pointer to a function that will drop a T at that address. On arena drop, you iterate that list and call each destructor. This is effectively manual drop order, and you must decide whether to drop in allocation order or reverse order. Dropping in reverse allocation order is usually safer because it mirrors stack-like lifetimes, but in an arena this is not always necessary. If your values can reference each other, drop order might matter; documenting the rule makes the API honest.
Safety invariants must be explicit. For a minimal bump arena, the invariants are: (a) the buffer is large enough for all allocations, (b) the bump pointer never decreases or exceeds capacity, (c) any reference returned points to initialized memory, (d) the buffer’s allocation is never moved, and (e) the arena’s lifetime outlives all returned references. These are the rules that make the unsafe code safe. Your documentation should state them, and your tests should attempt to violate them to ensure the compiler or runtime prevents it.
Arena allocation is also a lesson in API design. A well-designed arena might expose alloc (by value), alloc_with (by closure), and alloc_slice to reduce per-item overhead. The API should also communicate capacity: if the arena is full, do you return None, Result, or panic? For a learning project, prefer Result with a custom error type so you can show failure demos and exit codes.
Finally, consider how arenas interact with borrowing rules in collections. If you allocate nodes in an arena and then store references to them in a graph, you get cheap stable references but you also must ensure the arena outlives the graph. This is why arenas are often stored at the top-level of a data structure. The arena owns the memory, and the data structure owns references into it. Rust makes that relationship explicit in types and lifetimes, which is exactly the skill this project is meant to build.
How this fit on projects
You will apply these invariants in §3.2 (requirements), §4.1 (architecture), and §5.10 (phase milestones). The chunked arena extension connects directly to Project 3 (graph) and Project 5 (string interner).
Definitions & key terms
- Arena allocator: Allocator that frees all allocations together.
- Bump allocator: Linear allocation with a monotonic pointer.
- Chunked arena: Multiple fixed-size blocks to allow growth.
- Invariant: A rule that must always hold for safety.
- Stable address: Guarantee that a value’s memory address never changes.
Mental model diagram (ASCII)
Arena
+---------------------------+
| block0 | block1 | block2 |
+---------------------------+
^ bump
Alloc -> [T][pad][U][pad][V]
Drop -> free all blocks
How it works (step-by-step)
- Initialize arena with a fixed buffer or the first chunk.
- On
alloc, align and reserve space, then write the value. - If the buffer is full, return an error or allocate a new chunk.
- Return a reference tied to the arena lifetime.
- On
Drop, optionally run destructors, then free memory.
Minimal concrete example
pub fn alloc<'a, T>(&'a mut self, value: T) -> Result<&'a T, AllocError> {
// bump pointer, write value, return reference
}
Common misconceptions
- “Arena allocators are always faster.” They are faster for many small allocations but can waste memory.
- “You can grow a
Vec<u8>without moving.” Reallocation can move the buffer. - “Dropping an arena is enough for all types.” Not for types that manage external resources.
Check-your-understanding questions
- Why does a bump allocator need stable addresses?
- What are the trade-offs of chunked arenas vs a fixed buffer?
- How do destructor lists change the safety profile of the arena?
Check-your-understanding answers
- References rely on stable addresses; moving allocations invalidates them.
- Chunked arenas allow growth but add indirection and possible fragmentation.
- They make it safe to store non-
Copytypes by ensuring proper drops.
Real-world applications
- AST node storage in compilers.
- Game engine ECS archetype storage.
- Query plan nodes in databases.
Where you’ll apply it
- This project: §3.6 (edge cases), §4.1 (design), §5.10 Phase 2-3.
- Also used in: Project 3: Graph Data Structure, Project 5: String Interning.
References
- “Rustonomicon” (Unsafe code guidelines).
- “Programming Rust” arena patterns.
- “Bumpalo” crate docs (as inspiration).
Key insights
Arenas are safe when you can prove two things: no moves, and no references outlive the arena.
Summary
Arena allocation is a controlled lifetime trade-off: speed and simplicity in exchange for bulk deallocation.
Homework/Exercises to practice the concept
- Design an arena API that supports both
allocandalloc_slice. - Create a mock arena that only accepts
Copytypes and explain why it is easier. - Write a test that proves stable addresses across multiple allocations.
Solutions to the homework/exercises
- Use separate methods with
Layoutcalculations for slices; return&'a [T]. - No destructor tracking needed because
Copytypes do not own resources. - Allocate two values and compare their addresses; ensure they never change.
3. Project Specification
3.1 What You Will Build
A Rust crate arena_lab that implements a bump arena allocator with a safe API. The arena allocates values from a contiguous buffer, returns references tied to the arena’s lifetime, and frees all allocations at once when the arena is dropped. The crate includes a CLI demo and benchmarks.
Included:
Arenatype withnew(capacity),alloc, andreset(optional).- Error handling for out-of-space conditions.
- Optional destructor list to safely store non-
Copytypes. - Benchmarks comparing
ArenavsBox<T>.
Excluded:
- A general-purpose allocator replacement.
- Thread-safe allocation (single-threaded only).
- Integration with
GlobalAlloc.
3.2 Functional Requirements
- Safe allocation:
allocreturns a reference tied to the arena lifetime. - Alignment correctness: allocations are aligned for each type
T. - Capacity enforcement: allocations beyond capacity return an error.
- Drop behavior: arena frees all memory on drop; if destructors are tracked, they run exactly once.
- Demo program: CLI example demonstrates allocation and lifetime safety.
3.3 Non-Functional Requirements
- Performance: allocation should be O(1) with minimal overhead.
- Reliability: unsafe blocks are minimal and fully documented.
- Usability: API is small, clear, and has examples in docs.
3.4 Example Usage / Output
$ cargo run --example arena_demo
arena: capacity=64MB, used=48MB
allocations: 1_000_000
elapsed: 22.4ms
attempting use-after-drop...
error[E0597]: `arena` does not live long enough
3.5 Data Formats / Schemas / Protocols
- Arena buffer: contiguous
Vec<u8>or chunk list ofVec<u8>. - Allocation metadata:
offset: usize,capacity: usize. - Optional drop list:
Vec<fn(*mut u8)>with parallelVec<*mut u8>.
3.6 Edge Cases
- Allocation of zero-sized types.
- Allocation that exactly fills capacity.
- Misalignment if padding is incorrect.
- Dropping arena while references exist (must fail to compile).
3.7 Real World Outcome
You will have a working crate and a reproducible CLI demo.
3.7.1 How to Run (Copy/Paste)
cargo run --example arena_demo
cargo bench
3.7.2 Golden Path Demo (Deterministic)
- Use fixed seed for random sizes (
--seed 42). - Allocate 1,000,000
u32values in a 64MB arena. - Expect used bytes to be deterministic and benchmark to match within tolerance.
3.7.3 CLI Transcript (Success)
$ cargo run --example arena_demo -- --seed 42 --count 1000000 --capacity-mb 64
arena: capacity=64MB, used=48MB
allocations: 1_000_000
elapsed: 22.4ms
exit code: 0
3.7.4 Failure Demo (Bad Input)
$ cargo run --example arena_demo -- --capacity-mb 1 --count 200000
error: OutOfSpace { requested: 16, remaining: 8 }
exit code: 2
4. Solution Architecture
4.1 High-Level Design
+-------------------+ +------------------------+
| Arena (public API)| ----> | ArenaInner (unsafe) |
+-------------------+ +------------------------+
| |
v v
alloc(&self) buffer + bump
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
Arena |
Safe API, lifetimes | Expose alloc tied to &self/&mut self |
ArenaInner |
Bump pointer + buffer | Use Vec<u8> and Layout |
DropList |
Destructor tracking | Optional for non-Copy types |
arena_demo |
CLI demo | Deterministic output and error cases |
4.4 Data Structures (No Full Code)
struct Arena {
inner: UnsafeCell<ArenaInner>,
// optional: drop list
}
struct ArenaInner {
buf: Vec<u8>,
offset: usize,
}
4.4 Algorithm Overview
Key Algorithm: Bump Allocation
- Compute layout for
T. - Align
offsetto layout alignment. - Check capacity.
- Write value into buffer.
- Return reference.
Complexity Analysis:
- Time: O(1) per allocation
- Space: O(n) for allocated bytes
5. Implementation Guide
5.1 Development Environment Setup
rustup default stable
cargo new arena_lab
cd arena_lab
cargo add thiserror
5.2 Project Structure
arena_lab/
├── src/
│ ├── lib.rs
│ ├── arena.rs
│ └── errors.rs
├── examples/
│ └── arena_demo.rs
├── benches/
│ └── arena_bench.rs
└── README.md
5.3 The Core Question You’re Answering
“How can I return
&Tfrom an allocator without ever allowing a dangling reference, even though the allocator uses unsafe pointers?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Lifetime ties in function signatures (TRPL Ch. 10)
- Alignment and
Layout(std::alloc::Layoutdocs) - Unsafe invariants (Rustonomicon “Unsafe”)
5.5 Questions to Guide Your Design
- Should
alloctake&mut selfor&self? - Will the arena support non-
Copytypes, and if so, how will you drop them? - How will you signal out-of-space errors?
- How will you prevent buffer reallocation from moving memory?
5.6 Thinking Exercise
Sketch the arena state before and after three allocations of different alignments. Mark the padding bytes.
5.7 The Interview Questions They’ll Ask
- “How does Rust’s borrow checker make an arena allocator safe?”
- “Why is alignment important in a bump allocator?”
- “How do you ensure non-
Copytypes are dropped?”
5.8 Hints in Layers
Hint 1: Start with &mut self to avoid interior mutability.
Hint 2: Use Layout and align_up helper.
Hint 3: If supporting non-Copy, store drop callbacks.
Hint 4: Add a CLI example for deterministic output.
5.9 Books That Will Help
| Topic | Book | Chapter | |—|—|—| | Ownership + lifetimes | TRPL | Ch. 4, 10 | | Unsafe + raw pointers | Rustonomicon | “Unsafe”, “Raw Pointers” | | Allocation patterns | Programming Rust | Memory management sections |
5.10 Implementation Phases
Phase 1: Foundation (2-3 days)
Goals:
- Implement
ArenaInnerwith bump pointer. - Add
align_upand capacity checks.
Tasks:
- Create
ArenaInnerwith buffer and offset. - Implement
alloc_raw(layout)that returns*mut u8.
Checkpoint: Unit tests pass for alignment and bounds.
Phase 2: Core Functionality (3-5 days)
Goals:
- Implement
alloc<T>and lifetime ties. - Add error handling and demo.
Tasks:
- Implement
alloc<T>returning&'a T. - Add
AllocErrorand error mapping.
Checkpoint: Example demo compiles and runs deterministically.
Phase 3: Polish & Edge Cases (2-4 days)
Goals:
- Optional destructor list.
- Benchmarks and docs.
Tasks:
- Add drop list if needed.
- Write benchmarks comparing
Boxand arena.
Checkpoint: Benchmarks show expected allocation speedups.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|—|—|—|—|
| Allocation API | &mut self vs &self | Start with &mut self | Simpler invariants |
| Destructor support | none vs drop list | Include drop list | Safe for non-Copy |
| Growth strategy | fixed vs chunked | Fixed for v1 | Simpler proofs |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|—|—|—|
| Unit Tests | Validate alignment, bounds | alloc with u8, u64 |
| Integration Tests | CLI demo success/failure | arena_demo |
| Edge Case Tests | Zero-sized types, full capacity | alloc::<()>() |
6.2 Critical Test Cases
- Alignment correctness:
alloc<u64>afteralloc<u8>returns aligned address. - Capacity enforcement: allocation beyond capacity returns
AllocError. - Lifetime safety: compile-fail test for use-after-drop.
6.3 Test Data
capacity=64MB
count=1_000_000
seed=42
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|—|—|—|
| Forgetting alignment | Crash or UB | Use Layout + align_up |
| Reallocating buffer | Dangling references | Pre-allocate or chunk |
| Not handling Drop | Resource leaks | Track destructors |
7.2 Debugging Strategies
- Use Miri to catch UB in unsafe code.
- Log offsets for each allocation to validate padding.
7.3 Performance Traps
- Overly small buffer causes frequent allocation errors.
- Excessive drop-list overhead can reduce performance.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add
reset()to reuse the arena without freeing memory. - Add
alloc_slicefor contiguous arrays.
8.2 Intermediate Extensions
- Implement chunked growth strategy.
- Add optional per-item destructor tracking.
8.3 Advanced Extensions
- Implement a thread-safe arena with atomic bump pointer.
- Integrate with
GlobalAllocfor custom allocation.
9. Real-World Connections
9.1 Industry Applications
- Compilers: AST node allocation with fast teardown.
- Databases: Query planning structures built per-query.
9.2 Related Open Source Projects
- bumpalo: High-quality Rust bump arena implementation.
- typed-arena: Typed arena with lifetime-safe API.
9.3 Interview Relevance
- Ownership and lifetimes in resource management.
- Safe API design around unsafe internals.
10. Resources
10.1 Essential Reading
- The Rust Programming Language by Klabnik & Nichols - Ch. 4, 10, 15.
- Rustonomicon - “Unsafe”, “Raw Pointers”.
10.2 Video Resources
- “Rust Ownership and Lifetimes” - RustConf talks.
10.3 Tools & Documentation
- Miri: UB detection.
- Criterion: Benchmarking.
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain why
allocreturns&'a Ttied to&'a self. - I can justify each unsafe block with invariants.
- I can explain alignment and padding rules.
11.2 Implementation
- All functional requirements are met.
- Benchmarks run and show expected performance.
- Edge cases are handled.
11.3 Growth
- I can explain arena design in an interview.
- I can identify at least one improvement to this design.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Arena allocates values with correct lifetimes.
- CLI demo shows deterministic output and failure case.
- Alignment tests pass.
Full Completion:
- Includes destructor tracking or clear constraints for
T. - Benchmarks compare arena vs
Box.
Excellence (Going Above & Beyond):
- Chunked arena growth implemented.
- Thread-safe extension or
GlobalAllocintegration.