Project 5: ARM Memory Allocator (malloc from Scratch)
Build a custom heap memory allocator for bare-metal ARM systems, implementing malloc(), free(), and realloc() with proper alignment, coalescing, and fragmentation handling.
Quick Reference
| Attribute | Value |
|---|---|
| Language | C (alt: Rust, ARM Assembly) |
| Difficulty | Advanced |
| Time | 1-2 weeks |
| Coolness | ★★★☆☆ Genuinely Clever |
| Portfolio Value | Resume Gold |
| Prerequisites | Project 3 (bare-metal environment), strong C pointer skills |
| Key Topics | Memory management, alignment, free lists, coalescing |
Learning Objectives
By completing this project, you will:
- Understand heap memory management: Implement the fundamental algorithms behind dynamic memory allocation
- Master ARM alignment requirements: Handle 4-byte (ARM32) and 8-byte (AArch64) alignment constraints correctly
- Build free list data structures: Design and implement linked lists within raw memory regions
- Implement block coalescing: Merge adjacent free blocks to combat memory fragmentation
- Debug memory corruption: Detect and diagnose common heap bugs using canary values and validation
- Work without standard library: Build malloc without sbrk(), using fixed memory pools
The Core Question You’re Answering
“How does malloc actually work, and why is memory alignment critical on ARM processors?”
This question forces you to understand that malloc is not magic—it’s a data structure (free list) and a set of algorithms (first-fit, best-fit, coalescing) operating on raw memory. On ARM, alignment violations can cause bus errors or severe performance penalties, making correct implementation essential for bare-metal systems.
Concepts You Must Understand First
Before starting this project, verify your understanding:
| Concept | Why It Matters | Where to Learn |
|---|---|---|
| Pointer arithmetic in C | You’ll be navigating raw memory | K&R Chapter 5, CS:APP 3.8 |
| Linked list implementation | Free lists are linked lists in memory | Any data structures course |
| Memory layout (stack, heap, data, text) | You need to know where your heap lives | CS:APP Chapter 9 |
| Bitwise operations | Alignment calculations use masks | CS:APP Chapter 2 |
| ARM memory model | Understand alignment faults and penalties | ARM Architecture Reference Manual |
| Bare-metal memory maps | Know where your RAM starts and ends | Your MCU’s datasheet |
Self-Assessment Questions
Before starting, answer these without looking:
- Pointer arithmetic: If
char *p = 0x1000, what is(void *)(p + 16)? - Alignment: How do you round up an address to the nearest 8-byte boundary?
- Memory layout: In your STM32, where does RAM start and how large is it?
- Bit masking: What is
0x1003 & ~7? What about(0x1003 + 7) & ~7? - Linked lists: How do you insert a node into a sorted linked list?
Theoretical Foundation
How malloc Works: The Big Picture
When you call malloc(32), you expect to get a pointer to 32 usable bytes. But where do these bytes come from? In a typical OS, malloc manages a region called the “heap” that grows via sbrk() or mmap(). On bare metal, we have a fixed pool of memory.
Traditional malloc (with OS): Bare-metal malloc (this project):
┌─────────────────────────┐ ┌─────────────────────────┐
│ Program calls malloc(n) │ │ Program calls malloc(n) │
└───────────┬─────────────┘ └───────────┬─────────────┘
│ │
▼ ▼
┌─────────────────────────┐ ┌─────────────────────────┐
│ Search free list for │ │ Search free list for │
│ block >= n bytes │ │ block >= n bytes │
└───────────┬─────────────┘ └───────────┬─────────────┘
│ │
┌────┴────┐ ┌────┴────┐
│ Found? │ │ Found? │
└────┬────┘ └────┬────┘
Yes │ No Yes │ No
│ │ │ │ │ │
▼ │ ▼ ▼ │ ▼
Return │ Call sbrk() Return │ Return NULL
block │ to grow heap block │ (out of memory)
│ │ │
└────┘ │
│
Memory Block Structure
Every allocated block needs metadata. We store this in a “header” before the user data:
Block structure:
┌──────────────────────────────────────────────────────────────┐
│ │
│ ┌─────────────────┐ ┌─────────────────────────────────┐ │
│ │ Block Header │ │ User Data │ │
│ │ (metadata) │ │ (what malloc returns) │ │
│ └─────────────────┘ └─────────────────────────────────┘ │
│ ↑ ↑ │
│ Block start Pointer returned to user │
│ │
└──────────────────────────────────────────────────────────────┘
Header contents:
┌─────────────────────────────────────────────────────────────┐
│ size_t size │ Total block size including header │
│ block_t *next │ Pointer to next free block (if free) │
│ uint32_t magic │ 0xDEADBEEF=allocated, 0xFEEDFACE=free │
└─────────────────────────────────────────────────────────────┘
Free List Organization
A free list tracks available memory blocks. When memory is freed, the block is added to this list. When memory is allocated, we search this list:
Free list visualization:
heap_start
│
▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ HDR │ │ │ HDR │ │ │ HDR │ │
│ 128 │ │────▶│ 64 │ │────▶│ 256 │ │──▶ NULL
│ FACE│ │ │ FACE│ │ │ FACE│ │
└───────────┘ └───────────┘ └───────────┘
Free: 128 Free: 64 Free: 256
When malloc(50) is called:
1. Search list: 128 >= 50+header? Yes!
2. Split block if much larger
3. Remove from free list
4. Return pointer after header
ARM Alignment Requirements
ARM processors have strict alignment requirements:
| Data Type | ARM32 Alignment | AArch64 Alignment |
|---|---|---|
| char | 1 byte | 1 byte |
| short | 2 bytes | 2 bytes |
| int | 4 bytes | 4 bytes |
| long | 4 bytes | 8 bytes |
| pointer | 4 bytes | 8 bytes |
| double | 8 bytes | 8 bytes |
Why alignment matters:
- Performance: Unaligned access requires multiple memory operations
- Correctness: Some ARM processors fault on unaligned access
- Atomicity: Unaligned access may not be atomic
Alignment calculation:
┌─────────────────────────────────────────────────────────────┐
│ // Round up to 8-byte boundary │
│ #define ALIGN8(x) (((x) + 7) & ~7) │
│ │
│ Example: ALIGN8(13) │
│ 13 + 7 = 20 = 0b00010100 │
│ ~7 = 0b...11111000 │
│ 20 & ~7 = 0b00010000 = 16 │
│ │
│ So malloc(13) actually reserves at least 16 bytes of │
│ user data, ensuring the next block starts aligned. │
└─────────────────────────────────────────────────────────────┘
Allocation Strategies
Different strategies for finding a suitable free block:
First Fit: Take the first block that’s large enough
Pros: Fast (O(1) in best case)
Cons: Tends to leave small fragments at the beginning
Best Fit: Find the smallest block that’s large enough
Pros: Minimizes wasted space in chosen block
Cons: Slow (O(n)), leaves many tiny fragments
Segregated Lists: Multiple lists for different size classes
Pros: Fast allocation, good for common sizes
Cons: More complex, may waste memory between classes
Block Coalescing
When a block is freed, we check if neighbors are also free and merge them:
Before coalescing:
┌─────────┬─────────┬─────────┐
│ A (free)│ B (free)│ C (free)│
│ 64 │ 32 │ 64 │
└─────────┴─────────┴─────────┘
Three small free blocks: 64, 32, 64
After coalescing:
┌─────────────────────────────┐
│ A+B+C (free) │
│ 160 │
└─────────────────────────────┘
One large free block: 160 (can satisfy larger requests)
Immediate coalescing: Merge on every free() call Deferred coalescing: Merge only when allocation fails
Why This Matters
Understanding malloc is essential for:
- Embedded systems: Often need custom allocators for determinism
- Real-time systems: Standard malloc may have unbounded runtime
- Game development: Pool allocators for specific object types
- Security: Heap exploits target malloc metadata
- Performance: Custom allocators can be 10x faster for specific patterns
Project Specification
What You Will Build
A complete memory allocator for bare-metal ARM with:
- malloc(size_t size): Allocate
sizebytes from the heap - free(void *ptr): Return memory to the free list
- realloc(void *ptr, size_t size): Resize an allocation
- calloc(size_t num, size_t size): Allocate and zero memory
Functional Requirements
- Basic Allocation:
- Satisfy requests from 1 byte to pool size
- Return NULL when out of memory
- Maintain proper alignment (8-byte for this project)
- Deallocation:
- Return blocks to free list
- Coalesce adjacent free blocks
- Detect double-free errors
- Reallocation:
- Grow or shrink existing allocations
- Copy data when moving to new location
- Handle realloc(NULL, n) as malloc(n)
- Debugging Features:
- Magic numbers to detect corruption
- Heap statistics (total, used, free, fragments)
- Optional integrity checking
Non-Functional Requirements
- Deterministic: Same allocation pattern gives same results
- No external dependencies: Works without libc
- Thread-unsafe: Single-threaded only (simplifies implementation)
- Minimum block size: 16 bytes (to fit header + meaningful data)
Real World Outcome
// Test program output:
=== ARM Malloc Test Suite ===
Test 1: Basic allocation
malloc(32) = 0x20001008 ✓
malloc(64) = 0x20001030 ✓
malloc(16) = 0x20001078 ✓
Test 2: Free and reuse
free(0x20001030)
malloc(32) = 0x20001030 ✓ (reused!)
Test 3: Coalescing
free(0x20001008)
free(0x20001078)
malloc(100) = 0x20001008 ✓ (coalesced block)
Test 4: Alignment
malloc(1) = 0x20001074 (aligned to 8) ✓
malloc(7) = 0x20001080 (aligned to 8) ✓
Test 5: Large allocation
malloc(2048) = 0x20001088 ✓
Test 6: Out of memory
malloc(1000000) = NULL ✓ (expected failure)
Test 7: Realloc
p = malloc(32) = 0x20001088
realloc(p, 64) = 0x20001088 ✓ (expanded in place)
realloc(p, 16) = 0x20001088 ✓ (shrunk in place)
Test 8: Calloc
calloc(10, 4) = 0x200010C8 ✓
First 40 bytes are zero ✓
Heap stats:
Total pool: 4096 bytes
Allocated: 148 bytes (3.6%)
Free: 3948 bytes (96.4%)
Fragments: 2
Largest free block: 3916 bytes
All tests passed!
Solution Architecture
High-Level Design
┌─────────────────────────────────────────────────────────────────────┐
│ Memory Allocator │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
│ │ Public API │ │ Free List │ │ Block │ │
│ │ malloc/free │──▶│ Management │──▶│ Operations │ │
│ │ realloc/calloc│ │ │ │ │ │
│ └────────────────┘ └────────────────┘ └────────────────┘ │
│ │ │ │ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Heap Memory Pool │ │
│ │ [HDR|DATA ][HDR|DATA][HDR| FREE ] │ │
│ │ ↑ allocated ↑ alloc ↑ free │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
Key Components
| Component | Responsibility | Key Functions |
|---|---|---|
| Public API | User-facing allocation interface | malloc, free, realloc, calloc |
| Free List | Track available memory blocks | find_free_block, add_to_free_list, remove_from_free_list |
| Block Operations | Split, coalesce, validate blocks | split_block, coalesce, validate_block |
| Statistics | Track heap health metrics | get_heap_stats, print_heap_state |
Data Structures
// Block header - stored before every allocation
typedef struct block_header {
size_t size; // Total block size including header
struct block_header *next; // Next free block (only used when free)
uint32_t magic; // Magic number for corruption detection
// On 64-bit: add padding to maintain alignment
} block_header_t;
#define MAGIC_ALLOCATED 0xDEADBEEF
#define MAGIC_FREE 0xFEEDFACE
#define MIN_BLOCK_SIZE (sizeof(block_header_t) + 8) // Header + minimum data
// Heap state
typedef struct {
uint8_t *pool_start; // Start of memory pool
uint8_t *pool_end; // End of memory pool
block_header_t *free_list; // Head of free list
size_t total_size; // Total pool size
size_t allocated_size; // Currently allocated
size_t allocation_count; // Number of active allocations
} heap_state_t;
// Statistics structure
typedef struct {
size_t total_bytes;
size_t allocated_bytes;
size_t free_bytes;
size_t fragment_count;
size_t largest_free_block;
size_t allocation_count;
} heap_stats_t;
Memory Layout
Memory Pool Layout (4KB example at 0x20001000):
0x20001000 ┌─────────────────────────────────────────┐
│ Block Header (12 bytes ARM32) │
│ size: 56 │
│ next: NULL │
│ magic: 0xDEADBEEF (allocated) │
0x2000100C ├─────────────────────────────────────────┤
│ User Data (44 bytes) │
│ (pointer returned by malloc) │
0x20001038 ├─────────────────────────────────────────┤
│ Block Header (12 bytes) │
│ size: 32 │
│ next: 0x20001078 │
│ magic: 0xFEEDFACE (free) │
0x20001044 ├─────────────────────────────────────────┤
│ Free Space (20 bytes) │
0x20001058 ├─────────────────────────────────────────┤
│ Block Header (12 bytes) │
│ size: 24 │
│ next: NULL │
│ magic: 0xDEADBEEF (allocated) │
0x20001064 ├─────────────────────────────────────────┤
│ User Data (12 bytes) │
0x20001070 ├─────────────────────────────────────────┤
│ Remaining Pool (free) │
│ ... │
0x20002000 └─────────────────────────────────────────┘
Algorithm: malloc
malloc(size):
1. aligned_size = ALIGN8(size + sizeof(header))
2. Search free list for block >= aligned_size
3. If found:
a. Remove block from free list
b. If block is much larger, split it
c. Mark as allocated (magic = ALLOCATED)
d. Return pointer after header
4. If not found:
a. Return NULL (out of memory)
Algorithm: free
free(ptr):
1. header = ptr - sizeof(header)
2. Validate magic == ALLOCATED (detect double-free)
3. Mark as free (magic = FREE)
4. Coalesce with adjacent free blocks:
a. Check if previous block is free → merge backward
b. Check if next block is free → merge forward
5. Add to free list (sorted by address for easy coalescing)
Algorithm: Coalescing
coalesce(block):
// Try to merge with next block
next = (header *)((char *)block + block->size)
if next is within pool AND next->magic == FREE:
block->size += next->size
remove next from free list
// Try to merge with previous block (if using boundary tags)
// This requires footer in each block, or walking the list
Implementation Guide
Development Environment Setup
# Required tools
sudo apt-get install gcc-arm-none-eabi gdb-multiarch qemu-system-arm
# Verify ARM compiler
arm-none-eabi-gcc --version
# Create project structure
mkdir -p arm-malloc/{src,include,tests}
cd arm-malloc
Project Structure
arm-malloc/
├── src/
│ ├── malloc.c # Main allocator implementation
│ ├── startup.s # ARM startup code (from Project 3)
│ └── main.c # Test program
├── include/
│ ├── malloc.h # Public API
│ └── malloc_internal.h # Internal structures
├── linker.ld # Linker script with heap definition
├── Makefile
└── tests/
├── test_basic.c
├── test_stress.c
└── test_corruption.c
Implementation Phases
Phase 1: Foundation (Days 1-3)
Goals:
- Set up project structure
- Define data structures
- Implement heap initialization
Tasks:
- Create block header structure with magic numbers
- Define the memory pool (use linker script or static array)
- Implement
heap_init()to set up initial free block - Implement basic
get_heap_stats()for debugging
Code skeleton:
// malloc.c
#include "malloc.h"
#define HEAP_SIZE 4096
static uint8_t heap_pool[HEAP_SIZE] __attribute__((aligned(8)));
static heap_state_t heap;
void heap_init(void) {
heap.pool_start = heap_pool;
heap.pool_end = heap_pool + HEAP_SIZE;
heap.total_size = HEAP_SIZE;
heap.allocated_size = 0;
heap.allocation_count = 0;
// Create one large free block covering entire pool
block_header_t *initial = (block_header_t *)heap_pool;
initial->size = HEAP_SIZE;
initial->next = NULL;
initial->magic = MAGIC_FREE;
heap.free_list = initial;
}
Checkpoint: heap_init() creates a valid initial state, get_heap_stats() reports correct total size.
Phase 2: Basic malloc (Days 4-6)
Goals:
- Implement first-fit allocation
- Handle alignment correctly
- Implement block splitting
Tasks:
- Implement
find_free_block()- search free list for suitable block - Implement
split_block()- divide large blocks - Implement
malloc()- tie it all together - Test with various sizes
Key code:
void *malloc(size_t size) {
if (size == 0) return NULL;
// Calculate total size needed (header + aligned user data)
size_t total_size = ALIGN8(sizeof(block_header_t) + size);
if (total_size < MIN_BLOCK_SIZE) {
total_size = MIN_BLOCK_SIZE;
}
// Find a free block
block_header_t *block = find_free_block(total_size);
if (!block) return NULL;
// Remove from free list
remove_from_free_list(block);
// Split if block is much larger than needed
if (block->size >= total_size + MIN_BLOCK_SIZE) {
split_block(block, total_size);
}
// Mark as allocated
block->magic = MAGIC_ALLOCATED;
heap.allocated_size += block->size;
heap.allocation_count++;
// Return pointer to user data (after header)
return (void *)((uint8_t *)block + sizeof(block_header_t));
}
Checkpoint: malloc() returns valid, aligned pointers. Multiple allocations don’t overlap.
Phase 3: free and coalescing (Days 7-9)
Goals:
- Implement free with validation
- Implement forward and backward coalescing
- Detect double-free errors
Tasks:
- Implement
free()with magic number validation - Implement
coalesce_forward()- merge with next block - Implement
coalesce_backward()- merge with previous block (requires walking list) - Add to free list in sorted order (by address)
Key code:
void free(void *ptr) {
if (!ptr) return;
block_header_t *block = (block_header_t *)((uint8_t *)ptr - sizeof(block_header_t));
// Validate
if (block->magic != MAGIC_ALLOCATED) {
if (block->magic == MAGIC_FREE) {
// Double free detected!
debug_print("ERROR: Double free detected!\n");
} else {
// Corruption detected!
debug_print("ERROR: Heap corruption at %p\n", block);
}
return;
}
// Mark as free
block->magic = MAGIC_FREE;
heap.allocated_size -= block->size;
heap.allocation_count--;
// Coalesce with neighbors
block = coalesce(block);
// Add to free list (sorted by address)
add_to_free_list_sorted(block);
}
Checkpoint: Free blocks are reused. Coalescing combines adjacent blocks.
Phase 4: realloc and calloc (Days 10-11)
Goals:
- Implement realloc with in-place growth when possible
- Implement calloc (allocate + zero)
- Handle edge cases
Tasks:
- Implement
realloc()- try to grow in place, else allocate+copy+free - Implement
calloc()- malloc + memset - Handle edge cases (NULL ptr, size 0)
Key code:
void *realloc(void *ptr, size_t size) {
if (!ptr) return malloc(size);
if (size == 0) { free(ptr); return NULL; }
block_header_t *block = GET_HEADER(ptr);
size_t current_size = block->size - sizeof(block_header_t);
if (size <= current_size) {
// Shrinking - could split off excess, or just keep it
return ptr;
}
// Need to grow - check if next block is free and big enough
block_header_t *next = GET_NEXT_BLOCK(block);
if (next && next->magic == MAGIC_FREE) {
size_t combined = block->size + next->size;
if (combined >= ALIGN8(sizeof(block_header_t) + size)) {
// Can grow in place
remove_from_free_list(next);
block->size = combined;
return ptr;
}
}
// Must relocate
void *new_ptr = malloc(size);
if (new_ptr) {
memcpy(new_ptr, ptr, current_size);
free(ptr);
}
return new_ptr;
}
Checkpoint: realloc grows and shrinks allocations correctly. calloc returns zeroed memory.
Phase 5: Testing and Polish (Days 12-14)
Goals:
- Comprehensive testing
- Performance measurement
- Documentation
Tasks:
- Write stress tests (many allocations/frees)
- Write corruption detection tests
- Measure fragmentation under various patterns
- Add debug visualization of heap state
- Document API and implementation
Checkpoint: All tests pass. Heap stats accurately track state.
Hints in Layers
Hint 1: Getting Started - The Alignment Macro
The most important macro for this project:
#define ALIGN8(x) (((x) + 7) & ~7)
This rounds up to the nearest multiple of 8. Test it:
- ALIGN8(1) = 8
- ALIGN8(8) = 8
- ALIGN8(9) = 16
- ALIGN8(15) = 16
For ARM32, you could use ALIGN4, but ALIGN8 is safer for doubles and 64-bit values.
Hint 2: Block Header Navigation
To get from user pointer to header and back:
// User pointer to header
#define GET_HEADER(ptr) \
((block_header_t *)((uint8_t *)(ptr) - sizeof(block_header_t)))
// Header to user pointer
#define GET_USER_PTR(block) \
((void *)((uint8_t *)(block) + sizeof(block_header_t)))
// Get next block in memory (not free list)
#define GET_NEXT_BLOCK(block) \
((block_header_t *)((uint8_t *)(block) + (block)->size))
// Check if address is within heap
#define IS_IN_HEAP(ptr) \
((uint8_t *)(ptr) >= heap.pool_start && (uint8_t *)(ptr) < heap.pool_end)
Hint 3: Free List Management
Keep the free list sorted by address for easy coalescing:
void add_to_free_list_sorted(block_header_t *block) {
block_header_t **current = &heap.free_list;
while (*current && *current < block) {
current = &((*current)->next);
}
block->next = *current;
*current = block;
}
void remove_from_free_list(block_header_t *block) {
block_header_t **current = &heap.free_list;
while (*current && *current != block) {
current = &((*current)->next);
}
if (*current) {
*current = block->next;
}
}
Hint 4: Coalescing Strategy
With a sorted free list, coalescing becomes easier:
block_header_t *coalesce(block_header_t *block) {
// Forward coalescing - check next block
block_header_t *next = GET_NEXT_BLOCK(block);
if (IS_IN_HEAP(next) && next->magic == MAGIC_FREE) {
remove_from_free_list(next);
block->size += next->size;
// 'next' is now invalid - absorbed into 'block'
}
// Backward coalescing - find previous block
// This requires either boundary tags or walking the heap
block_header_t *prev = find_previous_block(block);
if (prev && prev->magic == MAGIC_FREE) {
remove_from_free_list(prev);
prev->size += block->size;
return prev; // Return the merged block
}
return block;
}
For backward coalescing without boundary tags, you’d need to walk from heap start:
block_header_t *find_previous_block(block_header_t *block) {
block_header_t *prev = NULL;
block_header_t *current = (block_header_t *)heap.pool_start;
while (current < block) {
prev = current;
current = GET_NEXT_BLOCK(current);
}
return prev;
}
Hint 5: Block Splitting
When you find a free block much larger than needed, split it:
void split_block(block_header_t *block, size_t needed_size) {
size_t remaining = block->size - needed_size;
if (remaining < MIN_BLOCK_SIZE) {
return; // Not enough left for a new block
}
// Create new free block from the remainder
block_header_t *new_block = (block_header_t *)((uint8_t *)block + needed_size);
new_block->size = remaining;
new_block->magic = MAGIC_FREE;
new_block->next = NULL;
// Shrink original block
block->size = needed_size;
// Add new block to free list
add_to_free_list_sorted(new_block);
}
Hint 6: Debug Visualization
A heap dump function is invaluable for debugging:
void dump_heap(void) {
debug_print("=== Heap Dump ===\n");
debug_print("Pool: %p - %p (%u bytes)\n",
heap.pool_start, heap.pool_end, heap.total_size);
block_header_t *block = (block_header_t *)heap.pool_start;
int count = 0;
while (IS_IN_HEAP(block) && count < 100) {
const char *status;
if (block->magic == MAGIC_ALLOCATED) status = "USED";
else if (block->magic == MAGIC_FREE) status = "FREE";
else status = "CORRUPT";
debug_print(" [%d] %p: size=%u, status=%s\n",
count, block, block->size, status);
block = GET_NEXT_BLOCK(block);
count++;
}
debug_print("Free list: ");
block_header_t *free = heap.free_list;
while (free) {
debug_print("%p(%u) -> ", free, free->size);
free = free->next;
}
debug_print("NULL\n");
}
Testing Strategy
Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Basic | Verify core functionality | Single alloc/free, alignment |
| Stress | Test under heavy load | 1000 allocations, random sizes |
| Coalescing | Verify block merging | Free adjacent blocks, check merge |
| Fragmentation | Measure memory efficiency | Alternating alloc/free patterns |
| Corruption | Test error detection | Double free, buffer overflow |
Critical Test Cases
// Test 1: Basic allocation and alignment
void test_basic(void) {
void *p1 = malloc(1); assert(p1 && IS_ALIGNED(p1, 8));
void *p2 = malloc(7); assert(p2 && IS_ALIGNED(p2, 8));
void *p3 = malloc(8); assert(p3 && IS_ALIGNED(p3, 8));
void *p4 = malloc(100); assert(p4 && IS_ALIGNED(p4, 8));
// Verify no overlap
assert(p2 >= p1 + 8 || p1 >= p2 + 8);
free(p1); free(p2); free(p3); free(p4);
}
// Test 2: Coalescing
void test_coalescing(void) {
void *p1 = malloc(32);
void *p2 = malloc(32);
void *p3 = malloc(32);
free(p2); // Free middle block
free(p1); // Should coalesce with p2
free(p3); // Should coalesce all three
// Now should be able to allocate all memory as one block
void *big = malloc(80); // Should succeed with coalesced block
assert(big != NULL);
free(big);
}
// Test 3: Stress test
void test_stress(void) {
void *ptrs[100];
// Allocate all
for (int i = 0; i < 100; i++) {
ptrs[i] = malloc(16 + (i % 64));
assert(ptrs[i] != NULL);
}
// Free odd indices
for (int i = 1; i < 100; i += 2) {
free(ptrs[i]);
ptrs[i] = NULL;
}
// Allocate again
for (int i = 1; i < 100; i += 2) {
ptrs[i] = malloc(32);
assert(ptrs[i] != NULL);
}
// Free all
for (int i = 0; i < 100; i++) {
free(ptrs[i]);
}
// Verify all memory returned
heap_stats_t stats;
get_heap_stats(&stats);
assert(stats.allocated_bytes == 0);
}
// Test 4: Realloc
void test_realloc(void) {
void *p = malloc(32);
memset(p, 0xAA, 32);
// Grow
p = realloc(p, 64);
assert(p != NULL);
assert(((uint8_t *)p)[0] == 0xAA); // Data preserved
// Shrink
p = realloc(p, 16);
assert(p != NULL);
assert(((uint8_t *)p)[0] == 0xAA);
free(p);
}
Common Pitfalls & Debugging
Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Off-by-one in alignment | Corruption after N allocations | Test ALIGN8 macro thoroughly |
| Forgetting to update free list | Memory leaks, double allocation | Always pair add/remove operations |
| Wrong pointer arithmetic | Wild pointers, crashes | Cast to uint8_t * before arithmetic |
| Not checking boundaries | Read/write outside heap | Validate all pointers with IS_IN_HEAP |
| Split creates too-small block | Header-only blocks | Check remaining >= MIN_BLOCK_SIZE |
| Coalesce loop infinite | Hang during free | Verify next block calculation |
Debugging Strategies
- Add verbose logging: Print every malloc/free with addresses and sizes
- Validate on every operation: Check magic numbers, free list integrity
- Use a heap visualizer: Print ASCII diagram of heap state
- Test incrementally: Get malloc working before free, free before realloc
- Use QEMU with GDB: Set watchpoints on corrupted addresses
Quick Validation
void validate_heap(void) {
size_t counted_free = 0;
size_t counted_alloc = 0;
block_header_t *block = (block_header_t *)heap.pool_start;
while (IS_IN_HEAP(block)) {
assert(block->magic == MAGIC_FREE || block->magic == MAGIC_ALLOCATED);
if (block->magic == MAGIC_FREE) {
counted_free += block->size;
} else {
counted_alloc += block->size;
}
block = GET_NEXT_BLOCK(block);
}
assert(counted_alloc == heap.allocated_size);
assert(counted_free + counted_alloc == heap.total_size);
}
Extensions & Challenges
Beginner Extensions
- Heap statistics CLI: Print stats via UART command
- Memory fill patterns: Fill allocated memory with 0xCD, freed with 0xDD (like MSVC debug)
- Allocation tracking: Log allocation call sites (requires stack walking or macro trickery)
Intermediate Extensions
- Boundary tags: Add footers to blocks for O(1) backward coalescing
- Best-fit strategy: Compare fragmentation vs first-fit
- Size-class segregation: Multiple free lists for different size ranges
- Memory pools: Create specialized allocators for fixed-size objects
Advanced Extensions
- Thread safety: Add mutex/spinlock protection
- Defragmentation: Compact heap by moving allocations (requires handles)
- Guard pages: Use MPU to detect buffer overflows (requires Project 15)
- TLSF algorithm: Implement Two-Level Segregated Fit for O(1) allocation
The Interview Questions They’ll Ask
After completing this project, you’ll be ready for:
- “How does malloc work internally?”
- Discuss: free lists, block headers, splitting, coalescing
- Mention: first-fit vs best-fit tradeoffs
- “What is memory fragmentation and how do you combat it?”
- Internal vs external fragmentation
- Coalescing, compaction, segregated lists
- “Why does alignment matter?”
- Performance penalties for unaligned access
- Atomicity requirements
- ARM-specific alignment faults
- “How would you detect a double-free bug?”
- Magic numbers in headers
- Poisoning freed memory
- Guard patterns
- “What’s the difference between stack and heap allocation?”
- Lifetime, thread-safety, fragmentation, performance
- When to use each
- “Design a memory allocator for [game engine / embedded system / real-time system]”
- Pool allocators for uniform objects
- Arena allocators for bulk free
- Deterministic timing requirements
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Dynamic memory allocation algorithms | Computer Systems: A Programmer’s Perspective | Ch. 9.9 |
| Free list implementations | The Art of Computer Programming, Vol 1 | Ch. 2.5 |
| ARM alignment rules | ARM Architecture Reference Manual | Memory Model |
| Pool allocators for games | Game Programming Patterns | Ch. 19 (Object Pool) |
| Memory allocator design | Understanding the Linux Kernel | Ch. 8 (Memory Management) |
| Embedded memory management | Making Embedded Systems | Ch. 7 |
| Writing robust C | Expert C Programming | Ch. 9 |
Self-Assessment Checklist
Before considering this project complete, verify:
Understanding
- I can explain why alignment matters on ARM processors
- I can draw the memory layout of an allocated block with header
- I can trace through a malloc/free sequence on paper
- I understand when and why coalescing improves memory usage
- I can explain the tradeoffs between first-fit and best-fit
Implementation
- malloc returns properly aligned pointers
- free returns memory to the free list correctly
- Adjacent free blocks are coalesced
- Double-free is detected and reported
- realloc handles grow, shrink, and NULL cases
- All heap memory can be reclaimed after freeing all allocations
Testing
- Basic allocation/free tests pass
- Stress tests with many allocations pass
- Coalescing tests verify blocks are merged
- Edge cases (size=0, NULL ptr) are handled
Growth
- I debugged at least one corruption bug using magic numbers
- I can extend this allocator with new features
- I understand how production allocators differ (jemalloc, tcmalloc)
Learning Milestones
Track your progress through these understanding checkpoints:
- malloc/free work for single allocations → Basic structure is correct
- Memory is reused after free → Free list works
- Coalescing prevents fragmentation → Block merging works
- Stress test passes → Ready for real use
This guide was expanded from LEARN_ARM_DEEP_DIVE.md. For the complete learning path, see the project index.