Project 14: Build Your Own Malloc
Project 14: Build Your Own Malloc
Implement a dynamic memory allocator from scratch, mastering heap organization, block management, and the trade-offs between throughput and memory utilization.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Expert |
| Time Estimate | 1 month+ |
| Language | C (Alternatives: Rust, Zig, C++) |
| Prerequisites | Projects 2, 9, and 13 recommended |
| Key Topics | Heap layout, free lists, coalescing, alignment, fragmentation |
| CS:APP Chapters | 6, 9 |
Table of Contents
- Learning Objectives
- Deep Theoretical Foundation
- Project Specification
- Solution Architecture
- Implementation Guide
- Testing Strategy
- Common Pitfalls
- Extensions
- Real-World Connections
- Resources
- Self-Assessment Checklist
1. Learning Objectives
By completing this project, you will:
- Master heap organization: Understand how the heap is structured and managed at the byte level
- Implement memory allocation algorithms: Build malloc, free, and optionally realloc from scratch
- Design efficient data structures: Create block headers, footers, and free list organizations
- Understand fragmentation trade-offs: Balance internal vs external fragmentation through policy choices
- Apply boundary tag coalescing: Implement efficient free block merging in O(1) time
- Build a heap checker: Write invariant-checking code that catches corruption early
- Measure and optimize performance: Understand throughput vs utilization trade-offs
- Connect VM concepts to practice: See how sbrk() and mmap() extend the heap
2. Deep Theoretical Foundation
2.1 Dynamic Memory Allocation Fundamentals
Why Dynamic Allocation Exists
Static memory (stack and global variables) has fundamental limitations:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MEMORY ALLOCATION SPECTRUM โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ STATIC ALLOCATION DYNAMIC ALLOCATION โ
โ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โ
โ โ
โ int arr[100]; int *arr = malloc(n * sizeof(int));โ
โ โ
โ Pros: Pros: โ
โ - Fast (no runtime cost) - Size determined at runtime โ
โ - Simple (compiler handles) - Lifetime controlled by program โ
โ - Automatic cleanup - Can grow/shrink โ
โ โ
โ Cons: Cons: โ
โ - Size fixed at compile time - Runtime overhead โ
โ - Limited by stack size - Manual cleanup required โ
โ - Lifetime tied to scope - Fragmentation possible โ
โ โ
โ Use when: Use when: โ
โ - Size is known - Size unknown until runtime โ
โ - Small, fixed allocations - Large or variable allocations โ
โ - Short-lived data - Data outlives function scope โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
The Allocatorโs Contract
A memory allocator must satisfy these requirements:
void *malloc(size_t size);
// 1. Return pointer to block of at least 'size' bytes
// 2. Block must be properly aligned (usually 8 or 16 bytes)
// 3. Block must not overlap with any other allocated block
// 4. Return NULL if request cannot be satisfied
void free(void *ptr);
// 1. Return block to the free pool
// 2. ptr must be from a previous malloc/realloc
// 3. ptr must not have been freed already (undefined behavior)
// 4. ptr == NULL is a valid no-op
void *realloc(void *ptr, size_t size);
// 1. Resize allocation to 'size' bytes
// 2. Preserve contents (up to min of old and new size)
// 3. May move the block; return new location
// 4. realloc(NULL, size) == malloc(size)
// 5. realloc(ptr, 0) == free(ptr), returns NULL
2.2 Heap Organization Strategies
The Heap in Process Memory
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PROCESS VIRTUAL ADDRESS SPACE โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ High Address (0x7fff...) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ KERNEL SPACE โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโคโ
โ โ STACK โโ
โ โ โ grows down โโ
โ โ ... โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโคโ
โ โ (unmapped gap) โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโคโ
โ โ ... โโ
โ โ โ grows up โโ
โ โ HEAP โโ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโ
โ โ โ Your allocator manages this region! โ โโ
โ โ โ โ โโ
โ โ โ โโโโโโโโ โโโโโโโโโโโโ โโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ โ โโ
โ โ โ โ used โ โ free โ โusedโ โ free โ โ โโ
โ โ โ โโโโโโโโ โโโโโโโโโโโโ โโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ โ โโ
โ โ โ โ โโ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโคโ
โ โ .bss (uninitialized data) โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโคโ
โ โ .data (initialized data) โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโคโ
โ โ .text (program code) โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Low Address (0x0...) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Heap Expansion Mechanisms
sbrk() - Traditional Heap Extension:
void *sbrk(intptr_t increment);
// Moves the "program break" (end of heap) by 'increment' bytes
// Returns pointer to the OLD break (start of new memory)
// sbrk(0) returns current break without changing it
// Example: Extend heap by 4096 bytes
void *new_mem = sbrk(4096);
if (new_mem == (void *)-1) {
// Error: out of memory
}
mmap() - Memory Mapping for Large Allocations:
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
// Maps a region of virtual address space
// For anonymous memory: fd = -1, flags = MAP_ANONYMOUS | MAP_PRIVATE
// Example: Allocate 1MB anonymous region
void *large_block = mmap(NULL, 1024*1024,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1, 0);
Trade-offs:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ sbrk() vs mmap() โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ sbrk() โ mmap() โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Heap must be contiguous โ Can map anywhere in address space โ
โ Fast for small increments โ Higher overhead per call โ
โ Can only shrink from the top โ Any region can be unmapped โ
โ Simple interface โ More complex but flexible โ
โ Traditional Unix approach โ Modern approach for large allocs โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Modern allocators: sbrk for small, mmap for large (>128KB) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.3 Block Structure: Headers and Footers
The Implicit Free List Block Format
Every block in the heap needs metadata to track its size and status:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ BLOCK STRUCTURE (WITH FOOTER) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ ALLOCATED BLOCK FREE BLOCK โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโ โ
โ โ
โ โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โ
โ โ HEADER (4B) โ โ HEADER (4B) โ โ
โ โ size | 1 โ โ size | 0 โ โ
โ โโโโโโโโโโโโโโโโโโค โโโโโโโโโโโโโโโโโโค โ
โ โ โ โ โ โ
โ โ PAYLOAD โ โ โ โ
โ โ (user data) โ โ FREE SPACE โ โ
โ โ โ โ (available) โ โ
โ โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโโค โโโโโโโโโโโโโโโโโโค โ
โ โ PADDING โ โ โ โ
โ โ (alignment) โ โ โ โ
โ โโโโโโโโโโโโโโโโโโค โโโโโโโโโโโโโโโโโโค โ
โ โ FOOTER (4B) โ โ FOOTER (4B) โ โ
โ โ size | 1 โ โ size | 0 โ โ
โ โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โ
โ โ
โ Low-order bit of size encodes allocated/free status โ
โ (works because size is always multiple of alignment) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Header Format (32-bit example):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ HEADER FORMAT โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ 31 3 2 1 0 โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโฌโโโฌโโโฌโโโ โ
โ โ BLOCK SIZE (bytes) โ โ โ Pโ Aโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโดโโโดโโโดโโโ โ
โ โ
โ A (bit 0): Allocated flag โ
โ 0 = free, 1 = allocated โ
โ โ
โ P (bit 1): Previous block allocated (optional optimization) โ
โ 0 = previous is free, 1 = previous is allocated โ
โ (eliminates need for footer in allocated blocks) โ
โ โ
โ Size is always a multiple of 8 (or 16), so low bits are free โ
โ โ
โ Example: โ
โ Header = 0x00000019 = 24 bytes, allocated โ
โ Header = 0x00000020 = 32 bytes, free โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Why Footers?:
Footers enable O(1) coalescing with the previous block:
Without footers:
To find previous block's size, must scan from heap start = O(n)
With footers:
Just look at (current_header - 4 bytes) = O(1)
โโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโฌโโโโโโโโโฌโโโโโโโโโโโโโโโโโฌโโโโโโโโโ
โ HDR โ PAYLOAD โ FTR โ HDR โ PAYLOAD โ FTR โ
โ 24|1 โ โ 24|1 โ 32|0 โ โ 32|0 โ
โโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโดโโโโโโโโโดโโโโโโโโโโโโโโโโโดโโโโโโโโโ
โ โ
| Footer tells us prev |
| block is 24 bytes, alloc |
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Look back 4 bytes to find previous footer
2.4 Free List Organizations
Implicit Free List
The simplest approach: blocks are organized implicitly by their physical order.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ IMPLICIT FREE LIST โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ HEAP STRUCTURE: โ
โ โ
โ โโโโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโ โ
โ โ PROLOGUE โ BLOCK โ BLOCK โ BLOCK โ EPILOGUE โ โ
โ โ 8|1 โ 32|1 โ 64|0 โ 128|1 โ 0|1 โ โ
โ โโโโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโ โ
โ ALLOCATED FREE ALLOCATED (end marker) โ
โ โ
โ FINDING FREE BLOCKS: โ
โ - Start at first block โ
โ - Walk through ALL blocks checking headers โ
โ - Stop at first free block that fits (first fit) โ
โ โ
โ Time Complexity: โ
โ - malloc: O(total blocks) - must scan all blocks โ
โ - free: O(1) with footers (coalescing) โ
โ โ
โ Space Overhead: โ
โ - Header: 4 bytes per block โ
โ - Footer: 4 bytes per block โ
โ - Minimum block size: 16 bytes (header + footer + 8-byte min) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Explicit Free List
Maintain a linked list of only free blocks:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ EXPLICIT FREE LIST โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ FREE BLOCK STRUCTURE: โ
โ โ
โ โโโโโโโโโโโโโโโโโโ โ
โ โ HEADER (4B) โ โ
โ โ size | 0 โ โ
โ โโโโโโโโโโโโโโโโโโค โ
โ โ NEXT (8B) โ โโโ points to next free block โ
โ โโโโโโโโโโโโโโโโโโค โ
โ โ PREV (8B) โ โโโ points to previous free block โ
โ โโโโโโโโโโโโโโโโโโค โ
โ โ โ โ
โ โ (unused) โ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโค โ
โ โ FOOTER (4B) โ โ
โ โ size | 0 โ โ
โ โโโโโโโโโโโโโโโโโโ โ
โ โ
โ HEAP VIEW: โ
โ โ
โ โโโโโโโโโโฌโโโโโโโโโฌโโโโโโโโโฌโโโโโโโโโฌโโโโโโโโโฌโโโโโโโโโ โ
โ โPROLOGUEโ ALLOC โ FREE โ ALLOC โ FREE โEPILOGUEโ โ
โ โ โ 32|1 โ 64|0 โ 128|1 โ 256|0 โ โ โ
โ โโโโโโโโโโดโโโโโโโโโดโโโโโโโโโดโโโโโโโโโดโโโโโโโโโดโโโโโโโโโ โ
โ โ โ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโ โ
โ doubly linked โ
โ โ
โ free_list_head โโโ [64|0] โโโโ [256|0] โโโโ NULL โ
โ โ
โ Time Complexity: โ
โ - malloc: O(free blocks) - only scan free blocks โ
โ - free: O(1) with immediate coalescing โ
โ โ
โ Space Overhead: โ
โ - Minimum block: 24 bytes (header + 2 pointers + footer) โ
โ - But only free blocks need the pointers โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
List Ordering Policies:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ FREE LIST ORDERING POLICIES โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ LIFO (Last-In-First-Out): โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ Insert newly freed block at HEAD of list โ
โ + Simple and fast insertion: O(1) โ
โ - Poor locality: recently freed blocks scattered โ
โ โ
โ free(block): โ
โ block->next = free_list_head โ
โ block->prev = NULL โ
โ if (free_list_head) free_list_head->prev = block โ
โ free_list_head = block โ
โ โ
โ ADDRESS-ORDERED: โ
โ โโโโโโโโโโโโโโโโ โ
โ Insert in order by address in memory โ
โ + Better locality: nearby free blocks adjacent in list โ
โ + Enables efficient coalescing โ
โ - Slower insertion: O(n) to find position โ
โ โ
โ Memory: [block_A at 0x1000] [block_B at 0x2000] [block_C at 0x3000]โ
โ List: block_A โ block_B โ block_C โ
โ โ
โ BEST PRACTICE: Address-ordered for better utilization โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Segregated Free Lists
Group free blocks by size class for O(1) access to appropriately-sized blocks:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SEGREGATED FREE LISTS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ SIZE CLASS ARRAY: โ
โ โโโโโโโโโโโโโโโโโ โ
โ โ
โ Index Size Range Free List โ
โ โโโโโ โโโโโโโโโโ โโโโโโโโโ โ
โ 0 16-31 bytes โโโ [24|0] โ [16|0] โ NULL โ
โ 1 32-63 bytes โโโ [48|0] โ [32|0] โ [56|0] โ NULL โ
โ 2 64-127 bytes โโโ [96|0] โ NULL โ
โ 3 128-255 bytes โโโ NULL (empty) โ
โ 4 256-511 bytes โโโ [384|0] โ [256|0] โ NULL โ
โ 5 512-1023 bytes โโโ [512|0] โ NULL โ
โ 6 1024-2047 โโโ [1024|0] โ NULL โ
โ 7 2048-4095 โโโ NULL โ
โ 8 4096-โ โโโ [8192|0] โ [4096|0] โ NULL โ
โ โ
โ SIZE CLASS CALCULATION: โ
โ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ class(size) = floor(log2(size)) - 3 (for power-of-2 classes) โ
โ โ
โ Or use explicit ranges: โ
โ size < 32 โ class 0 โ
โ size < 64 โ class 1 โ
โ size < 128 โ class 2 โ
โ ... โ
โ โ
โ MALLOC ALGORITHM: โ
โ โโโโโโโโโโโโโโโโโ โ
โ 1. Calculate size class for requested size โ
โ 2. Search that class's list โ
โ 3. If empty, try next larger class โ
โ 4. If all empty, extend heap with sbrk() โ
โ 5. Split if block too large โ
โ โ
โ Time Complexity: O(1) for common case โ
โ O(classes) worst case โ
โ โ
โ Space Overhead: Array of list heads (typically 8-16 pointers) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.5 Boundary Tag Coalescing
When freeing a block, adjacent free blocks should be merged to reduce fragmentation:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ COALESCING SCENARIOS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ CASE 1: Neither adjacent block is free โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Before: [ALLOC] [FREE*] [ALLOC] โ
โ After: [ALLOC] [FREE ] [ALLOC] (no change, just mark free) โ
โ โ
โ CASE 2: Previous block is free โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Before: [FREE ] [FREE*] [ALLOC] โ
โ โโโ64โโโโโโ32โโโ โ
โ โ
โ After: [ FREE (96) ] [ALLOC] โ
โ Update header of prev, footer of current โ
โ โ
โ CASE 3: Next block is free โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Before: [ALLOC] [FREE*] [FREE ] โ
โ โโโ32โโโโโโ64โโโ โ
โ โ
โ After: [ALLOC] [ FREE (96) ] โ
โ Update header of current, footer of next โ
โ โ
โ CASE 4: Both adjacent blocks are free โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Before: [FREE ] [FREE*] [FREE ] โ
โ โโโ64โโโโโโ32โโโโโโ64โโโ โ
โ โ
โ After: [ FREE (160) ] โ
โ Update header of prev, footer of next โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Coalescing Implementation:
static void *coalesce(void *bp) {
// Get allocation status of adjacent blocks
size_t prev_alloc = GET_ALLOC(FTRP(PREV_BLKP(bp))); // Previous footer
size_t next_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp))); // Next header
size_t size = GET_SIZE(HDRP(bp));
if (prev_alloc && next_alloc) { // Case 1: both allocated
return bp; // No coalescing needed
}
else if (prev_alloc && !next_alloc) { // Case 2: next is free
size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
PUT(HDRP(bp), PACK(size, 0)); // Update current header
PUT(FTRP(bp), PACK(size, 0)); // Update next's footer
// (FTRP uses new size, finds the right location)
}
else if (!prev_alloc && next_alloc) { // Case 3: prev is free
size += GET_SIZE(HDRP(PREV_BLKP(bp)));
PUT(FTRP(bp), PACK(size, 0)); // Update current footer
PUT(HDRP(PREV_BLKP(bp)), PACK(size, 0)); // Update prev header
bp = PREV_BLKP(bp); // Return prev's address
}
else { // Case 4: both are free
size += GET_SIZE(HDRP(PREV_BLKP(bp))) +
GET_SIZE(FTRP(NEXT_BLKP(bp)));
PUT(HDRP(PREV_BLKP(bp)), PACK(size, 0)); // Update prev header
PUT(FTRP(NEXT_BLKP(bp)), PACK(size, 0)); // Update next footer
bp = PREV_BLKP(bp); // Return prev's address
}
return bp;
}
2.6 Splitting Policies
When a free block is larger than needed, we may split it:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ BLOCK SPLITTING โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ REQUEST: malloc(24) (plus 8 bytes overhead = 32 bytes needed) โ
โ FOUND: Free block of 128 bytes โ
โ โ
โ WITHOUT SPLITTING: โ
โ โโโโโโโโโโโโโโโโโโ โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ HEADER (128|1) โ PAYLOAD (120 bytes) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Internal fragmentation: 120 - 24 = 96 bytes WASTED! โ
โ โ
โ WITH SPLITTING: โ
โ โโโโโโโโโโโโโโโ โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ HDR โ PAYLOAD โ FTR โ HDR โ FREE BLOCK โ FTR โ โ
โ โ32|1 โ (24) โ32|1 โ96|0 โ (88 usable) โ96|0 โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ 32 bytes 96 bytes โ
โ โ
โ SPLITTING DECISION: โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ remainder = block_size - needed_size โ
โ if (remainder >= MIN_BLOCK_SIZE) โ
โ split the block โ
โ else โ
โ use entire block (accept internal fragmentation) โ
โ โ
โ MIN_BLOCK_SIZE = header(4) + min_payload(8) + footer(4) = 16 โ
โ or with pointers: 24-32 bytes โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.7 Placement Policies: First Fit, Next Fit, Best Fit
How do we choose which free block to use?
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PLACEMENT POLICIES โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ FREE BLOCKS: [32] โ [128] โ [64] โ [256] โ [48] โ NULL โ
โ REQUEST: malloc(50) needs 64 bytes with overhead โ
โ โ
โ FIRST FIT: โ
โ โโโโโโโโโโ โ
โ Scan from beginning, use first block that fits โ
โ Choice: [128] (first one >= 64) โ
โ โ
โ + Simple to implement โ
โ + Fast for small allocations (found early) โ
โ - Fragments beginning of heap โ
โ - Large blocks may not find space โ
โ โ
โ NEXT FIT: โ
โ โโโโโโโโโ โ
โ Like first fit, but start where previous search ended โ
โ If last search ended at [64], start there โ
โ Choice: [64] (first one >= 64 from current position) โ
โ โ
โ + Spreads allocations across heap โ
โ + Can be faster than first fit โ
โ - Often worse fragmentation than first fit โ
โ โ
โ BEST FIT: โ
โ โโโโโโโโโ โ
โ Scan entire list, choose smallest block that fits โ
โ Choice: [64] (exact fit!) โ
โ โ
โ + Minimizes wasted space per allocation โ
โ + Good memory utilization โ
โ - Slow: O(n) every time โ
โ - Creates many tiny unusable fragments โ
โ โ
โ PERFORMANCE COMPARISON: โ
โ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Policy Throughput Utilization Implementation โ
โ โโโโโโ โโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโโโโ โ
โ First Fit Fast Medium Simple โ
โ Next Fit Fast Poor-Medium Simple + state โ
โ Best Fit Slow Good Simple (but slow) โ
โ Seg Lists Very Fast Good Complex โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.8 Fragmentation
Internal Fragmentation
Wasted space INSIDE allocated blocks:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ INTERNAL FRAGMENTATION โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ REQUEST: malloc(1) โ
โ ACTUAL ALLOCATION: โ
โ โ
โ โโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโ โ
โ โ HEADER โ PAYLOAD โ FOOTER โ โ
โ โ 4B โ 16B minimum (aligned) โ 4B โ โ
โ โโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโ โ
โ โ โ
โ โโโ User requested 1 byte โ
โ Allocator gave 16 bytes (minimum) โ
โ Internal fragmentation: 15 bytes โ
โ โ
โ CAUSES: โ
โ โโโโโโโ โ
โ 1. Alignment requirements (must be 8 or 16 byte aligned) โ
โ 2. Metadata overhead (header/footer) โ
โ 3. Minimum block size constraints โ
โ 4. Not splitting (when remainder too small) โ
โ โ
โ MEASUREMENT: โ
โ โ
โ internal_frag = allocated_size - requested_size โ
โ internal_frag_ratio = internal_frag / allocated_size โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
External Fragmentation
Wasted space BETWEEN allocated blocks:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ EXTERNAL FRAGMENTATION โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ SCENARIO: malloc(256) fails despite 512 bytes total free โ
โ โ
โ HEAP STATE: โ
โ โโโโโโโโโฌโโโโโโโโฌโโโโโโโโฌโโโโโโโโฌโโโโโโโโฌโโโโโโโโฌโโโโโโโโฌโโโโโโโโโ
โ โ FREE โ ALLOC โ FREE โ ALLOC โ FREE โ ALLOC โ FREE โ ALLOC โโ
โ โ 128B โ 64B โ 128B โ 64B โ 128B โ 64B โ 128B โ 64B โโ
โ โโโโโโโโโดโโโโโโโโดโโโโโโโโดโโโโโโโโดโโโโโโโโดโโโโโโโโดโโโโโโโโดโโโโโโโโโ
โ โ
โ Total free: 128 ร 4 = 512 bytes โ
โ Largest contiguous free: 128 bytes โ
โ Cannot satisfy 256-byte request! โ
โ โ
โ CAUSES: โ
โ โโโโโโโ โ
โ 1. Alternating alloc/free pattern โ
โ 2. Poor placement policy (best fit creates tiny holes) โ
โ 3. No coalescing (or deferred coalescing) โ
โ 4. Long-lived allocations fragment the heap โ
โ โ
โ MITIGATION: โ
โ โโโโโโโโโโโ โ
โ 1. Immediate coalescing โ
โ 2. Segregated fits (group similar sizes) โ
โ 3. Compaction (expensive, not common in C) โ
โ 4. Better placement (first fit often better than best fit) โ
โ โ
โ MEASUREMENT: โ
โ โ
โ external_frag = total_free_space - largest_free_block โ
โ utilization = allocated_bytes / total_heap_size โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.9 Alignment Requirements
All pointers returned by malloc must be properly aligned:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ALIGNMENT REQUIREMENTS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ WHY ALIGNMENT MATTERS: โ
โ โโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Modern CPUs load data in aligned chunks. Unaligned access: โ
โ - May require multiple memory operations โ
โ - Can cause hardware exceptions (SIGBUS on some architectures) โ
โ - Always slower when supported โ
โ โ
โ ALIGNMENT RULES: โ
โ โโโโโโโโโโโโโโโโ โ
โ โ
โ Type Size Required Alignment โ
โ โโโโ โโโโ โโโโโโโโโโโโโโโโโโ โ
โ char 1 1 byte โ
โ short 2 2 bytes โ
โ int 4 4 bytes โ
โ long 8 8 bytes (on 64-bit) โ
โ double 8 8 bytes โ
โ pointer 8 8 bytes (on 64-bit) โ
โ long double 16 16 bytes โ
โ __m128 16 16 bytes (SSE) โ
โ __m256 32 32 bytes (AVX) โ
โ โ
โ MALLOC REQUIREMENT: โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ
โ malloc must return addresses aligned to the MAXIMUM alignment โ
โ requirement of any type (typically 8 or 16 bytes) โ
โ โ
โ Valid addresses: 0x1000, 0x1008, 0x1010, 0x1018, ... โ
โ Invalid (8-byte): 0x1001, 0x1002, 0x1004, 0x1007, ... โ
โ โ
โ IMPLEMENTATION: โ
โ โโโโโโโโโโโโโโโ โ
โ โ
โ #define ALIGNMENT 8 โ
โ #define ALIGN(size) (((size) + (ALIGNMENT-1)) & ~(ALIGNMENT-1)) โ
โ โ
โ Examples: โ
โ ALIGN(1) = 8 โ
โ ALIGN(8) = 8 โ
โ ALIGN(9) = 16 โ
โ ALIGN(15) = 16 โ
โ ALIGN(16) = 16 โ
โ โ
โ BLOCK SIZE = ALIGN(payload_size + header + footer) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.10 Memory Allocator Invariants
Invariants are properties that must ALWAYS be true. Checking them catches bugs early:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ALLOCATOR INVARIANTS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ STRUCTURAL INVARIANTS (must always hold): โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ 1. Alignment: Every block address is ALIGNMENT-byte aligned โ
โ CHECK: (block_ptr & (ALIGNMENT - 1)) == 0 โ
โ โ
โ 2. Header/Footer match: Header size == Footer size โ
โ CHECK: GET_SIZE(HDRP(bp)) == GET_SIZE(FTRP(bp)) โ
โ GET_ALLOC(HDRP(bp)) == GET_ALLOC(FTRP(bp)) โ
โ โ
โ 3. Coalescing: No two consecutive free blocks โ
โ CHECK: if (!allocated(bp)) then allocated(NEXT_BLKP(bp)) โ
โ โ
โ 4. Boundary: Prologue and epilogue exist and are marked โ
โ CHECK: heap starts with 8|1, ends with 0|1 โ
โ โ
โ 5. Heap bounds: All blocks within [heap_start, heap_end] โ
โ CHECK: heap_start <= bp < heap_end for all blocks โ
โ โ
โ FREE LIST INVARIANTS (for explicit lists): โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ 6. Every block in free list is marked free โ
โ CHECK: for each bp in list: !GET_ALLOC(HDRP(bp)) โ
โ โ
โ 7. Every free block is in the free list โ
โ CHECK: scan heap, verify all free blocks are in list โ
โ โ
โ 8. Pointer consistency: next->prev == current, prev->next == curr โ
โ CHECK: for each bp: NEXT(PREV(bp)) == bp && PREV(NEXT(bp)) == bpโ
โ โ
โ SIZE INVARIANTS: โ
โ โโโโโโโโโโโโโโโโ โ
โ โ
โ 9. Block size >= minimum block size โ
โ CHECK: GET_SIZE(HDRP(bp)) >= MIN_BLOCK_SIZE โ
โ โ
โ 10. Block size is aligned โ
โ CHECK: (GET_SIZE(HDRP(bp)) & (ALIGNMENT - 1)) == 0 โ
โ โ
โ 11. Total of all block sizes == heap size โ
โ CHECK: sum of sizes from prologue to epilogue == brk - start โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.11 Throughput vs Utilization Trade-offs
Every allocator design involves trade-offs:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ THROUGHPUT vs UTILIZATION TRADE-OFFS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ High Throughput โ
โ โฒ โ
โ โ โ
โ Implicit Free โ Segregated Lists โ
โ (simple header) โ (per-class lists) โ
โ โ โ
โ Simple โโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโ Complex โ
โ โ โ
โ Best Fit โ Buddy Allocator โ
โ (scan all) โ (power-of-2 splits) โ
โ โ โ
โ โผ โ
โ High Utilization โ
โ โ
โ DESIGN CHOICES AND THEIR EFFECTS: โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Choice Throughput Effect Utilization Effect โ
โ โโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โ
โ Larger min block Better (faster) Worse (more internal) โ
โ Explicit lists Better Same โ
โ Segregated lists Much better Same or better โ
โ Best fit Worse Better โ
โ First fit Better Worse โ
โ Immediate coalesce Usually better Usually better โ
โ Deferred coalesce Sometimes better Sometimes worse โ
โ Footer elimination Better (less data) Better (less overhead) โ
โ โ
โ TYPICAL TARGETS: โ
โ โโโโโโโโโโโโโโโโ โ
โ - Throughput: 10,000+ operations/second โ
โ - Utilization: 70-90% of peak memory โ
โ - Real allocators (glibc, jemalloc): highly optimized for both โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
3. Project Specification
3.1 What You Will Build
A dynamic memory allocator that implements:
- malloc(size): Allocate at least
sizebytes of memory - free(ptr): Return memory to the free pool
- realloc(ptr, size): Resize an existing allocation (optional but recommended)
Plus supporting tools:
- Heap checker: Verify invariants after operations
- Performance harness: Measure throughput and utilization
3.2 Functional Requirements
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ FUNCTIONAL REQUIREMENTS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ CORE API: โ
โ โโโโโโโโโ โ
โ โ
โ int mm_init(void); โ
โ - Initialize the heap โ
โ - Create prologue and epilogue blocks โ
โ - Return 0 on success, -1 on failure โ
โ โ
โ void *mm_malloc(size_t size); โ
โ - Return pointer to allocated block of at least 'size' bytes โ
โ - Block must be 8-byte aligned (or 16-byte on 64-bit) โ
โ - Return NULL if size == 0 or cannot satisfy request โ
โ โ
โ void mm_free(void *ptr); โ
โ - Return block to free pool โ
โ - Coalesce with adjacent free blocks โ
โ - ptr == NULL is a no-op โ
โ โ
โ void *mm_realloc(void *ptr, size_t size); โ
โ - Resize allocation to 'size' bytes โ
โ - Preserve contents up to min(old_size, new_size) โ
โ - realloc(NULL, size) == malloc(size) โ
โ - realloc(ptr, 0) == free(ptr), return NULL โ
โ โ
โ HEAP EXTENSION: โ
โ โโโโโโโโโโโโโโโ โ
โ โ
โ Use provided mem_sbrk(int incr) to extend heap โ
โ - Similar to sbrk() but works with simulated heap โ
โ - Returns pointer to new memory on success โ
โ - Returns (void *)-1 on failure โ
โ โ
โ CONSTRAINTS: โ
โ โโโโโโโโโโโโ โ
โ โ
โ - No global arrays larger than 128 bytes โ
โ - No calls to malloc/free/realloc from libc โ
โ - Must handle requests from 1 byte to large allocations โ
โ - Must not corrupt heap on any sequence of operations โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
3.3 Non-Functional Requirements
- Correctness: Never return overlapping blocks, never corrupt heap
- Alignment: All returned pointers must be properly aligned
- Performance: Reasonable throughput (>1000 ops/sec at minimum)
- Utilization: Memory utilization above 50% on standard traces
- Robustness: Handle edge cases (size=0, ptr=NULL, large sizes)
3.4 Example Allocation Trace
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ EXAMPLE ALLOCATION TRACE โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ Trace: โ
โ โโโโโโ โ
โ a0 = malloc(32) # Allocate 32 bytes, get pointer p0 โ
โ a1 = malloc(64) # Allocate 64 bytes, get pointer p1 โ
โ a2 = malloc(128) # Allocate 128 bytes, get pointer p2 โ
โ free(a1) # Free the 64-byte block โ
โ a3 = malloc(48) # Should reuse (part of) a1's block โ
โ a1 = realloc(a2, 64) # Shrink a2 from 128 to 64 bytes โ
โ free(a0) # Free original 32-byte block โ
โ free(a3) # Free 48-byte block โ
โ free(a1) # Free remaining block โ
โ โ
โ Heap evolution: โ
โ โโโโโโโโโโโโโโโ โ
โ โ
โ After init: โ
โ [PROLOGUE] โโโโโโโโโโโโ FREE โโโโโโโโโโโโโโโโโโโโโโโโโโ [EPILOGUE]โ
โ โ
โ After malloc(32): โ
โ [PRO] [a0:32|1] โโโโโโโโโโ FREE โโโโโโโโโโโโโโโโโโโโโโโ [EPI] โ
โ โ
โ After malloc(64): โ
โ [PRO] [a0:32|1] [a1:64|1] โโโโโโ FREE โโโโโโโโโโโโโโโโโ [EPI] โ
โ โ
โ After malloc(128): โ
โ [PRO] [a0:32|1] [a1:64|1] [a2:128|1] โโโ FREE โโโโโโโโโ [EPI] โ
โ โ
โ After free(a1): โ
โ [PRO] [a0:32|1] [64|0] [a2:128|1] โโโ FREE โโโโโโโโโโโโโ [EPI] โ
โ โ
โ After malloc(48): (reuses part of freed a1) โ
โ [PRO] [a0:32|1] [a3:48|1] [16|0] [a2:128|1] โโโ FREE โโโ [EPI] โ
โ or (if 16 too small, uses all 64): โ
โ [PRO] [a0:32|1] [a3:64|1] [a2:128|1] โโโ FREE โโโ [EPI] โ
โ โ
โ Final state (all freed): โ
โ [PRO] โโโโโโโโโโโโโโโโโโ FREE โโโโโโโโโโโโโโโโโโโโโโโโโโ [EPI] โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
4. Solution Architecture
4.1 High-Level Design
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ALLOCATOR ARCHITECTURE โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ mm_malloc() โ
โ โ โ
โ โโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโ โ
โ โผ โผ โผ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ Find โ โ Extend โ โ Split โ โ
โ โ Block โ โ Heap โ โ Block โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโ โ
โ โผ โ
โ Return Pointer โ
โ โ
โ mm_free() โ
โ โ โ
โ โโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโ โ
โ โผ โผ โผ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ Mark โ โ Coalesce โ โ Update โ โ
โ โ Free โ โ Blocks โ โ List โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ HEAP LAYOUT โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ โ โ
โ โ heap_listp โ โ
โ โ โ โ โ
โ โ โผ โ โ
โ โ โโโโโโโโฌโโโโโโโโโโฌโโโโโโโโโโฌโโโโโโโโโโฌโโโโโโโโโโฌโโโโโโโ โ โ
โ โ โ PRO โ BLOCK 1 โ BLOCK 2 โ BLOCK 3 โ ... โ EPI โ โ โ
โ โ โ 8|1 โ โ โ โ โ 0|1 โ โ โ
โ โ โโโโโโโโดโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโดโโโโโโโ โ โ
โ โ โ โ
โ โ free_listp (for explicit list) โ โ
โ โ โ โ โ
โ โ โผ โ โ
โ โ [FREE 1] โโโโ [FREE 2] โโโโ [FREE 3] โโโโ NULL โ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
4.2 Data Structures
/* ==================== CONSTANTS ==================== */
/* Alignment requirement (bytes) */
#define ALIGNMENT 8
/* Word size (bytes) */
#define WSIZE 4
/* Double word size (bytes) */
#define DSIZE 8
/* Minimum block size: header + min payload + footer */
#define MIN_BLOCK_SIZE 16
/* Extend heap by this amount (bytes) */
#define CHUNKSIZE (1 << 12) /* 4096 bytes */
/* ==================== MACROS ==================== */
/* Max of two values */
#define MAX(x, y) ((x) > (y) ? (x) : (y))
/* Pack size and allocated bit into a word */
#define PACK(size, alloc) ((size) | (alloc))
/* Read a word at address p */
#define GET(p) (*(unsigned int *)(p))
/* Write a word at address p */
#define PUT(p, val) (*(unsigned int *)(p) = (val))
/* Read size from header/footer at address p */
#define GET_SIZE(p) (GET(p) & ~0x7)
/* Read allocated bit from header/footer at address p */
#define GET_ALLOC(p) (GET(p) & 0x1)
/* Given block ptr bp, compute address of its header */
#define HDRP(bp) ((char *)(bp) - WSIZE)
/* Given block ptr bp, compute address of its footer */
#define FTRP(bp) ((char *)(bp) + GET_SIZE(HDRP(bp)) - DSIZE)
/* Given block ptr bp, compute address of next block */
#define NEXT_BLKP(bp) ((char *)(bp) + GET_SIZE(((char *)(bp) - WSIZE)))
/* Given block ptr bp, compute address of previous block */
#define PREV_BLKP(bp) ((char *)(bp) - GET_SIZE(((char *)(bp) - DSIZE)))
/* ==================== FOR EXPLICIT FREE LIST ==================== */
/* Given free block ptr bp, get next free block pointer */
#define NEXT_FREE(bp) (*(void **)(bp))
/* Given free block ptr bp, get previous free block pointer */
#define PREV_FREE(bp) (*(void **)((char *)(bp) + DSIZE))
/* ==================== GLOBAL VARIABLES ==================== */
/* Pointer to first block (after prologue) */
static char *heap_listp = NULL;
/* Pointer to first free block (for explicit list) */
static char *free_listp = NULL;
4.3 Block Layout
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ BLOCK LAYOUTS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ IMPLICIT FREE LIST BLOCK: โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Address Content โ
โ โโโโโโโ โโโโโโโ โ
โ bp - 4 HEADER: [size | alloc] โ
โ bp PAYLOAD start (returned to user) โ
โ bp + ? PAYLOAD end โ
โ bp + size-8 FOOTER: [size | alloc] โ
โ โ
โ โโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโ โ
โ โ HEADER โ PAYLOAD โ FOOTER โ โ
โ โ 4B โ (size - 8) bytes โ 4B โ โ
โ โโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโ โ
โ โ
โ โ
โ EXPLICIT FREE LIST BLOCK (when free): โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ โโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโ โ
โ โ HEADER โ NEXT_PTR โ PREV_PTR โ (unused) โ FOOTER โ โ
โ โ 4B โ 8B โ 8B โ (size-28) B โ 4B โ โ
โ โโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโ โ
โ โ โ โ
โ bp bp + 8 โ
โ โ
โ MINIMUM BLOCK SIZE (explicit list): โ
โ header(4) + next(8) + prev(8) + footer(4) = 24 bytes โ
โ Round up to 32 for alignment โ
โ โ
โ โ
โ PROLOGUE AND EPILOGUE: โ
โ โโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ โโโโโโโโโโฌโโโโโโโโโ โโโโโโโโโโ โ
โ โ HDR โ FTR โ โ HDR โ โ
โ โ 8|1 โ 8|1 โ โ 0|1 โ โ
โ โโโโโโโโโโดโโโโโโโโโ โโโโโโโโโโ โ
โ PROLOGUE (8 bytes) EPILOGUE (4 bytes, marks end) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
4.4 Algorithm Overview
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ALGORITHM SUMMARY โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ mm_init(): โ
โ โโโโโโโโโโ โ
โ 1. Request initial heap space (e.g., 4 words) โ
โ 2. Create alignment padding if needed โ
โ 3. Create prologue header and footer (8|1) โ
โ 4. Create epilogue header (0|1) โ
โ 5. Set heap_listp to first block position โ
โ 6. Extend heap with a free block โ
โ โ
โ mm_malloc(size): โ
โ โโโโโโโโโโโโโโโโ โ
โ 1. Ignore spurious requests (size == 0) โ
โ 2. Adjust size: add overhead, round up to alignment โ
โ 3. Search free list for suitable block โ
โ - First fit: scan from start, take first fit โ
โ - Best fit: scan all, take smallest fit โ
โ 4. If no block found, extend heap via sbrk() โ
โ 5. Split block if remainder >= MIN_BLOCK_SIZE โ
โ 6. Mark block as allocated, return payload pointer โ
โ โ
โ mm_free(ptr): โ
โ โโโโโโโโโโโโโ โ
โ 1. If ptr == NULL, return immediately โ
โ 2. Get block size from header โ
โ 3. Mark header and footer as free โ
โ 4. Coalesce with adjacent free blocks โ
โ 5. (For explicit list) Insert into free list โ
โ โ
โ coalesce(bp): โ
โ โโโโโโโโโโโโโ โ
โ 1. Check if previous block is free (via previous footer) โ
โ 2. Check if next block is free (via next header) โ
โ 3. Handle 4 cases: โ
โ a. Neither free: do nothing โ
โ b. Next free: merge with next โ
โ c. Previous free: merge with previous โ
โ d. Both free: merge all three โ
โ 4. Update headers/footers and free list โ
โ โ
โ mm_realloc(ptr, size): โ
โ โโโโโโโโโโโโโโโโโโโโโ โ
โ 1. Handle edge cases (NULL ptr, size 0) โ
โ 2. If new size <= current size, can shrink in place โ
โ - Optionally split off excess โ
โ 3. If next block is free and combined size sufficient โ
โ - Coalesce with next block (avoid copy) โ
โ 4. Otherwise: โ
โ - malloc(size) โ
โ - memcpy(new, old, min_size) โ
โ - free(old) โ
โ - return new โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
5. Implementation Guide
5.1 Development Environment Setup
# Required tools
gcc --version # Need GCC for compilation
make --version # Build automation
gdb --version # Debugging
# Create project structure
mkdir -p malloc-lab/{src,tests,traces}
cd malloc-lab
# Download CS:APP malloc lab files (if available)
# Or create your own test harness
5.2 Project Structure
malloc-lab/
โโโ src/
โ โโโ mm.c # Your allocator implementation
โ โโโ mm.h # Public interface
โ โโโ memlib.c # Memory system simulator
โ โโโ memlib.h # Memory system interface
โโโ tests/
โ โโโ mdriver.c # Test driver
โ โโโ mm_test.c # Unit tests
โ โโโ fsecs.c # Timing utilities
โโโ traces/
โ โโโ short1.rep # Short test trace
โ โโโ short2.rep # Another short trace
โ โโโ ... # More traces
โโโ Makefile
โโโ README.md
5.3 Implementation Phases
Phase 1: Foundation (Days 1-3)
Goals:
- Set up the basic heap structure
- Implement mm_init()
- Create prologue and epilogue blocks
Tasks:
/*
* mm_init - Initialize the heap with prologue and epilogue
*/
int mm_init(void) {
/* Create the initial empty heap */
if ((heap_listp = mem_sbrk(4 * WSIZE)) == (void *)-1)
return -1;
PUT(heap_listp, 0); /* Alignment padding */
PUT(heap_listp + (1 * WSIZE), PACK(DSIZE, 1)); /* Prologue header */
PUT(heap_listp + (2 * WSIZE), PACK(DSIZE, 1)); /* Prologue footer */
PUT(heap_listp + (3 * WSIZE), PACK(0, 1)); /* Epilogue header */
heap_listp += (2 * WSIZE); /* Point to prologue block */
/* Extend the empty heap with a free block of CHUNKSIZE bytes */
if (extend_heap(CHUNKSIZE / WSIZE) == NULL)
return -1;
return 0;
}
Checkpoint: mm_init() creates valid heap structure. Can verify by examining memory.
Phase 2: Basic Allocation (Days 4-7)
Goals:
- Implement find_fit() with first-fit policy
- Implement place() to allocate blocks
- Implement basic mm_malloc()
Tasks:
/*
* find_fit - Find a free block of at least asize bytes
* Uses first-fit search
*/
static void *find_fit(size_t asize) {
void *bp;
/* First-fit search through the heap */
for (bp = heap_listp; GET_SIZE(HDRP(bp)) > 0; bp = NEXT_BLKP(bp)) {
if (!GET_ALLOC(HDRP(bp)) && (asize <= GET_SIZE(HDRP(bp)))) {
return bp;
}
}
return NULL; /* No fit found */
}
/*
* place - Place block of asize bytes at start of free block bp
* Split if remainder would be at least minimum block size
*/
static void place(void *bp, size_t asize) {
size_t csize = GET_SIZE(HDRP(bp));
if ((csize - asize) >= (2 * DSIZE)) { /* Can split */
PUT(HDRP(bp), PACK(asize, 1));
PUT(FTRP(bp), PACK(asize, 1));
bp = NEXT_BLKP(bp);
PUT(HDRP(bp), PACK(csize - asize, 0));
PUT(FTRP(bp), PACK(csize - asize, 0));
} else { /* Cannot split, use entire block */
PUT(HDRP(bp), PACK(csize, 1));
PUT(FTRP(bp), PACK(csize, 1));
}
}
/*
* mm_malloc - Allocate a block with at least size bytes of payload
*/
void *mm_malloc(size_t size) {
size_t asize; /* Adjusted block size */
size_t extendsize; /* Amount to extend heap if no fit */
char *bp;
/* Ignore spurious requests */
if (size == 0)
return NULL;
/* Adjust block size to include overhead and alignment reqs */
if (size <= DSIZE)
asize = 2 * DSIZE; /* Minimum block size */
else
asize = DSIZE * ((size + (DSIZE) + (DSIZE - 1)) / DSIZE);
/* Search the free list for a fit */
if ((bp = find_fit(asize)) != NULL) {
place(bp, asize);
return bp;
}
/* No fit found. Get more memory and place the block */
extendsize = MAX(asize, CHUNKSIZE);
if ((bp = extend_heap(extendsize / WSIZE)) == NULL)
return NULL;
place(bp, asize);
return bp;
}
Checkpoint: Can allocate blocks. Use simple test: malloc(32) returns valid pointer.
Phase 3: Free and Coalesce (Days 8-12)
Goals:
- Implement mm_free()
- Implement coalesce() with all 4 cases
- Verify no adjacent free blocks remain
Tasks:
/*
* mm_free - Free a block and coalesce adjacent free blocks
*/
void mm_free(void *ptr) {
if (ptr == NULL)
return;
size_t size = GET_SIZE(HDRP(ptr));
/* Mark block as free */
PUT(HDRP(ptr), PACK(size, 0));
PUT(FTRP(ptr), PACK(size, 0));
/* Coalesce with adjacent free blocks */
coalesce(ptr);
}
/*
* coalesce - Merge adjacent free blocks
* Returns pointer to the coalesced block
*/
static void *coalesce(void *bp) {
size_t prev_alloc = GET_ALLOC(FTRP(PREV_BLKP(bp)));
size_t next_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
size_t size = GET_SIZE(HDRP(bp));
if (prev_alloc && next_alloc) { /* Case 1 */
return bp;
}
else if (prev_alloc && !next_alloc) { /* Case 2 */
size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
PUT(HDRP(bp), PACK(size, 0));
PUT(FTRP(bp), PACK(size, 0));
}
else if (!prev_alloc && next_alloc) { /* Case 3 */
size += GET_SIZE(HDRP(PREV_BLKP(bp)));
PUT(FTRP(bp), PACK(size, 0));
PUT(HDRP(PREV_BLKP(bp)), PACK(size, 0));
bp = PREV_BLKP(bp);
}
else { /* Case 4 */
size += GET_SIZE(HDRP(PREV_BLKP(bp))) +
GET_SIZE(FTRP(NEXT_BLKP(bp)));
PUT(HDRP(PREV_BLKP(bp)), PACK(size, 0));
PUT(FTRP(NEXT_BLKP(bp)), PACK(size, 0));
bp = PREV_BLKP(bp);
}
return bp;
}
Checkpoint: Free works correctly. Verify with sequence: malloc, free, malloc (should reuse block).
Phase 4: Heap Checker (Days 13-16)
Goals:
- Implement comprehensive invariant checks
- Add debug printing
- Create test suite
Tasks:
/*
* mm_check - Check the heap for consistency
* Returns 0 if consistent, prints error and returns -1 otherwise
*/
int mm_check(void) {
char *bp;
int free_count_heap = 0;
int free_count_list = 0;
/* Check prologue */
if (GET_SIZE(HDRP(heap_listp)) != DSIZE ||
!GET_ALLOC(HDRP(heap_listp))) {
printf("Error: Invalid prologue header\n");
return -1;
}
/* Iterate through all blocks */
for (bp = heap_listp; GET_SIZE(HDRP(bp)) > 0; bp = NEXT_BLKP(bp)) {
/* Check alignment */
if ((size_t)bp % ALIGNMENT != 0) {
printf("Error: Block at %p is not aligned\n", bp);
return -1;
}
/* Check header/footer match */
if (GET_SIZE(HDRP(bp)) != GET_SIZE(FTRP(bp)) ||
GET_ALLOC(HDRP(bp)) != GET_ALLOC(FTRP(bp))) {
printf("Error: Header/footer mismatch at %p\n", bp);
return -1;
}
/* Check coalescing: no consecutive free blocks */
if (!GET_ALLOC(HDRP(bp))) {
free_count_heap++;
if (!GET_ALLOC(HDRP(NEXT_BLKP(bp))) &&
GET_SIZE(HDRP(NEXT_BLKP(bp))) > 0) {
printf("Error: Consecutive free blocks at %p\n", bp);
return -1;
}
}
/* Check block size >= minimum */
if (GET_SIZE(HDRP(bp)) < 2 * DSIZE) {
printf("Error: Block at %p smaller than minimum\n", bp);
return -1;
}
}
/* Check epilogue */
if (GET_SIZE(HDRP(bp)) != 0 || !GET_ALLOC(HDRP(bp))) {
printf("Error: Invalid epilogue\n");
return -1;
}
/* For explicit list: check all free blocks are in list */
/* (Add this when implementing explicit free list) */
return 0;
}
/*
* print_heap - Print the heap for debugging
*/
void print_heap(void) {
char *bp;
printf("\n===== HEAP DUMP =====\n");
printf("heap_listp = %p\n", heap_listp);
for (bp = heap_listp; GET_SIZE(HDRP(bp)) > 0; bp = NEXT_BLKP(bp)) {
printf("Block at %p: size=%u, alloc=%d\n",
bp, GET_SIZE(HDRP(bp)), GET_ALLOC(HDRP(bp)));
}
printf("Epilogue at %p\n", bp);
printf("=====================\n\n");
}
Checkpoint: Heap checker passes on all test cases.
Phase 5: Realloc Implementation (Days 17-20)
Goals:
- Implement mm_realloc()
- Handle all edge cases
- Optimize to avoid unnecessary copying
Tasks:
/*
* mm_realloc - Resize an allocated block
*/
void *mm_realloc(void *ptr, size_t size) {
size_t oldsize;
void *newptr;
/* If ptr is NULL, just malloc */
if (ptr == NULL) {
return mm_malloc(size);
}
/* If size is 0, just free */
if (size == 0) {
mm_free(ptr);
return NULL;
}
oldsize = GET_SIZE(HDRP(ptr));
/* Calculate new required size with overhead */
size_t asize;
if (size <= DSIZE)
asize = 2 * DSIZE;
else
asize = DSIZE * ((size + DSIZE + (DSIZE - 1)) / DSIZE);
/* If new size fits in current block */
if (asize <= oldsize) {
/* Could split here if oldsize - asize >= MIN_BLOCK_SIZE */
return ptr;
}
/* Check if next block is free and combined size is enough */
size_t next_alloc = GET_ALLOC(HDRP(NEXT_BLKP(ptr)));
size_t next_size = GET_SIZE(HDRP(NEXT_BLKP(ptr)));
if (!next_alloc && (oldsize + next_size >= asize)) {
/* Expand into next block */
PUT(HDRP(ptr), PACK(oldsize + next_size, 1));
PUT(FTRP(ptr), PACK(oldsize + next_size, 1));
return ptr;
}
/* Must allocate new block and copy */
newptr = mm_malloc(size);
if (newptr == NULL)
return NULL;
/* Copy old data (payload only) */
size_t copysize = oldsize - DSIZE; /* Subtract header+footer */
if (size < copysize)
copysize = size;
memcpy(newptr, ptr, copysize);
/* Free old block */
mm_free(ptr);
return newptr;
}
Checkpoint: Realloc passes all tests. Verify contents preserved.
Phase 6: Performance Optimization (Days 21-28)
Goals:
- Upgrade to explicit free list
- Implement segregated lists (optional)
- Tune parameters for best throughput/utilization
Tasks:
/*
* Explicit free list functions
*/
/* Insert block at front of free list (LIFO) */
static void insert_free_block(void *bp) {
NEXT_FREE(bp) = free_listp;
PREV_FREE(bp) = NULL;
if (free_listp != NULL) {
PREV_FREE(free_listp) = bp;
}
free_listp = bp;
}
/* Remove block from free list */
static void remove_free_block(void *bp) {
void *prev = PREV_FREE(bp);
void *next = NEXT_FREE(bp);
if (prev == NULL) { /* First in list */
free_listp = next;
} else {
NEXT_FREE(prev) = next;
}
if (next != NULL) {
PREV_FREE(next) = prev;
}
}
/* Updated find_fit for explicit list */
static void *find_fit(size_t asize) {
void *bp;
/* Search the free list */
for (bp = free_listp; bp != NULL; bp = NEXT_FREE(bp)) {
if (asize <= GET_SIZE(HDRP(bp))) {
return bp;
}
}
return NULL;
}
/* Updated coalesce to maintain free list */
static void *coalesce(void *bp) {
size_t prev_alloc = GET_ALLOC(FTRP(PREV_BLKP(bp)));
size_t next_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
size_t size = GET_SIZE(HDRP(bp));
if (prev_alloc && next_alloc) {
/* Just insert into free list */
insert_free_block(bp);
return bp;
}
else if (prev_alloc && !next_alloc) {
/* Remove next from list, merge, insert merged */
remove_free_block(NEXT_BLKP(bp));
size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
PUT(HDRP(bp), PACK(size, 0));
PUT(FTRP(bp), PACK(size, 0));
insert_free_block(bp);
}
else if (!prev_alloc && next_alloc) {
/* Remove prev from list, merge, insert merged */
remove_free_block(PREV_BLKP(bp));
size += GET_SIZE(HDRP(PREV_BLKP(bp)));
PUT(FTRP(bp), PACK(size, 0));
PUT(HDRP(PREV_BLKP(bp)), PACK(size, 0));
bp = PREV_BLKP(bp);
insert_free_block(bp);
}
else {
/* Remove both from list, merge all three, insert merged */
remove_free_block(PREV_BLKP(bp));
remove_free_block(NEXT_BLKP(bp));
size += GET_SIZE(HDRP(PREV_BLKP(bp))) +
GET_SIZE(FTRP(NEXT_BLKP(bp)));
PUT(HDRP(PREV_BLKP(bp)), PACK(size, 0));
PUT(FTRP(NEXT_BLKP(bp)), PACK(size, 0));
bp = PREV_BLKP(bp);
insert_free_block(bp);
}
return bp;
}
Checkpoint: Performance meets targets. Throughput and utilization scores acceptable.
Phase 7: Polish and Documentation (Days 29-30)
Goals:
- Clean up code
- Add comments
- Document design decisions
6. Testing Strategy
6.1 Unit Tests
/* Test basic allocation */
void test_malloc_basic(void) {
mm_init();
void *p1 = mm_malloc(32);
assert(p1 != NULL);
assert((size_t)p1 % ALIGNMENT == 0);
void *p2 = mm_malloc(64);
assert(p2 != NULL);
assert(p2 != p1); /* Different blocks */
mm_free(p1);
mm_free(p2);
assert(mm_check() == 0);
printf("test_malloc_basic PASSED\n");
}
/* Test coalescing */
void test_coalesce(void) {
mm_init();
void *p1 = mm_malloc(32);
void *p2 = mm_malloc(32);
void *p3 = mm_malloc(32);
mm_free(p2); /* Create hole in middle */
/* Should have 3 blocks: alloc, free, alloc */
assert(mm_check() == 0);
mm_free(p3); /* Should coalesce p2 and p3 */
/* Should have 2 blocks: alloc, free (coalesced) */
assert(mm_check() == 0);
mm_free(p1); /* Should coalesce all three */
/* Should have 1 large free block */
assert(mm_check() == 0);
printf("test_coalesce PASSED\n");
}
/* Test realloc */
void test_realloc(void) {
mm_init();
/* Test NULL pointer */
void *p = mm_realloc(NULL, 32);
assert(p != NULL);
/* Write pattern */
memset(p, 'A', 32);
/* Grow */
p = mm_realloc(p, 64);
assert(p != NULL);
/* Verify pattern preserved */
for (int i = 0; i < 32; i++) {
assert(((char *)p)[i] == 'A');
}
/* Shrink */
p = mm_realloc(p, 16);
assert(p != NULL);
/* Free via realloc */
void *q = mm_realloc(p, 0);
assert(q == NULL);
assert(mm_check() == 0);
printf("test_realloc PASSED\n");
}
/* Test alignment */
void test_alignment(void) {
mm_init();
for (int size = 1; size <= 1024; size++) {
void *p = mm_malloc(size);
assert(p != NULL);
assert((size_t)p % ALIGNMENT == 0);
mm_free(p);
}
assert(mm_check() == 0);
printf("test_alignment PASSED\n");
}
6.2 Stress Tests
/* Random allocation pattern */
void test_random_pattern(int seed, int ops) {
mm_init();
srand(seed);
void *ptrs[1000] = {0};
for (int i = 0; i < ops; i++) {
int op = rand() % 3;
int idx = rand() % 1000;
if (op == 0 && ptrs[idx] == NULL) {
/* Allocate */
size_t size = (rand() % 512) + 1;
ptrs[idx] = mm_malloc(size);
if (ptrs[idx]) {
memset(ptrs[idx], idx & 0xFF, size);
}
}
else if (op == 1 && ptrs[idx] != NULL) {
/* Free */
mm_free(ptrs[idx]);
ptrs[idx] = NULL;
}
else if (op == 2 && ptrs[idx] != NULL) {
/* Realloc */
size_t size = (rand() % 512) + 1;
ptrs[idx] = mm_realloc(ptrs[idx], size);
}
/* Periodic check */
if (i % 100 == 0) {
assert(mm_check() == 0);
}
}
/* Cleanup */
for (int i = 0; i < 1000; i++) {
if (ptrs[i]) mm_free(ptrs[i]);
}
printf("test_random_pattern PASSED (seed=%d, ops=%d)\n", seed, ops);
}
6.3 Trace Files
Standard trace file format:
# Comments start with #
# a <id> <size> - allocate <size> bytes, assign to ptr <id>
# f <id> - free ptr <id>
# r <id> <size> - realloc ptr <id> to <size> bytes
a 0 32
a 1 64
a 2 128
f 1
a 3 48
r 2 64
f 0
f 3
f 2
6.4 Heap Checker in Action
/* Call after every operation during debugging */
#ifdef DEBUG
#define CHECKHEAP(verbose) do { \
if (mm_check()) { \
printf("Heap error after %s:%d\n", __FILE__, __LINE__); \
print_heap(); \
exit(1); \
} \
} while (0)
#else
#define CHECKHEAP(verbose)
#endif
7. Common Pitfalls
7.1 Corruption Bugs
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CORRUPTION BUGS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ BUG: Off-by-one in block size calculation โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Symptom: Header/footer don't match after a few operations โ
โ โ
โ Wrong: PUT(FTRP(bp), PACK(size, 0)); // But FTRP uses old size โ
โ Right: PUT(HDRP(bp), PACK(size, 0)); // Header first โ
โ PUT(FTRP(bp), PACK(size, 0)); // Footer uses new size โ
โ โ
โ โ
โ BUG: Pointer arithmetic on wrong type โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Symptom: Blocks overlap or have gaps โ
โ โ
โ Wrong: bp = bp + size; // Adds size * sizeof(*bp) โ
โ Right: bp = (char *)bp + size; // Adds exactly 'size' bytes โ
โ โ
โ โ
โ BUG: Coalescing with allocated blocks โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Symptom: Allocated data gets overwritten โ
โ โ
โ Wrong: // Forgot to check GET_ALLOC before coalescing โ
โ Right: if (!prev_alloc) { ... merge ... } โ
โ โ
โ โ
โ BUG: Double free โ
โ โโโโโโโโโโโโโโโโ โ
โ โ
โ Symptom: Crash or infinite loop in coalescing โ
โ โ
โ Defense: In debug mode, mark freed blocks with magic number โ
โ Check for magic before freeing โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
7.2 Coalescing Bugs
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ COALESCING BUGS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ BUG: Not updating all headers/footers โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Case 4 (both neighbors free): โ
โ Before: [PREV:64|0] [CURR:32|0] [NEXT:64|0] โ
โ โ
โ Wrong: PUT(HDRP(PREV_BLKP(bp)), PACK(160, 0)); // Only header! โ
โ โ
โ Right: PUT(HDRP(PREV_BLKP(bp)), PACK(160, 0)); // Update prev hdrโ
โ PUT(FTRP(NEXT_BLKP(bp)), PACK(160, 0)); // Update next ftrโ
โ โ
โ โ
โ BUG: Forgetting to return new block pointer โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Case 3 or 4: Block pointer changes to prev block! โ
โ โ
โ Wrong: return bp; // Still points to old block โ
โ Right: bp = PREV_BLKP(bp); โ
โ return bp; // Return merged block start โ
โ โ
โ โ
โ BUG: Not removing from free list before coalescing โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ (For explicit free lists) โ
โ โ
โ Wrong: // Merge blocks but neighbor still in free list โ
โ Right: remove_free_block(NEXT_BLKP(bp)); // Remove first โ
โ // Then merge and insert merged block โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
7.3 Alignment Bugs
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ALIGNMENT BUGS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ BUG: Block size not multiple of alignment โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Wrong: asize = size + DSIZE; // Could be odd โ
โ Right: asize = DSIZE * ((size + DSIZE + (DSIZE-1)) / DSIZE); โ
โ โ
โ โ
โ BUG: Returning unaligned pointer โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Symptom: Bus error or incorrect data โ
โ โ
โ Check: assert((size_t)bp % ALIGNMENT == 0); โ
โ โ
โ โ
โ BUG: Header not at word boundary โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Initial heap must be aligned! โ
โ โ
โ // In mm_init: โ
โ PUT(heap_listp, 0); // Padding word for alignment โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
7.4 Performance Pitfalls
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PERFORMANCE PITFALLS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ PITFALL: Linear search of all blocks โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Problem: Implicit free list scans ALL blocks โ
โ Solution: Explicit free list (only scan free blocks) โ
โ Better: Segregated lists (O(1) for common sizes) โ
โ โ
โ โ
โ PITFALL: Extending heap too often โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Problem: sbrk() is expensive โ
โ Solution: Extend by larger chunks (e.g., 4KB) โ
โ Reuse freed blocks before extending โ
โ โ
โ โ
โ PITFALL: Not splitting blocks โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Problem: Huge internal fragmentation โ
โ Solution: Split if remainder >= MIN_BLOCK_SIZE โ
โ โ
โ โ
โ PITFALL: Best fit with O(n) search โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Problem: Slow for large heaps โ
โ Solution: First fit is usually faster AND has good utilization โ
โ Better: Segregated fits combine speed with good utilization โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
8. Extensions
8.1 Beginner Extensions
- Footer elimination: Use โprevious allocatedโ bit to skip footer on allocated blocks
- Better fit: Implement best fit or next fit, compare performance
- Heap visualization: ASCII art showing block layout
- Statistics collection: Track peak usage, fragmentation metrics
8.2 Intermediate Extensions
- Segregated free lists: Group blocks by size class for O(1) allocation
- Immediate coalescing vs deferred: Compare strategies
- Memory-mapped large blocks: Use mmap for allocations > threshold
- Debug mode: Pattern filling, guard pages, leak detection
8.3 Advanced Extensions
- Thread-safe allocator: Add locking or use lock-free algorithms
- Per-thread arenas: Reduce contention in multithreaded programs
- Compaction: Move blocks to reduce fragmentation (requires handle system)
- SLAB allocator: For fixed-size object pools (common in kernels)
8.4 Segregated Free Lists Implementation
/*
* Segregated free lists implementation
*/
#define NUM_SIZE_CLASSES 9
/* Size class boundaries (in bytes) */
static size_t class_sizes[NUM_SIZE_CLASSES] = {
32, 64, 128, 256, 512, 1024, 2048, 4096, SIZE_MAX
};
/* Array of free list heads, one per size class */
static void *seg_lists[NUM_SIZE_CLASSES];
/* Get size class for a given size */
static int get_size_class(size_t size) {
for (int i = 0; i < NUM_SIZE_CLASSES; i++) {
if (size <= class_sizes[i])
return i;
}
return NUM_SIZE_CLASSES - 1;
}
/* Find fit in segregated lists */
static void *seg_find_fit(size_t asize) {
int class = get_size_class(asize);
/* Search this class and larger classes */
for (int i = class; i < NUM_SIZE_CLASSES; i++) {
void *bp = seg_lists[i];
while (bp != NULL) {
if (GET_SIZE(HDRP(bp)) >= asize) {
return bp;
}
bp = NEXT_FREE(bp);
}
}
return NULL; /* No fit in any class */
}
/* Insert into appropriate segregated list */
static void seg_insert(void *bp) {
size_t size = GET_SIZE(HDRP(bp));
int class = get_size_class(size);
/* LIFO insertion at head of list */
NEXT_FREE(bp) = seg_lists[class];
PREV_FREE(bp) = NULL;
if (seg_lists[class] != NULL) {
PREV_FREE(seg_lists[class]) = bp;
}
seg_lists[class] = bp;
}
/* Remove from segregated list */
static void seg_remove(void *bp) {
size_t size = GET_SIZE(HDRP(bp));
int class = get_size_class(size);
void *prev = PREV_FREE(bp);
void *next = NEXT_FREE(bp);
if (prev == NULL) {
seg_lists[class] = next;
} else {
NEXT_FREE(prev) = next;
}
if (next != NULL) {
PREV_FREE(next) = prev;
}
}
9. Real-World Connections
9.1 Production Allocators
glibc malloc (ptmalloc2):
- Uses bins (segregated lists) for small allocations
- Maintains unsorted โfastbinโ for recently freed small blocks
- Uses mmap for very large allocations
- Thread-safe with per-arena locking
jemalloc (used by Firefox, Facebook):
- Thread-local caching to reduce contention
- Size classes with minimal internal fragmentation
- Sophisticated metadata management
- Extensive debugging and profiling support
tcmalloc (Google):
- Per-thread cache for small allocations
- Central free list for larger allocations
- Page-based heap organization
- Low fragmentation, high performance
mimalloc (Microsoft):
- Free list sharding for NUMA awareness
- Extremely fast small object allocation
- Designed for modern multi-core systems
9.2 Kernel Allocators
Linux SLAB/SLUB:
- Object caching for frequently-used structures
- Per-CPU caches to avoid locking
- Constructors/destructors for complex objects
Buddy Allocator:
- Power-of-2 block sizes
- Fast coalescing via โbuddyโ relationship
- Used for page-level allocation
9.3 Language Runtime Allocators
Go runtime:
- Concurrent, generational garbage collector
- Size-class based allocation
- Integration with goroutine scheduling
Rust allocators:
- System allocator (usually jemalloc or platform default)
- Custom allocator API for embedded/specialized use
- Memory safety guarantees at compile time
9.4 Interview Relevance
This project prepares you for questions like:
- โHow would you implement malloc?โ
- โWhat is fragmentation and how do you minimize it?โ
- โExplain the trade-offs between different allocator designsโ
- โHow do production allocators achieve thread safety?โ
- โWhat data structures are used in memory allocators?โ
10. Resources
10.1 Essential Reading
- CS:APP Chapter 9: Virtual Memory (especially 9.9 Dynamic Memory Allocation)
- CS:APP Chapter 6: The Memory Hierarchy (for locality understanding)
- โC Interfaces and Implementationsโ by David Hanson: Excellent allocator examples
- glibc malloc source code: Real-world implementation
10.2 Papers
- โDynamic Storage Allocation: A Survey and Critical Reviewโ (Wilson et al.)
- โHoard: A Scalable Memory Allocator for Multithreaded Applicationsโ (Berger et al.)
- โMimalloc: Free List Sharding in Actionโ (Leijen et al.)
10.3 Online Resources
10.4 Tools
- Valgrind: Memory error detection
- AddressSanitizer: Fast memory error detection
- heaptrack: Heap memory profiler
- perf: Performance profiling
10.5 Related Projects in This Series
- Previous: P13 (Virtual Memory Map Visualizer) - VM foundations
- Next: P15 (Robust Unix I/O Toolkit) - I/O patterns
- Uses concepts from: P2 (Bitwise Data Inspector) - bit manipulation for headers
- Uses concepts from: P9 (Cache Simulator) - locality for performance
11. Self-Assessment Checklist
Understanding
- I can explain why dynamic allocation is necessary
- I understand the difference between internal and external fragmentation
- I can describe at least 3 free list organizations and their trade-offs
- I can explain boundary tag coalescing in all 4 cases
- I understand alignment requirements and why they matter
- I can list 5+ allocator invariants and explain why each is important
Implementation
- mm_init creates valid heap with prologue and epilogue
- mm_malloc returns properly aligned blocks
- mm_malloc correctly splits blocks when possible
- mm_free correctly marks blocks and coalesces
- All 4 coalescing cases work correctly
- mm_realloc handles all edge cases
- Heap checker catches common errors
- No memory corruption on random traces
Performance
- Throughput meets minimum threshold (1000+ ops/sec)
- Utilization is reasonable (>50% on standard traces)
- Explicit free list improves performance over implicit
- I can explain which design choices affect throughput vs utilization
Growth
- I debugged at least one corruption bug using the heap checker
- I measured and improved performance through design changes
- I understand how production allocators (jemalloc, tcmalloc) work
- I can discuss allocator trade-offs in an interview setting
12. Real World Outcome
When you complete this project, hereโs exactly what youโll see when running your allocator with the test driver:
Running the Malloc Driver
$ ./mdriver -V
Team Name: my-malloc
Member 1: Student Developer
Using default tracefiles in ./traces/
Measuring performance with gettimeofday().
Results for mm malloc:
trace valid util ops secs Kops
0 yes 99% 5694 0.000183 31115
1 yes 99% 5848 0.000172 34000
2 yes 98% 6648 0.000228 29158
3 yes 99% 5380 0.000136 39559
4 yes 66% 14400 0.000132109091
5 yes 92% 4800 0.000345 13913
6 yes 92% 4800 0.000341 14076
7 yes 55% 12000 0.004792 2504
8 yes 51% 24000 0.013844 1734
9 yes 27% 14401 0.052847 272
10 yes 34% 14401 0.002219 6490
Total 74% 112372 0.075239 1494
Perf index = 44 (util) + 40 (thru) = 84/100
Heap Checker Output
$ ./mdriver -c -V -f traces/short1.rep
Checking mm_init...
Prologue: 8 bytes at 0x7f8a1c000000 [8|1]
Epilogue: 4 bytes at 0x7f8a1c000008 [0|1]
Initial heap size: 4096 bytes
PASSED
Checking mm_malloc(32)...
Returned: 0x7f8a1c00000c (aligned: YES)
Block size: 40 bytes (32 + 8 overhead)
Header: [40|1], Footer: [40|1]
PASSED
Checking mm_malloc(64)...
Returned: 0x7f8a1c000034 (aligned: YES)
Block size: 72 bytes (64 + 8 overhead)
PASSED
Checking mm_free(0x7f8a1c00000c)...
Block marked free: [40|0]
Adjacent blocks: [ALLOC] [FREE] [ALLOC]
Coalescing: Not needed (both neighbors allocated)
PASSED
Checking mm_free(0x7f8a1c000034)...
Block marked free: [72|0]
Adjacent blocks: [FREE] [FREE] [EPILOGUE]
Coalescing: Case 2 (merge with previous)
New coalesced block: [112|0]
PASSED
=== HEAP INVARIANT CHECK ===
All blocks aligned: YES
Header/footer consistency: YES
No consecutive free blocks: YES
Free list consistency: YES
Total heap size matches: YES
All checks PASSED!
Performance Comparison
$ ./compare_allocators
=== ALLOCATOR COMPARISON ===
Trace: binary-bal.rep (mixed alloc/free pattern)
Implementation Throughput Utilization Score
----------------------------------------------------------
Implicit First Fit 1,420 Kops 72% 65/100
Implicit Best Fit 890 Kops 78% 61/100
Explicit First Fit 8,230 Kops 75% 82/100
Explicit Best Fit 4,120 Kops 82% 80/100
Segregated Fits 15,800 Kops 84% 94/100
System malloc 45,000 Kops ~90% (baseline)
Analysis:
- Explicit list 5.8x faster than implicit (only scanning free blocks)
- Best fit: +3-7% utilization, -45% throughput
- Segregated fits: best of both worlds
Debugging a Heap Corruption
$ ./mdriver -V -f traces/coalescing.rep
trace 4: Running...
ERROR: Heap checker failed after operation 47
=== HEAP DUMP ===
Address Size Status Header Footer Notes
--------------------------------------------------------------
0x7f8a1c000000 8 ALLOC [8|1] [8|1] Prologue
0x7f8a1c000008 40 FREE [40|0] [40|0]
0x7f8a1c000030 40 FREE [40|0] [40|0] << CONSECUTIVE FREE!
0x7f8a1c000058 72 ALLOC [72|1] [72|1]
0x7f8a1c0000a0 0 ALLOC [0|1] N/A Epilogue
INVARIANT VIOLATION:
Consecutive free blocks at 0x7f8a1c000008 and 0x7f8a1c000030
These should have been coalesced!
Hint: Check your coalesce() function, specifically Case 2
(current block free, next block free)
13. The Core Question Youโre Answering
โWhen I call malloc(100), where does that memory come from, how does the allocator track it, and what happens when I call free()?โ
This project takes you inside one of the most fundamental abstractions in systems programming. Youโll see that โgetting memoryโ is really about managing a large region of bytes, carving it into pieces, tracking which pieces are in use, and reassembling freed pieces efficiently. Youโll understand why fragmentation matters, why free() doesnโt shrink your process, and the brilliant engineering behind production allocators.
14. Concepts You Must Understand First
Before starting this project, ensure you understand these concepts:
| Concept | Why It Matters | Where to Learn |
|---|---|---|
| Pointer arithmetic in C | Youโll compute block addresses constantly | CS:APP 3.8, any C book Ch. 5-6 |
| How the heap grows (sbrk, brk) | Youโll extend the heap when out of space | CS:APP 9.9.1 |
| Alignment requirements | Every allocation must be aligned | CS:APP 9.9.1, 3.9.3 |
| Bit manipulation (pack/unpack) | Headers encode size and status in one word | CS:APP Chapter 2 |
| Linked list operations | Free lists are linked lists | Data structures basics |
| What fragmentation means | Your allocator must minimize it | CS:APP 9.9.3-9.9.4 |
15. Questions to Guide Your Design
Work through these questions BEFORE writing code:
-
Block Layout: How big is your header? Do you need a footer? Whatโs your minimum block size?
-
Alignment: How do you ensure every returned pointer is 8-byte (or 16-byte) aligned? How does this affect block sizes?
-
Finding Free Blocks: Will you use implicit list, explicit list, or segregated lists? What are the trade-offs?
-
Placement Policy: First fit, next fit, or best fit? What are the implications for throughput and utilization?
-
Splitting: When do you split a block? Whatโs the minimum remainder size worth keeping?
-
Coalescing: Will you coalesce immediately or defer? How do footers enable O(1) coalescing with the previous block?
-
Heap Extension: How much should you extend when you run out of space? Small or large chunks?
16. Thinking Exercise
Before writing any code, trace through this scenario by hand:
Starting heap state (block addresses and sizes shown):
[PROLOGUE 8|1] [FREE 80|0] [ALLOC 48|1] [FREE 120|0] [EPILOGUE 0|1]
Operations to trace:
void *p1 = mm_malloc(32); // Needs 40 bytes with overhead
void *p2 = mm_malloc(100); // Needs 112 bytes with overhead (round up)
mm_free(p1);
void *p3 = mm_malloc(60); // Needs 72 bytes with overhead
Exercise: On paper, answer:
-
After malloc(32): Which block was used? Was it split? Draw the new heap layout with all headers/footers.
-
After malloc(100): Which block was used? What happens if no single block is big enough?
-
After free(p1): Was coalescing needed? Draw the free list if using explicit list.
-
After malloc(60): Which block was chosen (first fit)? Would best fit choose differently?
Verify your answers by implementing and adding debug printing.
17. The Interview Questions Theyโll Ask
After completing this project, youโll be ready for these common interview questions:
- โHow does malloc work internally?โ
- Expected: Manages heap via free lists, finds/splits blocks, extends heap if needed
- Bonus: Discuss headers, footers, coalescing, and alignment
- โWhatโs the difference between internal and external fragmentation?โ
- Expected: Internal = wasted space inside allocated blocks; External = scattered free blocks canโt satisfy large request
- Bonus: Give concrete examples, discuss mitigation strategies
- โExplain how you would implement free().โ
- Expected: Mark block as free, coalesce with neighbors, add to free list
- Bonus: Describe all 4 coalescing cases, explain why footers are needed
- โWhat placement policy would you use and why?โ
- Expected: First fit (fast) vs. best fit (better utilization) trade-off
- Bonus: Discuss next fit, segregated fits, and when each is appropriate
- โHow do production allocators like jemalloc achieve high performance?โ
- Expected: Thread-local caches, size classes, minimal locking
- Bonus: Discuss arena-based allocation, slab allocators, NUMA awareness
- โWhy doesnโt free() return memory to the OS immediately?โ
- Expected: Heap can only shrink from the top; fragmentation prevents shrinking
- Bonus: Discuss mmap for large allocations, MADV_DONTNEED, memory overcommit
18. Hints in Layers
If youโre stuck, reveal hints one at a time:
Hint 1: Getting Started with Block Macros
Define these macros first - theyโre the foundation:
#define WSIZE 4 // Word size (bytes)
#define DSIZE 8 // Double word size (bytes)
#define ALIGNMENT 8
// Pack size and allocated bit into a word
#define PACK(size, alloc) ((size) | (alloc))
// Read and write a word at address p
#define GET(p) (*(unsigned int *)(p))
#define PUT(p, val) (*(unsigned int *)(p) = (val))
// Read size and allocated fields from header/footer
#define GET_SIZE(p) (GET(p) & ~0x7)
#define GET_ALLOC(p) (GET(p) & 0x1)
// Given block ptr bp, compute header and footer addresses
#define HDRP(bp) ((char *)(bp) - WSIZE)
#define FTRP(bp) ((char *)(bp) + GET_SIZE(HDRP(bp)) - DSIZE)
// Given block ptr bp, compute next and previous block addresses
#define NEXT_BLKP(bp) ((char *)(bp) + GET_SIZE(HDRP(bp)))
#define PREV_BLKP(bp) ((char *)(bp) - GET_SIZE((char *)(bp) - DSIZE))
Test these macros extensively before using them!
Hint 2: The Coalescing Cases
When you free a block, check all four cases:
Case 1: [ALLOC] [FREE*] [ALLOC] -> just mark free
Case 2: [ALLOC] [FREE*] [FREE] -> merge current + next
Case 3: [FREE] [FREE*] [ALLOC] -> merge prev + current
Case 4: [FREE] [FREE*] [FREE] -> merge prev + current + next
For cases 3 and 4, remember to return the NEW block pointer (prev blockโs address).
To check if previous is free, you need its footer: look at (bp - DSIZE).
Hint 3: Debugging with a Heap Checker
Implement mm_check() early! Check these invariants:
int mm_check(void) {
char *bp;
// Check prologue
if (GET_SIZE(heap_listp - WSIZE) != DSIZE ||
!GET_ALLOC(heap_listp - WSIZE))
return 0; // Bad prologue
// Walk all blocks
for (bp = heap_listp; GET_SIZE(HDRP(bp)) > 0; bp = NEXT_BLKP(bp)) {
// Check alignment
if ((size_t)bp % ALIGNMENT != 0) {
printf("Block %p not aligned!\n", bp);
return 0;
}
// Check header/footer match
if (GET(HDRP(bp)) != GET(FTRP(bp))) {
printf("Header/footer mismatch at %p\n", bp);
return 0;
}
// Check for consecutive free blocks
if (!GET_ALLOC(HDRP(bp)) && !GET_ALLOC(HDRP(NEXT_BLKP(bp)))) {
printf("Consecutive free blocks at %p\n", bp);
return 0;
}
}
return 1; // All checks passed
}
Call mm_check() after every malloc/free during debugging.
Hint 4: Explicit Free List
For explicit lists, free blocks contain next/prev pointers in their payload area:
Free block layout:
[HEADER 4B] [NEXT PTR 8B] [PREV PTR 8B] [...padding...] [FOOTER 4B]
Minimum free block size: 4 + 8 + 8 + 4 = 24 bytes (round to 32 for alignment)
When allocating, remove from list:
void remove_from_free_list(void *bp) {
void *prev = PREV_FREE(bp);
void *next = NEXT_FREE(bp);
if (prev) NEXT_FREE(prev) = next;
else free_list_head = next;
if (next) PREV_FREE(next) = prev;
}
When freeing, add to list (LIFO is simplest):
void add_to_free_list(void *bp) {
NEXT_FREE(bp) = free_list_head;
PREV_FREE(bp) = NULL;
if (free_list_head) PREV_FREE(free_list_head) = bp;
free_list_head = bp;
}
19. Books That Will Help
| Topic | Book | Chapter/Section |
|---|---|---|
| Dynamic memory allocation overview | CS:APP 3rd Ed | Chapter 9.9 โDynamic Memory Allocationโ |
| Allocator requirements and goals | CS:APP 3rd Ed | Chapter 9.9.1 โThe malloc and free Functionsโ |
| Fragmentation | CS:APP 3rd Ed | Chapter 9.9.3-9.9.4 โFragmentationโ |
| Implementation details | CS:APP 3rd Ed | Chapter 9.9.6-9.9.13 โImplicit Free Listsโ through โExplicit Free Listsโ |
| Segregated storage | CS:APP 3rd Ed | Chapter 9.9.14 โSegregated Free Listsโ |
| Coalescing strategies | CS:APP 3rd Ed | Chapter 9.9.10 โCoalescing Free Blocksโ |
| Production allocators | The Linux Programming Interface | Chapter 7 โMemory Allocationโ |
| Hoard allocator (scalable) | Paper: โHoard: A Scalable Memory Allocator for Multithreaded Applicationsโ | - |
| jemalloc design | jemalloc documentation | - |
20. Submission / Completion Criteria
Minimum Viable Completion:
- mm_init, mm_malloc, mm_free work correctly
- Passes basic correctness tests
- Heap checker implemented and passing
- Implicit free list with first fit
Full Completion:
- All of the above plus mm_realloc
- Explicit free list implemented
- Good throughput and utilization scores
- All 4 coalescing cases verified
Excellence (Going Above & Beyond):
- Segregated free lists
- Footer elimination optimization
- Thread-safe version
- Comparison with production allocators
- Detailed performance analysis report
This guide was expanded from CSAPP_3E_DEEP_LEARNING_PROJECTS.md. For the complete learning path, see the project index.