Project 1: Memory Allocator (malloc/free from scratch)
Build a production-style heap allocator that can replace
mallocandfreeand explain every byte it manages.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 2-3 weeks |
| Main Programming Language | C (Alternatives: Rust, C++) |
| Alternative Programming Languages | Rust, C++ |
| Coolness Level | Level 4: Hardcore Tech Flex |
| Business Potential | Resume Gold |
| Prerequisites | Pointers, structs, process memory layout, basic syscalls |
| Key Topics | Virtual memory, free lists, fragmentation, alignment, mmap/sbrk |
1. Learning Objectives
By completing this project, you will:
- Implement a heap allocator that supports
malloc,free,calloc, andrealloc. - Explain virtual memory, the heap, and how
sbrkandmmapgrow address space. - Design and debug free list data structures with coalescing and splitting.
- Enforce alignment requirements and understand why misalignment breaks programs.
- Measure fragmentation and describe the trade-offs between speed and memory efficiency.
- Replace the system allocator in a real program using
LD_PRELOAD.
2. All Theory Needed (Per-Concept Breakdown)
2.1 Virtual Memory and the Heap
Description / Expanded Explanation
Virtual memory gives each process the illusion of a large, contiguous address space. The heap is just a region of that address space that the allocator manages. The OS does not give you “objects” or “arrays”; it gives you pages of memory and a set of rules. Your allocator is the system that turns pages into reusable blocks for programs.
Definitions & Key Terms
- virtual address space -> per-process address map that is translated to physical memory
- page -> fixed-size memory unit (usually 4 KB) managed by the OS
- heap -> region of virtual memory used for dynamic allocations
- program break -> end of the heap managed by
sbrk - mapping -> association of virtual pages to physical frames
Mental Model Diagram (ASCII)
High addresses
+---------------------------+ <-- kernel space
| kernel |
+---------------------------+
| stack | grows down
| |
| (gap) |
| |
| heap | grows up
+---------------------------+ <-- program break
| mmap regions |
+---------------------------+
| data | bss | text |
+---------------------------+
Low addresses
Mental Model Diagram (Image)

How It Works (Step-by-Step)
- The OS loads your program and maps text, data, and stack regions.
- The heap starts small; the program break marks its end.
sbrkmoves the break upward, mapping more pages.mmapcreates separate regions for large allocations.- Your allocator carves blocks out of these regions and tracks them.
Minimal Concrete Example
void *p = sbrk(0); // current break
void *q = sbrk(4096); // grow heap by one page
assert((char*)q - (char*)p == 4096);
Common Misconceptions
- “malloc gets memory from the OS every time” -> allocators reuse free blocks aggressively.
- “heap is a single array” -> it is a collection of mapped pages that can be fragmented.
Check-Your-Understanding Questions
- Why can two processes both use virtual address 0x1000 without conflict?
- What is the program break and why does it move?
- When would you choose
mmapinstead ofsbrk? - Predict what happens if you
sbrk(-4096)after allocations.
Where You’ll Apply It
- See 3.1 and 3.5 for allocator boundaries and metadata layout.
- See 4.1 and 4.4 for heap growth strategies.
- Also used in: P05 Container From Scratch
2.2 Block Metadata and Alignment
Description / Expanded Explanation
Every allocated block needs metadata. You need to know its size, whether it is free, and sometimes its neighbors. The allocator must ensure that the pointer returned to the user is properly aligned so that any C type can be stored safely.
Definitions & Key Terms
- header -> metadata placed before user data
- footer -> metadata at the end of a block (often size)
- alignment -> address is a multiple of a power of two (often 16)
- padding -> extra bytes inserted to meet alignment
Mental Model Diagram (ASCII)
+---------+--------------------+--------+
| header | user payload | footer |
+---------+--------------------+--------+
^ ^
block start block end
Mental Model Diagram (Image)

How It Works (Step-by-Step)
- Round the requested size up to the alignment.
- Add space for the header (and footer if used).
- Return a pointer after the header.
- On
free, subtract the header size to find metadata.
Minimal Concrete Example
#define ALIGNMENT 16
#define ALIGN(sz) (((sz) + (ALIGNMENT-1)) & ~(ALIGNMENT-1))
struct header { size_t size; int is_free; };
void *malloc_impl(size_t n) {
size_t total = ALIGN(n) + sizeof(struct header);
struct header *h = request_block(total);
h->size = total;
h->is_free = 0;
return (void*)(h + 1);
}
Common Misconceptions
- “Alignment only matters for performance” -> some architectures crash on misaligned access.
- “Metadata is just overhead” -> without it you cannot free or coalesce safely.
Check-Your-Understanding Questions
- Why is 16-byte alignment a safe default on 64-bit systems?
- What happens if
freeis given a pointer in the middle of a block? - How does alignment affect internal fragmentation?
Where You’ll Apply It
- See 3.5 for the exact block layout and header format.
- See 4.4 for block splitting logic.
- Also used in: P07 Lock-Free Queue
2.3 Free Lists, Splitting, and Coalescing
Description / Expanded Explanation
Allocators reuse freed memory by tracking free blocks in data structures. Splitting a large block into a smaller used block plus a remainder reduces waste, while coalescing adjacent free blocks reduces external fragmentation.
Definitions & Key Terms
- free list -> linked list (or tree) of free blocks
- splitting -> dividing a large block into used and free parts
- coalescing -> merging adjacent free blocks
- external fragmentation -> free memory exists but is not contiguous
Mental Model Diagram (ASCII)
Before free:
[USED 64][USED 128][USED 64]
After free middle:
[USED 64][FREE 128][USED 64]
After coalescing neighbors:
[FREE 256][USED 64] (if two adjacent frees)
Mental Model Diagram (Image)

How It Works (Step-by-Step)
freemarks the block as free and inserts it into the free list.mallocsearches the free list for a block large enough.- If the block is larger than needed, split it into two blocks.
- On
free, check neighbors; if free, merge and update size.
Minimal Concrete Example
void free_impl(void *ptr) {
struct header *h = ((struct header*)ptr) - 1;
h->is_free = 1;
insert_free_list(h);
try_coalesce(h);
}
Common Misconceptions
- “Best-fit always wins” -> it reduces fragmentation but increases search time.
- “Coalescing can wait forever” -> delayed coalescing can explode fragmentation.
Check-Your-Understanding Questions
- What is the difference between internal and external fragmentation?
- Why do boundary tags make coalescing O(1)?
- When might a free list become a performance bottleneck?
Where You’ll Apply It
- See 4.4 for allocation algorithms and complexity.
- See 5.10 Phase 2 for implementing splitting and coalescing.
- Also used in: P03 Persistent Key-Value Store
2.4 mmap vs sbrk Strategy
Description / Expanded Explanation
Small allocations typically come from a contiguous heap region (sbrk), while large allocations are better served by mmap so they can be returned directly to the OS. This split avoids heap fragmentation and improves performance for large blocks.
Definitions & Key Terms
sbrk-> moves the program break (heap end)mmap-> maps pages directly, can be unmapped- arena -> independent heap region managed by allocator
- threshold -> size cutoff for deciding
sbrkvsmmap
Mental Model Diagram (ASCII)
Heap arena (sbrk):
[small blocks][small blocks][small blocks]
Large allocations (mmap):
[ separate mapping ]
[ separate mapping ]
Mental Model Diagram (Image)

How It Works (Step-by-Step)
- Decide a size threshold (for example 128 KB).
- Requests below threshold go to the heap arena.
- Requests above threshold use
mmapand store a flag in metadata. - On
free,munmaplarge blocks directly to the OS.
Minimal Concrete Example
if (size >= MMAP_THRESHOLD) {
void *p = mmap(NULL, size, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
mark_mmap_block(p, size);
return p;
}
Common Misconceptions
- “mmap is always slower” -> it can be faster for large, long-lived allocations.
- “sbrk memory can always be returned” -> the heap only shrinks if the top is free.
Check-Your-Understanding Questions
- Why does
munmapreturn memory more reliably than shrinking the heap? - What risks exist if you
mmapmany tiny allocations? - How should the threshold be chosen?
Where You’ll Apply It
- See 3.2 and 3.5 for allocation policy and metadata flags.
- See 5.10 Phase 3 for large allocation support.
- Also used in: P02 Load Balancer for buffer management
2.5 Fragmentation Metrics and Debugging
Description / Expanded Explanation
Fragmentation is the allocator’s hidden cost. You need to measure how much memory is wasted and where. Without instrumentation you cannot tell if your allocator is behaving well or just growing the heap forever.
Definitions & Key Terms
- internal fragmentation -> wasted space inside allocated blocks
- external fragmentation -> unusable free space between blocks
- heap utilization -> used bytes / total heap bytes
- canary -> sentinel value to detect corruption
Mental Model Diagram (ASCII)
Heap view:
[USED 24][PAD 8][USED 40][FREE 32][USED 16]
Used bytes: 24 + 40 + 16 = 80
Total heap: 24+8+40+32+16 = 120
Utilization = 80 / 120 = 0.66
Mental Model Diagram (Image)

How It Works (Step-by-Step)
- Track total heap size and total allocated bytes.
- On each
malloc, log requested size vs actual block size. - Report utilization and fragmentation percentages.
- Add canaries and validate on
freefor corruption.
Minimal Concrete Example
stats.total_heap += block->size;
stats.used_bytes += requested;
printf("utilization=%.2f\n", (double)stats.used_bytes / stats.total_heap);
Common Misconceptions
- “If tests pass, fragmentation is fine” -> long-running workloads may fail.
- “Only external fragmentation matters” -> internal padding can be huge for small allocations.
Check-Your-Understanding Questions
- How can you quantify external fragmentation in a running allocator?
- Why do canaries help detect buffer overflows?
- What types of workloads maximize fragmentation?
Where You’ll Apply It
- See 6.1 and 6.2 for test categories and critical cases.
- See 7.3 for performance traps.
- Also used in: P03 Persistent Key-Value Store
3. Project Specification
3.1 What You Will Build
You will build a shared-library memory allocator that implements malloc, free, calloc, and realloc and can replace the system allocator via LD_PRELOAD. It will:
- manage heap growth using
sbrkandmmap - reuse freed memory via a free list
- split and coalesce blocks
- enforce alignment and detect double-free
- expose debug and stats modes for heap visualization
Out of scope: thread-safe allocation across multiple threads, per-CPU arenas, and advanced caching.
3.2 Functional Requirements
- malloc: return aligned pointers for any request size > 0.
- free: accept any pointer returned by your allocator and reclaim it.
- calloc: return zeroed memory.
- realloc: resize allocations while preserving data.
- Coalescing: merge adjacent free blocks.
- Large allocations: use
mmapabove a threshold. - Debug mode: heap dump and stats via env var.
- Safety: detect double-free and corrupted headers.
3.3 Non-Functional Requirements
- Performance: within 3-5x of glibc malloc on microbenchmarks.
- Reliability: should run simple tools like
/bin/lswithout crashes. - Usability: clear logging and deterministic test mode.
3.4 Example Usage / Output
$ LD_PRELOAD=./libmymalloc.so MALLOC_DEBUG=1 ./malloc-demo
[MALLOC] request=32 -> block=48 @0x70000010
[FREE] block=0x70000010 size=48
[STATS] heap=4096 used=32 free=4032 util=0.78
3.5 Data Formats / Schemas / Protocols
Block header (stored before user data):
struct block_header {
size_t size; // total block size, including header
uint8_t is_free; // 1 if free
uint8_t is_mmap; // 1 if mapped via mmap
uint16_t pad;
struct block_header *next;
struct block_header *prev;
};
Debug heap dump format:
[BLOCK addr=0x70000000 size=128 state=USED]
[BLOCK addr=0x70000080 size=256 state=FREE]
3.6 Edge Cases
malloc(0)returns NULL or a unique pointer (document choice).free(NULL)is a no-op.realloc(ptr, 0)frees the block.realloc(NULL, size)is equivalent tomalloc(size).- freeing a pointer not owned by the allocator must be detected and rejected.
3.7 Real World Outcome
3.7.1 How to Run (Copy/Paste)
# From project root
make clean && make
# Run deterministic demo with fixed mmap base
MALLOC_DETERMINISTIC=1 MALLOC_DEBUG=1 LD_PRELOAD=./libmymalloc.so ./malloc-demo
3.7.2 Golden Path Demo (Deterministic)
The demo uses a fixed mapping base so addresses are stable across runs.
3.7.3 CLI Transcript (Success)
$ MALLOC_DETERMINISTIC=1 MALLOC_DEBUG=1 LD_PRELOAD=./libmymalloc.so ./malloc-demo
[ALLOC] req=64 align=64 block=80 @0x70000000
[ALLOC] req=32 align=32 block=48 @0x70000050
[FREE] block=0x70000050 size=48
[ALLOC] req=16 align=16 block=32 @0x70000050
[STATS] heap=4096 used=80 free=3984 util=0.78
$ echo $?
0
3.7.3 CLI Transcript (Failure)
$ MALLOC_DETERMINISTIC=1 LD_PRELOAD=./libmymalloc.so ./malloc-demo --double-free
[FREE] block=0x70000050 size=48
[ERROR] double-free detected at 0x70000050
$ echo $?
2
3.7.4 If API
Not applicable.
3.7.5 If Library
- Install:
makeproduceslibmymalloc.so - Usage:
LD_PRELOAD=./libmymalloc.so ./program - Returns:
mallocreturns aligned pointer or NULL - Error handling: set
errno=ENOMEMon failure
3.7.6 Exit Codes
0success2allocator detected corruption or double-free3OS allocation failure (sbrk/mmap)
4. Solution Architecture
4.1 High-Level Design
+------------------------------+
| malloc/free API |
+------------------------------+
| allocator core |
| - free list |
| - split/coalesce |
| - mmap threshold |
+------------------------------+
| OS interface |
| - sbrk |
| - mmap/munmap |
+------------------------------+
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Heap arena | Manage contiguous heap blocks | sbrk-based growth |
| Free list | Track free blocks | doubly-linked list for O(1) remove |
| Large alloc | Handle big requests | mmap for direct release |
| Debugger | Heap dump and stats | env flags to avoid overhead |
4.3 Data Structures (No Full Code)
struct free_node {
size_t size;
int is_free;
struct free_node *next;
struct free_node *prev;
};
4.4 Algorithm Overview
Allocation
- Align request size and add header.
- Search free list for a fit.
- If found, split if remainder is large enough.
- If not found, grow heap or mmap.
Free
- Validate pointer and header.
- Mark free and insert into free list.
- Coalesce with neighbors.
Complexity Analysis
- Time: O(n) worst-case search in free list
- Space: O(1) per block metadata
5. Implementation Guide
5.1 Development Environment Setup
# Linux or macOS
make
./malloc-demo
5.2 Project Structure
project-root/
├── src/
│ ├── allocator.c
│ ├── allocator.h
│ └── debug.c
├── tests/
│ └── test_allocator.c
├── Makefile
└── README.md
5.3 The Core Question You’re Answering
“When I call malloc, where do those bytes come from, and how do I get them back safely?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Virtual memory and page mapping
- Pointer arithmetic and alignment
- Free list invariants
- Fragmentation metrics
mmapvssbrktrade-offs
5.5 Questions to Guide Your Design
- How will you detect double-free without heavy overhead?
- What is your minimum block size?
- What strategy will you use to search the free list?
- When do you coalesce and when do you delay it?
5.6 Thinking Exercise
Trace this sequence and draw the heap:
malloc(24)
malloc(64)
free(first)
malloc(16)
free(second)
free(third)
5.7 The Interview Questions They’ll Ask
- Explain internal vs external fragmentation.
- Why do allocators use boundary tags?
- What happens if you return unaligned pointers?
- How would you make this thread-safe?
5.8 Hints in Layers
Hint 1: Build a bump allocator first to validate heap growth. Hint 2: Use a doubly-linked free list to simplify removal. Hint 3: Add a debug heap dump to visualize fragmentation.
5.9 Books That Will Help
| Topic | Book | Chapter | |——-|——|———| | Virtual memory | Computer Systems: A Programmer’s Perspective | Ch. 9 | | Allocator design | Operating Systems: Three Easy Pieces | Ch. 17 | | C interfaces | C Interfaces and Implementations | Ch. 5 |
5.10 Implementation Phases
Phase 1: Foundation (3-4 days)
Goals:
- Implement aligned bump allocation
- Basic heap growth
Tasks:
- Implement
sbrkwrapper with logging. - Return aligned pointers. Checkpoint: allocate and free in a test program.
- Implement
Phase 2: Core Functionality (7-10 days)
Goals:
- Free list with split/coalesce
reallocandcallocTasks:- Insert freed blocks into a list.
- Implement splitting and coalescing. Checkpoint: run tests for reuse and fragmentation.
Phase 3: Polish and Edge Cases (4-6 days)
Goals:
- mmap for large blocks
- debug stats and leak detection
Tasks:
- Add mmap threshold and
munmapon free. - Add heap dump and corruption detection.
Checkpoint: run
/bin/lswithLD_PRELOAD.
- Add mmap threshold and
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Search strategy | first-fit, best-fit, next-fit | first-fit | simplest and fast enough | | Coalescing | immediate, deferred | immediate | reduces fragmentation early | | Large alloc | sbrk only, mmap threshold | mmap threshold | return memory to OS |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples | |———-|———|———-| | Unit Tests | Verify block metadata and alignment | alignment, header integrity | | Integration Tests | Run real programs | /bin/ls with LD_PRELOAD | | Edge Case Tests | Validate error handling | double-free, malloc(0) |
6.2 Critical Test Cases
- Reuse: malloc 100, free, malloc 100 -> no new
sbrk. - Coalesce: allocate A,B,C; free B and A -> one merged free block.
- Large alloc: request > threshold uses
mmap. - Corruption: overwrite boundary and detect via canary.
6.3 Test Data
requests: [8, 24, 64, 128, 512, 4096]
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution | |———|———|———-| | Misaligned pointers | bus error on some CPUs | align to 16 bytes | | Free list corruption | random crashes | add canaries, validate list | | No coalescing | heap grows endlessly | merge adjacent blocks |
7.2 Debugging Strategies
- Use
straceto verifysbrkandmmapcalls. - Add heap dumps after every allocation during early development.
- Run with AddressSanitizer to catch writes beyond bounds.
7.3 Performance Traps
- Linear free list search on every allocation can be O(n).
- Coalescing with naive neighbor search can become O(n^2).
8. Extensions & Challenges
8.1 Beginner Extensions
- Add
callocandreallocsupport. - Add
malloc_statsoutput.
8.2 Intermediate Extensions
- Implement segregated free lists by size class.
- Add small object caching (per-size bins).
8.3 Advanced Extensions
- Thread-safe allocator with per-thread arenas.
- Implement
madvisefor lazy page release.
9. Real-World Connections
9.1 Industry Applications
- Memory allocators in libc, JVMs, and database engines.
- High-performance services tune allocators for latency and fragmentation.
9.2 Related Open Source Projects
- jemalloc
- tcmalloc
- mimalloc
9.3 Interview Relevance
- Explains heap allocation, fragmentation, and OS interfaces.
- Demonstrates data structure design under constraints.
10. Resources
10.1 Essential Reading
- Computer Systems: A Programmer’s Perspective (Chapter 9)
- Operating Systems: Three Easy Pieces (Chapter 17)
10.2 Video Resources
- Systems programming lectures on memory allocation
10.3 Tools & Documentation
strace,gdb,valgrind,perf
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain virtual memory and the program break.
- I can describe free list strategies and trade-offs.
- I can explain why alignment matters.
11.2 Implementation
- All allocator functions pass unit tests.
- Heap dumps match expected layouts.
- Double-free and corruption are detected.
11.3 Growth
- I can describe allocator design in an interview.
- I can compare my allocator to jemalloc at a high level.
12. Submission / Completion Criteria
Minimum Viable Completion:
mallocandfreework for basic tests.LD_PRELOADworks with at least one external program.- Alignment is correct for 16-byte boundaries.
Full Completion:
- All functions implemented (
malloc,free,calloc,realloc). - Coalescing and splitting reduce fragmentation.
- Deterministic demo passes with stable output.
Excellence (Going Above & Beyond):
- Segregated free lists or bins implemented.
- Thread-safe version with per-thread caches.