Project 4: Memory Allocator with Debugging

Build a custom allocator with size classes, coalescing, and leak detection.

Quick Reference

Attribute	Value
Difficulty	Advanced
Time Estimate	20-30 hours
Main Programming Language	C (Alternatives: Rust)
Alternative Programming Languages	Rust
Coolness Level	Level 4
Business Potential	Resume gold
Prerequisites	C pointers, memory layout, debugging tools
Key Topics	allocators, free lists, coalescing, debugging

1. Learning Objectives

By completing this project, you will:

Implement segregated free lists for fast allocation.
Use boundary tags to coalesce adjacent free blocks.
Extend the heap with sbrk or mmap and manage arenas.
Add debugging features: leak detection, double free detection, canaries.
Measure fragmentation and allocator performance.

2. All Theory Needed (Per-Concept Breakdown)

Segregated Free Lists and Size Classes

Fundamentals

A free list is a linked list of unused memory blocks that can be reused for future allocations. Segregated free lists divide free blocks into size classes, such as 16B, 32B, 64B, etc. This allows O(1) allocation for small sizes and reduces search time. Instead of scanning a single large list, you look at the list for the requested size class. If it is empty, you look at the next larger class. This design balances speed and fragmentation control. In systems programming, size classes are a key optimization because they reduce both allocation time and metadata overhead while keeping memory reuse efficient.

Deep Dive into the concept

Segregated free lists are the core of most production allocators. The idea is that allocation requests are not uniform; many allocations are small and repeated. By grouping free blocks into size classes, you can return memory quickly and keep hot blocks in cache. The simplest size class scheme is power-of-two classes (16, 32, 64, 128, …). This makes indexing trivial: class = log2(size). However, power-of-two classes can increase internal fragmentation because a 33-byte allocation may consume a 64-byte block. To reduce fragmentation, many allocators use finer-grained classes for small sizes (e.g., 8-byte increments up to 128 bytes) and power-of-two classes for larger sizes. For this project, you can implement a hybrid scheme: fixed small classes up to 256 bytes, then doubling for larger sizes.

Each free block typically stores metadata in its header: size, allocation status, and pointers for the free list. In a segregated list, blocks of similar sizes are linked together. When an allocation request arrives, you round the size up to the nearest class, then check the corresponding free list. If it is empty, you can either search larger classes (best-fit) or request more memory from the system and split it into blocks. A fast allocator uses a “first-fit within class” strategy: take the first block from the class list. This is O(1) and usually good enough for a learning allocator.

The tradeoff is fragmentation. Internal fragmentation happens when a block is larger than requested. External fragmentation happens when free blocks are too small or scattered to satisfy a large allocation. Size classes reduce allocation time but can increase internal fragmentation. You can track fragmentation by measuring total free bytes vs total unused bytes inside allocated blocks. In this project, you should compute and report both.

Segregated lists also reduce contention in multi-threaded allocators, because each thread can have its own small-object cache per size class. You do not need to implement threading here, but you should design with the mental model: size classes are about locality and fast reuse, not just speed. Many allocators, such as jemalloc and tcmalloc, use size classes plus per-thread caches. The design you implement will give you insight into how these systems work.

Another subtlety is alignment. Most allocators must return pointers aligned to 8 or 16 bytes. This means you must round allocation sizes up to the alignment and ensure block headers are also aligned. Misalignment can cause crashes or performance penalties on some architectures. Therefore, your size classes should be multiples of the alignment. The header should be placed so that the returned pointer is aligned. A typical layout is: [header][payload], where header size is a multiple of alignment. The free list pointers can be stored in the payload area when the block is free, which avoids extra overhead.

Finally, you must integrate size classes with coalescing. When a block is freed, it may be merged with neighbors to form a larger block. This merged block must be inserted into the appropriate size class list. This means your allocator must remove neighbor blocks from their lists before coalescing. This is a common source of bugs and a place where clear invariants are critical.

How this fit on projects

Segregated free lists are required in §3.2 and used in §4.4. They are implemented in §5.10 Phase 1 and tested in §6.2. They are also referenced in P01-in-memory-cache-lru.md for memory accounting and in P06-capstone-simple-database-engine.md for buffer pool allocation.

Definitions & key terms

free list -> linked list of free blocks
size class -> grouping of blocks by size
internal fragmentation -> wasted space inside allocated block
external fragmentation -> free space unusable due to scattering
alignment -> required boundary for returned pointers

Mental model diagram

Class 16: [block]->[block]
Class 32: [block]
Class 64: [block]->[block]->[block]

How it works (step-by-step, with invariants and failure modes)

Round requested size up to alignment and class.
Check free list for class; if empty, look for larger class.
If larger block found, split and return part.
Update free list pointers and block headers.
On free, insert block into its class list.

Invariants: block sizes match class; free list pointers valid; headers store correct size. Failure modes: incorrect class calculation, list corruption, or misalignment.

Minimal concrete example

size_t cls = size_to_class(size);
block_t *b = free_lists[cls];
if (b) { free_lists[cls] = b->next; return payload(b); }

Common misconceptions

“One free list is enough” -> it leads to O(n) searches.
“Power-of-two classes are always optimal” -> they can waste memory.
“Alignment doesn’t matter” -> it matters for correctness and performance.

Check-your-understanding questions

Why do size classes improve allocation speed?
What is internal fragmentation and how do size classes affect it?
Why must blocks be aligned?

Check-your-understanding answers

They reduce search time to O(1) by grouping similar sizes.
Internal fragmentation is wasted space inside a block; coarse classes increase it.
Misalignment can break ABI requirements and slow memory access.

Real-world applications

jemalloc and tcmalloc use size classes.
Memory pools in game engines and servers.

Where you’ll apply it

This project: §3.2, §4.4, §5.10 Phase 1.
Also used in: P06-capstone-simple-database-engine.md.

References

Computer Systems: A Programmer’s Perspective, Ch. 9.9
Operating Systems: Three Easy Pieces, Ch. 17

Key insights

Size classes turn allocation into a fast lookup at the cost of controlled internal fragmentation.

Summary

Segregated free lists are the allocator’s fast path. They must be aligned, carefully indexed, and integrated with coalescing to maintain correctness.

Homework/Exercises to practice the concept

Implement a size class mapping function and test boundary sizes.
Build a free list array and allocate from it in O(1).

Solutions to the homework/exercises

Verify sizes map to expected classes (e.g., 17 bytes -> 24 or 32).
Remove head from list on allocation and push on free.

Boundary Tags and Coalescing

Fundamentals

Coalescing is the process of merging adjacent free blocks to reduce external fragmentation. Boundary tags store the block size at both the header and the footer of a block, enabling you to find neighboring block sizes when freeing. If the previous or next block is free, you merge them into a larger block and update headers and footers. This allows large allocations to succeed even after many small allocations and frees. Coalescing is critical for allocator correctness and long-term performance.

From a systems perspective, you should be able to explain the invariants this concept maintains, the resources it consumes (CPU, memory, IO), and the typical failure modes when it is misused. This basic framing will help you reason about correctness before you optimize or extend the design.

Deep Dive into the concept

Boundary tags are a classic allocator technique. Each block contains a header with size and allocation status, and a footer with the same size. When you free a block, you can look at the footer of the previous block to determine its size, then compute its header address. Similarly, you can look at the header of the next block. This allows you to determine whether neighbors are free and merge them without scanning the heap. The merge process involves removing neighbor blocks from their free lists, combining sizes, and inserting the merged block into the appropriate size class. The resulting block has a header and footer that reflect the new size.

Coalescing can be immediate or lazy. Immediate coalescing merges blocks on every free. Lazy coalescing defers merging until allocation time, which can reduce free-time cost but increases allocation complexity. Immediate coalescing is easier to reason about and is recommended for this project. However, immediate coalescing requires careful list management: you must remove neighbor blocks from their free lists before merging or you will corrupt lists and double free later. This is a common bug.

The allocator must also manage a prologue and epilogue block to simplify boundary checks. A prologue is a small allocated block at the start of the heap; an epilogue is a zero-sized allocated block at the end. These sentinel blocks prevent you from coalescing beyond the heap boundaries. When you extend the heap, you replace the epilogue with a new free block and add a new epilogue. This simplifies logic and reduces edge cases.

Another subtlety is that boundary tags add overhead to each block. The header and footer consume memory, which contributes to internal fragmentation. Some allocators omit footers for allocated blocks and only maintain them for free blocks, but this complicates coalescing. For this project, keep headers and footers for all blocks for simplicity. You can optimize later.

Coalescing affects performance and fragmentation metrics. Without coalescing, external fragmentation grows quickly. With coalescing, you can satisfy larger allocations but may increase time spent on free operations. In practice, coalescing is essential for long-running systems. Your debugging output should include coalescing events to help verify correctness. A deterministic test should allocate and free blocks in a pattern that creates adjacent free blocks, then verify that they merge.

It is also useful to define a canonical coalescing order to keep your free lists consistent. For example, always coalesce with the previous block first, then with the next. This makes it easier to reason about pointer updates and to debug. When you remove neighbors from free lists, do it before rewriting their headers to avoid losing list pointers. In debug builds, you can mark a block as "in_free_list" to catch double insertions. These small practices make allocator bugs far easier to diagnose.

The correctness of coalescing is foundational. If you miscompute sizes, you can merge overlapping blocks and corrupt the heap. This can lead to subtle crashes far from the actual bug. Therefore, boundary tag validation should be part of your debug mode. You can add a heap consistency checker that scans the heap and verifies that each block’s header and footer match, and that free list blocks are actually free. This will save you time when debugging.

How this fit on projects

Coalescing is required in §3.2 and implemented in §5.10 Phase 2. It is tested in §6.2 and used in §7.1 debugging. It also appears in P05-btree-file-index.md as an analogy to page coalescing.

Definitions & key terms

boundary tag -> size stored in header and footer
coalescing -> merging adjacent free blocks
prologue/epilogue -> sentinel blocks at heap boundaries
external fragmentation -> free space scattered across heap

Mental model diagram

[free 32][free 64] -> coalesce -> [free 96]

How it works (step-by-step, with invariants and failure modes)

Free block and mark header/footer as free.
Check previous block footer and next block header.
If neighbor free, remove neighbor from free list.
Merge sizes and update header/footer.
Insert merged block into correct size class.

Invariants: header/footer sizes match; free lists contain only free blocks. Failure modes: double insertion, wrong size leading to heap corruption.

Minimal concrete example

if (prev_free) remove_free(prev);
if (next_free) remove_free(next);
size = prev->size + curr->size + next->size;
write_header_footer(merged, size, FREE);

Common misconceptions

“Coalescing is optional” -> it prevents long-term fragmentation.
“Footers are wasteful” -> they enable O(1) neighbor lookup.
“Coalescing is always safe” -> only if neighbor blocks are verified.

Check-your-understanding questions

Why do boundary tags allow O(1) neighbor detection?
What happens if you coalesce without removing neighbors from free lists?
Why are prologue/epilogue blocks useful?

Check-your-understanding answers

The footer stores the previous block size, enabling direct address computation.
The same block appears in multiple lists, causing corruption and double free.
They prevent special-case checks at heap boundaries.

Real-world applications

glibc malloc uses boundary tags.
Kernel allocators use similar coalescing strategies.

Where you’ll apply it

This project: §3.2, §5.10 Phase 2, §7.1.
Also used in: P05-btree-file-index.md as an analogy.

References

Computer Systems: A Programmer’s Perspective, Ch. 9.9
Operating Systems: Three Easy Pieces, Ch. 17

Key insights

Coalescing is what keeps your heap usable after many allocations and frees.

Summary

Boundary tags provide constant-time neighbor access, enabling immediate coalescing and reducing external fragmentation.

Homework/Exercises to practice the concept

Implement a heap checker that validates header/footer consistency.
Create an allocation pattern that forces coalescing and verify merged sizes.

Solutions to the homework/exercises

Walk the heap and compare header/footer size fields for each block.
Allocate three blocks, free the middle and neighbors, and confirm one large block.

Heap Extension: sbrk, mmap, and Arenas

Fundamentals

When the allocator cannot find a free block large enough, it must request more memory from the OS. On Unix-like systems, sbrk extends the process heap, while mmap can allocate anonymous memory regions. Many allocators use a hybrid approach: sbrk for small and medium allocations, mmap for large ones. The allocator manages these regions as arenas with metadata that tracks size and free lists. Understanding how to request and manage memory from the OS is essential for building a functioning allocator.

Deep Dive into the concept

Sbrk historically extends the program break, which is the end of the heap. This is a simple model: you call sbrk(n) and get n new bytes. You then treat this as a contiguous extension of the heap. This makes coalescing across the new region straightforward. However, sbrk is not thread-friendly and is discouraged in some modern environments. Mmap allocates independent regions of memory that are not necessarily contiguous. It is more flexible and can be released back to the OS with munmap, but it complicates coalescing because separate regions are not adjacent.

A common strategy is to use sbrk for smaller allocations and mmap for large allocations (e.g., > 128KB). Large allocations can be given their own mapping and freed directly with munmap, which reduces fragmentation. For this project, you can implement a threshold: if requested size exceeds the threshold, use mmap and mark the block as “mmapped” in its header. This also means that coalescing does not cross into mmapped blocks; freeing such a block just unmaps it.

When using sbrk, you should allocate memory in chunks rather than exactly the requested size. This amortizes the cost of system calls. For example, request memory in 64KB chunks and split into blocks. This chunk can be managed as an arena with its own metadata. When you extend the heap, you create a new free block representing the new region and coalesce it with the previous epilogue if possible. This requires correct prologue/epilogue handling as described in the coalescing section.

When using mmap, you must ensure the returned pointer is aligned and that you store metadata at the start of the mapping. A typical layout is: [header][payload], where header records size and a flag indicating mmapped. On free, check the flag; if mmapped, call munmap on the entire region. This bypasses free lists entirely. This is important to avoid contaminating size class lists with huge blocks.

Arenas are a way to partition memory for performance and concurrency. A simple allocator may have one arena (the whole heap). A more advanced allocator may have multiple arenas for different threads or sizes. For this project, you can treat each sbrk chunk as an arena but manage them uniformly. This keeps the design simple while exposing the idea that memory is divided into regions. In debug mode, you can print arena boundaries and sizes to help verify correctness.

The key insight is that memory from the OS is finite and expensive to manage. An allocator must decide when to request more memory and how to return it. This design choice affects fragmentation, performance, and system behavior under pressure. Implementing heap extension logic gives you a practical understanding of the lower boundary of your allocator and how it interacts with the OS.

One more subtlety is alignment at the OS boundary. mmap returns page-aligned memory, which is usually 4096 bytes, but your allocator may require 16-byte alignment for payloads. Your header size must preserve alignment, or you must pad. Similarly, if you use sbrk, you should ensure that the initial heap start is aligned; if not, you may need to add padding at the beginning. These details often cause hard-to-reproduce bugs, so build a small alignment test that allocates different sizes and checks pointer alignment. This ensures your allocator is ABI-safe.

How this fit on projects

Heap extension is required in §3.2 and implemented in §5.10 Phase 1 and 2. It is tested in §6.2 (large allocations). It is also relevant to P01-in-memory-cache-lru.md for memory budgeting.

Definitions & key terms

sbrk -> system call to extend heap
mmap -> system call to map new memory region
arena -> managed memory region with its own metadata
program break -> end of process heap

Mental model diagram

Heap (sbrk): [arena1][arena2]...
Large alloc: [mmap region]

How it works (step-by-step, with invariants and failure modes)

On allocation miss, request a chunk from OS.
Create a free block covering the new chunk.
Insert into appropriate free list.
For large allocations, use mmap and bypass free lists.

Invariants: sbrk arenas are contiguous; mmapped blocks are tracked separately. Failure modes: incorrect unmap size or mixing mmapped blocks into lists.

Minimal concrete example

if (size > MMAP_THRESHOLD) {
    void *p = mmap(NULL, size + header_size, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0);
    return payload(p);
}

Common misconceptions

“sbrk is always better” -> mmap is better for large allocations.
“mmap blocks can be coalesced” -> they are separate mappings.
“OS memory is infinite” -> requests can fail under pressure.

Check-your-understanding questions

Why might you use mmap for large allocations?
What is the program break?
How does chunking reduce system call overhead?

Check-your-understanding answers

It avoids fragmenting the heap and allows direct unmap on free.
It is the current end of the process heap managed by sbrk.
Fewer system calls amortize overhead across many allocations.

Real-world applications

glibc malloc uses mmap for large blocks.
jemalloc uses arenas to reduce contention.

Where you’ll apply it

This project: §3.2, §5.10 Phase 1, §6.2.
Also used in: P01-in-memory-cache-lru.md.

References

The Linux Programming Interface, memory management chapters
Computer Systems: A Programmer’s Perspective, Ch. 9

Key insights

The allocator is a broker between your program and the OS; its heap extension policy shapes fragmentation and performance.

Summary

Heap extension via sbrk or mmap is what keeps your allocator supplied with memory. The policy you choose affects fragmentation and performance.

Homework/Exercises to practice the concept

Implement a simple sbrk-based arena and print its boundaries.
Add an mmap threshold and free large blocks with munmap.

Solutions to the homework/exercises

Call sbrk in fixed chunks and store start/end addresses.
Mark mmapped blocks in headers and unmap them on free.

Debugging Metadata: Canaries, Leak Tracking, Double-Free Detection

Fundamentals

Allocator debugging features help detect memory misuse such as leaks, double frees, and buffer overflows. You can add canary bytes before and after the payload to detect overwrites. You can track allocations in a hash table to detect leaks and double frees. When freeing, you can verify that the pointer was allocated and not already freed. These features add overhead but are invaluable for development and learning.

Deep Dive into the concept

A basic allocator is difficult to debug because memory corruption can occur far from the source. Debugging metadata makes errors visible. Canary bytes are known patterns (e.g., 0xCA, 0xFE) placed around the payload. When a block is freed, you check the canaries; if they are corrupted, you detect a buffer overflow or underflow. This is similar to what AddressSanitizer does. Canary size can be small (8-16 bytes) to limit overhead. You should store the canary value in a global constant and fill it on allocation. In debug mode, you can also fill freed blocks with a pattern (e.g., 0xDE) to detect use-after-free in tests.

Leak detection requires tracking allocated blocks. The simplest approach is a hash table keyed by allocation pointer, storing size and allocation site (file/line or backtrace). In C, you can capture a backtrace using backtrace() on Linux, but for simplicity you can store a counter or a string passed by the caller. Your allocator can provide a debug API malloc_dbg(size, file, line) and a macro wrapper. At program exit, you iterate through the allocation table and report any blocks still allocated. This gives a leak report. For double-free detection, you can check if the pointer is in the allocation table. If not, report an error. If it is, remove it and proceed. This does not catch all invalid frees but catches common cases.

These checks must be optional because they add overhead. Provide a build flag like -DALLOC_DEBUG to enable them. In release mode, the allocator should be fast. However, the presence of debug metadata can change block sizes. If you add canary bytes, the effective payload size shrinks, which affects size class mapping. This means your allocator must include debug overhead in size calculations. A robust design makes the metadata layout explicit so that both debug and release modes remain correct.

Another useful feature is a heap consistency checker. This function walks all blocks, verifies header/footer consistency, and ensures that free list blocks are not marked allocated. It also checks that allocated blocks are not in free lists. Running this checker at strategic points (e.g., after free, after coalescing) can catch bugs early. You can integrate this into debug mode and allow the user to call malloc_check() manually. This is similar to malloc_check in glibc.

In addition to canaries and leak detection, you can implement red zones and guard pages. Guard pages use mmap and mprotect to create inaccessible pages around allocations, catching out-of-bounds access by crashing immediately. This is advanced but optional. For this project, focus on canaries and a leak table. The key is to understand that allocators can be instrumented to detect misuse and that this instrumentation is often more useful than debugging with raw gdb.

To keep outputs deterministic, define a stable ordering for leak reports. For example, sort leaks by allocation sequence number or by pointer value. This makes regression tests reliable. You can also include a simple allocation id in the header and print it alongside the size, which helps you trace which allocation was leaked. If you add file/line metadata, prefer compile-time macros so the same allocation always prints the same source location. These small details make your allocator’s debug mode feel like a real tool rather than a one-off experiment.

How this fit on projects

Debugging metadata is required in §3.2 and shown in §3.7. It is implemented in §5.10 Phase 3 and tested in §6.2 leak/double free cases. It is also relevant to P01-in-memory-cache-lru.md for memory accounting verification.

Definitions & key terms

canary -> known byte pattern to detect overflow
leak -> allocation not freed before program exit
double free -> freeing the same pointer twice
red zone -> protected area around allocation
heap checker -> consistency verification pass

Mental model diagram

[header][canary][payload][canary][footer]

How it works (step-by-step, with invariants and failure modes)

On allocation, add canaries and insert into allocation table.
On free, verify canaries and lookup pointer in table.
If not found, report invalid or double free.
Remove from table and free block.
At exit, report any remaining allocations.

Invariants: allocation table contains all live blocks; canaries intact for valid blocks. Failure modes: missing table updates or incorrect canary placement.

Minimal concrete example

memset((char*)payload - 8, 0xCA, 8);
memset((char*)payload + size, 0xCA, 8);

Common misconceptions

“Debug checks are free” -> they add time and memory overhead.
“Canaries prevent overflow” -> they only detect it on free.
“Leak detection is automatic” -> you must track allocations.

Check-your-understanding questions

Why do canaries detect buffer overflows?
How does an allocation table detect double frees?
Why should debug metadata be optional?

Check-your-understanding answers

Overflows overwrite the canary bytes, which you check on free.
A freed pointer is removed; freeing again means it is not present.
It adds overhead and changes block sizes, which hurts performance.

Real-world applications

AddressSanitizer and Valgrind use similar techniques.
Debug builds of jemalloc include red zones.

Where you’ll apply it

This project: §3.7, §5.10 Phase 3, §6.2.
Also used in: P01-in-memory-cache-lru.md for memory checks.

References

The Art of Debugging with GDB
AddressSanitizer documentation

Key insights

Allocator debugging turns invisible memory bugs into visible, deterministic failures.

Summary

By adding canaries and tracking allocations, you can detect leaks, double frees, and overwrites, making your allocator robust and debuggable.

Homework/Exercises to practice the concept

Implement canaries and create a test that overwrites past the buffer.
Add a leak report printed at program exit.

Solutions to the homework/exercises

Overwrite payload[size] and verify canary mismatch on free.
Maintain a hash table of allocations and print remaining entries.

3. Project Specification

3.1 What You Will Build

A drop-in malloc/free replacement that supports size classes, coalescing, and debug features. The allocator exposes a shared library that can be preloaded using LD_PRELOAD. It reports leaks, detects double frees, and shows fragmentation statistics. The allocator must handle a mix of small and large allocations efficiently.

3.2 Functional Requirements

Implement malloc, free, realloc, calloc.
Use segregated free lists with size classes.
Coalesce adjacent free blocks using boundary tags.
Support large allocations via mmap.
Provide debug mode with canaries and leak reports.
Expose fragmentation statistics.

3.3 Non-Functional Requirements

Performance: O(1) allocation for common sizes.
Reliability: no heap corruption; consistent coalescing.
Usability: clear debug output in deterministic format.

3.4 Example Usage / Output

$ LD_PRELOAD=./libmymalloc.so ./test_program
[malloc] alloc 128 bytes -> 0x7f123400
[malloc] free  128 bytes -> 0x7f123400
[malloc] WARNING double free at 0x7f123400
[malloc] leak report: 1 block (256 bytes)

3.5 Data Formats / Schemas / Protocols

Allocator internal header format:

[block_size][flags][prev_free][next_free]

3.6 Edge Cases

Freeing NULL is a no-op.
Freeing invalid pointer prints error and exits with code 3 (debug).
Realloc to smaller size keeps data and may split block.

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

cc -O2 -Wall -Wextra -fPIC -shared -o libmymalloc.so src/malloc.c src/free.c src/debug.c
LD_PRELOAD=./libmymalloc.so ./tests/alloc_demo

3.7.2 Golden Path Demo (Deterministic)

$ LD_PRELOAD=./libmymalloc.so ./tests/alloc_demo
[malloc] alloc 64 bytes -> 0x1000
[malloc] alloc 64 bytes -> 0x1040
[malloc] free  64 bytes -> 0x1000
[malloc] free  64 bytes -> 0x1040
[malloc] leak report: 0 blocks

3.7.3 Failure Demo (Deterministic)

$ LD_PRELOAD=./libmymalloc.so ./tests/alloc_demo
[malloc] free invalid pointer 0xdeadbeef
[malloc] ERROR invalid_free

Exit codes:

0 on normal exit
3 on invalid free or canary failure

4. Solution Architecture

4.1 High-Level Design

alloc request -> size class -> free list -> split/coalesce
                                |
                                +-> mmap for large blocks

4.2 Key Components

4.4 Data Structures (No Full Code)

struct block_header {
    size_t size;
    uint8_t flags;
    struct block_header *prev_free;
    struct block_header *next_free;
};

4.4 Algorithm Overview

Key Algorithm: free + coalesce

Mark block free.
Check neighbors using boundary tags.
Merge and insert into size class list.

Complexity Analysis:

Time: O(1) for free list operations; O(1) coalescing
Space: O(n) blocks + metadata

5. Implementation Guide

5.1 Development Environment Setup

cc --version

5.2 Project Structure

mymalloc/
├── src/
│   ├── malloc.c
│   ├── free.c
│   ├── coalesce.c
│   ├── debug.c
│   └── arena.c
└── README.md

5.3 The Core Question You’re Answering

“How does a memory allocator balance speed with fragmentation control?”

5.4 Concepts You Must Understand First

Size classes and free lists
Boundary tags and coalescing
Heap extension with sbrk/mmap
Debugging metadata

5.5 Questions to Guide Your Design

What size classes minimize fragmentation for your workload?
When should you coalesce (always or lazily)?
How will you detect double frees reliably?

5.6 Thinking Exercise

If you allocate 100 blocks of 64 bytes and free every other block, how much contiguous memory is available for a 1KB allocation?

5.7 The Interview Questions They’ll Ask

What is internal vs external fragmentation?
How do boundary tags enable coalescing?
Why use mmap for large allocations?

5.8 Hints in Layers

Hint 1: Implement a simple allocator with a single free list. Hint 2: Add size classes once basic allocation works. Hint 3: Add debugging metadata last.

5.9 Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | Allocator design | CS:APP | Ch. 9.9 | | Free lists | OSTEP | Ch. 17 |

5.10 Implementation Phases

Phase 1: Foundation (6-10 hours)

Goals: block layout, free list, sbrk Tasks: headers/footers, basic malloc/free Checkpoint: simple allocations work

Phase 2: Core Functionality (6-10 hours)

Goals: size classes, coalescing Tasks: segregated lists, coalesce neighbors Checkpoint: fragmentation reduced under tests

Phase 3: Polish & Edge Cases (6-10 hours)

Goals: debug features, mmap large blocks Tasks: canaries, leak table, invalid free detection Checkpoint: leak report works and errors detected

5.11 Key Implementation Decisions

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

Allocate/free in random order; verify heap checker passes.
Double free detection triggers error.
Buffer overflow triggers canary failure.

6.3 Test Data

alloc 64
alloc 128
free 64
free 128

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

7.2 Debugging Strategies

Add a heap checker that scans all blocks.
Print free list contents in debug mode.

7.3 Performance Traps

Too many size classes increase overhead.
Excessive debug checks slow down allocation.

8. Extensions & Challenges

8.1 Beginner Extensions

Add realloc support.
Add a simple heap visualization.

8.2 Intermediate Extensions

Implement per-thread caches.
Add delayed coalescing.

8.3 Advanced Extensions

Guard pages with mprotect.
Implement slab allocator for fixed-size objects.

9. Real-World Connections

9.1 Industry Applications

glibc malloc, jemalloc, tcmalloc.
Kernel slab allocators.

jemalloc source code
tcmalloc source code

9.3 Interview Relevance

Fragmentation tradeoffs
Allocator design questions

10. Resources

10.1 Essential Reading

CS:APP Ch. 9
OSTEP Ch. 17

10.2 Video Resources

“Malloc Internals” talks

10.3 Tools & Documentation

valgrind and AddressSanitizer

11. Self-Assessment Checklist

11.1 Understanding

I can explain size classes and fragmentation.
I can explain boundary tags.
I can explain mmap vs sbrk.

11.2 Implementation

All functional requirements are met.
Debug checks catch invalid frees.
Coalescing works under stress.

11.3 Growth

I can discuss allocator tradeoffs in interviews.

12. Submission / Completion Criteria

Minimum Viable Completion:

malloc/free with size classes
basic coalescing

Full Completion:

mmap for large blocks
debug metadata and leak report

Excellence (Going Above & Beyond):

guard pages and heap visualization