Project 1: Memory Allocator (malloc/free from scratch)

Build a production-style heap allocator that can replace malloc and free and explain every byte it manages.

Quick Reference

Attribute Value
Difficulty Advanced
Time Estimate 2-3 weeks
Main Programming Language C (Alternatives: Rust, C++)
Alternative Programming Languages Rust, C++
Coolness Level Level 4: Hardcore Tech Flex
Business Potential Resume Gold
Prerequisites Pointers, structs, process memory layout, basic syscalls
Key Topics Virtual memory, free lists, fragmentation, alignment, mmap/sbrk

1. Learning Objectives

By completing this project, you will:

  1. Implement a heap allocator that supports malloc, free, calloc, and realloc.
  2. Explain virtual memory, the heap, and how sbrk and mmap grow address space.
  3. Design and debug free list data structures with coalescing and splitting.
  4. Enforce alignment requirements and understand why misalignment breaks programs.
  5. Measure fragmentation and describe the trade-offs between speed and memory efficiency.
  6. Replace the system allocator in a real program using LD_PRELOAD.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Virtual Memory and the Heap

Description / Expanded Explanation

Virtual memory gives each process the illusion of a large, contiguous address space. The heap is just a region of that address space that the allocator manages. The OS does not give you “objects” or “arrays”; it gives you pages of memory and a set of rules. Your allocator is the system that turns pages into reusable blocks for programs.

Definitions & Key Terms
  • virtual address space -> per-process address map that is translated to physical memory
  • page -> fixed-size memory unit (usually 4 KB) managed by the OS
  • heap -> region of virtual memory used for dynamic allocations
  • program break -> end of the heap managed by sbrk
  • mapping -> association of virtual pages to physical frames
Mental Model Diagram (ASCII)
High addresses
+---------------------------+  <-- kernel space
|        kernel             |
+---------------------------+
|         stack             |  grows down
|                           |
|        (gap)              |
|                           |
|         heap              |  grows up
+---------------------------+  <-- program break
|      mmap regions         |
+---------------------------+
|   data | bss | text        |
+---------------------------+
Low addresses
Mental Model Diagram (Image)

Virtual Memory and Heap

How It Works (Step-by-Step)
  1. The OS loads your program and maps text, data, and stack regions.
  2. The heap starts small; the program break marks its end.
  3. sbrk moves the break upward, mapping more pages.
  4. mmap creates separate regions for large allocations.
  5. Your allocator carves blocks out of these regions and tracks them.
Minimal Concrete Example
void *p = sbrk(0);         // current break
void *q = sbrk(4096);      // grow heap by one page
assert((char*)q - (char*)p == 4096);
Common Misconceptions
  • “malloc gets memory from the OS every time” -> allocators reuse free blocks aggressively.
  • “heap is a single array” -> it is a collection of mapped pages that can be fragmented.
Check-Your-Understanding Questions
  1. Why can two processes both use virtual address 0x1000 without conflict?
  2. What is the program break and why does it move?
  3. When would you choose mmap instead of sbrk?
  4. Predict what happens if you sbrk(-4096) after allocations.
Where You’ll Apply It
  • See 3.1 and 3.5 for allocator boundaries and metadata layout.
  • See 4.1 and 4.4 for heap growth strategies.
  • Also used in: P05 Container From Scratch

2.2 Block Metadata and Alignment

Description / Expanded Explanation

Every allocated block needs metadata. You need to know its size, whether it is free, and sometimes its neighbors. The allocator must ensure that the pointer returned to the user is properly aligned so that any C type can be stored safely.

Definitions & Key Terms
  • header -> metadata placed before user data
  • footer -> metadata at the end of a block (often size)
  • alignment -> address is a multiple of a power of two (often 16)
  • padding -> extra bytes inserted to meet alignment
Mental Model Diagram (ASCII)
+---------+--------------------+--------+
| header  | user payload        | footer |
+---------+--------------------+--------+
^                                      ^
block start                        block end
Mental Model Diagram (Image)

Block Metadata

How It Works (Step-by-Step)
  1. Round the requested size up to the alignment.
  2. Add space for the header (and footer if used).
  3. Return a pointer after the header.
  4. On free, subtract the header size to find metadata.
Minimal Concrete Example
#define ALIGNMENT 16
#define ALIGN(sz) (((sz) + (ALIGNMENT-1)) & ~(ALIGNMENT-1))

struct header { size_t size; int is_free; };
void *malloc_impl(size_t n) {
    size_t total = ALIGN(n) + sizeof(struct header);
    struct header *h = request_block(total);
    h->size = total;
    h->is_free = 0;
    return (void*)(h + 1);
}
Common Misconceptions
  • “Alignment only matters for performance” -> some architectures crash on misaligned access.
  • “Metadata is just overhead” -> without it you cannot free or coalesce safely.
Check-Your-Understanding Questions
  1. Why is 16-byte alignment a safe default on 64-bit systems?
  2. What happens if free is given a pointer in the middle of a block?
  3. How does alignment affect internal fragmentation?
Where You’ll Apply It
  • See 3.5 for the exact block layout and header format.
  • See 4.4 for block splitting logic.
  • Also used in: P07 Lock-Free Queue

2.3 Free Lists, Splitting, and Coalescing

Description / Expanded Explanation

Allocators reuse freed memory by tracking free blocks in data structures. Splitting a large block into a smaller used block plus a remainder reduces waste, while coalescing adjacent free blocks reduces external fragmentation.

Definitions & Key Terms
  • free list -> linked list (or tree) of free blocks
  • splitting -> dividing a large block into used and free parts
  • coalescing -> merging adjacent free blocks
  • external fragmentation -> free memory exists but is not contiguous
Mental Model Diagram (ASCII)
Before free:
[USED 64][USED 128][USED 64]

After free middle:
[USED 64][FREE 128][USED 64]

After coalescing neighbors:
[FREE 256][USED 64]   (if two adjacent frees)
Mental Model Diagram (Image)

Free List and Coalescing

How It Works (Step-by-Step)
  1. free marks the block as free and inserts it into the free list.
  2. malloc searches the free list for a block large enough.
  3. If the block is larger than needed, split it into two blocks.
  4. On free, check neighbors; if free, merge and update size.
Minimal Concrete Example
void free_impl(void *ptr) {
    struct header *h = ((struct header*)ptr) - 1;
    h->is_free = 1;
    insert_free_list(h);
    try_coalesce(h);
}
Common Misconceptions
  • “Best-fit always wins” -> it reduces fragmentation but increases search time.
  • “Coalescing can wait forever” -> delayed coalescing can explode fragmentation.
Check-Your-Understanding Questions
  1. What is the difference between internal and external fragmentation?
  2. Why do boundary tags make coalescing O(1)?
  3. When might a free list become a performance bottleneck?
Where You’ll Apply It
  • See 4.4 for allocation algorithms and complexity.
  • See 5.10 Phase 2 for implementing splitting and coalescing.
  • Also used in: P03 Persistent Key-Value Store

2.4 mmap vs sbrk Strategy

Description / Expanded Explanation

Small allocations typically come from a contiguous heap region (sbrk), while large allocations are better served by mmap so they can be returned directly to the OS. This split avoids heap fragmentation and improves performance for large blocks.

Definitions & Key Terms
  • sbrk -> moves the program break (heap end)
  • mmap -> maps pages directly, can be unmapped
  • arena -> independent heap region managed by allocator
  • threshold -> size cutoff for deciding sbrk vs mmap
Mental Model Diagram (ASCII)
Heap arena (sbrk):
[small blocks][small blocks][small blocks]

Large allocations (mmap):
[  separate mapping  ]
[  separate mapping  ]
Mental Model Diagram (Image)

mmap vs sbrk

How It Works (Step-by-Step)
  1. Decide a size threshold (for example 128 KB).
  2. Requests below threshold go to the heap arena.
  3. Requests above threshold use mmap and store a flag in metadata.
  4. On free, munmap large blocks directly to the OS.
Minimal Concrete Example
if (size >= MMAP_THRESHOLD) {
    void *p = mmap(NULL, size, PROT_READ|PROT_WRITE,
                   MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
    mark_mmap_block(p, size);
    return p;
}
Common Misconceptions
  • “mmap is always slower” -> it can be faster for large, long-lived allocations.
  • “sbrk memory can always be returned” -> the heap only shrinks if the top is free.
Check-Your-Understanding Questions
  1. Why does munmap return memory more reliably than shrinking the heap?
  2. What risks exist if you mmap many tiny allocations?
  3. How should the threshold be chosen?
Where You’ll Apply It
  • See 3.2 and 3.5 for allocation policy and metadata flags.
  • See 5.10 Phase 3 for large allocation support.
  • Also used in: P02 Load Balancer for buffer management

2.5 Fragmentation Metrics and Debugging

Description / Expanded Explanation

Fragmentation is the allocator’s hidden cost. You need to measure how much memory is wasted and where. Without instrumentation you cannot tell if your allocator is behaving well or just growing the heap forever.

Definitions & Key Terms
  • internal fragmentation -> wasted space inside allocated blocks
  • external fragmentation -> unusable free space between blocks
  • heap utilization -> used bytes / total heap bytes
  • canary -> sentinel value to detect corruption
Mental Model Diagram (ASCII)
Heap view:
[USED 24][PAD 8][USED 40][FREE 32][USED 16]

Used bytes: 24 + 40 + 16 = 80
Total heap: 24+8+40+32+16 = 120
Utilization = 80 / 120 = 0.66
Mental Model Diagram (Image)

Fragmentation Metrics

How It Works (Step-by-Step)
  1. Track total heap size and total allocated bytes.
  2. On each malloc, log requested size vs actual block size.
  3. Report utilization and fragmentation percentages.
  4. Add canaries and validate on free for corruption.
Minimal Concrete Example
stats.total_heap += block->size;
stats.used_bytes += requested;
printf("utilization=%.2f\n", (double)stats.used_bytes / stats.total_heap);
Common Misconceptions
  • “If tests pass, fragmentation is fine” -> long-running workloads may fail.
  • “Only external fragmentation matters” -> internal padding can be huge for small allocations.
Check-Your-Understanding Questions
  1. How can you quantify external fragmentation in a running allocator?
  2. Why do canaries help detect buffer overflows?
  3. What types of workloads maximize fragmentation?
Where You’ll Apply It

3. Project Specification

3.1 What You Will Build

You will build a shared-library memory allocator that implements malloc, free, calloc, and realloc and can replace the system allocator via LD_PRELOAD. It will:

  • manage heap growth using sbrk and mmap
  • reuse freed memory via a free list
  • split and coalesce blocks
  • enforce alignment and detect double-free
  • expose debug and stats modes for heap visualization

Out of scope: thread-safe allocation across multiple threads, per-CPU arenas, and advanced caching.

3.2 Functional Requirements

  1. malloc: return aligned pointers for any request size > 0.
  2. free: accept any pointer returned by your allocator and reclaim it.
  3. calloc: return zeroed memory.
  4. realloc: resize allocations while preserving data.
  5. Coalescing: merge adjacent free blocks.
  6. Large allocations: use mmap above a threshold.
  7. Debug mode: heap dump and stats via env var.
  8. Safety: detect double-free and corrupted headers.

3.3 Non-Functional Requirements

  • Performance: within 3-5x of glibc malloc on microbenchmarks.
  • Reliability: should run simple tools like /bin/ls without crashes.
  • Usability: clear logging and deterministic test mode.

3.4 Example Usage / Output

$ LD_PRELOAD=./libmymalloc.so MALLOC_DEBUG=1 ./malloc-demo
[MALLOC] request=32 -> block=48 @0x70000010
[FREE] block=0x70000010 size=48
[STATS] heap=4096 used=32 free=4032 util=0.78

3.5 Data Formats / Schemas / Protocols

Block header (stored before user data):

struct block_header {
    size_t size;        // total block size, including header
    uint8_t is_free;    // 1 if free
    uint8_t is_mmap;    // 1 if mapped via mmap
    uint16_t pad;
    struct block_header *next;
    struct block_header *prev;
};

Debug heap dump format:

[BLOCK addr=0x70000000 size=128 state=USED]
[BLOCK addr=0x70000080 size=256 state=FREE]

3.6 Edge Cases

  • malloc(0) returns NULL or a unique pointer (document choice).
  • free(NULL) is a no-op.
  • realloc(ptr, 0) frees the block.
  • realloc(NULL, size) is equivalent to malloc(size).
  • freeing a pointer not owned by the allocator must be detected and rejected.

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

# From project root
make clean && make

# Run deterministic demo with fixed mmap base
MALLOC_DETERMINISTIC=1 MALLOC_DEBUG=1 LD_PRELOAD=./libmymalloc.so ./malloc-demo

3.7.2 Golden Path Demo (Deterministic)

The demo uses a fixed mapping base so addresses are stable across runs.

3.7.3 CLI Transcript (Success)

$ MALLOC_DETERMINISTIC=1 MALLOC_DEBUG=1 LD_PRELOAD=./libmymalloc.so ./malloc-demo
[ALLOC] req=64 align=64 block=80 @0x70000000
[ALLOC] req=32 align=32 block=48 @0x70000050
[FREE] block=0x70000050 size=48
[ALLOC] req=16 align=16 block=32 @0x70000050
[STATS] heap=4096 used=80 free=3984 util=0.78
$ echo $?
0

3.7.3 CLI Transcript (Failure)

$ MALLOC_DETERMINISTIC=1 LD_PRELOAD=./libmymalloc.so ./malloc-demo --double-free
[FREE] block=0x70000050 size=48
[ERROR] double-free detected at 0x70000050
$ echo $?
2

3.7.4 If API

Not applicable.

3.7.5 If Library

  • Install: make produces libmymalloc.so
  • Usage: LD_PRELOAD=./libmymalloc.so ./program
  • Returns: malloc returns aligned pointer or NULL
  • Error handling: set errno=ENOMEM on failure

3.7.6 Exit Codes

  • 0 success
  • 2 allocator detected corruption or double-free
  • 3 OS allocation failure (sbrk/mmap)

4. Solution Architecture

4.1 High-Level Design

+------------------------------+
| malloc/free API              |
+------------------------------+
| allocator core               |
| - free list                  |
| - split/coalesce             |
| - mmap threshold             |
+------------------------------+
| OS interface                 |
| - sbrk                       |
| - mmap/munmap                |
+------------------------------+

4.2 Key Components

Component Responsibility Key Decisions
Heap arena Manage contiguous heap blocks sbrk-based growth
Free list Track free blocks doubly-linked list for O(1) remove
Large alloc Handle big requests mmap for direct release
Debugger Heap dump and stats env flags to avoid overhead

4.3 Data Structures (No Full Code)

struct free_node {
    size_t size;
    int is_free;
    struct free_node *next;
    struct free_node *prev;
};

4.4 Algorithm Overview

Allocation

  1. Align request size and add header.
  2. Search free list for a fit.
  3. If found, split if remainder is large enough.
  4. If not found, grow heap or mmap.

Free

  1. Validate pointer and header.
  2. Mark free and insert into free list.
  3. Coalesce with neighbors.

Complexity Analysis

  • Time: O(n) worst-case search in free list
  • Space: O(1) per block metadata

5. Implementation Guide

5.1 Development Environment Setup

# Linux or macOS
make
./malloc-demo

5.2 Project Structure

project-root/
├── src/
│   ├── allocator.c
│   ├── allocator.h
│   └── debug.c
├── tests/
│   └── test_allocator.c
├── Makefile
└── README.md

5.3 The Core Question You’re Answering

“When I call malloc, where do those bytes come from, and how do I get them back safely?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Virtual memory and page mapping
  2. Pointer arithmetic and alignment
  3. Free list invariants
  4. Fragmentation metrics
  5. mmap vs sbrk trade-offs

5.5 Questions to Guide Your Design

  1. How will you detect double-free without heavy overhead?
  2. What is your minimum block size?
  3. What strategy will you use to search the free list?
  4. When do you coalesce and when do you delay it?

5.6 Thinking Exercise

Trace this sequence and draw the heap:

malloc(24)
malloc(64)
free(first)
malloc(16)
free(second)
free(third)

5.7 The Interview Questions They’ll Ask

  1. Explain internal vs external fragmentation.
  2. Why do allocators use boundary tags?
  3. What happens if you return unaligned pointers?
  4. How would you make this thread-safe?

5.8 Hints in Layers

Hint 1: Build a bump allocator first to validate heap growth. Hint 2: Use a doubly-linked free list to simplify removal. Hint 3: Add a debug heap dump to visualize fragmentation.

5.9 Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | Virtual memory | Computer Systems: A Programmer’s Perspective | Ch. 9 | | Allocator design | Operating Systems: Three Easy Pieces | Ch. 17 | | C interfaces | C Interfaces and Implementations | Ch. 5 |

5.10 Implementation Phases

Phase 1: Foundation (3-4 days)

Goals:

  • Implement aligned bump allocation
  • Basic heap growth Tasks:
    1. Implement sbrk wrapper with logging.
    2. Return aligned pointers. Checkpoint: allocate and free in a test program.

Phase 2: Core Functionality (7-10 days)

Goals:

  • Free list with split/coalesce
  • realloc and calloc Tasks:
    1. Insert freed blocks into a list.
    2. Implement splitting and coalescing. Checkpoint: run tests for reuse and fragmentation.

Phase 3: Polish and Edge Cases (4-6 days)

Goals:

  • mmap for large blocks
  • debug stats and leak detection Tasks:
    1. Add mmap threshold and munmap on free.
    2. Add heap dump and corruption detection. Checkpoint: run /bin/ls with LD_PRELOAD.

5.11 Key Implementation Decisions

| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Search strategy | first-fit, best-fit, next-fit | first-fit | simplest and fast enough | | Coalescing | immediate, deferred | immediate | reduces fragmentation early | | Large alloc | sbrk only, mmap threshold | mmap threshold | return memory to OS |


6. Testing Strategy

6.1 Test Categories

| Category | Purpose | Examples | |———-|———|———-| | Unit Tests | Verify block metadata and alignment | alignment, header integrity | | Integration Tests | Run real programs | /bin/ls with LD_PRELOAD | | Edge Case Tests | Validate error handling | double-free, malloc(0) |

6.2 Critical Test Cases

  1. Reuse: malloc 100, free, malloc 100 -> no new sbrk.
  2. Coalesce: allocate A,B,C; free B and A -> one merged free block.
  3. Large alloc: request > threshold uses mmap.
  4. Corruption: overwrite boundary and detect via canary.

6.3 Test Data

requests: [8, 24, 64, 128, 512, 4096]

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

| Pitfall | Symptom | Solution | |———|———|———-| | Misaligned pointers | bus error on some CPUs | align to 16 bytes | | Free list corruption | random crashes | add canaries, validate list | | No coalescing | heap grows endlessly | merge adjacent blocks |

7.2 Debugging Strategies

  • Use strace to verify sbrk and mmap calls.
  • Add heap dumps after every allocation during early development.
  • Run with AddressSanitizer to catch writes beyond bounds.

7.3 Performance Traps

  • Linear free list search on every allocation can be O(n).
  • Coalescing with naive neighbor search can become O(n^2).

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add calloc and realloc support.
  • Add malloc_stats output.

8.2 Intermediate Extensions

  • Implement segregated free lists by size class.
  • Add small object caching (per-size bins).

8.3 Advanced Extensions

  • Thread-safe allocator with per-thread arenas.
  • Implement madvise for lazy page release.

9. Real-World Connections

9.1 Industry Applications

  • Memory allocators in libc, JVMs, and database engines.
  • High-performance services tune allocators for latency and fragmentation.
  • jemalloc
  • tcmalloc
  • mimalloc

9.3 Interview Relevance

  • Explains heap allocation, fragmentation, and OS interfaces.
  • Demonstrates data structure design under constraints.

10. Resources

10.1 Essential Reading

  • Computer Systems: A Programmer’s Perspective (Chapter 9)
  • Operating Systems: Three Easy Pieces (Chapter 17)

10.2 Video Resources

  • Systems programming lectures on memory allocation

10.3 Tools & Documentation

  • strace, gdb, valgrind, perf

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain virtual memory and the program break.
  • I can describe free list strategies and trade-offs.
  • I can explain why alignment matters.

11.2 Implementation

  • All allocator functions pass unit tests.
  • Heap dumps match expected layouts.
  • Double-free and corruption are detected.

11.3 Growth

  • I can describe allocator design in an interview.
  • I can compare my allocator to jemalloc at a high level.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • malloc and free work for basic tests.
  • LD_PRELOAD works with at least one external program.
  • Alignment is correct for 16-byte boundaries.

Full Completion:

  • All functions implemented (malloc, free, calloc, realloc).
  • Coalescing and splitting reduce fragmentation.
  • Deterministic demo passes with stable output.

Excellence (Going Above & Beyond):

  • Segregated free lists or bins implemented.
  • Thread-safe version with per-thread caches.