Systems Libraries & Runtimes — Expanded Project Guides

Phase 2 — Advanced Systems Track B

This track is about building the infrastructure that other software depends on. You’ll learn to write code that’s fast, correct across platforms, and doesn’t invoke undefined behavior.

About This Learning Path

These expanded project guides transform the original project specifications into comprehensive learning resources. Each project includes:

  • Deep Theoretical Foundation: Comprehensive explanations of underlying concepts
  • Complete Project Specification: Clear deliverables and acceptance criteria
  • Solution Architecture: Design patterns and component relationships (without spoiling the implementation)
  • Phased Implementation Guide: Step-by-step approach to build incrementally
  • Testing Strategy: How to verify your implementation works correctly
  • Common Pitfalls: Mistakes to avoid and debugging tips
  • Extensions & Challenges: Ways to go deeper after completing the core project
  • Self-Assessment Checklist: Verify your understanding before moving on

Core Concept Map

Concept What You Must Understand Projects
Memory Allocators Free lists, fragmentation, arena vs general-purpose, metadata overhead P01, P06
Threading Primitives Mutexes, atomics, memory ordering, lock-free algorithms P02, P06
Async Runtimes Event loops, futures/promises, IO multiplexing (epoll/kqueue), schedulers P03
ABI Details Calling conventions, struct layout, symbol visibility, name mangling P04
Platform Differences POSIX vs Windows syscalls, endianness, feature detection P03, P04
Performance Tuning Cache lines, branch prediction, SIMD, profiling P02, P05
Undefined Behavior Strict aliasing, alignment, integer overflow, pointer provenance P01, P04
API Design Ergonomics vs zero-cost, error handling, versioning All

Project Index

# Project Difficulty Time Key Focus Language
1 Custom Memory Allocator Advanced 3-4 weeks Memory, UB, Performance C
2 Work-Stealing Thread Pool Advanced 2-3 weeks Threading, Atomics, Cache C++
3 Mini Async Runtime Advanced 2-3 weeks Async, Platform IO, API Rust/C
4 Cross-Platform Syscall Abstraction Intermediate 2 weeks ABI, Platform, API C
5 High-Performance String Search Advanced 2-3 weeks SIMD, Performance, Profiling C
6 Embedded Key-Value Database (Capstone) Expert 1-2 months All concepts combined C

Projects in This Track

Project 1: Custom Memory Allocator

Language: C | Difficulty: Advanced | Time: 3-4 weeks

Build a production-ready memory allocator that can replace malloc in real programs via LD_PRELOAD. Learn why general-purpose allocators are slow, how jemalloc achieves scalability, and master the tradeoffs between speed, fragmentation, and thread safety.

Key Skills: Free lists, size classes, coalescing, thread-local caches, LD_PRELOAD deployment

Real-World Equivalent: jemalloc, tcmalloc, mimalloc


Project 2: Work-Stealing Thread Pool

Language: C++ | Difficulty: Advanced | Time: 2-3 weeks

Implement a high-performance thread pool with work-stealing scheduling similar to Rayon or Go’s runtime. Master atomic operations, memory ordering, and cache-aware programming through building lock-free data structures.

Key Skills: Chase-Lev deque, atomics, false sharing avoidance, worker scheduling

Real-World Equivalent: Rayon, Go runtime, Java ForkJoinPool


Project 3: Mini Async Runtime

Language: Rust or C | Difficulty: Advanced | Time: 2-3 weeks

Build a single-threaded async runtime capable of handling thousands of concurrent connections. Demystify async/await by implementing the event loop, futures, and reactor pattern from scratch.

Key Skills: epoll/kqueue, futures, wakers, non-blocking I/O, C10K problem

Real-World Equivalent: Tokio, libuv, io_uring


Project 4: Cross-Platform Syscall Abstraction Library

Language: C | Difficulty: Intermediate | Time: 2 weeks

Create a library that wraps platform-specific syscalls into a unified API, similar to libuv or Rust’s std. Discover why “POSIX” doesn’t mean identical and master the art of cross-platform development.

Key Skills: File descriptors vs HANDLEs, fork/exec vs CreateProcess, struct layout, feature detection

Real-World Equivalent: libuv, Rust std, Go runtime


Project 5: High-Performance String Search Library

Language: C with SIMD | Difficulty: Advanced | Time: 2-3 weeks

Build a SIMD-accelerated string search library that rivals ripgrep’s core. Learn why algorithm textbooks don’t teach what makes real tools fast, and master the art of feeding data to modern CPUs efficiently.

Key Skills: SSE/AVX2 intrinsics, CPU feature detection, cache-aware programming, UTF-8 handling

Real-World Equivalent: ripgrep, hyperscan, stringzilla


Project 6: Embedded Key-Value Database (Capstone)

Language: C | Difficulty: Expert | Time: 1-2 months

The “final boss” of this track. Build a persistent, thread-safe, embedded key-value store like a simplified RocksDB or LMDB—combining everything from all previous projects.

Key Skills: LSM trees, write-ahead logging, crash recovery, MVCC, memory-mapped I/O, bloom filters

Real-World Equivalent: RocksDB, LMDB, LevelDB


Learning Paths

Path 1: Systems Networking Focus

Goal: Build and understand async I/O and networking infrastructure

P03 Mini Async Runtime → P04 Syscall Abstraction → P06 Capstone

Outcome: Understand how Node.js, Tokio, and libuv work internally

Path 2: High-Performance Computing Focus

Goal: Master low-level performance optimization

P01 Memory Allocator → P02 Thread Pool → P05 String Search → P06 Capstone

Outcome: Write code that maximizes hardware utilization

Path 3: Interview Preparation Path

Goal: Build impressive portfolio projects for systems roles

P03 Mini Async Runtime → P02 Thread Pool → P01 Memory Allocator

Outcome: “I wrote a malloc” is a powerful interview signal

Goal: Deep understanding of all systems library fundamentals

P03 → P02 → P01 → P04 → P05 → P06

Start with async (high demand skill), then threading, then memory—building to the capstone.


Suggested Learning Progression

┌─────────────────────────────────────────────────────────────────────┐
│                     LEARNING PROGRESSION                             │
├─────────────────────────────────────────────────────────────────────┤
│                                                                       │
│  [P01: Memory Allocator]  ──────┐                                    │
│  Foundation: Memory management   │                                    │
│                                  │                                    │
│  [P02: Thread Pool]       ──────┼──→  [P03: Async Runtime]           │
│  Foundation: Concurrency        │      Combines threading + I/O       │
│                                  │                                    │
│  [P04: Cross-Platform]    ──────┤                                    │
│  Foundation: System interfaces  │                                    │
│                                  │                                    │
│  [P05: String Search]     ──────┴──→  [P06: KV Database Capstone]    │
│  Foundation: Performance         Final Boss: Combines everything      │
│                                                                       │
└─────────────────────────────────────────────────────────────────────┘

Prerequisites

Before starting this track, you should be comfortable with:

  • C Programming: Pointers, memory management, structs, function pointers
  • Systems Basics: Virtual memory, processes, file descriptors
  • Command Line: gcc/clang compilation, gdb debugging, make/cmake
  • Basic Threading: Creating threads, mutexes (will be deepened in projects)

Self-Assessment Questions

Before starting, can you answer these?

  1. What happens when you call malloc(100)? Where does the memory come from?
  2. What is a mutex? When would you use one vs an atomic variable?
  3. What is epoll and why is it better than select() for many connections?
  4. What does -fPIC do when compiling a shared library?
  5. What is a cache line and why does false sharing hurt performance?

If you can’t answer 3+ of these, review the prerequisite materials first.

Project Comparison

Project Depth Fun Resume Impact Interview Value
Memory Allocator ★★★★★ ★★★☆☆ ★★★★★ ★★★★★
Thread Pool ★★★★★ ★★★★☆ ★★★★☆ ★★★★★
Async Runtime ★★★★☆ ★★★★★ ★★★★★ ★★★★☆
Syscall Abstraction ★★★☆☆ ★★★☆☆ ★★★☆☆ ★★★☆☆
String Search ★★★★☆ ★★★★☆ ★★★★☆ ★★★★☆
KV Database ★★★★★ ★★★★★ ★★★★★ ★★★★★

Essential Books for This Track

Book Focus Areas
“Computer Systems: A Programmer’s Perspective” - Bryant & O’Hallaron Memory, caching, linking, virtual memory
“C Interfaces and Implementations” - David Hanson API design, memory management patterns
“Advanced Programming in the UNIX Environment” - Stevens & Rago POSIX system calls, process control
“Rust Atomics and Locks” - Mara Bos Memory ordering, lock-free programming
“The Linux Programming Interface” - Michael Kerrisk Deep Linux systems programming
“Modern X86 Assembly Language Programming” - Kusswurm SIMD, CPU architecture
“Designing Data-Intensive Applications” - Martin Kleppmann Database internals, LSM trees
“Database Internals” - Alex Petrov Storage engines, WAL, B-trees

These projects will transform you from someone who uses system libraries to someone who understands how they work at the deepest level.