C INTERFACES AND IMPLEMENTATIONS MASTERY
Published in 1996, C Interfaces and Implementations emerged from David Hanson's work on lcc, a retargetable C compiler that he co-developed with Christopher Fraser. The book wasn't an accident—it was the distillation of hard-won lessons from building production-quality systems software.
Sprint: C Interfaces and Implementations Mastery - Building Reusable Software from First Principles
Goal: Master the art of building bulletproof, reusable C libraries through David Hanson’s disciplined interface-based programming methodology. You’ll understand why separating interface from implementation creates maintainable code, how to design APIs that are impossible to misuse, and how to build a complete ecosystem of data structures and utilities that work together seamlessly. By the end, you’ll think like a library designer, not just a programmer.
Why “C Interfaces and Implementations” Matters
The Book That Changed How We Write C
Published in 1996, “C Interfaces and Implementations” emerged from David Hanson’s work on lcc, a retargetable C compiler that he co-developed with Christopher Fraser. The book wasn’t an accident—it was the distillation of hard-won lessons from building production-quality systems software.
The problem Hanson solved: C is a minimal language by design. It gives you raw power but almost no guardrails. The standard library provides basic I/O and string manipulation, but nothing for hash tables, sets, arbitrary-precision arithmetic, or even proper assertions. Worse, C has no module system—just textual inclusion of header files with all the namespace pollution and ordering headaches that implies.
Every serious C project ends up reinventing the same wheels: memory allocators, container types, error handling. And every team makes different choices, leading to codebases that are internally inconsistent and externally incompatible.
Hanson’s solution was a discipline—a systematic approach to building reusable components that:
- Hide implementation details completely (opaque types)
- Make ownership and lifetime explicit
- Fail loudly and early on misuse
- Compose together without leaking abstractions
Why It’s Still Relevant Today
You might think a 1996 book about C is obsolete. Look closer:
Modern C Projects Using Hanson's Patterns
─────────────────────────────────────────────────────────────────────────
SQLite (2000-present)
├── Opaque handles: sqlite3*, sqlite3_stmt*
├── Checked allocation: sqlite3_malloc(), sqlite3_free()
├── Interface/Implementation split: sqlite3.h vs 150K lines of .c
└── Used by: Chrome, Firefox, iOS, Android, macOS
Redis (2009-present)
├── sds (Simple Dynamic Strings): exactly Hanson's string abstraction
├── Custom allocators: zmalloc/zfree wrapper around malloc
├── Object system: redisObject with type tags
└── Used by: Twitter, GitHub, StackOverflow, Airbnb
Linux Kernel (1991-present)
├── Opaque types: struct file*, struct inode*
├── Interface headers: include/linux/*.h (public)
├── Implementation: fs/*.c, kernel/*.c (private)
├── Disciplined memory: kmalloc, kfree with explicit flags
└── Used by: 70% of world's servers
Git (2005-present)
├── Object abstraction: tree, blob, commit all behind common interface
├── Memory pools: for batch operations
├── Clean header boundaries: cache.h, object.h, refs.h
└── Used by: 95%+ of software developers
The 2024 Reality: C remains the language of systems programming. According to the TIOBE Index, C is consistently in the top 2 languages by popularity. The Linux kernel, PostgreSQL, Python’s CPython interpreter, and the majority of embedded systems are all C. Hanson’s patterns aren’t academic—they’re survival skills.
The Interface/Implementation Pattern
This is the core insight of the entire book:
┌─────────────────────────────────────────────────────────────────────────────┐
│ THE INTERFACE/IMPLEMENTATION PATTERN │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ CLIENT CODE LIBRARY │
│ ─────────── ─────── │
│ │
│ #include "table.h" table.h (INTERFACE) │
│ ───────────────── ────────────────── │
│ Table_T t; ┌────────────────────────────────┐ │
│ t = Table_new(100, ...); ───► │ typedef struct Table_T │ │
│ Table_put(t, key, val); │ *Table_T; // OPAQUE! │ │
│ val = Table_get(t, key); │ │ │
│ Table_free(&t); │ Table_T Table_new(...); │ │
│ │ void *Table_put(...); │ │
│ │ void *Table_get(...); │ │
│ │ void Table_free(...); │ │
│ Client sees: functions └────────────────────────────────┘ │
│ Client CANNOT see: │ │
│ - struct internals │ │
│ - bucket array ▼ │
│ - hash function table.c (IMPLEMENTATION) │
│ - collision resolution ───────────────────────── │
│ ┌────────────────────────────────┐ │
│ │ struct Table_T { │ │
│ │ int size; │ │
│ │ int length; │ │
│ │ struct binding { │ │
│ │ void *key; │ │
│ │ void *value; │ │
│ │ struct binding *link; │ │
│ │ } **buckets; │ │
│ │ int (*cmp)(void*, void*); │ │
│ │ unsigned (*hash)(void*); │ │
│ │ }; │ │
│ └────────────────────────────────┘ │
│ │
│ KEY INSIGHT: │
│ ─────────── │
│ The .h file is a CONTRACT. It defines WHAT, not HOW. │
│ The .c file is SECRET. It can change without breaking clients. │
│ The typedef *pointer makes the struct truly opaque. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
C Usage in Systems Programming Today
┌─────────────────────────────────────────────────────────────────────────────┐
│ C IN THE MODERN WORLD │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Operating Systems │
│ ───────────────── │
│ Linux Kernel ........... 30+ million lines of C │
│ Windows Kernel ......... Mostly C/C++ │
│ macOS/iOS XNU .......... 1.5+ million lines of C │
│ FreeBSD ................ 2+ million lines of C │
│ │
│ Databases │
│ ───────── │
│ SQLite ................. 150K lines, most widely deployed DB │
│ PostgreSQL ............. 1.4M lines of C │
│ MySQL .................. Core engine in C │
│ Redis .................. 100K lines of C │
│ │
│ Language Runtimes │
│ ───────────────── │
│ CPython ................ Python's reference implementation │
│ Ruby (CRuby/MRI) ....... Ruby's main implementation │
│ Lua .................... 30K lines, model of clean C │
│ PHP .................... Zend Engine │
│ │
│ Infrastructure │
│ ────────────── │
│ OpenSSL/LibreSSL ....... TLS for the internet │
│ curl ................... "The glue of the internet" │
│ Nginx .................. 50% of web servers │
│ Git .................... Version control for the world │
│ │
│ Embedded & IoT │
│ ───────────── │
│ 90%+ of microcontroller code is C │
│ FreeRTOS, Zephyr, embedded Linux │
│ Automotive, medical devices, aerospace │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Prerequisites & Background Knowledge
Essential Prerequisites
| Skill | What You Need | How to Verify |
|---|---|---|
| C Fundamentals | Pointers, pointer arithmetic, structs, unions | Can you implement a linked list from scratch? |
| Memory Model | Stack vs heap, malloc/free lifecycle | Can you explain where a local variable lives vs a malloc’d buffer? |
| Header Files | #include, header guards, declarations vs definitions | Can you explain why you get “multiple definition” errors? |
| Build Process | Compile vs link, .o files, static libraries | Can you build a project with multiple .c files using make? |
| Data Structures | Linked lists, hash tables, trees (conceptually) | Do you understand O(1) vs O(n) lookup? |
Recommended Background
Before starting, ideally you’ve read:
Required:
- “The C Programming Language” (K&R) - Chapters 1-6 at minimum
Strongly Recommended:
- “Effective C” by Robert C. Seacord - Modern C best practices
- Experience debugging with GDB or LLDB
Helpful but Optional:
- “Expert C Programming” by Peter van der Linden - Deep tricks
- “21st Century C” by Ben Klemens - Modern tooling
Self-Assessment Questions
Answer these before starting. If you struggle with more than 2, review the prerequisites:
- What’s wrong with this code?
char *get_greeting() { char buf[100]; sprintf(buf, "Hello, World!"); return buf; } -
Why does
sizeof(struct { char c; int i; })return 8, not 5? -
What happens when you call
free()on the same pointer twice? - Given this header:
// mylib.h struct Thing; struct Thing *thing_new(void);Why can’t a client do
struct Thing t; t.value = 5;? - What’s the difference between:
#include <stdio.h> // vs #include "myheader.h" - If you
malloc(100)and onlyfree()50 bytes of it, what happens to the other 50? (Trick question!)
Development Environment Setup
Required Tools
──────────────
□ C Compiler: GCC (gcc) or Clang (clang)
- Verify: gcc --version OR clang --version
- Need: C11 support minimum (-std=c11)
□ Build System: Make (GNU Make preferred)
- Verify: make --version
- The book's code uses make
□ Memory Debugger: Valgrind (Linux) or AddressSanitizer
- Linux: valgrind --leak-check=full ./your_program
- macOS: clang -fsanitize=address -g your_code.c
□ Debugger: GDB (Linux) or LLDB (macOS)
- Essential for understanding crashes
□ Text Editor or IDE with C support
- VS Code with C/C++ extension
- Vim/Neovim with coc.nvim or similar
- CLion if you prefer full IDE
Optional but Recommended
────────────────────────
□ Cppcheck: Static analysis for C
□ ctags/cscope: Code navigation
□ Doxygen: Documentation generation
□ gcov/lcov: Code coverage
Recommended Compiler Flags
──────────────────────────
CFLAGS = -Wall -Wextra -Wpedantic -std=c11 -g -O2
For development:
CFLAGS += -fsanitize=address -fsanitize=undefined
For production:
CFLAGS += -O3 -DNDEBUG
Time Investment Estimates
| Your Background | Estimated Time | Notes |
|---|---|---|
| Experienced C Developer | 4-6 weeks | You’ll appreciate the discipline, focus on patterns |
| Know C, Learning Library Design | 6-10 weeks | Core journey, best ROI |
| Learning C + Hanson Together | 12-16 weeks | Doable but challenging; K&R first recommended |
Per-Project Estimates:
- Foundation modules (Assert, Mem, Atom): 1-2 days each
- Data structure modules (List, Table, Set): 3-5 days each
- Advanced modules (Arena, XP, AP): 5-7 days each
- Integration project: 2-3 weeks
Core Concept Analysis
1. The Interface/Implementation Pattern
The Central Question: How do you write C code that clients can use without understanding (or depending on) its internal structure?
┌─────────────────────────────────────────────────────────────────────────────┐
│ OPAQUE TYPES: THE CORE TECHNIQUE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ THE PROBLEM: │
│ ──────────── │
│ In "normal" C, if a client can see a struct definition, they can: │
│ • Access any field directly: obj.internal_counter++ │
│ • Make assumptions about size: memcpy(&obj, src, sizeof(obj)) │
│ • Depend on field order: char *name = (char*)&obj + 4; │
│ │
│ Any change to the struct BREAKS ALL CLIENT CODE. │
│ │
│ THE SOLUTION: Opaque Types │
│ ────────────────────────────── │
│ │
│ Header (stack.h): Client code: │
│ ───────────────── ──────────── │
│ typedef struct Stack_T #include "stack.h" │
│ *Stack_T; // Forward decl Stack_T s = Stack_new(); │
│ // + pointer! Stack_push(s, data); │
│ void *top = Stack_pop(s); │
│ Stack_T Stack_new(void); Stack_free(&s); │
│ void Stack_push(Stack_T, void*); │
│ void *Stack_pop(Stack_T); // Client CANNOT: │
│ void Stack_free(Stack_T *); // s->top = NULL; // Error! │
│ // sizeof(*s) // Error! │
│ │
│ Implementation (stack.c): │
│ ───────────────────────── │
│ struct Stack_T { │
│ int count; │
│ int size; │
│ void **elements; │
│ }; │
│ │
│ Stack_T Stack_new(void) { │
│ Stack_T s = malloc(sizeof(*s)); │
│ s->count = 0; │
│ s->size = 16; │
│ s->elements = malloc(s->size * sizeof(void*)); │
│ return s; │
│ } │
│ │
│ WHY THIS WORKS: │
│ ─────────────── │
│ 1. Client includes stack.h → sees only "Stack_T is a pointer to something" │
│ 2. sizeof(Stack_T) = sizeof(void*) → client can store/pass the handle │
│ 3. Implementation can change struct layout WITHOUT recompiling clients │
│ 4. Client cannot corrupt internal state (no field access) │
│ │
│ THE HANDLE PATTERN: │
│ ────────────────── │
│ │
│ ┌──────────────────────┐ │
│ │ Client's variable │ │
│ │ Stack_T s ───────────┼───► [ Pointer (8 bytes) ] │
│ └──────────────────────┘ │ │
│ │ points to │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Heap-allocated struct │ │
│ │ ┌─────────────────────────┐ │ │
│ │ │ count: 3 │ │ │
│ │ │ size: 16 │ │ │
│ │ │ elements: ──────────────┼─┼──► [ptr][ptr].. │
│ │ └─────────────────────────┘ │ │
│ └─────────────────────────────┘ │
│ │
│ This is identical to FILE* from stdio - you never see inside a FILE. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Key Insight: The typedef struct X *X pattern makes X both a type name AND automatically a pointer. Clients work with handles, never raw structs.
2. Checked Runtime Errors
The Central Question: How do you catch bugs immediately instead of letting them corrupt data silently?
┌─────────────────────────────────────────────────────────────────────────────┐
│ DEFENSIVE PROGRAMMING WITH ASSERTIONS │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ THE PHILOSOPHY: │
│ ────────────── │
│ "If a precondition is violated, the program is ALREADY broken. │
│ Crashing immediately is a FAVOR to the developer." │
│ │
│ — David Hanson (paraphrased) │
│ │
│ STANDARD assert() IS WEAK: │
│ ────────────────────────── │
│ │
│ assert(ptr != NULL); // Output: "Assertion failed: ptr != NULL" │
│ // Where? Which function? Which call? │
│ │
│ HANSON'S Assert MODULE: │
│ ─────────────────────── │
│ │
│ #include "assert.h" │
│ │
│ void Table_put(Table_T table, void *key, void *value) { │
│ assert(table); // Precondition: table must be valid │
│ assert(key); // Precondition: key must not be NULL │
│ // ... implementation │
│ } │
│ │
│ Hanson's assert does: │
│ 1. Raises SIGABRT with useful message │
│ 2. Includes file:line:function context │
│ 3. Can be disabled with NDEBUG (like standard assert) │
│ 4. Sets foundation for Except module (exceptions via setjmp/longjmp) │
│ │
│ TYPES OF CHECKED ERRORS: │
│ ──────────────────────── │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ Error Type │ Example │ Response │ │
│ ├───────────────────┼──────────────────────────┼───────────────────────┤ │
│ │ Precondition │ NULL pointer argument │ assert() → crash │ │
│ │ Postcondition │ Invalid return state │ assert() → crash │ │
│ │ Invariant │ Corrupt internal data │ assert() → crash │ │
│ │ Resource error │ malloc returns NULL │ RAISE exception │ │
│ │ External failure │ File not found │ Return error code │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
│ THE FAIL-FAST PRINCIPLE: │
│ ──────────────────────── │
│ │
│ Bug occurs ──► Data corrupted ──► Crash later ──► Debug nightmare │
│ ▲ │
│ │ STOP IT HERE │
│ │ │
│ Bug occurs ──► assert() fires ──► Immediate crash with context │
│ "table.c:47: Table_put: table is NULL" │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
3. Memory Management Philosophy
The Central Question: How do you make memory allocation safe, debuggable, and appropriate to your usage patterns?
┌─────────────────────────────────────────────────────────────────────────────┐
│ WRAPPED MEMORY ALLOCATION │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ WHY WRAP malloc/free? │
│ ───────────────────── │
│ │
│ Raw malloc problems: │
│ • Returns NULL on failure (rarely checked!) │
│ • No tracking of allocations │
│ • No debugging info (where allocated? how big?) │
│ • Double-free and use-after-free are silent corruption │
│ │
│ THE Mem MODULE: │
│ ─────────────── │
│ │
│ // mem.h │
│ extern void *Mem_alloc(long nbytes, const char *file, int line); │
│ extern void *Mem_calloc(long count, long nbytes, const char *file, int); │
│ extern void Mem_free(void *ptr, const char *file, int line); │
│ extern void *Mem_resize(void *ptr, long nbytes, const char *file, int); │
│ │
│ // Usage via macros: │
│ #define ALLOC(nbytes) Mem_alloc((nbytes), __FILE__, __LINE__) │
│ #define FREE(ptr) ((void)(Mem_free((ptr), __FILE__, __LINE__), \ │
│ (ptr) = NULL)) │
│ #define RESIZE(ptr, n) Mem_resize((ptr), (n), __FILE__, __LINE__) │
│ │
│ WHAT THIS GIVES YOU: │
│ ─────────────────── │
│ │
│ 1. NEVER returns NULL - raises exception on failure │
│ 2. Every allocation tagged with file:line │
│ 3. FREE() macro sets pointer to NULL (catches use-after-free) │
│ 4. Can log all allocations for leak detection │
│ 5. Can add guard bytes to detect buffer overflows │
│ │
│ ALLOCATION TRACKING (debug mode): │
│ ───────────────────────────────── │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Address Size File:Line │ │
│ ├─────────────────────────────────────────────────────────────────────┤ │
│ │ 0x7f8a4000 1024 table.c:47 │ │
│ │ 0x7f8a4400 256 atom.c:23 │ │
│ │ 0x7f8a4500 64 main.c:12 │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ On program exit: "3 blocks still allocated, 1344 bytes leaked" │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ ARENA ALLOCATION │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ THE PROBLEM: │
│ ──────────── │
│ Many programs allocate lots of small objects with the SAME LIFETIME: │
│ • Compiler: AST nodes for one function │
│ • Parser: tokens for one file │
│ • Request: all data for one HTTP request │
│ │
│ Tracking individual frees is wasteful and error-prone. │
│ │
│ THE SOLUTION: Arena (Memory Pool) │
│ ───────────────────────────────── │
│ │
│ Normal allocation: Arena allocation: │
│ │
│ ┌─────────────────────────┐ ┌─────────────────────────┐ │
│ │ ptr1 = malloc(32) │ │ Arena a = Arena_new() │ │
│ │ ptr2 = malloc(64) │ │ ptr1 = Arena_alloc(a,32)│ │
│ │ ptr3 = malloc(128) │ │ ptr2 = Arena_alloc(a,64)│ │
│ │ ...100 more allocations │ │ ptr3 = Arena_alloc(a,..)│ │
│ │ free(ptr1) │ │ ...100 more allocations │ │
│ │ free(ptr2) │ │ │ │
│ │ free(ptr3) │ │ Arena_dispose(&a) │ │
│ │ ...pray you didn't miss │ │ // ONE call frees ALL │ │
│ │ any or double-free │ │ // No leaks possible! │ │
│ └─────────────────────────┘ └─────────────────────────┘ │
│ │
│ ARENA INTERNALS: │
│ ──────────────── │
│ │
│ Arena_T │
│ │ │
│ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Chunk 1 │───►│ Chunk 2 │───►│ Chunk 3 │ │
│ │ [allocated] │ │ [allocated] │ │ [current] │ │
│ │ [data.....] │ │ [data.....] │ │ [data] │ │
│ │ [data.....] │ │ [data.....] │ │ [free..] │◄── avail pointer │
│ │ [data.....] │ │ [data.....] │ │ [.......] │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
│ Arena_alloc: Bump the pointer, return old position │
│ Arena_dispose: Walk list, free each chunk. DONE. │
│ │
│ BENEFITS: │
│ ──────── │
│ • Allocation: O(1) - just pointer bump │
│ • Deallocation: O(n) where n = number of CHUNKS, not objects │
│ • No fragmentation within arena │
│ • Cache-friendly (sequential allocation) │
│ • Impossible to leak or double-free individual objects │
│ │
│ REAL-WORLD USE: │
│ ─────────────── │
│ • Nginx: memory pools per request │
│ • Apache: apr_pool for request lifetime │
│ • Compilers: per-function arenas for AST │
│ • Game engines: per-frame allocators │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
4. Abstract Data Types in C
The Central Question: How do you achieve polymorphism and encapsulation in a language without classes?
┌─────────────────────────────────────────────────────────────────────────────┐
│ OBJECTS IN C (HANSON STYLE) │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ THE PATTERN: │
│ ──────────── │
│ │
│ 1. Opaque type (handle) hides data │
│ 2. Functions operate on handle (methods) │
│ 3. Function pointers enable polymorphism │
│ 4. Naming convention: Module_operation (Table_put, List_push) │
│ │
│ EXAMPLE: A Generic Container │
│ ──────────────────────────── │
│ │
│ // table.h │
│ typedef struct Table_T *Table_T; │
│ │
│ Table_T Table_new(int hint, │
│ int cmp(const void *x, const void *y), // Compare function │
│ unsigned hash(const void *key)); // Hash function │
│ │
│ void *Table_put(Table_T table, const void *key, void *value); │
│ void *Table_get(Table_T table, const void *key); │
│ void *Table_remove(Table_T table, const void *key); │
│ void Table_free(Table_T *table); │
│ │
│ // Usage: Tables with different key types │
│ Table_T string_table = Table_new(100, string_cmp, string_hash); │
│ Table_T int_table = Table_new(100, int_cmp, int_hash); │
│ │
│ POLYMORPHISM THROUGH FUNCTION POINTERS: │
│ ─────────────────────────────────────── │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Table_T │ │
│ │ ├── cmp ──────► int (*)(void*, void*) ──► string_cmp() │ │
│ │ │ [function pointer] OR int_cmp() │ │
│ │ │ [POLYMORPHISM!] OR custom_cmp() │ │
│ │ │ │ │
│ │ ├── hash ─────► unsigned (*)(void*) ──► string_hash() │ │
│ │ │ [function pointer] OR int_hash() │ │
│ │ │ │ │
│ │ └── buckets ──► [ array of linked lists containing data ] │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ The Table doesn't know what keys are - it just calls cmp() and hash(). │
│ This is duck typing in C: "If it compares and hashes, it's a key." │
│ │
│ INFORMATION HIDING IN ACTION: │
│ ───────────────────────────── │
│ │
│ Version 1: Linked list buckets Version 2: Open addressing │
│ struct Table_T { struct Table_T { │
│ struct binding **buckets; struct slot *slots; │
│ int size; int capacity; │
│ ... ... │
│ }; }; │
│ │
│ CLIENTS DON'T KNOW AND DON'T CARE. │
│ Same .h file. Same API. Different performance characteristics. │
│ │
│ THE NAMING CONVENTION: │
│ ───────────────────── │
│ │
│ Module_T - The type (Table_T, List_T, Set_T) │
│ Module_new - Constructor │
│ Module_free - Destructor (takes T* to NULL the pointer) │
│ Module_verb - Operations (Table_put, List_push, Set_member) │
│ │
│ This is a namespace convention. C doesn't have namespaces, so we fake it. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
5. Literate Programming Influence
The Central Question: How should code be organized to be understood by humans, not just compilers?
┌─────────────────────────────────────────────────────────────────────────────┐
│ CODE ORGANIZATION & DOCUMENTATION │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ HANSON'S APPROACH (from cweb/noweb heritage): │
│ ──────────────────────────────────────────── │
│ │
│ The book itself is an example of literate programming: │
│ • Code and explanation interleaved │
│ • Each module has INTERFACE section, then IMPLEMENTATION │
│ • Concepts explained before code that implements them │
│ │
│ FOR YOUR IMPLEMENTATIONS, follow this structure: │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ module.h (Interface) │ │
│ │ ───────────────────── │ │
│ │ /* │ │
│ │ * Module: One-line description │ │
│ │ * │ │
│ │ * Overview paragraph explaining WHAT and WHY. │ │
│ │ * │ │
│ │ * Ownership: Who owns returned pointers? When to free? │ │
│ │ * │ │
│ │ * Thread safety: Is it safe? What locks are needed? │ │
│ │ */ │ │
│ │ │ │
│ │ #ifndef MODULE_INCLUDED │ │
│ │ #define MODULE_INCLUDED │ │
│ │ │ │
│ │ /* Types */ │ │
│ │ typedef struct Module_T *Module_T; │ │
│ │ │ │
│ │ /* Exceptions (if any) */ │ │
│ │ extern const Except_T Module_Error; │ │
│ │ │ │
│ │ /* Functions grouped by purpose */ │ │
│ │ /* -- Creation and destruction */ │ │
│ │ extern Module_T Module_new(...); │ │
│ │ extern void Module_free(Module_T *m); │ │
│ │ │ │
│ │ /* -- Primary operations */ │ │
│ │ extern void Module_op1(Module_T m, ...); │ │
│ │ extern Result Module_op2(Module_T m, ...); │ │
│ │ │ │
│ │ #endif │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ module.c (Implementation) │ │
│ │ ───────────────────────── │ │
│ │ │ │
│ │ #include "module.h" │ │
│ │ #include "mem.h" │ │
│ │ #include "assert.h" │ │
│ │ │ │
│ │ /* Internal constants and macros */ │ │
│ │ #define DEFAULT_SIZE 64 │ │
│ │ │ │
│ │ /* The representation - NEVER visible to clients */ │ │
│ │ struct Module_T { │ │
│ │ int field1; │ │
│ │ char *field2; │ │
│ │ ... │ │
│ │ }; │ │
│ │ │ │
│ │ /* Static (file-scope) helper functions */ │ │
│ │ static int helper(Module_T m) { │ │
│ │ assert(m); │ │
│ │ ... │ │
│ │ } │ │
│ │ │ │
│ │ /* Public function implementations */ │ │
│ │ Module_T Module_new(...) { │ │
│ │ Module_T m; │ │
│ │ NEW(m); /* Checked allocation */ │ │
│ │ ... │ │
│ │ return m; │ │
│ │ } │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ KEY DOCUMENTATION PRINCIPLES: │
│ ───────────────────────────── │
│ │
│ 1. The .h file is the primary documentation │
│ - A reader should understand usage from .h alone │
│ - Comments explain contract, not implementation │
│ │
│ 2. Implementation comments explain WHY, not WHAT │
│ - "Resize when load > 0.75 for amortized O(1)" ✓ │
│ - "Loop through array" ✗ │
│ │
│ 3. Assertion ARE documentation │
│ - assert(ptr) means "ptr must not be NULL" │
│ - assert(n > 0) means "n must be positive" │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Concept Summary Table
| Concept Cluster | What You Need to Internalize | Common Mistakes |
|---|---|---|
| Opaque Types | Hide struct internals behind typedef struct X *X |
Exposing struct definition in header |
| Checked Allocation | Every malloc wrapped, every error caught or raised | Ignoring malloc return value |
| Interface Discipline | .h files are contracts, .c files are secrets | Leaking implementation details to headers |
| Arena Memory | Bulk allocate, bulk free - no tracking individual objects | Mixing arena and regular allocation lifetimes |
| Assertion Philosophy | Fail fast, fail loud, fail with information | Assertions that can be false in valid usage |
| Function Pointers | Polymorphism via compare/hash function parameters | Hardcoding behavior that should be pluggable |
| Naming Conventions | Module_Type, Module_new, Module_operation | Inconsistent naming across modules |
| Ownership Semantics | Document who owns returned pointers explicitly | Ambiguous ownership leading to leaks or double-free |
Deep Dive Reading by Concept
Primary: “C Interfaces and Implementations” by David R. Hanson
| Concept | CII Chapters | Key Pages |
|---|---|---|
| Interface/Implementation Pattern | Ch. 1, 2 | The entire philosophy |
| Assertions | Ch. 4 | Assert module |
| Checked Memory | Ch. 5, 6 | Mem module, Arena module |
| Atoms (Interning) | Ch. 3 | Atom module |
| Exceptions | Ch. 4 | Except module |
| Data Structures | Ch. 7-14 | List, Table, Set, Seq, Ring, Bit, etc. |
| Strings | Ch. 15, 16 | Str, Text modules |
| Arithmetic | Ch. 17-19 | XP, AP, MP modules |
| Threading | Ch. 20 | Thread module |
Supporting Books
| Concept | Book & Chapter | Why It Helps |
|---|---|---|
| C Language Fundamentals | “The C Programming Language” (K&R), All | Foundation everything builds on |
| Modern C Practices | “Effective C” Ch. 1-6 | Memory, types, expressions, arrays |
| Debugging C | “The Art of Debugging” Ch. 1-5 | GDB/Valgrind fluency |
| Deep C Quirks | “Expert C Programming” by van der Linden | Understanding C’s dark corners |
| Systems Context | “Computer Systems: A Programmer’s Perspective” | How C maps to machine |
| API Design | “C Programming: A Modern Approach” Ch. 15-19 | Program organization |
Online Resources
| Resource | URL | Value |
|---|---|---|
| CII Source Code | https://github.com/drh/cii | Official implementation to study |
| Hanson’s Home Page | https://www.cs.princeton.edu/~drh/ | Papers and errata |
| “Redis Internals” | https://redis.io/docs/reference/internals/ | Real-world application of patterns |
| SQLite Architecture | https://sqlite.org/arch.html | Production opaque handle design |
Quick Start: Your First 48 Hours
Day 1: Build the Foundation (4-6 hours)
Morning: Environment Setup
- Clone or download the CII source code from Hanson’s repository
- Set up your build environment (gcc, make, valgrind)
- Build the existing library - make sure all tests pass
- Read Chapter 1-2 of the book
Afternoon: Assert Module
- Read the assert.h interface carefully
- Implement your own version from scratch (don’t copy!)
- Write tests that trigger assertions
- Verify with valgrind that you have no memory issues
Key Checkpoint: You can explain:
- Why assertions should crash, not return error codes
- How FILE and LINE macros work
- What NDEBUG does and when to use it
Day 2: Memory Foundation (4-6 hours)
Morning: Mem Module
- Read Chapter 5 (Mem interface)
- Study how Mem wraps malloc/free
- Implement ALLOC, FREE, RESIZE macros
- Add allocation tracking for debugging
Afternoon: Atom Module
- Read Chapter 3 (Atom interface)
- Understand string interning concept
- Implement Atom_string, Atom_int, Atom_length
- Write tests that verify atom uniqueness
Key Checkpoint: You can explain:
- Why Mem_alloc never returns NULL
- What string interning is and why it’s useful
- How atoms enable pointer comparison for strings
Recommended Learning Paths
Path 1: The Foundations Builder (4 weeks)
For developers who want the core patterns without every module.
Week 1: Assert → Mem → Arena
Week 2: Atom → Except
Week 3: List → Table
Week 4: Set → Build a mini-project using all modules
You'll have: Core patterns, 7 modules, ability to build real libraries
Path 2: The Data Structures Focus (6 weeks)
For developers who want to implement serious container libraries.
Week 1: Assert → Mem
Week 2: Atom → List (functional linked lists)
Week 3: Table (hash tables) → Set (set operations)
Week 4: Array (dynamic arrays) → Seq (deques)
Week 5: Ring (circular buffers) → Bit (bit vectors)
Week 6: Integration project: combine for a real application
You'll have: 10 modules, complete container library, real implementation experience
Path 3: The Completionist (10-12 weeks)
For developers who want to master every module and the philosophy deeply.
Weeks 1-2: Foundation (Assert, Mem, Arena, Atom, Except)
Weeks 3-4: Data Structures I (List, Table, Set)
Weeks 5-6: Data Structures II (Array, Seq, Ring, Bit)
Weeks 7-8: Strings (Fmt, Str, Text)
Weeks 9-10: Arithmetic (XP, AP, MP)
Week 11: Threading (Thread module)
Week 12: Capstone: Mini Standard Library
You'll have: All 20 modules, deep understanding of C library design
Project List
The following projects mirror the modules in Hanson’s book, each teaching specific aspects of interface-based programming.
| # | Project | What You’ll Build | Key Concept |
|---|---|---|---|
| 1 | Assert Module | Runtime checking with informative errors | Fail-fast philosophy, preprocessor macros |
| 2 | Mem Module | Checked memory allocation | Wrapping system calls, debugging aids |
| 3 | Arena Module | Bulk memory management | Memory pools, lifetime-based allocation |
| 4 | Atom Module | String interning | Hash tables, pointer identity |
| 5 | Except Module | Exception handling with setjmp/longjmp | Non-local jumps, cleanup handlers |
| 6 | List Module | Functional-style linked lists | Immutable operations, recursion |
| 7 | Table Module | Generic hash tables | Function pointers for polymorphism |
| 8 | Set Module | Mathematical set operations | Generic containers, hash-based lookup |
| 9 | Array Module | Dynamic arrays with bounds checking | Resizable arrays, index safety |
| 10 | Seq Module | Double-ended queues | Efficient push/pop at both ends |
| 11 | Ring Module | Circular buffers | Fixed-size bounded buffers |
| 12 | Bit Module | Bit vector operations | Compact boolean sets, bitwise ops |
| 13 | Fmt Module | Extensible printf | Variadic functions, format strings |
| 14 | Str Module | Low-level string operations | Pointer arithmetic, slices |
| 15 | Text Module | High-level immutable strings | Copy-on-write, position abstraction |
| 16 | XP Module | Extended precision arithmetic | Arbitrary-size unsigned integers |
| 17 | AP Module | Arbitrary precision signed integers | Sign handling, negation |
| 18 | MP Module | Multiple precision fixed integers | Fixed-width large integers |
| 19 | Thread Module | Cooperative threading | setjmp/longjmp coroutines |
| 20 | Integration: Mini Standard Library | Combine all modules into a cohesive library | Library packaging, testing, documentation |
Project 1: Assert Module - Runtime Checking That Actually Helps
- File: P01-ASSERT-MODULE.md
- Main Programming Language: C
- Alternative Programming Languages: Rust (for comparison of panic! vs assert), Zig (for comparison of @panic)
- Coolness Level: Level 3 - Genuinely Clever
- Business Potential: Level 1 - Resume Gold
- Difficulty: Level 2 - Intermediate
- Knowledge Area: Error Handling, Macros, Debugging
- Software or Tool: GCC, GDB
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 4
What you’ll build: A custom assertion module that provides informative failure messages, includes file/line/expression information, integrates with debugging tools, and supports both simple checks and formatted diagnostic messages.
Why it teaches the topic: The standard C assert() macro is bare-bones - it tells you an assertion failed but provides minimal context for debugging. By building your own, you’ll understand variadic macros, the preprocessor’s role in debugging, and why assertions are the first line of defense against bugs. Hanson’s assert establishes the pattern used by all other modules.
Core challenges you’ll face:
- Variadic macro design -> Maps to understanding C11
__VA_ARGS__and__VA_OPT__ - Source location capture -> Maps to understanding
__FILE__,__LINE__,__func__ - Conditional compilation -> Maps to understanding NDEBUG and release builds
- Abort vs return -> Maps to understanding when to crash vs when to recover
- Integration with Except -> Maps to understanding how assertion failures can raise exceptions
Real World Outcome
What you will see:
- A header file:
assert.hthat defines your assertion macros - An implementation file:
assert.cwith the failure handler - Test programs: Demonstrating assertion failures with clear diagnostics
- Integration tests: Showing how assertions catch bugs immediately
Command Line Outcome Example:
# 1. Compile a test program with assertions enabled (default)
$ gcc -std=c11 -g -o test_assert test_assert.c assert.c
# 2. Run a program that violates a simple assertion
$ ./test_assert null_ptr
Assertion failed: ptr != NULL
Expression: ptr != NULL
File: test_assert.c
Line: 42
Function: process_data
Aborted (core dumped)
# 3. Run with a formatted message assertion
$ ./test_assert bounds
Assertion failed: index < size
Expression: index < size
File: test_assert.c
Line: 67
Function: get_element
Message: Array index 100 out of bounds (array size is 50)
Aborted (core dumped)
# 4. The stack trace in GDB shows exactly where the bug is
$ gdb ./test_assert core
(gdb) bt
#0 __GI_abort () at abort.c:79
#1 0x0000555555555234 in Assert_fail (expr=0x555555556012 "ptr != NULL",
file=0x555555556004 "test_assert.c", line=42,
func=0x555555556020 "process_data", ...) at assert.c:28
#2 0x00005555555552a1 in process_data (data=0x0, n=10)
at test_assert.c:42
#3 0x0000555555555312 in main () at test_assert.c:58
# 5. Compile with NDEBUG to disable assertions for release
$ gcc -std=c11 -DNDEBUG -O3 -o test_release test_assert.c assert.c
$ ./test_release null_ptr
# (runs without assertion check - undefined behavior ensues)
# This is why NDEBUG is dangerous if assertions guard against UB!
# 6. Demonstrate the assert vs assert with message
$ ./test_assert demo
Testing simple assert: assert(x > 0)
Testing message assert: assert(x > 0, "x must be positive, got %d", x)
# 7. Compare with standard assert (less informative)
$ gcc -std=c11 -o test_standard test_standard_assert.c
$ ./test_standard
test_standard_assert.c:42: process_data: Assertion `ptr != NULL' failed.
Aborted (core dumped)
# Notice: No custom message, no function context, harder to debug
# 8. Show assertion preventing silent corruption
$ ./memory_test without_assert
# Program corrupts memory, crashes 1000 lines later in unrelated code
$ ./memory_test with_assert
Assertion failed: nbytes > 0
Expression: nbytes > 0
File: memory_test.c
Line: 15
Function: safe_alloc
Message: Cannot allocate non-positive bytes: -42
Aborted (core dumped)
# Bug caught at source, not at symptom
The Core Question You’re Answering
“How can I catch bugs at the moment they occur, with enough context to fix them immediately?”
Before you write any code, understand that assertions are not error handling - they’re bug detection. An assertion failure means the program has a bug that needs to be fixed, not an error condition that needs to be handled at runtime.
The distinction is crucial:
- Errors are expected: File not found, network timeout, user input invalid. These should be handled gracefully - return error codes, throw exceptions, prompt for retry.
- Bugs are not expected: Null pointer dereference, negative array size, violated invariant. These indicate a programmer mistake. The correct response is to crash immediately with maximum diagnostic information.
When you see assert(table != NULL) at the start of Table_put(), it means “if table is NULL here, my program is already broken and continuing would only make things worse.”
Concepts You Must Understand First
Stop and research these before coding:
- Variadic Macros
- What is
__VA_ARGS__and how does it work? - What is
__VA_OPT__(C23) and why was it added? - How do you write a macro that takes a variable number of arguments?
- What is the problem with empty
__VA_ARGS__and trailing commas? - Book Reference: “21st Century C” by Klemens - Ch. 10
- What is
- Preprocessor Source Location Macros
- What does
__FILE__expand to? - What does
__LINE__expand to? - What does
__func__expand to? (Note: it’s not technically a macro!) - Why must these be captured in the macro, not the function?
- What happens if you pass
__LINE__to a function? - Book Reference: “C Interfaces and Implementations” by Hanson - Ch. 4
- What does
- The abort() Function
- What does
abort()do differently fromexit()? - Why does assertion failure call
abort()instead ofexit(1)? - How does
abort()interact with debuggers? - What signals does
abort()raise? - Book Reference: “Expert C Programming” by van der Linden - Ch. 9
- What does
- NDEBUG Convention
- What is the NDEBUG macro and who defines it?
- Why do release builds typically disable assertions?
- What are the risks of disabled assertions?
- Should assertions check for conditions that could occur in valid usage?
- Book Reference: “C Interfaces and Implementations” by Hanson - Ch. 4
Questions to Guide Your Design
Before implementing, think through these:
- Macro Interface
- Should your assert take just an expression, or also a message?
- How will you handle assertions with formatted messages (like printf)?
- Should you have two macros:
assert(expr)andassertf(expr, fmt, ...)? - What should the macro expand to when NDEBUG is defined?
- Failure Behavior
- What information should be printed on failure?
- Should output go to stdout or stderr?
- Should you flush output before aborting?
- Should you call a user-definable hook before aborting?
- Integration with Tools
- How can you make it easier to set breakpoints on assertion failures?
- Should you support a “soft assert” that logs but doesn’t abort?
- How will your assert work with sanitizers (ASan, UBSan)?
- Should assertions be able to raise Except_T exceptions?
- Defensive Considerations
- What if the format string in an assertion message is NULL?
- What if an assertion is in a signal handler?
- What about assertions in library code vs application code?
Thinking Exercise
Trace Through an Assertion Failure
Before coding, trace what happens when this code runs:
#define assert(e) \
((void)((e) || (Assert_fail(#e, __FILE__, __LINE__, __func__), 0)))
void process_data(int *data, int n) {
assert(data != NULL);
assert(n > 0);
// ... process data ...
}
int main(void) {
process_data(NULL, 10); // Bug: passing NULL
return 0;
}
Questions while tracing:
- What does
#eexpand to? (Hint: stringification operator) - When is
Assert_failcalled? (Hint: short-circuit evaluation) - Why is there a
, 0at the end of the macro? - What type does the whole expression have? (Hint:
(void)cast) - If
Assert_failreturns, what happens? (It shouldn’t, but what if?)
Now add a formatted message:
#define assertf(e, ...) \
((void)((e) || (Assert_fail(#e, __FILE__, __LINE__, __func__, __VA_ARGS__), 0)))
void get_element(int *array, int size, int index) {
assertf(index >= 0 && index < size,
"Index %d out of bounds [0, %d)", index, size);
return array[index];
}
Additional questions:
- What does
Assert_failneed to do with__VA_ARGS__? - What if someone calls
assertf(x > 0)with no message? - How does
__VA_OPT__solve the trailing comma problem?
The Interview Questions They’ll Ask
Prepare to answer these:
-
“What’s the difference between assertions and error handling? Give an example of when you’d use each.”
Strong answer should include: Assertions are for programmer errors (bugs), not runtime errors. An assertion should never fail if the program is correct.
assert(ptr != NULL)catches bugs;if (file == NULL) return ERROR;handles expected failures. -
“Why do release builds typically disable assertions? What are the tradeoffs?”
Strong answer should include: Performance (no check overhead), binary size, but risks: if assertions guard against UB, disabling them is dangerous. Some teams keep assertions in production (“crash early” philosophy). Microsoft’s “Writing Solid Code” recommends this.
-
“How would you implement an assertion macro that prints the failed expression and source location?”
Strong answer should include: Use
#efor stringification,__FILE__,__LINE__,__func__. Call a noreturn function. Use short-circuit evaluation:(e) || (fail(), 0). Handle variadic messages with__VA_ARGS__. -
“Explain a bug you caught using assertions. How did the assertion help you find it quickly?”
Strong answer should include: Specific example showing how early detection (at the assertion) saved debugging time vs finding the symptom later. Emphasize the value of the diagnostic information.
-
“Some teams use assertions in production. What are the arguments for and against this?”
Strong answer should include: For: crash early vs corrupt data, security (some UB is exploitable), debuggability. Against: performance, denial-of-service if assertions can be triggered by input, user experience.
Hints in Layers
Hint 1: Start with the Simplest Version
Don’t worry about formatted messages at first. Implement a basic assert that just prints the expression and aborts:
// Pseudocode - not actual implementation
define assert(expression)
if expression is false
print "Assertion failed:", stringified expression
print "File:", current file, "Line:", current line
call abort()
// The key insight: use short-circuit evaluation
// (expr) || (fail(), 0)
// If expr is true, right side never evaluated
// If expr is false, fail() is called, then 0 returned (but fail never returns)
Hint 2: Add the Failure Handler Function
Separating the failure handling into a function has benefits:
- Keeps the macro small (better debugging)
- Allows setting breakpoints on the function
- Enables variadic arguments for messages
- Can be marked
noreturnfor better optimization
// Pseudocode structure
function Assert_fail(expression_string, file, line, function_name)
print formatted failure message to stderr
flush stderr (ensure message is visible before abort)
call abort()
// Mark as noreturn for compiler optimization
__attribute__((noreturn))
void Assert_fail(...);
// Macro calls the function
define assert(expression)
if not expression
call Assert_fail with stringified expression and location
Hint 3: Add Formatted Messages with Variadic Macros
The tricky part is handling the optional message. Consider two approaches:
// Approach 1: Two macros
#define assert(e) assert_impl(e, #e, __FILE__, __LINE__, __func__, NULL)
#define assertf(e, fmt, ...) assert_impl(e, #e, __FILE__, __LINE__, __func__, fmt, __VA_ARGS__)
// Approach 2: Single macro with __VA_OPT__ (C23)
#define assert(e, ...) \
((void)((e) || (Assert_fail(#e, __FILE__, __LINE__, __func__ \
__VA_OPT__(,) __VA_ARGS__), 0)))
// Approach 3: GNU extension ##__VA_ARGS__
#define assert(e, ...) \
((void)((e) || (Assert_fail(#e, __FILE__, __LINE__, __func__, ##__VA_ARGS__), 0)))
Hint 4: Handle NDEBUG Correctly
// Pseudocode for conditional compilation
#ifdef NDEBUG
// Option A: Expand to nothing (expression not evaluated)
#define assert(e, ...) ((void)0)
// Option B: Evaluate but don't check (for side effects)
#define assert(e, ...) ((void)(e))
#else
#define assert(e, ...) /* full version */
#endif
// IMPORTANT DESIGN DECISION:
// If the expression has side effects, Option A changes program behavior!
// Example: assert(ptr = malloc(100))
// With Option A: malloc never called in release
// With Option B: malloc called, but result not checked
// Best practice: NEVER put side effects in assertions
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Assert design | “C Interfaces and Implementations” by Hanson | Ch. 4 |
| Macro techniques | “21st Century C” by Klemens | Ch. 10 |
| Defensive programming | “Writing Solid Code” by Maguire | Ch. 2-3 |
| Debugging philosophy | “The Practice of Programming” by Kernighan & Pike | Ch. 5 |
| Preprocessor deep dive | “Expert C Programming” by van der Linden | Ch. 8 |
Common Pitfalls & Debugging
Problem 1: “My assert doesn’t show the right line number”
- Why: You’re calling a function that then uses
__LINE__, which gives the function’s line, not the caller’s - Debug:
__LINE__is expanded at the point the macro is used. If you pass it to another macro, it expands there. - Fix: The macro must expand
__LINE__and pass the value to the failure function
Problem 2: “The expression is evaluated twice”
- Why: Your macro uses the expression in both the condition and the message
- Debug:
assert(x++)increments twice - Fix: Use short-circuit evaluation:
((e) || (fail(#e, ...), 0))evaluateseexactly once
Problem 3: “My formatted message doesn’t work with no arguments”
- Why:
__VA_ARGS__might be empty, leaving a trailing comma - Debug:
assert(x > 0)might expand toAssert_fail(..., ) - Fix: Use
__VA_OPT__(,) __VA_ARGS__in C23, or GNU’s##__VA_ARGS__, or provide two macro versions
Problem 4: “The debugger doesn’t stop at my assertion”
- Why: The debugger needs a breakable function, not just a macro
- Debug: Set breakpoint with
break Assert_failin GDB - Fix: Make sure
Assert_failis a real function (not inlined) in debug builds
Problem 5: “Assertion fires in library code but I want to handle it”
- Why: Assertions should not be used for recoverable errors
- Debug: Consider if the condition is truly a bug or a handleable error
- Fix: If it’s handleable, use the Except module instead (see Project 5)
Project 2: Mem Module - Checked Memory Allocation
- File: P02-MEM-MODULE.md
- Main Programming Language: C
- Alternative Programming Languages: Rust (for comparison with ownership model), Zig (for comparison with explicit allocators)
- Coolness Level: Level 4 - Hardcore Tech Flex
- Business Potential: Level 2 - Side Hustle Potential
- Difficulty: Level 3 - Advanced
- Knowledge Area: Memory Management, Debugging, Data Structures
- Software or Tool: GCC, Valgrind, AddressSanitizer
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 5
What you’ll build: A memory allocation wrapper that provides leak detection, double-free detection, use-after-free detection (partial), memory usage statistics, and source location tracking for every allocation. This becomes your debugging superpower for all C projects.
Why it teaches the topic: Raw malloc/free are error-prone because they provide no feedback about misuse. A null return from malloc is almost never checked. Double-free corrupts the heap silently. Use-after-free might work for hours before crashing. By building a checked allocator, you’ll understand memory management internals and create a debugging tool you’ll use in every future C project.
Core challenges you’ll face:
- Allocation tracking -> Maps to maintaining metadata about every allocation
- Leak detection at exit -> Maps to tracking allocations that are never freed
- Double-free detection -> Maps to detecting when the same memory is freed twice
- Pointer validation -> Maps to verifying pointers were allocated by your module
- Performance overhead -> Maps to balancing debugging capability with speed
Real World Outcome
What you will see:
- A header file:
mem.hwith your allocation API - An implementation file:
mem.cwith tracking data structures - Test programs: Demonstrating leak reports, double-free catches, and statistics
- Memory reports: Showing exactly where your program allocates and leaks memory
Command Line Outcome Example:
# 1. Compile a program with the Mem module in debug mode
$ gcc -std=c11 -g -DMEM_DEBUG -o myapp myapp.c mem.c assert.c
# 2. Run a program with memory leaks
$ ./myapp
... program output ...
=== MEMORY REPORT ===
Total allocations: 1,247
Total deallocations: 1,244
Memory leaked: 3 blocks, 1,248 bytes total
Leaked block 1:
Address: 0x55f8a4b2c010
Size: 1024 bytes
Allocated at: parser.c:142 in parse_expression()
Leaked block 2:
Address: 0x55f8a4b2c420
Size: 128 bytes
Allocated at: parser.c:89 in parse_statement()
Leaked block 3:
Address: 0x55f8a4b2c4b0
Size: 96 bytes
Allocated at: lexer.c:234 in read_token()
# 3. Run a program with a double-free bug
$ ./myapp_buggy double_free
=== MEMORY ERROR ===
Double-free detected!
Address: 0x55f8a4b2c010
Size: 256 bytes
Originally allocated at: main.c:25 in process_data()
First freed at: main.c:42 in cleanup()
Double-free attempted at: main.c:58 in main()
Aborted (core dumped)
# 4. Run a program freeing unallocated memory
$ ./myapp_buggy bad_free
=== MEMORY ERROR ===
Attempted to free untracked pointer!
Address: 0x7ffd45678900
(This address was never allocated by Mem_alloc)
Free attempted at: main.c:30 in buggy_function()
Aborted (core dumped)
# 5. Get allocation statistics
$ MEM_STATS=1 ./myapp
... program output ...
=== ALLOCATION STATISTICS ===
Peak memory usage: 4,587,232 bytes
Current usage: 0 bytes (clean exit!)
Allocation sizes:
1-16 bytes: 523 allocations (42%)
17-64 bytes: 412 allocations (33%)
65-256 bytes: 198 allocations (16%)
257-1024 bytes: 89 allocations (7%)
>1024 bytes: 25 allocations (2%)
Top 5 allocation sites:
parser.c:142 - 234 calls, 239,616 bytes total
lexer.c:89 - 189 calls, 24,192 bytes total
ast.c:56 - 156 calls, 159,744 bytes total
table.c:78 - 98 calls, 6,272 bytes total
main.c:12 - 45 calls, 1,440 bytes total
# 6. Compile without debugging for release (Mem wraps to raw malloc)
$ gcc -std=c11 -O3 -o myapp_release myapp.c mem.c assert.c
$ ./myapp_release
... runs with no tracking overhead, just thin malloc wrappers ...
# 7. Compare with Valgrind (which your module can partially replace)
$ valgrind --leak-check=full ./myapp_valgrind 2>&1 | head -20
==12345== LEAK SUMMARY:
==12345== definitely lost: 1,248 bytes in 3 blocks
# (Same as your Mem module found, but slower and external tool)
The Core Question You’re Answering
“How can I make memory bugs immediately visible instead of silently corrupting my program?”
Before you write any code, understand that memory bugs are insidious because they often don’t crash immediately. A use-after-free might work fine 99% of the time, then suddenly corrupt unrelated data. A memory leak might only matter after hours of uptime. A double-free might appear to work but corrupt the heap allocator’s internal structures.
By tracking every allocation and adding runtime checks, you transform silent corruption into immediate, diagnosable failures. The goal is not just to find bugs, but to find them at the point of error, not three functions and 1000 lines of code later.
Concepts You Must Understand First
Stop and research these before coding:
- malloc/calloc/realloc/free Semantics
- What does each function do and return?
- What happens if malloc fails? (returns NULL)
- What are the rules for realloc’s return value? (can return new pointer, can return NULL, original may be freed)
- What is the behavior of free(NULL)? (defined as no-op)
- What happens if you free the same pointer twice? (undefined behavior)
- Book Reference: “C Interfaces and Implementations” by Hanson - Ch. 5
- Memory Block Metadata
- How can you store information about an allocation alongside the allocation itself?
- What is a “header” in memory allocator design?
- How do you get from a user pointer back to your metadata?
- What’s the tradeoff between inline headers vs separate tracking?
- Book Reference: “The C Programming Language” by K&R - Section 8.7
- Hash Tables for Fast Lookup
- How do you quickly find metadata for a given pointer?
- What hash function works well for pointers?
- How do you handle collisions?
- Book Reference: “Algorithms” by Sedgewick - Ch. 3.4
- Memory Poisoning
- What is “poisoning” freed memory?
- How does it help detect use-after-free?
- What patterns are commonly used? (0xDEADBEEF, 0xFEEDFACE, 0xABABABAB)
- Book Reference: “Writing Solid Code” by Maguire - Ch. 3
Questions to Guide Your Design
Before implementing, think through these:
- Metadata Storage
- Will you store metadata in a separate hash table or in a header before each allocation?
- How much metadata do you need? (size, file, line, freed flag, freed location, timestamp?)
- What’s the memory overhead per allocation?
- Can you reduce overhead by storing file names as atoms?
- Tracking Data Structures
- How will you find an allocation’s metadata given only a pointer?
- How will you iterate over all allocations (for leak reports)?
- How will you handle thousands of allocations efficiently?
- What happens when your tracking structures themselves need memory?
- Detection Mechanisms
- How will you detect double-free?
- How will you detect freeing a pointer not from your allocator?
- Should you poison freed memory? With what pattern?
- Should you keep freed blocks around to detect use-after-free?
- Build Configuration
- How will you disable tracking for release builds?
- Should Mem_alloc just become malloc when debugging is off?
- How will you minimize overhead in production?
- Can you make debug/release a compile-time switch?
Thinking Exercise
Design the Metadata Structure
Before coding, design how you’ll track allocations. Consider what the error messages in “Real World Outcome” require:
// What information do you need per allocation?
struct allocation_info {
void *ptr; // The allocated pointer (or derive from position?)
size_t size; // Allocation size (for poisoning, stats)
const char *file; // Source file (consider: store as atom?)
int line; // Line number
const char *func; // Function name (C99 __func__)
int freed; // Has this been freed? (for double-free detection)
const char *free_file; // Where it was freed (for better errors)
int free_line;
// Timestamp? Allocation sequence number? Checksum?
};
Questions while designing:
- How do you get from a user pointer to this metadata?
- How much memory overhead is acceptable? (8 bytes per alloc? 64? 256?)
- What if someone allocates millions of tiny blocks?
- Should metadata be allocated with your allocator? (Bootstrap problem!)
The Interview Questions They’ll Ask
Prepare to answer these:
-
“How would you implement memory leak detection in a custom allocator?”
Strong answer should include: Track every allocation in a data structure (hash table). On program exit (atexit handler), iterate and report any not marked as freed. Store source location for useful reports.
-
“Explain how you would detect double-free bugs. What data structures would you use?”
Strong answer should include: Mark freed blocks in metadata. On free, check if already marked. Either keep metadata forever (memory cost) or use tombstones. Hash table by pointer address for O(1) lookup.
-
“What is use-after-free and why is it difficult to detect reliably?”
Strong answer should include: Accessing memory after freeing it. Difficult because: memory may be reallocated (overwrites your poison pattern), the access may appear to work, detecting reads is harder than writes. Full detection requires tools like ASan.
-
“You’re debugging a program that crashes randomly. You suspect a memory bug. Walk me through your approach.”
Strong answer should include: Add ASan/Valgrind, look for pattern in crashes, add assertions at boundaries, use Mem module for tracking, check for uninitialized memory, double-free, use-after-free.
-
“What are the tradeoffs between debug allocators (like your Mem module) and tools like Valgrind or AddressSanitizer?”
Strong answer should include: Debug allocator: portable, customizable, always available. Valgrind: complete but slow (10-50x). ASan: fast (2x), complete, but requires recompile. Your module: lightweight, can be always-on.
Hints in Layers
Hint 1: Start with a Simple Wrapper
Before adding tracking, just wrap malloc/free with your interface:
// Pseudocode for basic wrapper
function Mem_alloc(size, file, line)
assert(size > 0) // Use your Assert module!
ptr = malloc(size)
if ptr == NULL
RAISE(Mem_Failed) // Or abort with message
return ptr
function Mem_free(ptr, file, line)
assert(ptr != NULL) // Debatable: some allow free(NULL)
free(ptr)
// Convenience macros
#define ALLOC(n) Mem_alloc((n), __FILE__, __LINE__)
#define FREE(ptr) (Mem_free((ptr), __FILE__, __LINE__), (ptr) = NULL)
Hint 2: Add Allocation Tracking with a Hash Table
Store metadata about each allocation:
// Pseudocode for tracking
global: hash_table mapping pointers to allocation_info
function Mem_alloc(size, file, line)
ptr = malloc(size)
if ptr == NULL: raise exception
info = new allocation_info
info.ptr = ptr
info.size = size
info.file = file // Consider: Atom_string(file) for dedup
info.line = line
info.freed = 0
hash_table_insert(ptr, info)
return ptr
function Mem_free(ptr, file, line)
info = hash_table_lookup(ptr)
if info == NULL:
ABORT("Freeing untracked pointer at file:line")
if info.freed:
ABORT("Double-free detected! Originally allocated at..., first freed at..., now at file:line")
info.freed = 1
info.free_file = file
info.free_line = line
free(ptr)
Hint 3: Add Memory Poisoning
Fill freed memory with a recognizable pattern:
// Pseudocode for poisoning
#define POISON_BYTE 0xDD // "Dead" memory
function Mem_free(ptr, file, line)
info = lookup(ptr)
// ... validation ...
// Poison the memory before freeing
memset(ptr, POISON_BYTE, info.size)
// Option A: Free immediately
free(ptr)
// Option B: Keep in quarantine (better use-after-free detection)
// Don't free yet, add to quarantine list
// Free oldest quarantine entry when list gets too big
Hint 4: Add Leak Reporting at Exit
Use atexit() to report leaks when the program ends:
// Pseudocode for leak reporting
function report_leaks()
leak_count = 0
leak_bytes = 0
for each entry in hash_table:
if not entry.freed:
print "Leaked: address, size, allocated at file:line"
leak_count++
leak_bytes += entry.size
if leak_count > 0:
print "Total: leak_count blocks, leak_bytes bytes leaked"
// In module initialization
static int initialized = 0;
function ensure_init()
if not initialized:
initialized = 1
atexit(report_leaks)
// Call ensure_init() at start of every Mem_ function
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Mem module design | “C Interfaces and Implementations” by Hanson | Ch. 5 |
| Allocator internals | “The C Programming Language” by K&R | Section 8.7 |
| Memory debugging | “Writing Solid Code” by Maguire | Ch. 3 |
| Hash tables | “Algorithms” by Sedgewick | Ch. 3.4 |
| Real allocators | “Modern Operating Systems” by Tanenbaum | Ch. 3 |
Common Pitfalls & Debugging
Problem 1: “My leak report shows allocations from my own module”
- Why: The hash table itself uses malloc, which your Mem_alloc might wrap
- Debug: Track whether you’re inside a Mem function to avoid recursion
- Fix: Use raw malloc for internal data structures, not your wrapped version
Problem 2: “I can’t detect all use-after-free bugs”
- Why: If the freed memory is reallocated, the new data overwrites your poison pattern
- Debug: This is a fundamental limitation; you can only detect some cases
- Fix: Keep freed blocks in quarantine longer, or use AddressSanitizer for complete detection
Problem 3: “My hash table lookup is slow with many allocations”
- Why: Poor hash function or too many collisions
- Debug: Profile your lookup time with thousands of allocations
- Fix: Use a better hash function for pointers:
(uintptr_t)ptr >> 3often works well
Problem 4: “Memory overhead is too high”
- Why: You’re storing too much metadata per allocation
- Debug: Calculate:
overhead = allocations * sizeof(allocation_info) - Fix: Store file names as atoms, use compact line number storage, drop non-essential fields in release
Problem 5: “Bootstrap problem: how do I allocate the hash table?”
- Why: You need memory to track memory
- Debug: First allocation has nowhere to record itself
- Fix: Use raw malloc for internal structures, OR pre-allocate a fixed tracking array, OR use static initial storage
Project 3: Arena Module - Bulk Memory Management
- File: P03-ARENA-MODULE.md
- Main Programming Language: C
- Alternative Programming Languages: Rust (for comparison with bumpalo crate), Go (for comparison with escape analysis), Zig (for custom allocators)
- Coolness Level: Level 4 - Hardcore Tech Flex
- Business Potential: Level 2 - Side Hustle Potential
- Difficulty: Level 3 - Advanced
- Knowledge Area: Memory Management, Compilers, Performance Optimization
- Software or Tool: GCC, Valgrind, perf
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 6
What you’ll build: An arena (or region) allocator that allocates memory from large blocks and frees everything at once. This is how compilers, games, web servers, and high-performance applications manage memory for objects with shared lifetimes.
Why it teaches the topic: Individual malloc/free calls are slow (system call overhead, fragmentation management) and error-prone (forget to free = leak, free twice = corruption). Arenas provide O(1) allocation (just a pointer bump), zero fragmentation within the arena, and no memory leaks possible (you free the entire arena). Understanding arenas is essential for systems programming.
Core challenges you’ll face:
- Block-based allocation -> Maps to understanding how to subdivide large memory chunks
- Alignment requirements -> Maps to ensuring allocations meet CPU alignment requirements
- Block chaining -> Maps to handling when one block fills up
- Free-all-at-once semantics -> Maps to understanding object lifetimes
Real World Outcome
What you will see:
- A header file:
arena.hwith arena creation, allocation, and disposal - An implementation file:
arena.cwith block management - A demo parser: Using arenas to allocate AST nodes, then freeing all at once
- Performance benchmarks: Showing arena vs malloc/free speed difference
Command Line Outcome Example:
# 1. Compile a parser that uses arena allocation
$ gcc -std=c11 -g -O2 -o parser parser.c arena.c mem.c assert.c
# 2. Parse a large file - watch memory usage and timing
$ ./parser large_source_file.c
Parsing: large_source_file.c (50,000 lines)
Tokens: 892,341
AST nodes created: 234,567
Arena statistics:
Blocks allocated: 47
Block size: 65,536 bytes (64 KB)
Total memory: 3,080,192 bytes (2.94 MB)
Memory used: 2,847,293 bytes (92.4% utilization)
Wasted to alignment: 89,134 bytes (2.9%)
Wasted at block ends: 143,765 bytes (4.7%)
Parse time: 0.342 seconds
Arena dispose time: 0.001 seconds # <-- THIS is the magic!
# 3. Compare with malloc/free approach
$ ./parser_malloc large_source_file.c
Parsing: large_source_file.c (50,000 lines)
Parse time: 0.512 seconds # 50% slower to allocate
Free time: 0.089 seconds # 89x slower to free!
Peak memory: 4,234,567 bytes # More memory due to per-alloc overhead
Arena speedup: 1.5x allocation, 89x deallocation
# 4. Demonstrate nested arenas (for temporary work)
$ ./nested_arena_demo
Creating outer arena (for result)...
Allocated: header structure (64 bytes)
Allocated: result metadata (256 bytes)
Creating inner arena (for scratch space)...
Allocated: temp buffer 1 (4096 bytes)
Allocated: temp buffer 2 (4096 bytes)
Allocated: temp buffer 3 (4096 bytes)
Processing with scratch space...
Disposing inner arena... done (12,288 bytes freed instantly)
Inner arena memory returned, outer arena unaffected!
Allocated: final result (1024 bytes)
Outer arena now has: 1,344 bytes in use
Disposing outer arena... done
# 5. Show zero memory leaks (impossible with arenas used correctly)
$ valgrind --leak-check=full ./parser small_file.c 2>&1 | tail -5
==12345== LEAK SUMMARY:
==12345== definitely lost: 0 bytes in 0 blocks
==12345== indirectly lost: 0 bytes in 0 blocks
==12345== possibly lost: 0 bytes in 0 blocks
==12345== still reachable: 0 bytes in 0 blocks
# 6. Benchmark allocation speed
$ ./arena_benchmark
Allocating 1,000,000 small objects (16-64 bytes each)...
malloc/free:
Allocation time: 0.234 seconds
Free time: 0.198 seconds
Total: 0.432 seconds
Arena:
Allocation time: 0.012 seconds # 19x faster to allocate
Dispose time: 0.001 seconds # 198x faster to free
Total: 0.013 seconds
Arena is 33x faster overall!
# 7. Show arena with calloc-like zeroing
$ ./arena_demo
Arena_alloc(arena, 100): returns uninitialized memory (fast)
Arena_calloc(arena, 10, 10): returns zeroed memory (like calloc)
First 10 bytes: 00 00 00 00 00 00 00 00 00 00
The Core Question You’re Answering
“What if I could allocate memory as fast as incrementing a pointer, and free it all with a single call?”
Before you write any code, understand the key insight: many programs allocate objects that have the same lifetime. Consider these examples:
- Compiler: All AST nodes for one function are created during parsing and freed after code generation
- Web server: All data for one HTTP request is created when the request arrives and freed when the response is sent
- Game engine: All objects for one frame are created during update and freed before the next frame
- Document parser: All nodes for one document are created during parsing and freed when the document is closed
In all these cases, individual free() calls are wasted effort. You’re going to free everything anyway, so why track each allocation individually?
Concepts You Must Understand First
Stop and research these before coding:
- Memory Alignment
- Why must some types start at specific addresses (e.g., doubles at 8-byte boundaries)?
- What is
max_align_tand why does it exist? - How do you round an address up to the next aligned boundary?
- What happens on x86 with misaligned access? (Works, but slower.) On ARM? (May fault!)
- Book Reference: “C Interfaces and Implementations” by Hanson - Ch. 6
- Bump/Arena Allocation
- What is a “bump allocator” or “linear allocator”?
- How does allocation work with just a pointer increment?
- Why is there no individual free? (The whole point!)
- What’s the relationship between arenas and garbage collection?
- Book Reference: “Crafting Interpreters” by Nystrom - Ch. 14.5
- Memory Blocks and Chaining
- What happens when one block is full?
- How do you chain multiple blocks together?
- What’s the tradeoff in block size? (Big = waste at end, small = more mallocs)
- Should blocks be same size or grow exponentially?
- Book Reference: “C Interfaces and Implementations” by Hanson - Ch. 6
- Lifetime-Based Allocation
- What does it mean for objects to have “the same lifetime”?
- How do nested lifetimes work? (Inner arena freed before outer)
- What happens if you save a pointer from an arena and the arena is disposed?
- Book Reference: “Game Programming Patterns” by Nystrom - “Data Locality”
Questions to Guide Your Design
Before implementing, think through these:
- Block Management
- How big should each block be? (Consider: too small = many mallocs, too big = waste)
- What happens when a single allocation exceeds block size?
- How will you track which blocks belong to which arena?
- Should you grow block sizes as the arena grows?
- Allocation Strategy
- How will you handle alignment for each allocation?
- Should you support zero-sized allocations? (Hanson says no)
- What about realloc equivalent? (Arena_resize?)
- Should you track allocation count/size for debugging?
- Disposal
- When you dispose an arena, do you free the blocks or keep them for reuse?
- How do you support nested arenas?
- Should disposal be immediate or deferred?
- What if someone tries to allocate from a disposed arena?
- API Design
- Should Arena_alloc return NULL on failure or raise exception?
- Should there be Arena_calloc (zeroed allocation)?
- How do you handle the arena running out of memory?
Thinking Exercise
Trace Arena Allocation
Before coding, trace memory layout for these allocations. Assume 16-byte alignment and a 256-byte block:
Arena_T arena = Arena_new(); // Block at 0x1000, size 256
void *a = Arena_alloc(arena, 8); // 8 bytes
void *b = Arena_alloc(arena, 1); // 1 byte
void *c = Arena_alloc(arena, 16); // 16 bytes
void *d = Arena_alloc(arena, 100); // 100 bytes
void *e = Arena_alloc(arena, 200); // 200 bytes - doesn't fit!
Questions while tracing:
- What address is
a? (0x1000, aligned) - What address is
b? (0x1008? or 0x1010 with padding?) - How much space is wasted between allocations for alignment?
- When
eis requested and doesn’t fit, what happens? - After
e, how many blocks exist?
Trace block chaining:
Block 1 (256 bytes):
0x1000: a (8 bytes)
0x1008: [padding for alignment]
0x1010: b (1 byte)
0x1011: [padding]
0x1020: c (16 bytes)
0x1030: d (100 bytes)
0x1094: [unused: 108 bytes wasted at end]
Block 2 (256 bytes):
0x2000: e (200 bytes)
0x20C8: [unused: 56 bytes available]
The Interview Questions They’ll Ask
Prepare to answer these:
-
“Explain arena allocation and why it’s faster than malloc/free for certain use cases.”
Strong answer should include: Arena = allocate by bumping pointer, free all at once. Fast because: no free list management, no fragmentation, O(1) alloc, O(blocks) free. Use when objects share lifetime.
-
“How would you implement an arena allocator? What data structures would you use?”
Strong answer should include: Linked list of blocks, current pointer and limit in current block. Alloc: align pointer, check fits, bump pointer. New block if full. Dispose: walk list, free each block.
-
“What types of programs benefit most from arena allocation? Give specific examples.”
Strong answer should include: Compilers (AST per function), web servers (per-request), games (per-frame), parsers (per-document), any batch processing where you can identify object lifetimes.
-
“What are the downsides of arena allocation? When would you not use it?”
Strong answer should include: Can’t free individual objects, memory only freed on dispose, not suitable for long-lived objects with different lifetimes, wastes memory at end of blocks.
-
“How do games and compilers use arena allocation? Describe the memory lifecycle.”
Strong answer should include: Game: scratch arena cleared every frame, level arena cleared on level change. Compiler: per-function arena for AST, per-file arena for tokens, global arena for symbol tables.
Hints in Layers
Hint 1: Start with a Single Block
Implement allocation from a single, fixed-size block first:
// Pseudocode for single-block arena
struct Arena_T {
char *avail; // Next available byte
char *limit; // One past last byte in block
};
function Arena_new()
block = malloc(BLOCK_SIZE)
arena.avail = block
arena.limit = block + BLOCK_SIZE
return arena
function Arena_alloc(arena, size)
// Align avail to proper boundary
arena.avail = round_up(arena.avail, ALIGNMENT)
if arena.avail + size > arena.limit
ERROR: out of space! // Handle later
result = arena.avail
arena.avail += size
return result
Hint 2: Add Multiple Blocks
Chain blocks together when one fills up:
// Pseudocode for multi-block arena
struct Block {
struct Block *next;
char memory[]; // Flexible array member
};
struct Arena_T {
struct Block *first; // Head of block list
char *avail; // Current allocation pointer
char *limit; // End of current block
};
function Arena_alloc(arena, size)
avail = round_up(arena.avail, ALIGNMENT)
if avail + size > arena.limit
// Allocate new block
block = malloc(sizeof(Block) + max(size, BLOCK_SIZE))
block.next = arena.first
arena.first = block
arena.avail = block.memory
arena.limit = block.memory + block_size
avail = arena.avail
arena.avail = avail + size
return avail
Hint 3: Handle Alignment Properly
Alignment is crucial for correctness:
// Pseudocode for alignment
#define ALIGNMENT (sizeof(union { long l; double d; void *p; long long ll; }))
// Or use alignof(max_align_t) in C11
function round_up(ptr, alignment)
uintptr_t addr = (uintptr_t)ptr
uintptr_t mask = alignment - 1
return (char *)((addr + mask) & ~mask)
// Example: round_up(0x1001, 16) = 0x1010
Hint 4: Implement Dispose
Dispose frees all blocks at once:
// Pseudocode for dispose
function Arena_dispose(arena_ptr)
arena = *arena_ptr
block = arena.first
while block != NULL
next = block.next
free(block)
block = next
// Set pointer to NULL to catch use-after-dispose
*arena_ptr = NULL
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Arena design | “C Interfaces and Implementations” by Hanson | Ch. 6 |
| Game allocators | “Game Programming Patterns” by Nystrom | “Data Locality” |
| Compiler memory | “Crafting Interpreters” by Nystrom | Ch. 14.5 |
| Alignment rules | “Effective C, 2nd Edition” by Seacord | Ch. 2 |
| Memory systems | “Computer Systems: A Programmer’s Perspective” | Ch. 9 |
Common Pitfalls & Debugging
Problem 1: “I get crashes on some architectures but not others”
- Why: You’re returning misaligned pointers
- Debug:
printf("Allocated at %p (mod %zu = %zu)\n", ptr, ALIGN, (uintptr_t)ptr % ALIGN) - Fix: Always align to at least 8 bytes, or use
alignof(max_align_t)for max portability
Problem 2: “Memory usage keeps growing even after Arena_dispose”
- Why: You’re not freeing the blocks, just resetting pointers
- Debug: Run with Valgrind:
valgrind --leak-check=full ./program - Fix: Actually call
free()on each block in dispose, don’t just reset pointers
Problem 3: “Large allocations fail even with lots of memory”
- Why: Single allocation larger than block size
- Debug: Check if
requested_size > BLOCK_SIZE - Fix: Allocate a dedicated block for large requests (big enough for that specific allocation)
Problem 4: “Arena seems slower than expected”
- Why: Block size is too small, causing many malloc calls for new blocks
- Debug: Count number of blocks allocated
- Fix: Increase default block size to 64KB-1MB; sweet spot depends on use case
Problem 5: “Use-after-dispose bug is hard to find”
- Why: Pointer to disposed arena memory may still “work” temporarily
- Debug: Poison arena memory before dispose (fill with 0xDD)
- Fix: Set arena pointer to NULL after dispose; consider a sentinel value in the first block
Project 4: Atom Module - String Interning
- File: P04-ATOM-MODULE.md
- Main Programming Language: C
- Alternative Programming Languages: Python (compare to built-in interning), Java (compare to String.intern()), Lisp (symbols are atoms)
- Coolness Level: Level 4 - Hardcore Tech Flex
- Business Potential: Level 2 - Side Hustle Potential
- Difficulty: Level 3 - Advanced
- Knowledge Area: Hash Tables, String Algorithms, Compilers, Language Implementation
- Software or Tool: GCC, GDB
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 3
What you’ll build: An atom (string interning) module that stores each unique string exactly once and enables O(1) string comparison by pointer equality. Two strings with the same content will always have the same pointer.
Why it teaches the topic: String comparison with strcmp is O(n) - expensive for compilers and interpreters that compare identifiers thousands of times. Atoms convert this to O(1) pointer comparison. Every compiler, database query processor, symbol table, and many applications use this technique. Understanding atoms teaches hash table design, memory efficiency, and a fundamental optimization pattern.
Core challenges you’ll face:
- Hash table implementation -> Maps to understanding hash functions and collision handling
- String storage -> Maps to understanding permanent vs temporary memory
- Thread safety -> Maps to understanding concurrent access (optional advanced topic)
- Memory efficiency -> Maps to understanding when sharing saves memory
Real World Outcome
What you will see:
- A header file:
atom.hwith interning and lookup functions - An implementation file:
atom.cwith hash table and string storage - A simple interpreter demo: Demonstrating fast identifier lookup using atoms
- Performance comparison: Showing atom comparison vs strcmp
Command Line Outcome Example:
# 1. Compile a simple interpreter using atoms
$ gcc -std=c11 -g -O2 -o interp interp.c atom.c arena.c mem.c assert.c
# 2. Run the interpreter on a file with many identifier references
$ ./interp example_script.lang
Variables defined: 47
Variable references in code: 23,456
Function calls: 8,901
Atom table statistics:
Unique atoms: 312
Total string bytes stored: 3,847
Hash table buckets: 512
Load factor: 0.61
Average chain length: 0.61
Longest chain: 3
Memory saved by sharing: 78,934 bytes (vs storing each occurrence)
# 3. Demonstrate pointer equality for identical strings
$ ./atom_demo
Creating atoms from different sources...
atom1 = Atom_string("hello") -> 0x55f8a4b2c010
atom2 = Atom_string("hello") -> 0x55f8a4b2c010 # SAME pointer!
atom3 = Atom_string("Hello") -> 0x55f8a4b2c020 # Different (case sensitive)
atom4 = Atom_new("hello", 5) -> 0x55f8a4b2c010 # SAME again!
char buffer[] = "hello";
atom5 = Atom_string(buffer) -> 0x55f8a4b2c010 # SAME! String content matters, not location
Pointer equality tests:
atom1 == atom2 : true (same content -> same pointer)
atom1 == atom3 : false (different content -> different pointer)
atom1 == atom4 : true
atom1 == atom5 : true
No strcmp needed! Just compare pointers.
# 4. Show comparison speed difference
$ ./atom_benchmark
Comparing same identifier 10,000,000 times...
Using strcmp():
Time: 1.234 seconds
strcmp calls: 10,000,000
Total character comparisons: ~50,000,000
Using atom pointer equality:
Time: 0.067 seconds
Pointer comparisons: 10,000,000
Atoms are 18.4x faster for comparison!
# 5. Show memory savings
$ ./atom_memory_demo
Storing variable name "username" appearing in code...
Without atoms:
String "username" appears 10,000 times in AST
Memory used: 90,000 bytes (9 bytes * 10,000)
With atoms:
String "username" stored once: 9 bytes
10,000 atom pointers: 80,000 bytes
Total: 80,009 bytes
Immediate savings: 11%
For longer identifiers:
"authenticateUserWithCredentials" (32 chars) appears 10,000 times
Without atoms: 320,000 bytes
With atoms: 80,032 bytes
Savings: 75%!
# 6. Integration with lexer
$ ./lexer_demo source_file.c
Lexing: source_file.c
Token stream (showing atom addresses):
IDENT "int" -> atom@0x1000
IDENT "main" -> atom@0x1008
LPAREN
IDENT "void" -> atom@0x1010
RPAREN
LBRACE
IDENT "int" -> atom@0x1000 # Same atom as first "int"!
IDENT "x" -> atom@0x1018
...
Symbol table can use atom as key directly - O(1) lookup!
# 7. Atom from length (for non-null-terminated strings)
$ ./atom_demo length
Source buffer: "hello world"
Atom_new(buffer, 5) = "hello" -> 0x1000
Atom_new(buffer+6, 5) = "world" -> 0x1008
Atom_length(0x1000) = 5
The Core Question You’re Answering
“How can I compare strings instantly and share identical strings in memory?”
Before you write any code, understand the problem deeply:
When your interpreter sees x = x + 1, it encounters the identifier x twice. Without atoms, you’d:
- Store “x” once for the left side, once for the right side (2 bytes + overhead each)
- Use
strcmpto check if they refer to the same variable (O(n) per comparison)
With atoms:
- Store “x” exactly once (1 byte + overhead, once)
- Both references are the same pointer
- Comparison is
ptr1 == ptr2(O(1), single CPU instruction)
The insight: if two strings have the same content, they should have the same address. This is called “interning” - from “internal representation.”
Concepts You Must Understand First
Stop and research these before coding:
- Hash Functions for Strings
- What makes a good hash function for strings?
- What is the djb2 hash function? (Simple, decent quality)
- What is FNV-1a? (Better distribution)
- How do you handle long strings efficiently?
- Book Reference: “Algorithms” by Sedgewick - Ch. 3.4
- Hash Table Collision Handling
- What is separate chaining (linked lists in each bucket)?
- What is open addressing (linear probing, quadratic probing)?
- How do you choose table size?
- When should you resize the table?
- Book Reference: “C Interfaces and Implementations” by Hanson - Ch. 3
- String Interning Concept
- What is “interning” a string?
- Why does interned string comparison become pointer comparison?
- What languages have built-in interning? (Python small strings, Java String.intern(), Lisp symbols)
- What is the relationship between atoms and symbol tables?
- Book Reference: “Compilers: Principles, Techniques, and Tools” by Aho - Ch. 2
- Immutability Requirements
- Why must interned strings be immutable?
- What happens if someone modifies an interned string?
- Where should interned strings be stored? (Not on stack!)
- Book Reference: “C Interfaces and Implementations” by Hanson - Ch. 3
Questions to Guide Your Design
Before implementing, think through these:
- Hash Table Design
- How many buckets should your initial table have? (Prime number helps)
- When should you grow the table? (Load factor > 0.75?)
- Will you use separate chaining or open addressing?
- What if the caller passes an empty string?
- String Storage
- Where will you store the actual string bytes? (They must be permanent!)
- Can you use an Arena for this? (Yes! Atoms never freed individually)
- How do you handle strings with embedded nulls? (Use length-based storage)
- Should atoms include the length? (Hanson: yes)
- Interface Design
Atom_string(char *s)- from null-terminated stringAtom_new(char *s, int len)- from buffer with lengthAtom_int(long n)- convert integer to atomAtom_length(atom)- get length of atom- What should return type be?
const char *or dedicated type?
- Lifetime Management
- Should atoms ever be freed? (Hanson: no, live forever)
- What if you want temporary atoms? (Use arena directly instead)
- How do atoms interact with the program lifecycle?
Thinking Exercise
Trace Hash Table Operations
Before coding, trace these operations with a 4-bucket hash table:
// Assume hash("hello") % 4 = 2
// Assume hash("world") % 4 = 1
// Assume hash("help") % 4 = 2 (collision with "hello")
// Assume hash("hi") % 4 = 3
Atom_string("hello") // Insert "hello" into bucket 2
Atom_string("world") // Insert "world" into bucket 1
Atom_string("help") // Insert "help" into bucket 2 (collision!)
Atom_string("hello") // Find existing "hello" in bucket 2
Atom_string("hi") // Insert "hi" into bucket 3
Draw the hash table after each operation:
Bucket 0: []
Bucket 1: []
Bucket 2: []
Bucket 3: []
Questions while tracing:
- Which bucket does each string go into?
- How do you handle the collision between “hello” and “help”?
- When looking up “hello” the second time, how do you find it?
- What is the chain in bucket 2 after all operations?
The Interview Questions They’ll Ask
Prepare to answer these:
-
“What is string interning and why would you use it?”
Strong answer should include: Store each unique string once, return same pointer for same content. Benefits: O(1) comparison by pointer equality, memory savings when strings repeated. Used in: compilers, databases, interpreters, configuration.
-
“How would you implement a string intern pool? Describe the data structures.”
Strong answer should include: Hash table mapping string content to permanent storage. On insert: hash string, check if exists, if not copy to permanent storage and add to table. On lookup: hash and compare. Return pointer to stored copy.
-
“What hash function would you use for strings? Why?”
Strong answer should include: djb2 (simple, reasonable), FNV-1a (better distribution), or xxHash/MurmurHash (fast for long strings). Key properties: fast, good distribution, handles various string patterns.
-
“What are the memory tradeoffs of string interning? When does it help? Hurt?”
Strong answer should include: Helps when: strings repeated many times, comparisons frequent. Hurts when: many unique strings (hash table overhead), strings used once. Break-even depends on: string length, repetition count, comparison frequency.
-
“How do compilers use string interning? Why is it important for symbol tables?”
Strong answer should include: Identifiers interned during lexing. Symbol table keys are atoms. Scope lookup is O(1) pointer comparison. Memory efficient for repeated identifiers. Keywords often pre-interned.
Hints in Layers
Hint 1: Start with the Hash Function
Choose a simple, well-known hash function:
// Pseudocode for djb2 hash
function hash_string(str, len)
hash = 5381
for i from 0 to len-1:
hash = hash * 33 + str[i]
// Or: hash = ((hash << 5) + hash) + str[i]
return hash
// Why 5381? Dan Bernstein chose it empirically.
// Why 33? Good distribution properties for ASCII text.
Hint 2: Implement Basic Hash Table with Chaining
Use separate chaining (linked lists in each bucket):
// Pseudocode for atom table
struct Entry {
struct Entry *next; // Chain for collision
int len; // String length
char str[]; // Flexible array: the actual string
};
struct AtomTable {
int size; // Number of buckets
int count; // Number of atoms stored
struct Entry **buckets; // Array of Entry pointers
};
function Atom_string(s)
return Atom_new(s, strlen(s))
function Atom_new(str, len)
hash = hash_string(str, len)
bucket_index = hash % table.size
// Search bucket for existing atom
for entry in table.buckets[bucket_index]:
if entry.len == len AND memcmp(entry.str, str, len) == 0:
return entry.str // Found it!
// Not found, create new atom
entry = allocate(sizeof(Entry) + len + 1) // +1 for null terminator
entry.len = len
memcpy(entry.str, str, len)
entry.str[len] = '\0'
// Add to bucket (prepend to chain)
entry.next = table.buckets[bucket_index]
table.buckets[bucket_index] = entry
table.count++
return entry.str
Hint 3: Store String Length
Hanson stores length to enable Atom_length() and handle embedded nulls:
// Pseudocode for length access
// The length is stored just before the string in the Entry
function Atom_length(atom)
// atom points to the str[] field in Entry
// Entry is: [next][len][str...]
// So length is at atom - sizeof(int) - sizeof(pointer)
// But that's fragile! Better design:
entry_ptr = atom - offsetof(Entry, str)
return entry_ptr->len
Hint 4: Consider Table Growth
When the table gets too full, resize:
// Pseudocode for table resizing
function maybe_grow()
if table.count > table.size * 0.75:
new_size = table.size * 2 // Or next prime
new_buckets = allocate(new_size * sizeof(Entry*))
// Rehash all entries
for bucket in old_buckets:
for entry in bucket:
new_index = hash_string(entry.str, entry.len) % new_size
entry.next = new_buckets[new_index]
new_buckets[new_index] = entry
free(old_buckets)
table.buckets = new_buckets
table.size = new_size
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Atom design | “C Interfaces and Implementations” by Hanson | Ch. 3 |
| Hash tables | “Algorithms” by Sedgewick | Ch. 3.4 |
| Symbol tables | “Compilers: Principles, Techniques, and Tools” by Aho | Ch. 2.7 |
| String algorithms | “The Practice of Programming” by Kernighan & Pike | Ch. 3 |
| Hash functions | “Introduction to Algorithms” by CLRS | Ch. 11 |
Common Pitfalls & Debugging
Problem 1: “Different strings are getting the same atom”
- Why: Hash collision AND broken comparison (not checking content after hash match)
- Debug: Print hash values and bucket indices for both strings
- Fix: Make sure you compare both length AND content with
memcmp, not just hash
Problem 2: “Atom lookup is slow”
- Why: Too many collisions (bad hash function or tiny table)
- Debug: Print chain lengths for each bucket, should be mostly 0-3
- Fix: Implement table resizing when load factor > 0.75, or use better hash function
Problem 3: “Memory keeps growing even with repeated strings”
- Why: You’re not finding existing atoms (comparison bug)
- Debug: Print whether each
Atom_newcreates or finds - Fix: Verify your comparison uses the right length and memcmp
Problem 4: “Atom addresses change unexpectedly”
- Why: You’re reallocating or moving string storage
- Debug: Atoms must be permanent; the pointer IS the identity
- Fix: Never move or realloc atom storage; use permanent arena or never-freed memory
Problem 5: “Atom_length crashes or returns wrong value”
- Why: Offset calculation to find length is wrong
- Debug: Print entry structure addresses and verify layout
- Fix: Use
offsetof()correctly, or store length in a side table
Project 5: Except Module - Exception Handling in C
- File: P05-EXCEPT-MODULE.md
- Main Programming Language: C
- Alternative Programming Languages: C++ (native exceptions for comparison), Java (try/catch/finally), Python (exception handling)
- Coolness Level: Level 5 - Interview Ace
- Business Potential: Level 2 - Side Hustle Potential
- Difficulty: Level 4 - Expert
- Knowledge Area: Error Handling, Control Flow, Stack Management, Non-local Jumps
- Software or Tool: GCC, GDB
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 4
What you’ll build: A portable exception handling system using setjmp/longjmp, providing TRY/EXCEPT/FINALLY macros that enable structured error handling in C without cluttering code with error checks after every function call.
Why it teaches the topic: C’s only built-in error handling is return codes, which have serious problems: they clutter code, are easy to ignore, and make cleanup tedious. Hanson’s Except module provides a cleaner model - separate the error path from the happy path, ensure cleanup happens, and propagate errors without explicit checks. Understanding how this is built with setjmp/longjmp teaches deep lessons about the call stack and program state.
Core challenges you’ll face:
- setjmp/longjmp mechanics -> Maps to understanding non-local jumps and what state is preserved
- Exception context stacks -> Maps to understanding nested TRY blocks
- FINALLY semantics -> Maps to ensuring cleanup happens regardless of how blocks exit
- Stack unwinding limitations -> Maps to understanding what C cannot do (automatic destructors)
- Re-raising exceptions -> Maps to understanding exception propagation
Real World Outcome
What you will see:
- A header file:
except.hwith TRY/EXCEPT/FINALLY/RAISE macros and exception types - An implementation file:
except.cwith the exception context stack - A file processing demo: Showing graceful error handling with proper cleanup
- Nested exception demo: Showing propagation through multiple TRY blocks
Command Line Outcome Example:
# 1. Compile a file processing program with exceptions
$ gcc -std=c11 -g -o file_processor file_processor.c except.c mem.c assert.c
# 2. Run with a valid file - happy path
$ ./file_processor valid_data.txt
Opening file: valid_data.txt
Processing 1,247 records...
Record 1: processed
Record 2: processed
...
Record 1,247: processed
Closing file... done.
Success: All 1,247 records processed.
# 3. Run with a file that has an error in the middle
$ ./file_processor corrupt_data.txt
Opening file: corrupt_data.txt
Processing records...
Record 1: processed
Record 2: processed
Record 3: processed
Record 4: ERROR - Invalid format!
Exception raised: Parse_Error
Message: Expected integer at line 4, column 12, got 'abc'
Raised at: parser.c:89 in parse_record()
FINALLY: Closing file... done.
FINALLY: Freeing buffers... done.
EXCEPT: Error handled, partial results discarded.
Program exiting gracefully after error.
# 4. Demonstrate nested TRY blocks and propagation
$ ./nested_exception_demo
TRY block 1 entered
TRY block 2 entered
TRY block 3 entered
Raising exception from innermost block...
Exception propagating through block 3...
FINALLY block 3: cleanup C done.
Exception propagating through block 2 (not handling this type)...
FINALLY block 2: cleanup B done.
Exception caught in block 1: Deep_Exception
Original location: demo.c:45
Stack depth when raised: 3
FINALLY block 1: cleanup A done.
All cleanup completed, exception handled.
# 5. Demonstrate FINALLY always runs
$ ./finally_demo
Opening 3 resources...
Resource 1: opened
Resource 2: opened
Resource 3: opened
Simulating error during processing...
Exception raised: Process_Error
FINALLY: Closing resource 3... done (even though exception!)
FINALLY: Closing resource 2... done
FINALLY: Closing resource 1... done
EXCEPT: Error logged and handled.
All resources properly closed despite error.
# 6. Show re-raising exceptions
$ ./reraise_demo
Transaction started...
TRY: Executing database operation
Database error: connection timeout!
EXCEPT: Logging error locally... done
EXCEPT: Re-raising for caller to handle...
Caller received propagated exception: Database_Error
Original location: db.c:234 in execute_query()
Message: Connection timeout after 30 seconds
Transaction rolled back at top level.
# 7. Compare code with and without exceptions
$ cat error_handling_comparison.txt
WITHOUT EXCEPTIONS (traditional C):
────────────────────────────────────
FILE *f = fopen(path, "r");
if (f == NULL) { return ERROR_FILE; }
char *buf = malloc(size);
if (buf == NULL) { fclose(f); return ERROR_MEMORY; }
int n = fread(buf, 1, size, f);
if (n < size) { free(buf); fclose(f); return ERROR_READ; }
result = process(buf, n);
if (result < 0) { free(buf); fclose(f); return result; }
free(buf);
fclose(f);
return SUCCESS;
// 6 error checks, cleanup repeated 4 times, easy to miss one
WITH EXCEPTIONS (Hanson style):
─────────────────────────────────
FILE *f = NULL;
char *buf = NULL;
TRY
f = fopen_or_raise(path);
buf = malloc_or_raise(size);
read_or_raise(f, buf, size);
result = process_or_raise(buf);
FINALLY
if (buf) free(buf);
if (f) fclose(f);
END_TRY
return result;
// Zero explicit error checks, cleanup in one place, impossible to forget
The Core Question You’re Answering
“How can I separate error handling from normal flow, and ensure cleanup always happens?”
Before you write any code, understand the fundamental problem with C’s return-code error handling:
Problem 1: Code Clutter
Every function call that might fail needs an if check. A function with 10 operations needs 10 error checks, obscuring the logic.
Problem 2: Cleanup Duplication If you open a file, allocate memory, then fail on a network call, you must close the file and free the memory. If you add more operations, cleanup code multiplies.
Problem 3: Easy to Ignore
int result = some_function(); compiles fine even if you never check result. Errors silently propagate.
Exceptions solve these by:
- Separating error path from success path
- Centralizing cleanup in FINALLY
- Making errors impossible to ignore (they propagate if not caught)
The tradeoff: C’s setjmp/longjmp exceptions don’t “unwind the stack” like C++. Local variables might have wrong values. Cleanup code you would have run is skipped. You MUST use FINALLY for cleanup, or use arenas.
Concepts You Must Understand First
Stop and research these before coding:
- setjmp/longjmp Mechanics
- What does
setjmp()save? (Registers, stack pointer, program counter) - What does
longjmp()do? (Restore saved state, return different value) - What is a
jmp_bufand what’s in it? - What are the rules about local variables after longjmp?
- Can you longjmp after the setjmp function has returned? (NO! Undefined!)
- Book Reference: “Advanced Programming in the UNIX Environment” by Stevens - Ch. 7.10
- What does
- The Volatile Problem
- Why might local variables have “wrong” values after longjmp?
- What does
volatiledo in this context? - Which variables need to be volatile?
- Book Reference: C standard section 7.13.2.1
- Exception Context Stacks
- Why do you need a stack of jmp_bufs?
- What happens with nested TRY blocks?
- How do you “pop” when exiting a TRY block normally?
- What if you RAISE but there’s no active TRY?
- Book Reference: “C Interfaces and Implementations” by Hanson - Ch. 4
- FINALLY Semantics
- Why is FINALLY needed if TRY/EXCEPT exist?
- When does FINALLY run? (Always: normal exit, exception, re-raise)
- How do you implement FINALLY with setjmp/longjmp?
- Book Reference: “C Interfaces and Implementations” by Hanson - Ch. 4
Questions to Guide Your Design
Before implementing, think through these:
- Exception Representation
- What information does an exception need? (Type, message, source location?)
- Should exception types be strings, integers, or struct pointers?
- How do you define new exception types?
- Should there be a global “current exception”?
- Macro Design
- What should TRY, EXCEPT, FINALLY, END_TRY expand to?
- How do you handle the syntax:
EXCEPT(exception_type)vsEXCEPT_ANY? - How do you handle exceptions that escape without being caught?
- Can a user have multiple EXCEPT clauses for different types?
- Stack Management
- How do you maintain the stack of TRY blocks?
- What data structure represents a “exception frame”?
- What happens if longjmp is called with no active TRY?
- Is your stack global or per-thread?
- Cleanup Guarantees
- How does FINALLY know whether an exception is active?
- Should FINALLY re-raise the exception automatically?
- What if FINALLY itself raises an exception?
- What if someone does
returninside a TRY block?
Thinking Exercise
Trace setjmp/longjmp Behavior
Before coding, trace what happens with this code:
#include <setjmp.h>
#include <stdio.h>
jmp_buf env;
void dangerous_operation(void) {
printf("About to fail\n");
longjmp(env, 1);
printf("This never prints\n"); // Unreachable!
}
void intermediate(void) {
printf("Intermediate start\n");
dangerous_operation();
printf("Intermediate end\n"); // Does this print?
}
int main(void) {
printf("Main start\n");
if (setjmp(env) == 0) {
printf("Normal path\n");
intermediate();
printf("After intermediate\n"); // Does this print?
} else {
printf("Exception path\n");
}
printf("Main end\n");
return 0;
}
Write down the exact output, then verify by running.
Questions while tracing:
- What is the output of this program?
- Which stack frames are abandoned (not returned from)?
- If
intermediateallocated memory with malloc, is it leaked? - Why doesn’t “Intermediate end” print?
- What value does setjmp return initially vs after longjmp?
The Interview Questions They’ll Ask
Prepare to answer these:
-
“Explain how setjmp/longjmp work. What does setjmp save, and what does longjmp restore?”
Strong answer should include: setjmp saves CPU registers, stack pointer, program counter into jmp_buf. Returns 0 initially. longjmp restores this state and makes setjmp return the specified value (never 0). It’s a non-local goto.
-
“How would you implement TRY/CATCH in C? What are the limitations compared to C++?”
Strong answer should include: Use setjmp for TRY, longjmp for throw. Need stack of jmp_bufs for nesting. Limitations: no automatic destructors, local variables may have wrong values, can’t jump after setjmp function returns.
-
“What is the problem with resource cleanup when using longjmp? How would you solve it?”
Strong answer should include: Code between setjmp and longjmp is skipped. malloc’d memory is leaked. Solutions: FINALLY block (always runs), arena allocation (bulk free), or careful design to cleanup before throw.
-
“Why does C++ have destructors and RAII but C doesn’t? What’s the fundamental difference?”
Strong answer should include: C has no concept of object lifetime beyond scope. No way to run code when variable goes out of scope. Stack unwinding in C++ calls destructors. C has no destructors to call. RAII requires language support.
-
“Describe a bug you might encounter when using setjmp/longjmp. How would you avoid it?”
Strong answer should include: Local variable values after longjmp. longjmp after function return (undefined). SIGKILL during longjmp. Solutions: use volatile for variables modified between setjmp/longjmp, never store jmp_buf beyond function.
Hints in Layers
Hint 1: Understand Basic setjmp/longjmp
Before adding macros, understand the raw mechanism:
// Pseudocode for basic exception mechanism
global: jmp_buf exception_point
global: Exception current_exception
function RAISE(exception)
current_exception = exception
longjmp(exception_point, 1)
// Usage:
if setjmp(exception_point) == 0
// Normal execution
call functions that might RAISE
else
// Exception was raised
handle current_exception
Hint 2: Add a Stack of Exception Frames
For nested TRY blocks, you need a stack:
// Pseudocode for exception frame stack
struct Exception_Frame {
jmp_buf env;
Exception_Frame *prev;
const Except_T *exception;
enum { Entered, Handled, Finalized } state;
};
global: Exception_Frame *exception_stack = NULL;
// Entering a TRY block
Exception_Frame frame;
frame.prev = exception_stack;
exception_stack = &frame;
frame.state = Entered;
if setjmp(frame.env) == 0
// TRY body
else
// Exception raised, check if we handle this type
Hint 3: Design the Macros
The macros create the syntactic sugar:
// Pseudocode for macro expansion
#define TRY do { \
volatile int _except_flag; \
Exception_Frame _frame; \
_frame.prev = exception_stack; \
exception_stack = &_frame; \
_except_flag = setjmp(_frame.env); \
if (_except_flag == Except_entered) {
#define EXCEPT(e) \
} if (_except_flag == Except_raised && _frame.exception == &(e)) { \
_except_flag = Except_handled;
#define FINALLY \
} { \
// This block always runs
#define END_TRY \
} \
if (_except_flag == Except_raised) RERAISE; \
exception_stack = exception_stack->prev; \
} while(0)
Hint 4: Handle Re-raising and Uncaught Exceptions
// Pseudocode for RAISE
#define RAISE(e) do { \
const Except_T *_e = &(e); \
Exception_Frame *_p = exception_stack; \
if (_p == NULL) { \
// No handler! Uncaught exception \
fprintf(stderr, "Uncaught exception: %s\n", _e->reason); \
abort(); \
} \
_p->exception = _e; \
exception_stack = _p->prev; \
longjmp(_p->env, Except_raised); \
} while(0)
// RERAISE: propagate current exception to outer handler
#define RERAISE RAISE(*_frame.exception)
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Except module design | “C Interfaces and Implementations” by Hanson | Ch. 4 |
| setjmp/longjmp | “Advanced Programming in the UNIX Environment” by Stevens | Ch. 7.10 |
| C++ exceptions | “Exceptional C++” by Sutter | Items 8-19 |
| Error handling | “The Practice of Programming” by Kernighan & Pike | Ch. 4 |
| Signal handling | “Computer Systems: A Programmer’s Perspective” | Ch. 8 |
Common Pitfalls & Debugging
Problem 1: “Local variables have wrong values after exception”
- Why: The C standard says automatic variables modified between setjmp and longjmp have indeterminate values unless
volatile - Debug: Print variable values before and after the TRY block
- Fix: Declare variables that must survive exceptions as
volatile
Problem 2: “FINALLY doesn’t run when exception is raised”
- Why: Your macro structure doesn’t execute FINALLY block on exception path
- Debug: Add printf at start of FINALLY to trace execution
- Fix: FINALLY must be outside the if/else that separates normal from exception path
Problem 3: “Nested exceptions crash the program”
- Why: Frame stack is corrupted or you’re not properly pushing/popping
- Debug: Print stack depth at each PUSH/POP
- Fix: Ensure frames are properly linked; exception_stack = frame.prev in all exit paths
Problem 4: “Memory leaks when exceptions are raised”
- Why: Code between allocation and longjmp is skipped
- Debug: Run with Valgrind; leaks appear when exception is raised
- Fix: Always put cleanup in FINALLY, or use arenas for exception-unsafe code
Problem 5: “longjmp called after function returned”
- Why: jmp_buf stored beyond the function that called setjmp
- Debug: This is undefined behavior; may crash, may corrupt
- Fix: Never save jmp_buf in global or heap after the TRY function returns
Problem 6: “RAISE seems to return to wrong handler”
- Why: Exception stack not properly maintained on normal TRY exit
- Debug: Track exception_stack value at each TRY entry and exit
- Fix: Make sure END_TRY pops the frame even when no exception occurred
Project 6: List Module - Functional-Style Linked Lists
- File: P06-LIST-MODULE.md
- Main Programming Language: C
- Coolness Level: Level 3 - Genuinely Clever
- Difficulty: Level 2 - Intermediate
- Knowledge Area: Data Structures, Functional Programming Patterns
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 7
What you’ll build: A singly-linked list module that follows functional programming conventions - operations create new lists rather than mutating existing ones, enabling safe composition and predictable behavior.
Why it teaches the concept: Hanson’s List module is not your typical textbook linked list. It draws from Lisp and ML traditions, treating lists as values rather than mutable containers. This forces you to think carefully about memory ownership, shared structure, and the difference between imperative and functional design in C.
Core challenges you’ll face:
- Head pointer only - No tail pointer means append is O(n), but that’s intentional
- Shared structure - Multiple lists can share suffixes without copying
- Memory ownership - Who owns what when lists share nodes?
- Functional operations - Implementing map, filter, and fold in C
- The sentinel node pattern - Understanding why Hanson uses a sentinel
The Core Question You’re Answering
“How do you implement functional programming patterns in an imperative language, and what does this teach you about data ownership and sharing?”
Traditional imperative linked lists let you insert and delete nodes anywhere. Hanson’s List is different - it’s closer to Lisp’s cons cells. Operations like push and pop work on the head. Concatenation and reversal create new lists. This design enables structure sharing: two lists can share a common tail without either “owning” it in the traditional sense.
Concepts You Must Understand First
- Singly-Linked List Basics - Book Reference: “Algorithms in C” by Sedgewick - Ch. 3.3-3.4
- Functional vs. Imperative Lists - Book Reference: “Structure and Interpretation of Computer Programs” - Ch. 2.2
- Memory Ownership Patterns - Book Reference: “Mastering Algorithms with C” by Loudon - Ch. 5
- The Apply Pattern (Map/Fold) - What does it mean to “apply” a function to each element?
Real World Outcome
A word frequency counter that uses lists for intermediate storage:
$ ./wordfreq sample.txt
Word Frequency Analysis
=======================
Total words: 18
Unique words: 10
the 4 ████████████
quick 2 ██████
fox 2 ██████
...
$ valgrind --leak-check=full ./wordfreq sample.txt
==12345== All heap blocks were freed -- no leaks are possible
The Interview Questions They’ll Ask
- “What’s the difference between a functional list and a mutable linked list?”
- “If two lists share a tail, how do you manage memory?”
- “Implement a function that reverses a linked list. Time and space complexity?”
- “How would you implement ‘filter’ for a linked list in C?”
- “What’s the time complexity of appending two functional lists?”
Hints in Layers
Hint 1: A list node contains a void* for data and a pointer to the next node. List_T is just a pointer to the first node (or NULL for empty).
Hint 2: List_apply takes a function pointer and a “closure” void* that gets passed to each invocation.
Hint 3: Empty list is NULL, not a dummy node. Many operations need NULL checks.
Hint 4: To reverse, iterate pushing each element onto a new list. O(n) time AND O(n) space.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| The List Module | “C Interfaces and Implementations” - Hanson | Ch. 7 |
| Linked List Fundamentals | “Algorithms in C” - Sedgewick | Ch. 3.3-3.5 |
| Functional Data Structures | “Purely Functional Data Structures” - Okasaki | Ch. 2 |
| C Linked Lists | “Mastering Algorithms with C” - Loudon | Ch. 5 |
Common Pitfalls & Debugging
- “List_free causes double-free” - Structure sharing means you can’t naively free all nodes
- “List_append modifies original” - Must copy all nodes of first list, not mutate
- “List_apply crashes on empty” - Early return if list is NULL
Project 7: Table Module - Generic Hash Tables
- File: P07-TABLE-MODULE.md
- Main Programming Language: C
- Coolness Level: Level 4 - Industry Standard
- Difficulty: Level 3 - Advanced
- Knowledge Area: Data Structures, Generic Programming
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 8
What you’ll build: A generic hash table using client-provided comparison and hash functions to achieve type-agnostic behavior in C.
Why it teaches the concept: The Table module demonstrates using function pointers for polymorphism. The same code stores strings, integers, or structs - all without templates. This is how qsort works and how professional C libraries achieve flexibility.
The Core Question You’re Answering
“How do you build a container in C that works with any data type?”
Options: (1) Separate code per type, (2) void* + function pointers (Hanson’s approach), (3) Macros for type-specific code.
Concepts You Must Understand First
- Hash Function Basics - Book Reference: “Algorithms in C” by Sedgewick - Ch. 14.1-14.3
- Collision Resolution - Book Reference: “Mastering Algorithms with C” by Loudon - Ch. 8
- Function Pointers in C - Book Reference: “The C Programming Language” by K&R - Section 5.11
- Load Factor and Resizing - Amortized O(1) insertion
Real World Outcome
A symbol table for a mini programming language:
$ ./symtable program.mini
Symbol Table Contents:
x : int (line 1)
name : string (line 2)
Table Statistics:
Entries: 5, Buckets: 16, Load factor: 0.31
The Interview Questions They’ll Ask
- “Design a hash table. Time complexity of insert, lookup, delete?”
- “What makes a good hash function?”
- “Compare chaining vs. open addressing.”
- “How would you make this thread-safe?”
- “What happens when you resize?”
Hints in Layers
Hint 1: Table struct needs: bucket array, count, size, compare and hash function pointers. Hint 2: Each binding has key (void), value (void), and next pointer. Hint 3: Table_put checks if key exists, replaces if yes, adds if no, returns old value. Hint 4: Table_map iterates; document that modification during iteration is forbidden.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| The Table Module | “C Interfaces and Implementations” - Hanson | Ch. 8 |
| Hash Table Theory | “Introduction to Algorithms” - CLRS | Ch. 11 |
| Practical Hashing | “Algorithms in C” - Sedgewick | Ch. 14 |
| Hash Functions | “Mastering Algorithms with C” - Loudon | Ch. 8 |
Project 8: Set Module - Mathematical Set Operations
- File: P08-SET-MODULE.md
- Main Programming Language: C
- Coolness Level: Level 3 - Genuinely Clever
- Difficulty: Level 3 - Advanced
- Knowledge Area: Data Structures, Discrete Mathematics
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 9
What you’ll build: A Set module supporting union, intersection, difference, symmetric difference, member testing, and subset relationships.
Why it teaches the concept: Sets appear everywhere: database queries, permissions, tag filtering. Hanson’s Set builds on Table, teaching API layering and code reuse.
The Core Question You’re Answering
“How do mathematical abstractions map to efficient data structure implementations?”
Sets model unordered collections of distinct elements. Operations: membership, union, intersection, difference.
Concepts You Must Understand First
- Set Theory Basics - Union, intersection, difference, symmetric difference
- Sets and Hash Tables - A set is a hash table with only keys
- Operation Complexity - When to iterate smaller vs. larger set
- Subset and Equality - A ⊆ B means every element of A is in B
Real World Outcome
A file diff tool using set operations:
$ ./setdiff file1.txt file2.txt
Only in file1.txt: apple, elderberry
Only in file2.txt: fig, grape
In both files: banana, cherry, date
The Interview Questions They’ll Ask
- “Implement a set data structure. What operations?”
- “Given two large sets, compute intersection efficiently.”
- “Difference between set and hash table?”
- “Represent sparse set of integers 0 to 1 billion?”
- “Implement symmetric difference using other operations.”
Hints in Layers
Hint 1: Simplest Set wraps Table, storing NULL values. Hint 2: Union: create new set, add all of A, add all of B. Hint 3: Intersection: iterate smaller set, check membership in larger. Hint 4: Test with identities: A ∪ A = A, A ∩ ∅ = ∅, A △ A = ∅
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| The Set Module | “C Interfaces and Implementations” - Hanson | Ch. 9 |
| Set Theory | “Discrete Mathematics” - Rosen | Ch. 2 |
| Hash-Based Sets | “Algorithms in C” - Sedgewick | Ch. 14 |
| Set Implementation | “Mastering Algorithms with C” - Loudon | Ch. 8 |
Project 9: Array Module - Dynamic Arrays with Bounds Checking
- File: P09-ARRAY-MODULE.md
- Main Programming Language: C
- Coolness Level: Level 3 - Genuinely Clever
- Difficulty: Level 2 - Intermediate
- Knowledge Area: Data Structures, Memory Safety
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 10
What you’ll build: A dynamic array with automatic resizing and bounds checking on every access.
Why it teaches the concept: Raw C arrays cause buffer overflows. Hanson’s Array shows how to maintain performance while adding runtime safety - the same idea behind Rust’s bounds checking.
The Core Question You’re Answering
“What is the true cost of memory safety, and how do you design APIs that make unsafe operations impossible?”
C’s arr[i] enables cache-friendly access but also enables buffer overflows. Hanson trades a tiny comparison per access for guaranteed safety.
Concepts You Must Understand First
- C Array Fundamentals - Book Reference: “The C Programming Language” by K&R - Ch. 5
- Dynamic Memory and Realloc - Book Reference: “C: A Modern Approach” by King - Ch. 17
- Buffer Overflow Vulnerabilities - Book Reference: “Hacking: The Art of Exploitation” - Ch. 2
- Growth Strategies - Book Reference: “Algorithms in C” by Sedgewick - Ch. 3.4
Real World Outcome
An RPN calculator using your Array as a stack:
$ ./rpn "3 4 + 2 *"
14
$ ./rpn "+"
Error: stack underflow at '+' (need 2 operands, have 0)
$ valgrind --leak-check=full ./rpn "3 4 +"
==12345== All heap blocks were freed -- no leaks are possible
The Interview Questions They’ll Ask
- “Implement a dynamic array. Amortized complexity of push?”
- “Why grow by 2x instead of constant?”
- “How does C’s lack of bounds checking cause vulnerabilities?”
- “Compare Array module to C++ std::vector.”
- “Performance overhead of bounds checking?”
Hints in Layers
Hint 1: Struct needs: buffer pointer, length, capacity, element size. Hint 2: Array_get asserts 0 <= i < length, then returns buffer + i * element_size. Hint 3: Growth: new_capacity = 2 * capacity, use realloc to preserve data. Hint 4: void* genericity means typed access is awkward: int *p = Array_get(arr, i).
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| The Array Module | “C Interfaces and Implementations” - Hanson | Ch. 10 |
| Dynamic Arrays | “Algorithms in C” - Sedgewick | Ch. 3.4 |
| Memory Safety | “Effective C” - Seacord | Ch. 5 |
| Amortized Analysis | “Introduction to Algorithms” - CLRS | Ch. 17 |
Project 10: Seq Module - Double-Ended Queues
- File: P10-SEQ-MODULE.md
- Main Programming Language: C
- Coolness Level: Level 3 - Genuinely Clever
- Difficulty: Level 3 - Advanced
- Knowledge Area: Data Structures, Algorithm Design
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 11
What you’ll build: A sequence (deque) module supporting O(1) insertion and removal at both ends using a circular buffer.
Why it teaches the concept: Deques achieve O(1) at both ends with array performance - key for BFS, sliding window problems, and work-stealing schedulers.
The Core Question You’re Answering
“How do you achieve O(1) operations at both ends while maintaining contiguous memory for cache efficiency?”
Array: O(1) back, O(n) front. Linked list: O(1) both but poor cache. Deque: O(1) both with good cache.
Concepts You Must Understand First
- Circular Buffer Concept - Book Reference: “Mastering Algorithms with C” by Loudon - Ch. 6
- Deque Operations - addhi/addlo, remhi/remlo
- Empty vs. Full Detection - Both have head == tail; need count or waste one slot
- Growth Strategy - Book Reference: “Algorithms in C” by Sedgewick - Ch. 4.7
Real World Outcome
Sliding window maximum algorithm - a classic interview problem:
$ ./sliding_max "1 3 -1 -3 5 3 6 7" 3
Result: [3, 3, 5, 5, 6, 7]
Algorithm: O(n) because each element pushed and popped at most once!
$ valgrind --leak-check=full ./sliding_max "1 3 -1 -3 5 3 6 7" 3
==12345== All heap blocks were freed -- no leaks are possible
The Interview Questions They’ll Ask
- “Implement a deque. Operations and complexities?”
- “Explain sliding window maximum. Solve in O(n)?”
- “When use deque vs. array vs. linked list?”
- “Handle wrap-around in circular buffer?”
- “Difference between std::deque and circular buffer?”
Hints in Layers
Hint 1: Store: element array, capacity, length, head index. Compute tail from head + length. Hint 2: Element at position i: array[(head + i) % capacity]. Hint 3: Growing: copy in logical order, set head = 0. Hint 4: Empty check: length == 0. Full: length == capacity.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| The Seq Module | “C Interfaces and Implementations” - Hanson | Ch. 11 |
| Circular Buffers | “Mastering Algorithms with C” - Loudon | Ch. 6 |
| Deques and Queues | “Algorithms in C” - Sedgewick | Ch. 4.6-4.7 |
| Sliding Window | “Algorithms” - Sedgewick & Wayne | Ch. 1.3 |
Common Pitfalls & Debugging
- “Elements in wrong order” - Be consistent: addhi goes tail side, addlo goes head side
- “Crash after growing” - Copy in logical order (0 to len-1), not physical order
- “Off-by-one in modulo” - Use (h - 1 + cap) % cap for decrement (C modulo can be negative!)
- “Can’t tell empty from full” - Track length separately, not just head/tail
Project 11: Ring Module - Circular Buffers
- File: P11-RING-MODULE.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Go
- Coolness Level: Level 3 - Genuinely Clever
- Business Potential: Level 2 - Micro-SaaS / Pro Tool
- Difficulty: Level 3 - Advanced
- Knowledge Area: Data Structures, Embedded Systems, Audio/Video
- Software or Tool: GCC, Valgrind
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 12
What you’ll build: A fixed-size circular buffer with O(1) insertions and deletions at both ends, plus rotation operations.
Why it teaches C Interfaces: Rings demonstrate how fixed-size constraints can enable simpler, faster implementations. Unlike Seq which grows dynamically, Ring has a fixed capacity—this constraint is part of the interface contract.
Core challenges you’ll face:
- Index wrap-around → Understanding modular arithmetic for circular access
- Full vs Empty ambiguity → The classic ring buffer problem (same head/tail state)
- Rotation semantics → Moving elements efficiently without copying
- Memory layout → Contiguous buffer with logical circularity
Real World Outcome
# Build ring module test
$ make ring_test
# Test 1: Basic ring operations
$ ./ring_test basic
Ring created with capacity 8
Added: A B C D E
Ring contents (head to tail): [A, B, C, D, E, _, _, _]
Ring length: 5, capacity: 8
Removed from head: A
Removed from tail: E
Ring contents: [B, C, D, _, _, _, _, _]
Added: F G H I
Ring FULL - cannot add more
Ring contents: [B, C, D, F, G, H, I, _]
Wait, that's 7 elements... Ring reports: length=7, capacity=8
# Test 2: Rotation operations
$ ./ring_test rotation
Ring: [1, 2, 3, 4, 5]
Rotate left by 2: [3, 4, 5, 1, 2]
Rotate right by 1: [2, 3, 4, 5, 1]
# Test 3: Audio buffer simulation
$ ./ring_test audio
Simulating audio buffer (1024 samples, 44100 Hz)
Producer writes 256 samples... OK
Consumer reads 128 samples... OK
Buffer level: 128/1024 (12.5%)
Producer writes 512 samples... OK
Consumer reads 512 samples... OK
Buffer level: 128/1024 (12.5%)
Underrun test: Consumer tries to read 256...
UNDERRUN! Only 128 samples available
The Core Question You’re Answering
“How do you create a bounded buffer that provides O(1) operations at both ends while preventing overflow and underflow?”
This question appears everywhere: audio drivers, network packet buffers, keyboard input queues, and producer-consumer systems.
Concepts You Must Understand First
- Modular Arithmetic
- How does
(head + 1) % capacitywork? - What happens when head wraps past capacity?
- Book Reference: “C Interfaces and Implementations” Ch. 12
- How does
- The Full/Empty Ambiguity
- If head == tail, is the ring full or empty?
- Three solutions: waste a slot, use a count, use a flag
- Which did Hanson choose and why?
- Producer-Consumer Pattern
- What happens when producer is faster than consumer?
- What about the reverse?
Questions to Guide Your Design
- Capacity: Should capacity be power-of-2 for fast modulo (bitwise AND)?
- Empty slot: Do you waste one slot to distinguish full from empty?
- Growth: Should Ring ever grow, or is fixed-size fundamental to its contract?
- Iteration: How do you iterate from head to tail correctly?
Thinking Exercise
Draw the ring buffer state after these operations on a capacity-4 ring:
add_tail(A) → add_tail(B) → add_tail(C) → remove_head() →
add_tail(D) → add_tail(E) → remove_head() → add_head(X)
Track head, tail, and count at each step. Where does wrap-around occur?
The Interview Questions They’ll Ask
- “Implement a circular buffer for a real-time audio system. What’s your overflow strategy?”
- “How would you implement a ring buffer that’s lock-free for single producer, single consumer?”
- “What’s the difference between a ring buffer and a deque? When would you choose each?”
- “How do you distinguish between a full and empty ring buffer without wasting space?”
- “Design a bounded queue for a keyboard driver. What happens on overflow?”
Hints in Layers
Hint 1: Start with head, tail, and count as your state. Count solves the full/empty ambiguity directly.
Hint 2: For add_tail: store at buffer[tail], then tail = (tail + 1) % capacity. For remove_head: read from buffer[head], then head = (head + 1) % capacity.
Hint 3: Rotation can be O(n) by shifting elements, or O(1) by adjusting head/tail indices. Hanson uses index adjustment—think about what that means.
Hint 4: Test with capacity 1, capacity 2, and alternating add/remove to catch edge cases.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Ring implementation | “C Interfaces and Implementations” by Hanson | Ch. 12 |
| Circular buffer theory | “Algorithms in C” by Sedgewick | Queues section |
| Lock-free rings | “C++ Concurrency in Action” by Williams | Ch. 7 (concepts apply to C) |
Common Pitfalls & Debugging
Problem 1: “Off-by-one errors on wrap-around”
- Why: Confusing
<vs<=or pre/post increment - Fix: Always use modulo:
index = (index + 1) % capacity - Test: Verify add/remove at capacity boundary
Problem 2: “Ring appears full when empty (or vice versa)”
- Why: Not tracking count separately
- Fix: Maintain explicit count, or waste one slot
- Test: Fill completely, empty completely, repeat
Project 12: Bit Module - Bit Vector Operations
- File: P12-BIT-MODULE.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, C++
- Coolness Level: Level 4 - Hardcore Tech Flex
- Business Potential: Level 2 - Micro-SaaS / Pro Tool
- Difficulty: Level 3 - Advanced
- Knowledge Area: Bit Manipulation, Set Theory, Performance Optimization
- Software or Tool: GCC, perf
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 13
What you’ll build: A bit vector (bit array) ADT supporting set/clear/test operations plus set-theoretic operations (union, intersection, difference) on bits.
Why it teaches C Interfaces: Bit vectors are the ultimate space optimization—1 bit per boolean instead of 8 or 32. The interface hides the word-packing complexity while providing clean set operations.
Core challenges you’ll face:
- Bit indexing → Converting bit index to word index + bit position
- Word operations → Applying operations to whole words for speed
- Set operations → Implementing union/intersection/difference on bit arrays
- Portability → Handling different word sizes (32-bit vs 64-bit)
Real World Outcome
$ ./bit_test
# Test 1: Basic bit operations
Created bit vector of 1000 bits
Setting bits: 0, 7, 42, 999
Testing bit 0: SET
Testing bit 1: CLEAR
Testing bit 42: SET
Testing bit 1000: ERROR - index out of range
# Test 2: Counting
Set 100 random bits...
Bit count (popcount): 100
# Test 3: Set operations
Vector A: bits {0, 1, 2, 3, 4}
Vector B: bits {3, 4, 5, 6, 7}
A ∪ B (union): {0, 1, 2, 3, 4, 5, 6, 7}
A ∩ B (intersection): {3, 4}
A - B (difference): {0, 1, 2}
A △ B (symmetric diff): {0, 1, 2, 5, 6, 7}
# Test 4: Bloom filter simulation
$ ./bloom_test
Bloom filter: 10000 bits, 3 hash functions
Adding 1000 words from dictionary...
Testing known words: 1000/1000 found (100%)
Testing random strings: 23/1000 false positives (2.3%)
The Core Question You’re Answering
“How do you store and manipulate millions of boolean values with minimal memory while maintaining fast set operations?”
Databases use bit vectors for bitmap indices. Bloom filters use them for probabilistic membership. Compilers use them for liveness analysis.
Concepts You Must Understand First
- Bit Manipulation in C
- What do
&,|,^,~,<<,>>do? - How do you set bit N? Clear bit N? Toggle bit N? Test bit N?
- Book Reference: “The C Programming Language” by K&R - Ch. 2.9
- What do
- Word Packing
- If you have 1000 bits, how many 64-bit words do you need?
- How do you find which word contains bit 42?
- How do you find which bit within that word?
- Population Count (popcount)
- How do you count the number of set bits efficiently?
- What’s the Brian Kernighan trick? What about hardware
popcnt?
Questions to Guide Your Design
- Word size: Use
unsigned long(platform-dependent) or fixed-widthuint64_t? - Bit order: Is bit 0 the LSB or MSB of the first word?
- Length tracking: Store length in bits or words?
- Bounds checking: Check every operation or trust the caller?
Thinking Exercise
Given a 64-bit word and wanting to access bit 42:
- Which word index? (42 / 64 = 0)
- Which bit within word? (42 % 64 = 42)
- Mask to test bit 42:
1UL << 42
Now write the formulas for: set bit N, clear bit N, test bit N.
The Interview Questions They’ll Ask
- “Implement a Bloom filter. What operations does your bit vector need?”
- “How would you find the first set bit in a bit vector efficiently?”
- “Count set bits in a 64-bit integer without a loop.”
- “Implement set intersection on two bit vectors. What’s the complexity?”
- “Design a bitmap index for a database column with 1 billion rows.”
Hints in Layers
Hint 1: Store bits in an array of unsigned long words. Bit N is in word N / BITS_PER_WORD at position N % BITS_PER_WORD.
Hint 2: For set operations on equal-length vectors, just apply bitwise ops word-by-word: result[i] = a[i] | b[i] for union.
Hint 3: For popcount, use __builtin_popcountl() if available, otherwise the Kernighan loop: while(n) { count++; n &= n-1; }.
Hint 4: First set bit: __builtin_ffsl() or manual binary search on words.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Bit vector implementation | “C Interfaces and Implementations” by Hanson | Ch. 13 |
| Bit manipulation tricks | “Hacker’s Delight” by Warren | Throughout |
| Bloom filters | “Algorithms” by Sedgewick & Wayne | Ch. 3.5 |
Project 13: Fmt Module - Extensible Printf
- File: P13-FMT-MODULE.md
- Main Programming Language: C
- Alternative Programming Languages: None (C-specific)
- Coolness Level: Level 4 - Hardcore Tech Flex
- Business Potential: Level 1 - Resume Gold
- Difficulty: Level 4 - Expert
- Knowledge Area: String Formatting, Variadic Functions, Extensibility
- Software or Tool: GCC
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 14
What you’ll build: An extensible printf-like formatting system where you can register custom format specifiers.
Why it teaches C Interfaces: Fmt shows how to design an extensible interface—users can add new behaviors without modifying library code. This is the Open/Closed Principle in C.
Core challenges you’ll face:
- Variadic argument handling → Using
va_list,va_argcorrectly - Format string parsing → Handling width, precision, flags
- Extensibility design → Registering and dispatching custom converters
- Output abstraction → Writing to strings, files, or callbacks
Real World Outcome
$ ./fmt_test
# Test 1: Standard conversions
Fmt_print("Integer: %d, String: %s\n", 42, "hello")
Output: Integer: 42, String: hello
# Test 2: Width and precision
Fmt_print("|%10d|%-10d|%010d|\n", 42, 42, 42)
Output: | 42|42 |0000000042|
Fmt_print("|%.5s|%10.5s|\n", "hello world", "hello world")
Output: |hello| hello|
# Test 3: Custom converter - dates
Registering 'D' for Date type...
Fmt_print("Today: %D\n", &today)
Output: Today: 2024-12-28
# Test 4: Custom converter - IP addresses
Registering 'I' for IP address...
Fmt_print("Server: %I\n", &ip_addr)
Output: Server: 192.168.1.1
# Test 5: Output to string
char buffer[100];
Fmt_string(buffer, sizeof(buffer), "Name: %s, Age: %d", "Alice", 30)
Buffer contains: "Name: Alice, Age: 30"
# Test 6: Output to file
Fmt_fprint(stderr, "Error code: %d\n", errno)
The Core Question You’re Answering
“How do you design a formatting system that’s as convenient as printf but allows users to add new format specifiers without modifying the library?”
This is about extensible design—making your interface open for extension but closed for modification.
Concepts You Must Understand First
- Variadic Functions in C
- How do
va_start,va_arg,va_endwork? - Why must you know the type of each argument in advance?
- Book Reference: “The C Programming Language” by K&R - Ch. 7.3
- How do
- Printf Format Specifications
- What do
%, flags, width, precision, and conversion specifier mean? - How does
%10.5swork?
- What do
- Function Pointers for Extensibility
- How do you store and call functions dynamically?
- What’s the converter function signature?
Questions to Guide Your Design
- Registration: How do users register a custom converter? What’s the function signature?
- Dispatch: How do you look up the converter for a given character?
- Context: What information does a converter need? (output function, flags, width, precision)
- Output: How do you abstract output to work with files, strings, and callbacks?
The Interview Questions They’ll Ask
- “Design a logging library with custom format specifiers. How do you handle extensions?”
- “Implement printf for an embedded system with no heap. What changes?”
- “How does printf know the types of its arguments?”
- “What’s wrong with printf from a type-safety perspective? How do languages fix this?”
- “Implement a format string vulnerability detector.”
Hints in Layers
Hint 1: Create a table mapping characters to converter functions: converters['d'] = int_converter.
Hint 2: Each converter has signature: void converter(int code, va_list *ap, int (*put)(int, void*), void *cl, int flags, int width, int precision).
Hint 3: The output function put(char, client_data) abstracts where output goes. For strings, client_data tracks buffer position.
Hint 4: Parse format string character by character. On %, parse flags/width/precision, then dispatch to converter.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Fmt implementation | “C Interfaces and Implementations” by Hanson | Ch. 14 |
| Variadic functions | “The C Programming Language” by K&R | Ch. 7.3 |
| Printf internals | “Expert C Programming” by van der Linden | Ch. 5 |
Project 14: Str Module - Low-Level String Operations
- File: P14-STR-MODULE.md
- Main Programming Language: C
- Alternative Programming Languages: Rust (for comparison)
- Coolness Level: Level 3 - Genuinely Clever
- Business Potential: Level 1 - Resume Gold
- Difficulty: Level 3 - Advanced
- Knowledge Area: String Processing, Memory Efficiency
- Software or Tool: GCC, Valgrind
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 15
What you’ll build: A string module with position-based indexing (1-based, with negative indices from end) that operates on substrings without allocation.
Why it teaches C Interfaces: Str shows how to design an interface that’s safer and more convenient than raw C strings while remaining efficient. The position indexing prevents off-by-one errors.
Core challenges you’ll face:
- Position semantics → 1-based indexing, negative indices from end
- Substring without copy → Returning pointers into existing strings
- Length tracking → Explicit length vs null-termination
- Conversion → Interoperating with standard C strings
Real World Outcome
$ ./str_test
# Test 1: Position-based indexing (1-based!)
s = "Hello, World!"
Str_sub(s, 1, 5) → "Hello" # positions 1-5
Str_sub(s, 8, 0) → "World!" # position 8 to end (0 = end)
Str_sub(s, -6, 0) → "World!" # -6 = 6th from end
# Test 2: Negative indices
Str_sub(s, -1, -1) → "!" # last character
Str_sub(s, 1, -2) → "Hello, World" # all but last
# Test 3: No allocation - returns view
char *sub = Str_sub(s, 1, 5, &len);
# sub points into s, no malloc!
# len = 5
# Test 4: Character operations
Str_chr(s, 1, 0, 'o') → 5 # first 'o' at position 5
Str_rchr(s, 1, 0, 'o') → 9 # last 'o' at position 9
# Test 5: Comparison
Str_cmp(s, 1, 5, t, 1, 5) → 0 # compare substrings
The Core Question You’re Answering
“How do you design a string interface that prevents the off-by-one and buffer overflow errors that plague C string code?”
Str’s 1-based, end-inclusive positions and explicit lengths make correct code easier to write.
Concepts You Must Understand First
- C String Problems
- Why is
strlen()O(n)? - What causes buffer overflows in
strcpy()? - Why are off-by-one errors so common with 0-based indexing?
- Why is
- Position Semantics
- Position 1 = first character
- Position -1 = last character
- Position 0 = after last character (useful as endpoint)
- String Views vs Owned Strings
- What’s the difference between pointing into a string and copying it?
- What are the lifetime implications?
Questions to Guide Your Design
- Validation: What happens if positions are out of range?
- Null termination: Does Str guarantee null-terminated results?
- Mutability: Can Str operations modify the source string?
- Integration: How do you convert between Str and char*?
The Interview Questions They’ll Ask
- “Design a string library that prevents buffer overflows. What’s your approach?”
- “Implement substring search. What algorithm would you use and why?”
- “Compare string views (Rust/C++) to null-terminated strings. Trade-offs?”
- “How would you implement Unicode support in a string library?”
- “What’s the complexity of your substring operation? Does it allocate?”
Hints in Layers
Hint 1: Internal representation: { char *s; int len; } - pointer plus length, not null-terminated.
Hint 2: Convert positions: if pos > 0, use pos - 1 as index. If pos < 0, use len + pos as index. If pos == 0 at end, use len.
Hint 3: Substring returns a pointer into the original plus a length - no allocation needed.
Hint 4: For output to C functions, you may need to copy and null-terminate. Provide a Str_dup() for this.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Str implementation | “C Interfaces and Implementations” by Hanson | Ch. 15 |
| String algorithms | “Algorithms in C” by Sedgewick | String processing |
| Safe string handling | “Secure Coding in C and C++” by Seacord | Ch. 2 |
Project 15: Text Module - High-Level Immutable Strings
- File: P15-TEXT-MODULE.md
- Main Programming Language: C
- Alternative Programming Languages: Java (String), Rust (str)
- Coolness Level: Level 3 - Genuinely Clever
- Business Potential: Level 1 - Resume Gold
- Difficulty: Level 3 - Advanced
- Knowledge Area: Immutable Data Structures, String Processing
- Software or Tool: GCC, Valgrind
- Main Book: “C Interfaces and Implementations” by David Hanson - Ch. 16
What you’ll build: An immutable string abstraction where concatenation and substring create new strings, never modifying originals.
Why it teaches C Interfaces: Text demonstrates how to implement immutable value semantics in C. Immutability eliminates aliasing bugs and enables safe sharing.
Core challenges you’ll face:
- Immutability enforcement → No operation modifies existing Text
- Memory management → Who frees Text objects?
- Efficient concatenation → Avoiding O(n^2) repeated appends
- Boxing/unboxing → Converting between Text and char*
Real World Outcome
$ ./text_test
# Test 1: Creation
t1 = Text_put("Hello")
t2 = Text_put(" World")
# Both are now immutable Text values
# Test 2: Concatenation creates new Text
t3 = Text_cat(t1, t2)
# t1 is still "Hello"
# t2 is still " World"
# t3 is "Hello World"
# Test 3: Substring
t4 = Text_sub(t3, 1, 5) # "Hello"
# t3 unchanged
# Test 4: Comparison
Text_cmp(t1, t4) → 0 # equal (both "Hello")
# Test 5: Interop with C strings
char *s = Text_get(NULL, 0, t3); # extract as C string
# s = "Hello World\0"
# (caller must free)
# Test 6: Memory efficiency
t5 = Text_put("Hello")
# t5 may share storage with t1 (same content)
# Test 7: Text in data structures
Table_put(table, Text_box("key"), Text_box("value"));
# Text values can be stored directly
The Core Question You’re Answering
“How do you implement immutable strings in a mutable language like C, and why would you want to?”
Immutability eliminates a class of bugs (aliasing, unexpected mutation) and enables optimization (sharing, caching).
Concepts You Must Understand First
- Immutability
- What does it mean for a value to be immutable?
- Why does Java’s String class work this way?
- What bugs does immutability prevent?
- Value vs Reference Semantics
- When you assign
t2 = t1, do they share or copy? - How do you implement value semantics in C?
- When you assign
- String Interning (Atoms) Relationship
- Could Text reuse Atom’s string table?
- What’s the difference between Atom and Text?
Questions to Guide Your Design
- Storage: How do you store immutable strings? (Arena? Reference counting?)
- Sharing: When two Texts have the same content, do they share storage?
- Lifetime: How long does a Text live? Who decides?
- Efficiency: How do you avoid O(n^2) for building strings incrementally?
The Interview Questions They’ll Ask
- “Why are strings immutable in Java/Python? What are the trade-offs?”
- “Implement efficient string concatenation for building large strings.”
- “Compare immutable strings to StringBuilder. When use each?”
- “Design a text editor buffer. Would you use immutable or mutable strings?”
- “How would you implement copy-on-write strings?”
Hints in Layers
Hint 1: Text is a struct containing a pointer and length. All operations return new Text values.
Hint 2: Use an arena or pool allocator for Text storage. Texts allocated in same arena share lifetime.
Hint 3: For building strings, provide Text_buffer that collects pieces, then Text_buffer_finish to create immutable result.
Hint 4: Consider rope data structure for efficient concatenation (tree of string pieces).
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Text implementation | “C Interfaces and Implementations” by Hanson | Ch. 16 |
| Immutable data structures | “Purely Functional Data Structures” by Okasaki | Ch. 1-2 |
| Rope data structure | “Algorithms” by Sedgewick & Wayne | String algorithms |
Common Pitfalls & Debugging (Projects 11-15)
Problem: Ring buffer gives wrong results after wraparound
- Verify:
(head + 1) % capacityused consistently - Test: Fill to capacity-1, add one more, remove one
Problem: Bit operations affect wrong bits
- Verify: Using
1UL << nnot1 << nfor 64-bit - Test: Set bit 63, verify bit 0 unchanged
Problem: Fmt custom converter crashes
- Verify:
va_arg()called with correct type - Test: Pass wrong type intentionally to see symptoms
Problem: Str negative indices off by one
- Verify:
-1means last char, not past end - Test:
Str_sub(s, -1, -1)should return last character only
Problem: Text memory grows unbounded
- Verify: Using arena and deallocating periodically
- Test: Create/discard millions of Text values, monitor memory
Project 16: XP Module - Extended Precision Unsigned Arithmetic
File Reference
- File: P16-XP-MODULE.md
- Main Programming Language: C
- Coolness Level: Level 4 - Hardcore Tech Flex
- Difficulty: Level 4 - Expert
- Estimated Time: 2-3 weeks
- Knowledge Area: Arbitrary Precision Math, Cryptography Foundations
- Main Book: “C Interfaces and Implementations” by David Hanson - Chapter 17
What You Will Learn
This project takes you into the world of numbers that exceed what hardware can natively represent. When you need to compute RSA keys, hash astronomical datasets, or calculate factorials of large numbers, 64-bit integers simply cannot hold the results. The XP module teaches you how to represent and manipulate unsigned integers of arbitrary length.
Core Concepts
Why Extended Precision Matters
Modern cryptography operates on numbers with hundreds or thousands of digits. A 2048-bit RSA key is a 617-digit decimal number. Your CPU’s largest native integer (typically 64 bits) maxes out at about 20 decimal digits. The gap between hardware capability and cryptographic requirements created the field of arbitrary precision arithmetic.
Hardware Limit (64-bit): 18,446,744,073,709,551,615
2048-bit RSA modulus: A number with ~617 decimal digits
1000!: A number with 2,568 decimal digits
Representing Large Numbers
The fundamental insight is storing numbers as arrays of smaller units (typically 32-bit or 64-bit words), similar to how we write decimal numbers as sequences of digits:
Extended Precision Number Representation
────────────────────────────────────────────────────────────────────────────────
Traditional decimal: 1,234,567,890
^ ^ ^
| | |
Digit Digit Digit
positions
Extended precision (base 2^32):
Number: 0x123456789ABCDEF0123456789ABCDEF0
+-----------+-----------+-----------+-----------+
| Word[3] | Word[2] | Word[1] | Word[0] |
| 0x12345678| 0x9ABCDEF0| 0x12345678| 0x9ABCDEF0|
+-----------+-----------+-----------+-----------+
^ ^ ^ ^
| | | |
Most Significant Least Significant
Value = Word[3] * (2^32)^3 + Word[2] * (2^32)^2 +
Word[1] * (2^32)^1 + Word[0] * (2^32)^0
Addition with Carry Propagation
Adding two XP numbers requires handling carries that ripple from least significant to most significant words:
XP Addition with Carry
────────────────────────────────────────────────────────────────────────────────
a: [0xFFFFFFFF] [0x00000001] [0x00000000]
+ b: [0x00000001] [0xFFFFFFFF] [0x00000001]
-------------------------------------------
Step 1: Add Word[0]s
0x00000000 + 0x00000001 = 0x00000001, carry = 0
Step 2: Add Word[1]s + carry
0x00000001 + 0xFFFFFFFF + 0 = 0x100000000
Result = 0x00000000, carry = 1 (overflow!)
Step 3: Add Word[2]s + carry
0xFFFFFFFF + 0x00000001 + 1 = 0x100000001
Result = 0x00000001, carry = 1
Final: [0x00000001] [0x00000000] [0x00000001] + overflow bit
Subtraction with Borrow
Subtraction mirrors addition but propagates borrows instead of carries:
XP Subtraction (Pseudocode)
────────────────────────────────────────────────────────────────────────────────
function xp_subtract(a[], b[], result[], n):
borrow = 0
for i = 0 to n-1:
diff = a[i] - b[i] - borrow
if diff underflows (negative):
diff = diff + BASE // BASE = 2^32
borrow = 1
else:
borrow = 0
result[i] = diff
return borrow // Non-zero means a < b
Multiplication Algorithms
The simplest approach is grade-school multiplication, where each word of one operand multiplies every word of the other:
Grade-School Multiplication
────────────────────────────────────────────────────────────────────────────────
Multiplying 3-word numbers:
a[2] a[1] a[0]
x b[2] b[1] b[0]
--------------------------------
a*b[0] (partial product 0)
+ a*b[1] (partial product 1, shifted)
+ a*b[2] (partial product 2, shifted twice)
--------------------------------
Each a*b[i] produces a 4-word result (3 words * 1 word = 4 words max)
Time Complexity: O(n^2) multiplications
For very large numbers, the Karatsuba algorithm reduces multiplications at the cost of more additions:
Karatsuba Insight
────────────────────────────────────────────────────────────────────────────────
Traditional: (a1*B + a0) * (b1*B + b0) requires 4 multiplications:
a1*b1, a1*b0, a0*b1, a0*b0
Karatsuba trick:
z2 = a1 * b1
z0 = a0 * b0
z1 = (a1 + a0) * (b1 + b0) - z2 - z0
Result = z2*B^2 + z1*B + z0
Only 3 multiplications!
Time Complexity: O(n^1.585)
Division Algorithm
Division is the most complex operation, typically implemented as long division with estimation:
XP Division (Simplified Pseudocode)
────────────────────────────────────────────────────────────────────────────────
function xp_divide(dividend[], divisor[], quotient[], remainder[]):
// Normalize: shift divisor so high bit is set
shift = count_leading_zeros(divisor[high])
dividend = shift_left(dividend, shift)
divisor = shift_left(divisor, shift)
for i = high_word down to 0:
// Estimate quotient digit
q_estimate = (dividend[i+1] * BASE + dividend[i]) / divisor[high]
// Refine estimate (may be off by 1 or 2)
while q_estimate * divisor > remaining_dividend:
q_estimate -= 1
quotient[i] = q_estimate
dividend -= q_estimate * divisor * BASE^i
remainder = shift_right(dividend, shift) // Unnormalize
Why Cryptography Needs This
RSA encryption computes: ciphertext = message^e mod n
Where n is a 2048-bit (or larger) number. This operation requires:
- Extended precision representation (XP module)
- Modular multiplication (repeated squaring)
- Modular exponentiation (square-and-multiply algorithm)
Without XP arithmetic, modern public-key cryptography would be impossible.
Guiding Questions
- Why do we use base 2^32 instead of base 10 for XP numbers?
- What’s the maximum size of the product of two n-word numbers?
- Why does Karatsuba become beneficial only for very large numbers?
- How does the choice of word size affect carry/borrow detection?
- Why is division so much harder than multiplication?
Hints and Progression
Hint 1 - Starting Point: Begin with addition and subtraction. Use a simple array of uint32_t values. The carry/borrow is just a single bit you track.
Hint 2 - Multiplication Foundation: For multiplication, the product of two n-word numbers is at most 2n words. Allocate appropriately.
Hint 3 - Handling Overflow: When multiplying two 32-bit numbers, the result can be 64 bits. Use 64-bit intermediate results, then split into word and carry.
Hint 4 - Division Strategy: Start with simple cases (single-word divisor) before implementing full multi-word division.
Hint 5 - Testing Approach: Test with numbers you can verify manually. 0xFFFFFFFF + 1 should give 0x100000000 (two words).
Real World Outcome
Computing factorials of large numbers (1000!)
After completing this project, you can calculate 1000! - a number with 2,568 decimal digits. This computation is impossible with native integer types but straightforward with your XP module.
Your implementation should:
- Accept an integer n as input
- Compute n! using your XP multiplication
- Convert the result to decimal for display
- Handle numbers up to 10,000! or beyond
Interview Preparation
Conceptual Questions:
- Explain how you’d add two 256-bit numbers using 64-bit registers
- What’s the time complexity of grade-school multiplication? Can you do better?
- How would you implement comparison of two XP numbers?
Coding Challenges:
- Implement XP addition with carry propagation
- Write a function to convert XP number to decimal string
- Implement XP multiplication using the grade-school algorithm
Book References
- Primary: Hanson, Ch. 17 - XP module implementation
- Algorithm Background: Knuth, “The Art of Computer Programming” Vol. 2, Seminumerical Algorithms
- Cryptographic Context: Schneier, “Applied Cryptography”
Project 17: AP Module - Arbitrary Precision Signed Integers
File Reference
- File: P17-AP-MODULE.md
- Main Programming Language: C
- Coolness Level: Level 4 - Hardcore Tech Flex
- Difficulty: Level 4 - Expert
- Estimated Time: 2-3 weeks
- Knowledge Area: Signed Arithmetic, Mathematical Computing
- Main Book: “C Interfaces and Implementations” by David Hanson - Chapter 18
What You Will Learn
The XP module handles unsigned numbers beautifully, but mathematics requires negative numbers too. The AP module extends XP to handle signed integers of arbitrary size, enabling you to build a complete mathematical computing system.
Core Concepts
Sign-Magnitude Representation
Unlike hardware integers that typically use two’s complement, the AP module uses sign-magnitude representation:
Sign-Magnitude vs Two's Complement
────────────────────────────────────────────────────────────────────────────────
Sign-Magnitude (AP style):
+42 = { sign: +, magnitude: [0x2A] }
-42 = { sign: -, magnitude: [0x2A] }
Same magnitude, different sign bit
Two's Complement (hardware style):
+42 = 0x0000002A
-42 = 0xFFFFFFD6 (bitwise NOT + 1)
Completely different bit patterns
AP Structure:
+---------+------------------------+
| Sign | XP Magnitude |
| (1 bit) | (array of words) |
+---------+------------------------+
Why Sign-Magnitude for Arbitrary Precision?
Two’s complement requires a fixed number of bits. The “sign bit” is the most significant bit, which means the representation changes if you add more bits. For arbitrary precision:
Two's Complement Extension Problem
────────────────────────────────────────────────────────────────────────────────
-1 in 8 bits: 0xFF
-1 in 16 bits: 0xFFFF
-1 in 32 bits: 0xFFFFFFFF
To extend, you must know the sign and pad with 1s.
But arbitrary precision means "we don't know how many words yet."
Sign-magnitude solves this:
-1 = { sign: -, magnitude: [1] }
Adding more words just means adding leading zeros to magnitude.
No sign-dependent padding required.
Arithmetic with Signed Numbers
Operations must consider sign combinations:
AP Addition (Pseudocode)
────────────────────────────────────────────────────────────────────────────────
function ap_add(a, b):
if a.sign == b.sign:
// Same signs: add magnitudes, keep sign
result.sign = a.sign
result.magnitude = xp_add(a.magnitude, b.magnitude)
else:
// Different signs: subtract smaller from larger
cmp = xp_compare(a.magnitude, b.magnitude)
if cmp > 0:
result.sign = a.sign
result.magnitude = xp_sub(a.magnitude, b.magnitude)
elif cmp < 0:
result.sign = b.sign
result.magnitude = xp_sub(b.magnitude, a.magnitude)
else:
result = ZERO // Equal magnitudes, opposite signs
return result
Sign Handling in Operations
────────────────────────────────────────────────────────────────────────────────
Addition: a + b
(+a) + (+b) = +(a + b) // Add magnitudes
(-a) + (-b) = -(a + b) // Add magnitudes, negative
(+a) + (-b) = a - b // Subtract (may change sign)
(-a) + (+b) = b - a // Subtract (may change sign)
Subtraction: a - b = a + (-b)
Just negate b and add
Multiplication: a * b
|a * b| = |a| * |b| // Multiply magnitudes
sign = (a.sign == b.sign) ? + : -
Division: a / b
|a / b| = |a| / |b| // Divide magnitudes
sign = (a.sign == b.sign) ? + : -
(Plus remainder handling)
Handling Zero
Zero is special - it has no sign (or rather, both +0 and -0 should be treated identically):
Canonicalizing Zero (Pseudocode)
────────────────────────────────────────────────────────────────────────────────
function canonicalize(ap):
if magnitude_is_zero(ap.magnitude):
ap.sign = POSITIVE // Convention: zero is always +0
return ap
Comparison Operations
Comparing signed AP numbers requires checking signs first:
AP Comparison (Pseudocode)
────────────────────────────────────────────────────────────────────────────────
function ap_compare(a, b):
// Handle sign differences
if a.sign == POSITIVE and b.sign == NEGATIVE:
return +1 // a > b
if a.sign == NEGATIVE and b.sign == POSITIVE:
return -1 // a < b
// Same sign: compare magnitudes
mag_cmp = xp_compare(a.magnitude, b.magnitude)
if a.sign == POSITIVE:
return mag_cmp // Larger magnitude = larger number
else:
return -mag_cmp // Larger magnitude = smaller number (more negative)
Building on XP
The AP module is a layer on top of XP:
AP Module Architecture
────────────────────────────────────────────────────────────────────────────────
+------------------------------------------+
| AP Interface |
| ap_add, ap_sub, ap_mul, ap_div, ... |
+------------------------------------------+
|
| Uses
v
+------------------------------------------+
| XP Interface |
| xp_add, xp_sub, xp_mul, xp_div, ... |
+------------------------------------------+
|
| Operates on
v
+------------------------------------------+
| Word Arrays (uint32_t[]) |
+------------------------------------------+
Trade-offs: Sign-Magnitude vs Two’s Complement
| Aspect | Sign-Magnitude | Two’s Complement |
|---|---|---|
| Negation | Flip sign bit (O(1)) | Complement & add 1 (O(n)) |
| Extension | Just add zero words | Must sign-extend |
| Zero | Two representations (+0, -0) | One representation |
| Hardware support | Minimal | Excellent |
| Bitwise ops | Awkward | Natural |
| Comparison | Check sign, then magnitude | Direct subtraction |
Guiding Questions
- Why does sign-magnitude work better than two’s complement for arbitrary precision?
- How do you handle the result sign when multiplying two negative numbers?
- What’s the issue with having both +0 and -0?
- Why is negation O(1) in sign-magnitude but O(n) in two’s complement?
- How should integer division handle negative dividends and divisors?
Hints and Progression
Hint 1 - Start Simple: Implement AP as a struct with a sign field and an XP array. Don’t optimize yet.
Hint 2 - Reuse XP: Every AP operation should delegate magnitude work to XP functions. AP only handles signs.
Hint 3 - Test Signs Exhaustively: Test all sign combinations: (+,+), (+,-), (-,+), (-,-).
Hint 4 - Zero Edge Case: Always canonicalize results. After subtraction of equal values, ensure result is +0, not -0.
Hint 5 - Division Semantics: Decide on truncation vs floor division for negative numbers. Document your choice.
Real World Outcome
A big integer calculator (like bc)
Build an interactive calculator that:
- Parses arithmetic expressions with +, -, *, /, %, and ()
- Handles arbitrarily large positive and negative integers
- Provides exact results (no floating-point approximation)
- Includes common functions: factorial, power, gcd
Example session:
> 2^1000
107150860718626732094842504906000181056140481170553360744375038837035105112493612249...
> factorial(100)
93326215443944152681699238856266700490715968264381621468592963895217599993229915608...
> -123456789012345678901234567890 * 987654321098765432109876543210
-121932631137021795226185032733622923332237463801111263526900...
Interview Preparation
Conceptual Questions:
- Compare sign-magnitude and two’s complement for arbitrary precision
- How would you implement efficient negation?
- Explain the sign rules for division with negative operands
Coding Challenges:
- Implement signed multiplication using unsigned XP routines
- Write a function that computes (-1)^n efficiently for AP numbers
- Implement modular exponentiation for negative bases
Book References
- Primary: Hanson, Ch. 18 - AP module implementation
- Mathematical Foundations: Graham, Knuth, Patashnik, “Concrete Mathematics”
- Language Comparison: Python’s int type uses sign-magnitude internally
Project 18: MP Module - Multiple Precision Fixed-Size Integers
File Reference
- File: P18-MP-MODULE.md
- Main Programming Language: C
- Coolness Level: Level 5 - Legendary/Mythical
- Difficulty: Level 5 - Legendary
- Estimated Time: 3-4 weeks
- Knowledge Area: Modular Arithmetic, Cryptography Implementation
- Main Book: “C Interfaces and Implementations” by David Hanson - Chapter 19
What You Will Learn
While AP handles numbers of any size, cryptography often requires numbers of a specific, fixed size (256-bit, 512-bit, 2048-bit). The MP module provides fixed-size integers with modular arithmetic - the mathematical foundation for RSA, Diffie-Hellman, and elliptic curve cryptography.
Core Concepts
Why Fixed-Size for Cryptography?
Fixed vs Arbitrary Precision in Crypto
────────────────────────────────────────────────────────────────────────────────
RSA with 2048-bit key:
- All operations are mod N (a 2048-bit number)
- Results are always < N (exactly 2048 bits or fewer)
- No need for arbitrary growth
- Fixed allocation = no timing side channels from realloc
+------- 2048 bits exactly -------+
| |
| All intermediate values fit |
| in this fixed space |
| |
+---------------------------------+
Arbitrary precision:
- Size grows with computation
- Must reallocate memory
- Timing varies with size
- Memory allocation = potential side channel
Modular Arithmetic Fundamentals
All MP operations occur within a modular ring:
Modular Arithmetic Ring
────────────────────────────────────────────────────────────────────────────────
For N = 7 (simple example):
Regular integers: ... -2 -1 0 1 2 3 4 5 6 7 8 9 10 ...
Mod 7 ring: 0 1 2 3 4 5 6
\ /
\ /
\ /
\ /
\ /
*
Wraps around
3 + 5 = 8 = 1 (mod 7)
3 * 5 = 15 = 1 (mod 7)
5 - 6 = -1 = 6 (mod 7)
For cryptographic sizes:
All operations mod N (where N is the modulus):
a +_N b = (a + b) mod N
a *_N b = (a * b) mod N
a -_N b = (a - b + N) mod N // Ensure non-negative
The trick: intermediate a + b might exceed N,
but the result always fits in N's bit size.
Montgomery Multiplication
Standard modular multiplication requires division (expensive). Montgomery multiplication replaces division with shifts:
Montgomery Domain
────────────────────────────────────────────────────────────────────────────────
Regular domain: Montgomery domain:
a aR mod N
b bR mod N
a * b mod N (aR)(bR) = abR^2 mod N
Need to remove one R:
Montgomery reduce: abR^2 * R^-1 = abR mod N
Convert to Montgomery, do all multiplications there,
convert back at the end.
Why faster? Montgomery reduction uses only:
- Multiplication
- Addition
- Right shift (instead of division!)
R is chosen as 2^k where k = bit size of N
Division by R is just a right shift
Montgomery Reduction (Simplified Pseudocode)
────────────────────────────────────────────────────────────────────────────────
// Compute xR^-1 mod N without division
function mont_reduce(x, N, k):
// k = number of bits in N
// N' is precomputed such that N * N' = -1 (mod R)
m = (x * N_prime) mod R // Low bits only
t = (x + m * N) / R // Exact division (shift)
if t >= N:
t = t - N
return t
Fixed-Size Representation
MP Number Layout
────────────────────────────────────────────────────────────────────────────────
512-bit MP number (16 x 32-bit words):
+------+------+------+------+------+------+------+------+
|W[15] |W[14] |W[13] |W[12] |W[11] |W[10] | W[9] | W[8] |
+------+------+------+------+------+------+------+------+
|W[7] |W[6] |W[5] |W[4] |W[3] |W[2] | W[1] | W[0] |
+------+------+------+------+------+------+------+------+
MSW LSW
Fixed 64 bytes - never grows, never shrinks
Allocated once, reused for all operations
Modular Exponentiation
The core operation for RSA: a^e mod N
Square-and-Multiply Exponentiation (Pseudocode)
────────────────────────────────────────────────────────────────────────────────
function mod_exp(base, exponent, modulus):
result = 1
base = base mod modulus
while exponent > 0:
if exponent is odd:
result = (result * base) mod modulus
exponent = exponent >> 1 // Divide by 2
base = (base * base) mod modulus
return result
// For RSA decryption with 2048-bit exponent:
// About 2048 squarings + 1024 multiplications on average
// Each operation is 2048-bit modular multiplication
Square-and-Multiply Example
────────────────────────────────────────────────────────────────────────────────
Computing 3^13 mod 17:
13 in binary: 1101
Step | Bit | Operation | result | base
------|-----|---------------------|--------|------
Init | | | 1 | 3
1 | 1 | result *= base | 3 |
| | base = base^2 | | 9
2 | 0 | (bit is 0, skip) | 3 |
| | base = base^2 | | 81 = 13
3 | 1 | result *= base | 39 = 5 |
| | base = base^2 | | 169 = 16
4 | 1 | result *= base | 80 = 12|
| | (done) | |
3^13 mod 17 = 12 (Verified: 1594323 mod 17 = 12)
Why Fixed-Size Is Sometimes Better
Fixed vs Arbitrary Trade-offs
────────────────────────────────────────────────────────────────────────────────
Arbitrary (AP/XP):
+ Handles any size naturally
+ No overflow concerns
- Memory allocation overhead
- Timing varies with operand size
- Cache behavior unpredictable
Fixed (MP):
+ Constant-time operations possible
+ Predictable memory usage
+ Better cache utilization
+ No allocation during operations
- Must choose size upfront
- Wasted space if numbers are small
For cryptography, fixed-size wins because:
1. Security requires constant-time
2. Key sizes are known at compile time
3. Performance is critical
Guiding Questions
- Why is modular reduction after every operation necessary in MP?
- What makes Montgomery multiplication faster than naive modular multiplication?
- Why is constant-time execution important for cryptography?
- How does the choice of R in Montgomery multiplication affect efficiency?
- What’s the relationship between MP and XP internally?
Hints and Progression
Hint 1 - Start with Modular Arithmetic: Implement addition, subtraction, and multiplication mod N using regular XP operations followed by reduction.
Hint 2 - Understand Montgomery Domain: Before implementing Montgomery multiplication, work through several examples by hand.
Hint 3 - Precomputation Matters: Montgomery requires precomputing N’ such that N*N’ = -1 (mod R). Use the extended Euclidean algorithm.
Hint 4 - Constant-Time Considerations: Avoid if statements that depend on secret data. Use conditional moves instead.
Hint 5 - Test Against Known Values: RSA test vectors are available in standards documents. Use them.
Real World Outcome
A modular exponentiation routine (foundation for RSA)
Build a function that computes a^e mod n for arbitrary 2048-bit values:
// Your function signature
void mp_modexp(mp_t result, const mp_t base, const mp_t exp, const mp_t mod);
// Should correctly compute:
// - RSA encryption: message^e mod n
// - RSA decryption: ciphertext^d mod n
// - Diffie-Hellman: g^x mod p
Test with actual RSA key generation:
- Generate two large primes p and q
- Compute n = p * q
- Compute phi = (p-1)(q-1)
- Choose e (typically 65537)
- Compute d = e^-1 mod phi
- Verify: message^ed mod n = message
Interview Preparation
Conceptual Questions:
- Explain why RSA uses modular exponentiation
- What is Montgomery multiplication and why is it faster?
- Why do cryptographic implementations need constant-time operations?
Coding Challenges:
- Implement modular exponentiation using square-and-multiply
- Write a function to compute modular inverse using extended Euclidean algorithm
- Implement basic modular addition and subtraction for fixed-size integers
Book References
- Primary: Hanson, Ch. 19 - MP module implementation
- Cryptographic Context: Menezes, van Oorschot, Vanstone, “Handbook of Applied Cryptography” (free online)
- Montgomery Details: Montgomery’s original paper (1985)
- Side Channels: Kocher, “Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS”
Project 19: Thread Module - Cooperative Multithreading
File Reference
- File: P19-THREAD-MODULE.md
- Main Programming Language: C
- Coolness Level: Level 5 - Legendary/Mythical
- Difficulty: Level 5 - Legendary
- Estimated Time: 3-4 weeks
- Knowledge Area: Operating Systems, Concurrency Primitives
- Main Book: “C Interfaces and Implementations” by David Hanson - Chapter 20
What You Will Learn
Before POSIX threads were universally available, programmers needed threading abstractions that worked everywhere. Hanson’s Thread module implements cooperative (non-preemptive) multithreading using only standard C constructs - specifically setjmp and longjmp. This project teaches you how threading actually works at the lowest level.
Core Concepts
Cooperative vs Preemptive Threading
Threading Models
────────────────────────────────────────────────────────────────────────────────
Preemptive (OS threads, pthreads):
Thread A Thread B Thread C
| | |
|<--- OS switches arbitrarily -->|
| | |
[running] [running] [running]
| | |
X<-- Timer interrupt forces switch
| |
[running] |
| |
X<-- Another interrupt
|
[running]
Threads don't control when they're interrupted.
OS decides based on time slices, priorities, I/O.
Cooperative (green threads, fibers):
Thread A Thread B Thread C
|
[running] [waiting] [waiting]
|
|--yield()---->|
[running] [waiting]
|
|---yield()---->|
[running]
|
<---yield()-----|
[running]
Threads explicitly yield control.
A thread runs until it voluntarily gives up CPU.
The Magic of setjmp/longjmp
These standard C functions allow non-local jumps - essentially saving and restoring execution context:
setjmp/longjmp Mechanism
────────────────────────────────────────────────────────────────────────────────
Normal function call/return:
main() foo() bar()
| | |
|--call------->| |
| |--call-------->|
| | |
| |<--return------|
|<--return-----|
|
Returns unwind one frame at a time.
With longjmp:
main() foo() bar()
| | |
|--setjmp()--->| |
| (saves SP, | |
| PC, regs) |--call-------->|
| | |
|<========================longjmp()|
|
[Jumps directly back to setjmp point!]
[Stack frames for foo and bar are abandoned]
Thread Context
Each cooperative thread needs its own stack and saved registers:
Thread State
────────────────────────────────────────────────────────────────────────────────
Thread Control Block (TCB):
+-----------------------+
| thread_id |
+-----------------------+
| jmp_buf context | <-- Saved SP, PC, registers
| (setjmp buffer) |
+-----------------------+
| stack_pointer |
+-----------------------+
| stack_base |--+
+-----------------------+ |
| stack_size | |
+-----------------------+ |
| state | |
| (READY/RUNNING/ | |
| WAITING/DEAD) | |
+-----------------------+ |
|
Thread's private stack: |
+-----------------------+<-+
| |
| Local variables |
| Return addresses |
| Function arguments |
| |
+-----------------------+
Context Switching Implementation
Cooperative Context Switch (Pseudocode)
────────────────────────────────────────────────────────────────────────────────
global current_thread // Currently running thread
global ready_queue // Queue of runnable threads
function thread_yield():
// Save current context
if setjmp(current_thread.context) == 0:
// First return from setjmp - we're saving
// Add current thread to ready queue
enqueue(ready_queue, current_thread)
// Pick next thread
next = dequeue(ready_queue)
current_thread = next
// Restore next thread's context
longjmp(next.context, 1)
else:
// Second return from setjmp - we're being restored
// Just continue executing
return
Context Switch Flow
────────────────────────────────────────────────────────────────────────────────
Thread A running: Thread B waiting:
+----------------+ +----------------+
| Stack frame | | Stack frame |
+----------------+ +----------------+
| Local vars | | Saved at yield |
+----------------+ +----------------+
| ... | | jmp_buf has |
+----------------+ | saved SP, PC |
| yield() | +----------------+
+----------------+
|
v
setjmp() saves A's state
|
v
longjmp() to B's context ------------------>|
v
B resumes here!
(setjmp returns 1)
Thread Creation
Creating a thread requires allocating a new stack and setting up initial context:
Thread Creation (Pseudocode)
────────────────────────────────────────────────────────────────────────────────
function thread_create(function, argument):
// Allocate thread control block
thread = allocate(sizeof(TCB))
// Allocate private stack
thread.stack_base = allocate(STACK_SIZE)
thread.stack_pointer = thread.stack_base + STACK_SIZE // Stacks grow down
// Set up initial context
// This is tricky - we need to:
// 1. Switch to the new stack
// 2. Set up so that when restored, it calls function(argument)
// 3. Then switch back to our current stack
// One approach: use a "trampoline" function
setup_initial_context(thread, thread_start_trampoline, function, argument)
// Mark as ready and add to queue
thread.state = READY
enqueue(ready_queue, thread)
return thread.id
function thread_start_trampoline(function, argument):
function(argument) // Run the thread function
thread_exit() // Clean up when it returns
Thread Join
Thread Join (Pseudocode)
────────────────────────────────────────────────────────────────────────────────
function thread_join(thread_id):
thread = find_thread(thread_id)
while thread.state != DEAD:
// Can't run until target thread finishes
current_thread.state = WAITING
current_thread.waiting_for = thread_id
thread_yield()
// Target thread is dead, we can continue
// Optionally retrieve return value and clean up
function thread_exit():
current_thread.state = DEAD
// Wake up any threads waiting for us
for each t in all_threads:
if t.waiting_for == current_thread.id:
t.state = READY
enqueue(ready_queue, t)
// Switch to another thread (we can't continue)
next = dequeue(ready_queue)
current_thread = next
longjmp(next.context, 1)
// Never returns - we're dead
Why This Was Innovative in 1996
1996 Threading Landscape
────────────────────────────────────────────────────────────────────────────────
1996:
+-------------------+ +-------------------+
| Windows | | Unix |
+-------------------+ +-------------------+
| Win32 Threads | | Many variations: |
| (proprietary) | | - Solaris threads |
+-------------------+ | - POSIX pthreads |
| - SunOS LWPs |
| - Each different! |
+-------------------+
Hanson's Thread Module:
+-------------------------------------------+
| Works anywhere with standard C |
| (setjmp/longjmp are ANSI C) |
| Same code on Windows, Unix, DOS, VMS... |
+-------------------------------------------+
Today (pthreads everywhere):
Why still relevant?
- Understand how threads REALLY work
- User-space threading (goroutines, green threads) uses these ideas
- Fibers in Windows
- Coroutines in modern languages
Shared State Considerations
Cooperative threads still have shared state issues:
Shared State with Cooperative Threads
────────────────────────────────────────────────────────────────────────────────
Preemptive hazard:
Thread A Thread B
read x = 5
<-- INTERRUPT -->
read x = 5
x = x + 1
write x = 6
x = x + 1
write x = 6
Race condition! Lost update.
Cooperative - safer but not safe:
Thread A Thread B
read x = 5
x = x + 1 [waiting - can't interrupt]
write x = 6
yield()
read x = 6
x = x + 1
write x = 7
No race IF operations don't yield.
But if read, yield, write - still broken!
Guiding Questions
- Why can cooperative threading be implemented in pure portable C?
- What information must be saved in a thread context?
- Why do we need a separate stack for each thread?
- What happens if a thread overflows its allocated stack?
- How does cooperative threading compare to goroutines or Erlang processes?
Hints and Progression
Hint 1 - Understand setjmp First: Write simple programs using setjmp/longjmp before tackling threads.
Hint 2 - Stack Allocation Challenge: The trickiest part is setting up a new stack. Study how the first context switch initializes a thread.
Hint 3 - Start with Two Threads: Get two threads ping-ponging via yield() before adding more features.
Hint 4 - Debug with Care: Debuggers get confused by longjmp. Add lots of printf debugging.
Hint 5 - Study Existing Implementations: Look at libtask, libco, or even Lua’s coroutines for ideas.
Real World Outcome
A cooperative scheduler running multiple “green threads”
Build a complete threading library that demonstrates:
- Thread creation and management: Create, yield, join, exit
- Producer-consumer demo: Multiple producers and consumers sharing a buffer
- Server simulation: Cooperative handling of multiple “connections”
Example program:
void producer(void *arg) {
int id = *(int*)arg;
for (int i = 0; i < 10; i++) {
int item = produce_item();
buffer_put(item);
printf("Producer %d: put %d\n", id, item);
Thread_yield();
}
}
void consumer(void *arg) {
int id = *(int*)arg;
for (int i = 0; i < 10; i++) {
int item = buffer_get();
printf("Consumer %d: got %d\n", id, item);
Thread_yield();
}
}
int main() {
for (int i = 0; i < 3; i++)
Thread_create(producer, &i);
for (int i = 0; i < 3; i++)
Thread_create(consumer, &i);
Thread_start(); // Run scheduler until all threads complete
}
Interview Preparation
Conceptual Questions:
- Explain the difference between cooperative and preemptive threading
- How do setjmp and longjmp enable user-space threading?
- What are the advantages and disadvantages of green threads?
Coding Challenges:
- Implement a simple two-thread yield mechanism using setjmp/longjmp
- Write a round-robin scheduler for cooperative threads
- Implement thread join that blocks until a target thread completes
Book References
- Primary: Hanson, Ch. 20 - Thread module implementation
- OS Context: Silberschatz, “Operating System Concepts” (threading chapters)
- Modern Relevance: Nystrom, “Game Programming Patterns” (update method / game loop)
- Comparison: Go scheduler design documents (goroutines use similar concepts)
Project 20: Integration Project - Mini Standard Library
File Reference
- File: P20-INTEGRATION-PROJECT.md
- Main Programming Language: C
- Coolness Level: Level 5 - Legendary/Mythical
- Difficulty: Level 5 - Legendary
- Estimated Time: 4-6 weeks
- Knowledge Area: Library Design, Software Architecture
- Main Book: “C Interfaces and Implementations” by David Hanson - All Chapters
What You Will Learn
The final project brings everything together. You’ll combine all 19 modules into a cohesive, professional-quality library - and then build a substantial application using it. This is where software engineering meets systems programming.
Core Concepts
Library Architecture
Hanson Library Architecture
────────────────────────────────────────────────────────────────────────────────
Application Layer:
+----------------------------------------------------------+
| Your Application |
| (Mini-shell, Expression Evaluator, Text Processor) |
+----------------------------------------------------------+
|
v
Library Interface Layer:
+----------------------------------------------------------+
| Unified API |
| Consistent naming, error handling, memory management |
+----------------------------------------------------------+
|
+-----------------------+------------------------+
| | |
v v v
+--------+ +----------+ +----------+
| Data | | Text | | Math |
| Struct | | Process | | Modules |
+--------+ +----------+ +----------+
|Array | |Str | |XP |
|List | |Fmt | |AP |
|Seq | |Text | |MP |
|Stack | |Ring | +----------+
|Table | +----------+
|Set |
|Bit |
+--------+
| | |
v v v
+----------------------------------------------------------+
| Foundation Layer |
| Mem / Arena / Except / Assert / Atom / Thread |
+----------------------------------------------------------+
Consistent Error Handling
Choose one error handling strategy for the entire library:
Error Handling Approaches
────────────────────────────────────────────────────────────────────────────────
Approach 1: Assertions (Hanson's default)
+------------------------------------------------+
| Precondition violations -> assertion failure |
| No recovery possible |
| Simple but harsh |
+------------------------------------------------+
Approach 2: Return codes
+------------------------------------------------+
| Functions return error codes |
| Caller must check every call |
| Tedious but flexible |
+------------------------------------------------+
Approach 3: Exceptions (using Except module)
+------------------------------------------------+
| Errors raise exceptions |
| TRY/EXCEPT blocks for recovery |
| Clean code flow but hidden control flow |
+------------------------------------------------+
Approach 4: Hybrid (recommended)
+------------------------------------------------+
| Programmer errors -> assertions |
| Resource exhaustion -> exceptions |
| Expected failures -> return codes |
+------------------------------------------------+
Memory Management Strategy
Memory Strategy Decision Tree
────────────────────────────────────────────────────────────────────────────────
+------------------------+
| What kind of lifetime? |
+------------------------+
|
+-----------------+-----------------+
| |
v v
+---------------+ +---------------+
| Long-lived or | | Short-lived, |
| unpredictable | | batch-style |
+---------------+ +---------------+
| |
v v
+---------------+ +---------------+
| Use Mem module| | Use Arena |
| (individual | | (allocate many|
| alloc/free) | | free all) |
+---------------+ +---------------+
| |
v v
+---------------+ +---------------+
| Track with | | One arena per |
| reference | | request/phase |
| counting? | | Bulk dealloc |
+---------------+ +---------------+
Build System Design
Library Build System
────────────────────────────────────────────────────────────────────────────────
Source Files:
src/
|-- assert.c # Foundation
|-- except.c
|-- mem.c
|-- arena.c
|-- atom.c
|-- array.c # Data Structures
|-- list.c
|-- seq.c
|-- stack.c
|-- table.c
|-- set.c
|-- bit.c
|-- str.c # Text Processing
|-- fmt.c
|-- text.c
|-- ring.c
|-- xp.c # Math
|-- ap.c
|-- mp.c
+-- thread.c # Threading
Build Outputs:
|-- libcii.a # Static library
|-- libcii.so # Shared library (optional)
+-- include/
+-- cii/
|-- assert.h
|-- except.h
+-- ... all headers
Makefile targets:
- all: build library
- test: run test suite
- install: install headers and library
- clean: remove build artifacts
- docs: generate documentation
Writing a Real Application
Choose one of these applications to build:
Option A: Mini-Shell
Mini-Shell Architecture
────────────────────────────────────────────────────────────────────────────────
+----------------------------------+
| Main Loop |
+----------------------------------+
|
+----------+----------+
| |
v v
+--------+ +------------+
| Parser | | Executor |
| (Str, |--------->| (Table for |
| Atom) | | builtins) |
+--------+ +------------+
| |
v v
+--------+ +------------+
| History| | Environment|
| (Ring) | | (Table) |
+--------+ +------------+
Uses: Str, Atom, Table, Ring, Mem, Except
Option B: Expression Evaluator
Expression Evaluator Architecture
────────────────────────────────────────────────────────────────────────────────
Input: "let x = 100! + 2^256"
|
v
+---------------------+
| Lexer (Str, Atom) |
+---------------------+
|
v
+---------------------+
| Parser (Stack) |
| - Shunting yard |
+---------------------+
|
v
+---------------------+
| Evaluator |
| - AP for integers |
| - Table for vars |
+---------------------+
|
v
Output: "extremely large number..."
Uses: Str, Atom, Stack, Table, AP/XP, Mem, Fmt
Option C: Text Processor
Text Processor Architecture
────────────────────────────────────────────────────────────────────────────────
Input: Document files
|
v
+------------------------+
| Document Loader |
| (Str, Text) |
+------------------------+
|
v
+------------------------+
| Index Builder |
| - Word extraction |
| - Atom for words |
| - Table for word->docs |
| - Set for document IDs |
+------------------------+
|
v
+------------------------+
| Query Engine |
| - Parse query |
| - Set operations |
| - Ranked results |
+------------------------+
Uses: Str, Text, Atom, Table, Set, Array, Mem, Arena
Test Suite Design
Testing Strategy
────────────────────────────────────────────────────────────────────────────────
Unit Tests (per module):
+------------------+
| test_array.c |
| - Array_new |
| - Array_get |
| - Array_put |
| - Edge cases |
| - Error cases |
+------------------+
|
v
+------------------+
| test_table.c |
| ... etc |
+------------------+
Integration Tests:
+---------------------------+
| test_integration.c |
| - Multi-module scenarios |
| - Memory stress tests |
| - Error propagation |
+---------------------------+
Application Tests:
+---------------------------+
| test_shell.c |
| - Command parsing |
| - Script execution |
| - Edge cases |
+---------------------------+
Guiding Questions
- How do you maintain consistency across 19+ modules developed over time?
- What’s the right granularity for error handling in a library?
- How do you choose between Mem and Arena for a given use case?
- What testing strategy ensures module interactions work correctly?
- How do you document a library for other developers?
Hints and Progression
Hint 1 - Start with Foundations: Ensure Mem, Assert, Except, and Atom are rock solid. Everything depends on them.
Hint 2 - Establish Conventions Early: Define naming conventions, error handling, and memory rules before integrating.
Hint 3 - Build Incrementally: Add one module at a time to the library, testing as you go.
Hint 4 - Application Drives Integration: Choose your application early - it will reveal which modules need tighter integration.
Hint 5 - Real Users Find Real Bugs: Have someone else try to use your library. Fresh eyes find documentation gaps.
Real World Outcome
A complete mini-shell or mini-interpreter using all your modules
Your final deliverable includes:
- Complete Library: All 19 modules compiled into libcii.a
- Header Files: Clean public interfaces in include/cii/
- Test Suite: Comprehensive tests achieving >80% coverage
- Documentation: API reference and user guide
- Sample Application: Fully functional shell or evaluator
- Build System: Makefile with standard targets
The application should demonstrate:
- At least 10 different modules in use
- Clean error handling
- No memory leaks (verified by Valgrind)
- Reasonable performance
- Usable by someone who hasn’t seen the code
Interview Preparation
Conceptual Questions:
- How would you design a reusable C library from scratch?
- What are the trade-offs between static and dynamic libraries?
- How do you ensure memory safety across a large C codebase?
Coding Challenges:
- Design a consistent error handling scheme for a multi-module library
- Write a Makefile for a library with dependencies between modules
- Implement a simple REPL using hash tables, stacks, and string processing
Book References
- Primary: Hanson, All Chapters - Integration patterns
- Build Systems: “Managing Projects with GNU Make” by Robert Mecklenburg
- Library Design: “C Interfaces and Implementations” (meta-lesson)
- Testing: “Test-Driven Development for Embedded C” by James Grenning
Project Comparison Table
| Project | Module | Difficulty | Time | Depth | Fun Factor |
|---|---|---|---|---|---|
| 1 | Assert | Level 2 - Intermediate | Weekend | Medium | 3/5 |
| 2 | Mem | Level 2 - Intermediate | Weekend | High | 3/5 |
| 3 | Except | Level 3 - Advanced | 1 week | High | 4/5 |
| 4 | Arena | Level 2 - Intermediate | Weekend | Medium | 3/5 |
| 5 | Atom | Level 3 - Advanced | 1 week | High | 4/5 |
| 6 | List | Level 2 - Intermediate | Weekend | Medium | 3/5 |
| 7 | Table | Level 3 - Advanced | 1-2 weeks | High | 5/5 |
| 8 | Set | Level 3 - Advanced | 1 week | High | 4/5 |
| 9 | Array | Level 2 - Intermediate | Weekend | Medium | 3/5 |
| 10 | Seq | Level 2 - Intermediate | Weekend | Medium | 3/5 |
| 11 | Ring | Level 3 - Advanced | 1 week | Medium | 4/5 |
| 12 | Bit | Level 3 - Advanced | 1 week | High | 4/5 |
| 13 | Fmt | Level 3 - Advanced | 1-2 weeks | High | 3/5 |
| 14 | Str | Level 2 - Intermediate | 1 week | Medium | 4/5 |
| 15 | Text | Level 3 - Advanced | 1-2 weeks | High | 4/5 |
| 16 | XP | Level 4 - Expert | 2-3 weeks | Very High | 5/5 |
| 17 | AP | Level 4 - Expert | 2-3 weeks | Very High | 5/5 |
| 18 | MP | Level 5 - Legendary | 3-4 weeks | Extreme | 5/5 |
| 19 | Thread | Level 5 - Legendary | 3-4 weeks | Extreme | 5/5 |
| 20 | Integration | Level 5 - Legendary | 4-6 weeks | Extreme | 5/5 |
Difficulty Legend
- Level 2 - Intermediate: Solid C knowledge required
- Level 3 - Advanced: Deep understanding of pointers, memory, data structures
- Level 4 - Expert: Systems programming expertise needed
- Level 5 - Legendary: Research-level understanding of the domain
Recommendations by Background
New to C
Learning Path: Assert -> Mem -> Arena -> List -> Array -> Seq
Start with the foundational modules that teach essential C patterns:
- Assert teaches defensive programming and contract design
- Mem introduces memory allocation wrappers and debugging
- Arena shows batch memory management patterns
- List covers classic linked list implementation
- Array and Seq round out basic data structure skills
Time estimate: 6-8 weeks
Experienced C Programmer
Learning Path: Atom -> Table -> Set -> Except -> Fmt
Jump to the intellectually interesting modules:
- Atom demonstrates string interning and perfect hashing concepts
- Table is the workhorse - hash tables with excellent design
- Set builds on Table with mathematical set operations
- Except shows how to fake exception handling in C
- Fmt teaches extensible formatting systems
Time estimate: 6-8 weeks
Interview Preparation Focus
Priority Modules: Array, Seq, Table, Bit, Stack, List
These modules cover the most common interview topics:
- Array/Seq: Dynamic arrays (like std::vector)
- Table: Hash table design and analysis
- Bit: Bit manipulation patterns
- Stack: Classic data structure
- List: Linked list operations
For each module, practice:
- Implementing core operations from scratch
- Analyzing time/space complexity
- Handling edge cases
- Discussing trade-offs
Time estimate: 4-6 weeks
Cryptography Interest
Learning Path: XP -> AP -> MP -> Bit
The mathematics path for aspiring cryptographers:
- XP: Extended precision unsigned arithmetic (foundation)
- AP: Signed arbitrary precision (complete integer math)
- MP: Modular arithmetic for fixed sizes (RSA, DH)
- Bit: Bit manipulation for crypto operations
After completing these, you can implement:
- RSA key generation and encryption
- Diffie-Hellman key exchange
- Basic elliptic curve operations
Time estimate: 10-14 weeks
Operating Systems Interest
Learning Path: Mem -> Arena -> Thread -> Except
Understanding OS fundamentals through implementation:
- Mem: Memory allocator design
- Arena: Region-based memory management
- Thread: Cooperative scheduling and context switching
- Except: Non-local jumps and signal handling patterns
Time estimate: 8-10 weeks
Final Overall Project: The Hanson Standard Library
Project Overview
Combine all 19 modules into a single, cohesive, production-quality library that could serve as an alternative to portions of the C standard library or complement it with additional capabilities.
Step 1: Consistent Naming Conventions
Establish and enforce naming rules across all modules:
NAMING CONVENTION SPECIFICATION
────────────────────────────────────────────────────────────────────────────────
Types:
Module_T // Opaque pointer type (e.g., Table_T, Set_T)
Module_Func // Function pointer type (e.g., Table_Hash)
Functions:
Module_new // Constructor
Module_free // Destructor
Module_get // Accessor
Module_put // Mutator
Module_length // Size query
Module_map // Iterator with callback
Exceptions:
Module_Error // Exception for module-specific errors
Constants:
MODULE_DEFAULT // Default values (all caps)
Internal (static):
module_helper // Private functions (lowercase)
Step 2: Unified Build System
Create a comprehensive Makefile that builds all modules with proper dependency management.
Step 3: Comprehensive Test Suite
Build tests for each module and integration tests for module interactions.
Step 4: Sample Application
Build either a mini-shell or expression evaluator that demonstrates at least 10 modules working together.
From Learning to Production: What’s Next?
Comparing Your Implementation
After completing the Hanson modules, compare your implementation to established C libraries:
GLib (GNOME’s C utility library)
- Much larger scope (I/O, threading, Unicode)
- Similar philosophy of opaque types
- Your Table is like GHashTable
- Your List is like GList/GSList
APR (Apache Portable Runtime)
- Cross-platform focus
- Pool-based memory (like Arena)
- Your Thread concepts appear in apr_thread
BSD libc
- Standard library extensions
- queue.h macros (compare to your List)
- tree.h (compare to your Table/Set)
Skills You Now Have
After completing these 20 projects, you can:
- Design C interfaces: Opaque types, clean APIs, contracts
- Manage memory: Custom allocators, arenas, leak detection
- Implement data structures: From scratch, with full understanding
- Handle errors: Multiple strategies, trade-off analysis
- Process text: Efficient string operations, formatting
- Implement arbitrary precision math: Cryptography foundations
- Build threading primitives: Context switching, scheduling
- Create production libraries: Testing, documentation, build systems
Open Source Contribution Opportunities
Your skills are now applicable to:
- Redis: Hash tables, dynamic strings, memory management
- SQLite: Memory management, B-trees, text processing
- Lua: Similar philosophy to Hanson (clean C, portable)
- GMP: Your XP/AP/MP knowledge directly applicable
- libuv: Thread pool, event loop concepts
Career Paths Unlocked
- Systems Programming: OS development, device drivers, embedded systems
- Security Engineering: Cryptographic implementations, secure coding
- Database Development: Storage engines, query processors
- Language Implementation: Interpreters, compilers, runtime systems
- Infrastructure Engineering: High-performance servers, network protocols
Summary
Complete Project Overview
| # | Project | Language | Difficulty | Time |
|---|---|---|---|---|
| 1 | Assert Module | C | Level 2 | Weekend |
| 2 | Mem Module | C | Level 2 | Weekend |
| 3 | Arena Module | C | Level 2 | Weekend |
| 4 | Atom Module | C | Level 3 | 1 week |
| 5 | Except Module | C | Level 3 | 1 week |
| 6 | List Module | C | Level 2 | Weekend |
| 7 | Table Module | C | Level 3 | 1-2 weeks |
| 8 | Set Module | C | Level 3 | 1 week |
| 9 | Array Module | C | Level 2 | Weekend |
| 10 | Seq Module | C | Level 2 | Weekend |
| 11 | Ring Module | C | Level 3 | 1 week |
| 12 | Bit Module | C | Level 3 | 1 week |
| 13 | Fmt Module | C | Level 3 | 1-2 weeks |
| 14 | Str Module | C | Level 2 | 1 week |
| 15 | Text Module | C | Level 3 | 1-2 weeks |
| 16 | XP Module | C | Level 4 | 2-3 weeks |
| 17 | AP Module | C | Level 4 | 2-3 weeks |
| 18 | MP Module | C | Level 5 | 3-4 weeks |
| 19 | Thread Module | C | Level 5 | 3-4 weeks |
| 20 | Integration Project | C | Level 5 | 4-6 weeks |
Total Estimated Time: 30-45 weeks for all projects
Expected Outcomes
After completing this learning journey, you will have:
- Built a complete C utility library from scratch
- Deep understanding of memory management, data structures, and algorithms
- Practical experience with professional C coding patterns
- Foundation for cryptography through arbitrary precision arithmetic
- Operating systems knowledge through thread implementation
- Portfolio project demonstrating serious systems programming skills
- Interview readiness for systems programming roles
- Contribution capability to major open source C projects
Additional Resources & References
Primary Book
- Hanson, David R. - “C Interfaces and Implementations: Techniques for Creating Reusable Software” (Addison-Wesley, 1996)
- Source Code: https://github.com/drh/cii
Complementary Books
- Kernighan & Ritchie - “The C Programming Language” (2nd Edition)
- Sedgewick, Robert - “Algorithms in C” (Parts 1-5)
- Loudon, Kyle - “Mastering Algorithms with C”
- Knuth, Donald E. - “The Art of Computer Programming” (Vol. 1-3)
- Menezes, van Oorschot, Vanstone - “Handbook of Applied Cryptography” (free online)
Memory Debugging Tools
- Valgrind - Memory error detector (essential for C development)
- AddressSanitizer (ASan) - Built into GCC and Clang:
gcc -fsanitize=address - MemorySanitizer (MSan) -
clang -fsanitize=memory
Build Systems
- GNU Make - Standard build tool
- CMake - Cross-platform build generator
- Meson - Fast, user-friendly build system
Related Open Source Projects
- GLib - GNOME C utility library
- Redis - In-memory data structure store (https://github.com/redis/redis)
- SQLite - Embedded SQL database (https://www.sqlite.org/src/)
- Lua - Lightweight scripting language (https://www.lua.org/source/)
This learning guide is based on “C Interfaces and Implementations” by David R. Hanson. The book remains a masterclass in C library design despite being published in 1996. The principles of clean interfaces, opaque types, and contract-based programming are as relevant today as ever.
Total learning time: 30-45 weeks for complete mastery. Choose your path based on your goals and current skill level.