← Back to all projects

LEARN WEBASSEMBLY DEEP DIVE

Learn WebAssembly: From Zero to Runtime Builder

Goal: Deeply understand WebAssembly—from its binary format to building your own interpreter, compiler, and runtime. By the end, you’ll understand every byte of a .wasm file and how it executes.


Why WebAssembly Matters

WebAssembly (Wasm) is one of the most important technologies of the last decade. It’s:

  • A compilation target: Write in C, Rust, Go → run anywhere
  • A portable bytecode: Same binary runs in browsers, servers, edge, embedded
  • A security sandbox: Untrusted code runs safely
  • A virtual machine spec: Simple enough to implement yourself

Understanding WebAssembly deeply means understanding:

  • How compilers work (targeting Wasm)
  • How virtual machines work (interpreting/compiling Wasm)
  • How sandboxing works (Wasm’s security model)
  • How modern web performance works (Wasm in browsers)

What Is WebAssembly?

The 30-Second Version

┌─────────────┐     compile      ┌─────────────┐     run in      ┌─────────────┐
│   C/Rust/Go │ ───────────────► │    .wasm    │ ──────────────► │   Browser   │
│  source     │                  │   binary    │                 │   Node.js   │
└─────────────┘                  └─────────────┘                 │   Wasmtime  │
                                       │                         │   Edge/IoT  │
                                       │                         └─────────────┘
                                       │
                                       ▼
                              ┌─────────────────┐
                              │  Stack-based VM │
                              │  Linear memory  │
                              │  Type-safe      │
                              │  Sandboxed      │
                              └─────────────────┘

The Technical Core

WebAssembly is a binary instruction format for a stack-based virtual machine:

  1. Binary format (.wasm): Compact, fast to decode
  2. Text format (.wat): Human-readable, S-expression syntax
  3. Stack machine: Instructions push/pop values from a stack
  4. Linear memory: A single, contiguous, resizable byte array
  5. Strong typing: Every value has a type (i32, i64, f32, f64, etc.)
  6. Validation: Code is verified before execution

Core Concept Analysis

The Stack Machine Model

Unlike register machines (x86, ARM), WebAssembly uses a stack:

Instruction: i32.add

Before:        After:
┌─────┐        ┌─────┐
│  3  │ ←top   │  5  │ ←top
├─────┤        ├─────┤
│  2  │        │ ... │
├─────┤        └─────┘
│ ... │
└─────┘

(Pop 3 and 2, push 3+2=5)

Example WAT (WebAssembly Text Format):

(func $add (param $a i32) (param $b i32) (result i32)
  local.get $a    ;; Push $a onto stack
  local.get $b    ;; Push $b onto stack
  i32.add         ;; Pop two values, push their sum
)

The Type System

WebAssembly has a small set of types:

Type Size Description
i32 32-bit Integer (signed/unsigned interpretation by operation)
i64 64-bit Integer
f32 32-bit IEEE 754 float
f64 64-bit IEEE 754 double
v128 128-bit SIMD vector (4×f32, 2×f64, etc.)
funcref - Function reference
externref - External (host) reference

Linear Memory

┌────────────────────────────────────────────────────────────────┐
│                      LINEAR MEMORY                             │
├────────────────────────────────────────────────────────────────┤
│ Address: 0x0000                                       0xFFFF...│
│ ┌──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬─────┐│
│ │ byte │ byte │ byte │ byte │ byte │ byte │ byte │ byte │ ... ││
│ └──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┴─────┘│
│                                                                │
│ • Contiguous array of bytes                                    │
│ • Grows in 64KB pages                                          │
│ • Max 4GB (32-bit addressing, or 16EB with memory64)           │
│ • Little-endian                                                │
│ • Bounds-checked (trap on out-of-bounds)                       │
└────────────────────────────────────────────────────────────────┘

Memory access instructions:

  • i32.load: Load 32-bit integer from memory
  • i32.store: Store 32-bit integer to memory
  • memory.size: Get current memory size in pages
  • memory.grow: Request more memory pages

The Binary Format

Every .wasm file starts with:

0x00 0x61 0x73 0x6D  = "\0asm" (magic number)
0x01 0x00 0x00 0x00  = version 1 (little-endian)

Then come sections (each optional):

ID Section Purpose
1 Type Function signatures
2 Import Imported functions, memories, globals
3 Function Function index → type index mapping
4 Table Function tables for indirect calls
5 Memory Memory declarations
6 Global Global variables
7 Export Exported functions, memories
8 Start Entry point function
9 Element Table initialization
10 Code Function bodies (actual instructions)
11 Data Memory initialization

Module Structure

┌─────────────────────────────────────────────────────────────────┐
│                        WASM MODULE                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │   Imports   │  │   Exports   │  │        Types            │  │
│  │ (functions, │  │ (functions, │  │  (func signatures)      │  │
│  │  memory,    │  │  memory,    │  │  (param i32) (result    │  │
│  │  globals)   │  │  tables)    │  │   i32)                  │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                      Functions                          │    │
│  │  ┌──────────────────────────────────────────────────┐   │    │
│  │  │ func 0: (locals i32 i32) code: [0x20, 0x00, ...] │   │    │
│  │  │ func 1: (locals) code: [0x41, 0x2A, 0x0B, ...]   │   │    │
│  │  │ ...                                               │   │    │
│  │  └──────────────────────────────────────────────────┘   │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │   Memory    │  │   Tables    │  │        Globals          │  │
│  │ (linear     │  │ (indirect   │  │  (mutable/immutable     │  │
│  │  memory)    │  │  calls)     │  │   variables)            │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Project 1: WAT Explorer - Reading and Writing WebAssembly Text

  • File: LEARN_WEBASSEMBLY_DEEP_DIVE.md
  • Main Programming Language: WAT + JavaScript
  • Alternative Programming Languages: Any language that can invoke Wasm
  • Coolness Level: Level 2: Practical but Forgettable
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 1: Beginner
  • Knowledge Area: WebAssembly Basics / Text Format
  • Software or Tool: wat2wasm, WABT toolkit
  • Main Book: “WebAssembly: The Definitive Guide” by Brian Sletten

What you’ll build: A series of hand-written WAT programs that you compile to .wasm and run, learning every instruction category through direct experimentation.

Why it teaches WebAssembly: Before you can interpret or compile WebAssembly, you need to think in WebAssembly. Writing WAT by hand forces you to understand the stack machine, types, and control flow at a fundamental level.

Core challenges you’ll face:

  • Understanding stack-based computation → maps to how values flow through operations
  • Managing control flow without goto → maps to structured control (blocks, loops, br)
  • Working with linear memory → maps to manual memory management
  • Interfacing with JavaScript → maps to imports and exports

Key Concepts:

Difficulty: Beginner Time estimate: 1 week Prerequisites: Basic programming knowledge, understanding of binary/hex

Real world outcome:

;; fibonacci.wat - compute nth Fibonacci number
(module
  (func $fib (export "fib") (param $n i32) (result i32)
    (if (result i32) (i32.lt_s (local.get $n) (i32.const 2))
      (then (local.get $n))
      (else
        (i32.add
          (call $fib (i32.sub (local.get $n) (i32.const 1)))
          (call $fib (i32.sub (local.get $n) (i32.const 2)))
        )
      )
    )
  )
)
// Running in JavaScript
const wasmModule = await WebAssembly.instantiate(wasmBytes);
console.log(wasmModule.instance.exports.fib(10)); // 55

Implementation Hints:

Install the WABT toolkit:

# macOS
brew install wabt

# Ubuntu
apt install wabt

# Or build from source: https://github.com/WebAssembly/wabt

Basic WAT structure:

(module
  ;; Type section (usually implicit)
  (type $sig (func (param i32) (result i32)))

  ;; Import from host
  (import "console" "log" (func $log (param i32)))

  ;; Memory declaration
  (memory (export "memory") 1)  ;; 1 page = 64KB

  ;; Global variable
  (global $counter (mut i32) (i32.const 0))

  ;; Function definition
  (func $add (param $a i32) (param $b i32) (result i32)
    local.get $a
    local.get $b
    i32.add
  )

  ;; Export function
  (export "add" (func $add))
)

Key instruction categories to explore:

  1. Numeric operations: i32.add, i32.sub, i32.mul, i32.div_s, i32.rem_s
  2. Comparisons: i32.eq, i32.lt_s, i32.gt_u, i32.eqz
  3. Local variables: local.get, local.set, local.tee
  4. Global variables: global.get, global.set
  5. Memory: i32.load, i32.store, memory.size, memory.grow
  6. Control flow: block, loop, if, br, br_if, br_table, return
  7. Function calls: call, call_indirect

Control flow example (loop):

(func $sum_to_n (param $n i32) (result i32)
  (local $i i32)
  (local $sum i32)
  (local.set $i (i32.const 1))
  (local.set $sum (i32.const 0))

  (block $break
    (loop $continue
      ;; if i > n, break
      (br_if $break (i32.gt_s (local.get $i) (local.get $n)))

      ;; sum += i
      (local.set $sum (i32.add (local.get $sum) (local.get $i)))

      ;; i++
      (local.set $i (i32.add (local.get $i) (i32.const 1)))

      ;; continue loop
      (br $continue)
    )
  )
  (local.get $sum)
)

Convert and run:

# Convert WAT to WASM
wat2wasm program.wat -o program.wasm

# Inspect the binary
wasm-objdump -d program.wasm

# Run with Node.js
node -e "
const fs = require('fs');
const bytes = fs.readFileSync('program.wasm');
WebAssembly.instantiate(bytes).then(({instance}) => {
  console.log(instance.exports.add(2, 3));
});
"

Learning milestones:

  1. Simple arithmetic works → You understand the stack model
  2. Loops and conditionals work → You understand structured control flow
  3. Memory load/store works → You understand linear memory
  4. JS interop works → You understand the embedding API

Project 2: Binary Format Parser

  • File: LEARN_WEBASSEMBLY_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Python, Go
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Binary Parsing / File Formats
  • Software or Tool: None (pure C)
  • Main Book: “The Linux Programming Interface” by Michael Kerrisk (for file I/O patterns)

What you’ll build: A parser that reads .wasm binary files and decodes them into an in-memory representation, printing the module structure.

Why it teaches WebAssembly: The binary format IS WebAssembly. Understanding every byte—the LEB128 encoding, section structure, and instruction opcodes—gives you complete insight into what a .wasm file really contains.

Core challenges you’ll face:

  • Parsing LEB128 variable-length integers → maps to space-efficient encoding
  • Handling section structure → maps to module organization
  • Decoding type signatures → maps to function types
  • Reading instruction sequences → maps to bytecode structure

Key Concepts:

  • LEB128 encoding: Used for all integers in Wasm (space-efficient)
  • Binary format spec: WebAssembly Binary Format
  • Section structure: Magic, version, then optional sections
  • Instruction encoding: Single-byte opcodes (some multi-byte)

Difficulty: Intermediate Time estimate: 2 weeks Prerequisites: Project 1 (WAT familiarity), C file I/O

Real world outcome:

$ ./wasm_parser add.wasm
WebAssembly Module:
  Magic: 0x6d736100 (\0asm)
  Version: 1

  Section 1 (Type): 1 entries
    [0] (i32, i32) -> i32

  Section 3 (Function): 1 entries
    [0] -> type 0

  Section 7 (Export): 1 entries
    "add" -> func 0

  Section 10 (Code): 1 entries
    [0] size=7 locals=0
      0x20 0x00     local.get 0
      0x20 0x01     local.get 1
      0x6a          i32.add
      0x0b          end

Implementation Hints:

LEB128 decoding (the key to reading Wasm):

// Unsigned LEB128
uint64_t read_leb128_u(FILE *f, int *bytes_read) {
    uint64_t result = 0;
    int shift = 0;
    *bytes_read = 0;

    while (1) {
        uint8_t byte = fgetc(f);
        (*bytes_read)++;

        result |= ((uint64_t)(byte & 0x7F)) << shift;

        if ((byte & 0x80) == 0) break;  // High bit clear = last byte
        shift += 7;
    }
    return result;
}

// Signed LEB128
int64_t read_leb128_s(FILE *f, int *bytes_read) {
    int64_t result = 0;
    int shift = 0;
    uint8_t byte;
    *bytes_read = 0;

    do {
        byte = fgetc(f);
        (*bytes_read)++;
        result |= ((int64_t)(byte & 0x7F)) << shift;
        shift += 7;
    } while (byte & 0x80);

    // Sign extend
    if ((shift < 64) && (byte & 0x40)) {
        result |= (~0ULL << shift);
    }
    return result;
}

Module structure:

typedef struct {
    uint32_t magic;
    uint32_t version;

    // Type section
    FuncType *types;
    uint32_t type_count;

    // Import section
    Import *imports;
    uint32_t import_count;

    // Function section (maps func index to type index)
    uint32_t *func_types;
    uint32_t func_count;

    // Memory section
    Memory *memories;
    uint32_t memory_count;

    // Global section
    Global *globals;
    uint32_t global_count;

    // Export section
    Export *exports;
    uint32_t export_count;

    // Code section
    FuncBody *code;
    uint32_t code_count;

    // Data section
    DataSegment *data;
    uint32_t data_count;

} WasmModule;

Parsing flow:

WasmModule* parse_wasm(const char *filename) {
    FILE *f = fopen(filename, "rb");
    WasmModule *mod = calloc(1, sizeof(WasmModule));

    // 1. Read header
    fread(&mod->magic, 4, 1, f);
    fread(&mod->version, 4, 1, f);

    if (mod->magic != 0x6d736100) {
        fprintf(stderr, "Not a WebAssembly file\n");
        return NULL;
    }

    // 2. Read sections
    while (!feof(f)) {
        uint8_t section_id = fgetc(f);
        if (feof(f)) break;

        int len_bytes;
        uint32_t section_size = read_leb128_u(f, &len_bytes);

        long section_start = ftell(f);

        switch (section_id) {
            case 1: parse_type_section(f, mod); break;
            case 2: parse_import_section(f, mod); break;
            case 3: parse_function_section(f, mod); break;
            case 5: parse_memory_section(f, mod); break;
            case 7: parse_export_section(f, mod); break;
            case 10: parse_code_section(f, mod); break;
            // ... other sections
            default:
                fseek(f, section_start + section_size, SEEK_SET);
        }
    }

    fclose(f);
    return mod;
}

Opcode reference (subset): | Opcode | Instruction | Description | |——–|————-|————-| | 0x00 | unreachable | Trap immediately | | 0x01 | nop | No operation | | 0x02 | block | Begin block | | 0x03 | loop | Begin loop | | 0x04 | if | Begin if | | 0x0B | end | End block/loop/if/func | | 0x0C | br | Branch | | 0x0D | br_if | Conditional branch | | 0x10 | call | Call function | | 0x20 | local.get | Get local variable | | 0x21 | local.set | Set local variable | | 0x41 | i32.const | Push i32 constant | | 0x6A | i32.add | Add two i32s | | 0x6B | i32.sub | Subtract | | 0x6C | i32.mul | Multiply |

Learning milestones:

  1. Header parsing works → You understand the file structure
  2. LEB128 decodes correctly → You understand the encoding
  3. All sections parse → You understand module structure
  4. Instructions decode → You’re ready to interpret!

Project 3: WebAssembly Validator

  • File: LEARN_WEBASSEMBLY_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, OCaml
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Type Systems / Static Analysis
  • Software or Tool: None (builds on Project 2)
  • Main Book: “Types and Programming Languages” by Benjamin C. Pierce

What you’ll build: A validator that checks if a .wasm module is well-formed and type-safe, rejecting invalid modules before execution.

Why it teaches WebAssembly: Validation is the heart of WebAssembly’s safety. It ensures that stack operations are balanced, types match, memory accesses are in-bounds (statically where possible), and control flow is structured. This is what makes WebAssembly trustworthy.

Core challenges you’ll face:

  • Type checking the stack → maps to ensuring operations get correct types
  • Validating control flow → maps to structured control integrity
  • Checking function signatures → maps to call site verification
  • Memory and global access validation → maps to bounds checking setup

Key Concepts:

  • Validation algorithm: WebAssembly Validation Spec
  • Type checking: Stack-based type checking
  • Control flow: Block types, branch targets
  • Type inference: Determining result types of expressions

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Project 2 (binary parser)

Real world outcome:

$ ./wasm_validator good.wasm
Validation passed ✓

$ ./wasm_validator bad_stack.wasm
Validation error: function 0, instruction at offset 12
  Expected i32 on stack, found empty stack
  Instruction: i32.add

$ ./wasm_validator bad_type.wasm
Validation error: function 1, instruction at offset 8
  Type mismatch: expected f32, got i32
  Instruction: f32.add

Implementation Hints:

Validation state:

typedef enum {
    TYPE_I32,
    TYPE_I64,
    TYPE_F32,
    TYPE_F64,
    TYPE_V128,
    TYPE_FUNCREF,
    TYPE_EXTERNREF,
    TYPE_UNKNOWN  // For polymorphic stack (after unreachable)
} ValType;

typedef struct {
    ValType *types;
    int count;
    int capacity;
} TypeStack;

typedef struct {
    TypeStack stack;
    int control_depth;
    struct ControlFrame {
        ValType *label_types;   // Types expected at branch
        int label_type_count;
        ValType *end_types;     // Types at block end
        int end_type_count;
        int height;             // Stack height at block start
        bool unreachable;       // After unreachable instruction
    } *control_stack;
    int control_count;
} ValidationContext;

Stack operations:

void push_type(ValidationContext *ctx, ValType type) {
    if (ctx->stack.count >= ctx->stack.capacity) {
        ctx->stack.capacity *= 2;
        ctx->stack.types = realloc(ctx->stack.types,
                                    ctx->stack.capacity * sizeof(ValType));
    }
    ctx->stack.types[ctx->stack.count++] = type;
}

ValType pop_type(ValidationContext *ctx, ValType expected) {
    ControlFrame *frame = &ctx->control_stack[ctx->control_count - 1];

    if (ctx->stack.count == frame->height) {
        if (frame->unreachable) {
            return expected;  // Polymorphic: accept anything
        }
        validation_error("Stack underflow");
    }

    ValType actual = ctx->stack.types[--ctx->stack.count];

    if (expected != TYPE_UNKNOWN && actual != TYPE_UNKNOWN &&
        expected != actual) {
        validation_error("Type mismatch: expected %s, got %s",
                        type_name(expected), type_name(actual));
    }

    return actual;
}

Validating instructions:

void validate_instruction(ValidationContext *ctx, uint8_t opcode, ...) {
    switch (opcode) {
        case 0x6A:  // i32.add
            pop_type(ctx, TYPE_I32);
            pop_type(ctx, TYPE_I32);
            push_type(ctx, TYPE_I32);
            break;

        case 0x41:  // i32.const
            push_type(ctx, TYPE_I32);
            break;

        case 0x20:  // local.get
            ValType local_type = get_local_type(ctx, local_index);
            push_type(ctx, local_type);
            break;

        case 0x02:  // block
            push_control_frame(ctx, BLOCK, block_type);
            break;

        case 0x0C:  // br
            ControlFrame *target = &ctx->control_stack[ctx->control_count - 1 - label_idx];
            pop_types(ctx, target->label_types, target->label_type_count);
            set_unreachable(ctx);
            break;

        case 0x0B:  // end
            pop_control_frame(ctx);
            break;

        // ... 200+ more instructions
    }
}

Control flow validation:

void push_control_frame(ValidationContext *ctx, ControlKind kind,
                        ValType *types, int type_count) {
    ControlFrame frame = {
        .kind = kind,
        .label_types = types,
        .label_type_count = type_count,
        .end_types = types,
        .end_type_count = type_count,
        .height = ctx->stack.count,
        .unreachable = false
    };

    // For loops, label types are empty (branch to start)
    if (kind == LOOP) {
        frame.label_types = NULL;
        frame.label_type_count = 0;
    }

    ctx->control_stack[ctx->control_count++] = frame;
}

void pop_control_frame(ValidationContext *ctx) {
    ControlFrame *frame = &ctx->control_stack[ctx->control_count - 1];

    // Check that stack matches expected end types
    pop_types(ctx, frame->end_types, frame->end_type_count);

    if (ctx->stack.count != frame->height) {
        validation_error("Stack height mismatch at block end");
    }

    ctx->control_count--;

    // Push end types onto enclosing frame's stack
    push_types(ctx, frame->end_types, frame->end_type_count);
}

Learning milestones:

  1. Simple functions validate → Basic type checking works
  2. Control flow validates → Block/loop/if handling works
  3. Invalid modules rejected → Error detection works
  4. Matches reference validator → You’ve implemented the spec!

Project 4: WebAssembly Interpreter

  • File: LEARN_WEBASSEMBLY_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Virtual Machines / Interpreters
  • Software or Tool: None (builds on Project 2)
  • Main Book: “Crafting Interpreters” by Robert Nystrom

What you’ll build: A complete WebAssembly interpreter that can execute .wasm modules, implementing the stack machine, memory, and all core instructions.

Why it teaches WebAssembly: An interpreter is the most direct implementation of the WebAssembly semantics. Every instruction you implement deepens your understanding of what WebAssembly code actually does.

Core challenges you’ll face:

  • Implementing the value stack → maps to operand management
  • Implementing the call stack → maps to function invocation
  • Implementing memory operations → maps to load/store semantics
  • Implementing control flow → maps to branches and blocks

Key Concepts:

  • Execution semantics: WebAssembly Execution Spec
  • Stack machine interpretation: “Crafting Interpreters”, Part II - Robert Nystrom
  • Threaded code: Dispatch techniques for interpreters
  • Memory-mapped operations: Load/store with address calculation

Resources:

Difficulty: Advanced Time estimate: 1 month Prerequisites: Projects 2-3 (parser, validator)

Real world outcome:

$ ./wasm_interp factorial.wasm --invoke factorial 5
Result: 120

$ ./wasm_interp hello.wasm --invoke greet
Hello, WebAssembly!

$ ./wasm_interp fib.wasm --invoke fib 30
Result: 832040
Time: 0.8s

Implementation Hints:

Runtime data structures:

// Value on the stack
typedef struct {
    ValType type;
    union {
        int32_t i32;
        int64_t i64;
        float f32;
        double f64;
    } value;
} WasmValue;

// Call frame
typedef struct {
    WasmFunction *func;
    uint32_t pc;              // Program counter (instruction offset)
    uint32_t sp;              // Stack pointer (base of locals)
    WasmValue *locals;        // Local variables
    uint32_t local_count;
} CallFrame;

// Runtime state
typedef struct {
    WasmModule *module;

    // Value stack
    WasmValue *stack;
    uint32_t stack_top;
    uint32_t stack_capacity;

    // Call stack
    CallFrame *frames;
    uint32_t frame_count;
    uint32_t frame_capacity;

    // Linear memory
    uint8_t *memory;
    uint32_t memory_size;     // Current size in bytes
    uint32_t memory_max;      // Maximum size in pages

    // Globals
    WasmValue *globals;

    // Tables (for indirect calls)
    uint32_t *table;
    uint32_t table_size;

} WasmRuntime;

Main interpreter loop:

WasmValue interpret(WasmRuntime *rt, uint32_t func_idx, WasmValue *args) {
    CallFrame *frame = push_frame(rt, func_idx, args);

    while (1) {
        uint8_t opcode = frame->func->code[frame->pc++];

        switch (opcode) {
            case 0x00:  // unreachable
                trap(rt, "unreachable executed");
                break;

            case 0x01:  // nop
                break;

            case 0x02:  // block
                // Read block type, push control frame
                break;

            case 0x0B:  // end
                if (frame == &rt->frames[0]) {
                    // End of top-level function
                    return pop_value(rt);
                }
                // End of block/loop, pop control frame
                break;

            case 0x0C:  // br
                uint32_t label = read_leb128_u(frame);
                branch_to(rt, frame, label);
                break;

            case 0x10:  // call
                uint32_t callee = read_leb128_u(frame);
                frame = push_frame(rt, callee, pop_args(rt, callee));
                break;

            case 0x0F:  // return
                pop_frame(rt);
                if (rt->frame_count == 0) {
                    return pop_value(rt);
                }
                frame = &rt->frames[rt->frame_count - 1];
                break;

            case 0x20:  // local.get
                uint32_t idx = read_leb128_u(frame);
                push_value(rt, frame->locals[idx]);
                break;

            case 0x21:  // local.set
                uint32_t idx = read_leb128_u(frame);
                frame->locals[idx] = pop_value(rt);
                break;

            case 0x41:  // i32.const
                int32_t val = read_leb128_s(frame);
                push_i32(rt, val);
                break;

            case 0x6A:  // i32.add
                int32_t b = pop_i32(rt);
                int32_t a = pop_i32(rt);
                push_i32(rt, a + b);
                break;

            case 0x28:  // i32.load
                uint32_t align = read_leb128_u(frame);
                uint32_t offset = read_leb128_u(frame);
                uint32_t addr = pop_i32(rt) + offset;
                if (addr + 4 > rt->memory_size) {
                    trap(rt, "out of bounds memory access");
                }
                int32_t val = *(int32_t*)(rt->memory + addr);
                push_i32(rt, val);
                break;

            // ... 200+ more instructions
        }
    }
}

Memory operations:

void store_i32(WasmRuntime *rt, uint32_t addr, int32_t value) {
    if (addr + 4 > rt->memory_size) {
        trap(rt, "out of bounds memory access");
    }
    // Little-endian store
    rt->memory[addr] = value & 0xFF;
    rt->memory[addr + 1] = (value >> 8) & 0xFF;
    rt->memory[addr + 2] = (value >> 16) & 0xFF;
    rt->memory[addr + 3] = (value >> 24) & 0xFF;
}

int32_t load_i32(WasmRuntime *rt, uint32_t addr) {
    if (addr + 4 > rt->memory_size) {
        trap(rt, "out of bounds memory access");
    }
    return rt->memory[addr] |
           (rt->memory[addr + 1] << 8) |
           (rt->memory[addr + 2] << 16) |
           (rt->memory[addr + 3] << 24);
}

int32_t memory_grow(WasmRuntime *rt, int32_t pages) {
    uint32_t old_pages = rt->memory_size / 65536;
    uint32_t new_pages = old_pages + pages;

    if (new_pages > rt->memory_max) {
        return -1;  // Failure
    }

    rt->memory = realloc(rt->memory, new_pages * 65536);
    memset(rt->memory + rt->memory_size, 0, pages * 65536);
    rt->memory_size = new_pages * 65536;

    return old_pages;
}

Learning milestones:

  1. Simple functions run → Basic interpretation works
  2. Recursion works → Call stack is correct
  3. Memory operations work → Linear memory implemented
  4. Spec tests pass → You have a conformant interpreter!

Project 5: Simple Language → WebAssembly Compiler

  • File: LEARN_WEBASSEMBLY_DEEP_DIVE.md
  • Main Programming Language: C (or language of choice)
  • Alternative Programming Languages: Rust, TypeScript, Python
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 4: Expert
  • Knowledge Area: Compilers / Code Generation
  • Software or Tool: Builds on Project 2 (for binary output)
  • Main Book: “Writing a C Compiler” by Nora Sandler

What you’ll build: A compiler for a simple programming language (like a subset of C or a custom language) that outputs WebAssembly binary format.

Why it teaches WebAssembly: Compiling TO WebAssembly forces you to think about what instructions exist, how to express high-level constructs (if, while, functions) in Wasm’s structured control flow, and how to manage the stack and memory.

Core challenges you’ll face:

  • Parsing source language → maps to lexing and parsing
  • Type checking → maps to ensuring valid Wasm types
  • Control flow lowering → maps to blocks, loops, branches
  • Memory management → maps to stack allocation in linear memory

Key Concepts:

  • Compiler structure: “Writing a C Compiler”, Chapters 1-5 - Nora Sandler
  • Code generation for stack machines: How to emit Wasm instructions
  • Structured control flow: No arbitrary goto in Wasm
  • Binary encoding: Emitting valid .wasm files

Resources:

Difficulty: Expert Time estimate: 1-2 months Prerequisites: Projects 1-4, basic compiler knowledge

Real world outcome:

// mini-c source (subset of C)
int factorial(int n) {
    if (n <= 1) {
        return 1;
    }
    return n * factorial(n - 1);
}
$ ./minicc factorial.mc -o factorial.wasm
Compiled successfully: factorial.wasm (127 bytes)

$ ./wasm_interp factorial.wasm --invoke factorial 5
Result: 120

Implementation Hints:

Language design (keep it simple!):

program     = function*
function    = type name "(" params ")" block
type        = "int" | "float"
params      = (type name ("," type name)*)?
block       = "{" statement* "}"
statement   = "return" expr ";"
            | "if" "(" expr ")" block ("else" block)?
            | "while" "(" expr ")" block
            | type name ("=" expr)? ";"
            | name "=" expr ";"
            | expr ";"
expr        = expr binop expr
            | "-" expr
            | name "(" args ")"
            | name
            | number
binop       = "+" | "-" | "*" | "/" | "==" | "!=" | "<" | ">" | "<=" | ">="

AST structure:

typedef enum {
    NODE_FUNC,
    NODE_BLOCK,
    NODE_RETURN,
    NODE_IF,
    NODE_WHILE,
    NODE_VAR_DECL,
    NODE_ASSIGN,
    NODE_BINARY,
    NODE_UNARY,
    NODE_CALL,
    NODE_VAR,
    NODE_LITERAL
} NodeKind;

typedef struct ASTNode {
    NodeKind kind;
    Type type;
    union {
        struct { char *name; Type ret_type; Param *params; int param_count; struct ASTNode *body; } func;
        struct { struct ASTNode **stmts; int count; } block;
        struct { struct ASTNode *value; } return_stmt;
        struct { struct ASTNode *cond, *then_block, *else_block; } if_stmt;
        struct { struct ASTNode *cond, *body; } while_stmt;
        struct { char *name; struct ASTNode *init; } var_decl;
        struct { char *name; struct ASTNode *value; } assign;
        struct { char *op; struct ASTNode *left, *right; } binary;
        struct { char *name; struct ASTNode **args; int arg_count; } call;
        struct { char *name; } var;
        struct { int32_t int_val; double float_val; } literal;
    };
} ASTNode;

Code generation (recursive descent):

void codegen_expr(Compiler *c, ASTNode *node) {
    switch (node->kind) {
        case NODE_LITERAL:
            emit_byte(c, 0x41);  // i32.const
            emit_leb128_s(c, node->literal.int_val);
            break;

        case NODE_VAR:
            int idx = lookup_local(c, node->var.name);
            emit_byte(c, 0x20);  // local.get
            emit_leb128_u(c, idx);
            break;

        case NODE_BINARY:
            codegen_expr(c, node->binary.left);
            codegen_expr(c, node->binary.right);
            emit_binop(c, node->binary.op);
            break;

        case NODE_CALL:
            for (int i = 0; i < node->call.arg_count; i++) {
                codegen_expr(c, node->call.args[i]);
            }
            int func_idx = lookup_function(c, node->call.name);
            emit_byte(c, 0x10);  // call
            emit_leb128_u(c, func_idx);
            break;
    }
}

void codegen_if(Compiler *c, ASTNode *node) {
    codegen_expr(c, node->if_stmt.cond);

    emit_byte(c, 0x04);  // if
    emit_byte(c, 0x40);  // void block type (no result)

    codegen_block(c, node->if_stmt.then_block);

    if (node->if_stmt.else_block) {
        emit_byte(c, 0x05);  // else
        codegen_block(c, node->if_stmt.else_block);
    }

    emit_byte(c, 0x0B);  // end
}

void codegen_while(Compiler *c, ASTNode *node) {
    // Wasm loops branch to the BEGINNING, so we need:
    // block $break
    //   loop $continue
    //     br_if $break (if !condition)
    //     body
    //     br $continue
    //   end
    // end

    emit_byte(c, 0x02);  // block
    emit_byte(c, 0x40);  // void

    emit_byte(c, 0x03);  // loop
    emit_byte(c, 0x40);  // void

    // Condition (negated for br_if to break)
    codegen_expr(c, node->while_stmt.cond);
    emit_byte(c, 0x45);  // i32.eqz (negate)
    emit_byte(c, 0x0D);  // br_if
    emit_leb128_u(c, 1); // break to outer block

    // Body
    codegen_block(c, node->while_stmt.body);

    // Loop back
    emit_byte(c, 0x0C);  // br
    emit_leb128_u(c, 0); // to loop start

    emit_byte(c, 0x0B);  // end loop
    emit_byte(c, 0x0B);  // end block
}

Emitting binary format:

void emit_module(Compiler *c) {
    // Magic and version
    emit_bytes(c, "\0asm", 4);
    emit_u32(c, 1);

    // Type section
    emit_section(c, 1, emit_type_section);

    // Function section
    emit_section(c, 3, emit_function_section);

    // Export section
    emit_section(c, 7, emit_export_section);

    // Code section
    emit_section(c, 10, emit_code_section);
}

Learning milestones:

  1. Expressions compile → Basic code gen works
  2. Functions compile → Call/return works
  3. Control flow compiles → Structured control translation works
  4. Whole programs compile and run → You’ve built a real compiler!

Project 6: WASI Implementation

  • File: LEARN_WEBASSEMBLY_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: System Interfaces / Capability-Based Security
  • Software or Tool: Builds on Project 4 (interpreter)
  • Main Book: “The Linux Programming Interface” by Michael Kerrisk

What you’ll build: An implementation of WASI (WebAssembly System Interface) that allows WebAssembly modules to access files, environment variables, command-line arguments, and time.

Why it teaches WebAssembly: WASI shows how WebAssembly can run outside the browser. Understanding capability-based security and how system calls are exposed to sandboxed code is essential for server-side Wasm.

Core challenges you’ll face:

  • Implementing fd_read/fd_write → maps to file descriptor abstraction
  • Handling path virtualization → maps to directory sandboxing
  • Implementing args/environ → maps to process environment
  • Clock and random sources → maps to system resources

Key Concepts:

  • WASI specification: WASI Preview 1
  • Capability-based security: Pre-opened file descriptors
  • POSIX-like interface: Familiar but sandboxed
  • WIT definitions: Interface descriptions (WASI 0.2+)

Resources:

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Project 4 (interpreter), Unix systems programming

Real world outcome:

# Run a WASI program that reads a file
$ ./wasm_wasi cat.wasm --dir /tmp -- /tmp/hello.txt
Hello from the file!

# Run a program that uses args and environ
$ ./wasm_wasi env.wasm
ARGS: [env.wasm]
HOME=/home/user
PATH=/usr/bin:/bin

Implementation Hints:

WASI imports (Preview 1 - the stable API):

// The WASI module exports these functions that your runtime imports
typedef struct {
    // Process
    int32_t (*args_get)(int32_t argv, int32_t argv_buf);
    int32_t (*args_sizes_get)(int32_t argc_ptr, int32_t argv_buf_size_ptr);
    int32_t (*environ_get)(int32_t environ, int32_t environ_buf);
    int32_t (*environ_sizes_get)(int32_t count_ptr, int32_t buf_size_ptr);

    // Filesystem
    int32_t (*fd_read)(int32_t fd, int32_t iovs, int32_t iovs_len, int32_t nread_ptr);
    int32_t (*fd_write)(int32_t fd, int32_t iovs, int32_t iovs_len, int32_t nwritten_ptr);
    int32_t (*fd_close)(int32_t fd);
    int32_t (*fd_seek)(int32_t fd, int64_t offset, int32_t whence, int32_t newoffset_ptr);
    int32_t (*path_open)(int32_t fd, int32_t dirflags, int32_t path, int32_t path_len,
                         int32_t oflags, int64_t rights_base, int64_t rights_inheriting,
                         int32_t fdflags, int32_t opened_fd_ptr);

    // Clock
    int32_t (*clock_time_get)(int32_t id, int64_t precision, int32_t timestamp_ptr);

    // Random
    int32_t (*random_get)(int32_t buf, int32_t buf_len);

    // Exit
    void (*proc_exit)(int32_t code);

} WasiAPI;

File descriptor table:

typedef struct {
    int host_fd;          // The actual OS file descriptor
    char *path;           // Virtual path (for sandboxing)
    uint64_t rights;      // What operations are allowed
    uint64_t inheriting;  // Rights for children
    bool preopened;       // Was this given at startup?
} WasiFd;

typedef struct {
    WasiFd *fds;
    int fd_count;
    int fd_capacity;

    char **args;
    int argc;

    char **environ;
    int envc;

    char **preopened_dirs;
    int preopened_count;

} WasiContext;

Implementing fd_write:

int32_t wasi_fd_write(WasiContext *ctx, WasmRuntime *rt,
                       int32_t fd, int32_t iovs_ptr, int32_t iovs_len,
                       int32_t nwritten_ptr) {
    // Validate fd
    if (fd < 0 || fd >= ctx->fd_count || ctx->fds[fd].host_fd < 0) {
        return WASI_ERRNO_BADF;
    }

    // Check rights
    if (!(ctx->fds[fd].rights & WASI_RIGHT_FD_WRITE)) {
        return WASI_ERRNO_NOTCAPABLE;
    }

    // Read iovec array from linear memory
    uint32_t total_written = 0;
    for (int i = 0; i < iovs_len; i++) {
        uint32_t iov_addr = iovs_ptr + i * 8;  // sizeof(wasi_ciovec_t) = 8
        uint32_t buf_ptr = load_i32(rt, iov_addr);
        uint32_t buf_len = load_i32(rt, iov_addr + 4);

        // Write from linear memory to host fd
        ssize_t n = write(ctx->fds[fd].host_fd,
                          rt->memory + buf_ptr,
                          buf_len);

        if (n < 0) {
            return errno_to_wasi(errno);
        }
        total_written += n;
    }

    // Write result
    store_i32(rt, nwritten_ptr, total_written);
    return WASI_ERRNO_SUCCESS;
}

Path sandboxing:

// WASI uses capability-based security: modules can only access
// directories that were explicitly given to them at startup

char* resolve_path(WasiContext *ctx, int32_t dirfd, const char *path) {
    if (dirfd < 0 || dirfd >= ctx->fd_count) {
        return NULL;
    }

    WasiFd *dir = &ctx->fds[dirfd];
    if (!dir->preopened) {
        return NULL;  // Must use preopened directory
    }

    // Prevent path traversal
    if (strstr(path, "..") != NULL) {
        return NULL;  // No escaping the sandbox!
    }

    char *full_path = malloc(strlen(dir->path) + strlen(path) + 2);
    sprintf(full_path, "%s/%s", dir->path, path);

    return full_path;
}

Preopening directories:

void wasi_preopen_dir(WasiContext *ctx, const char *host_path, const char *guest_path) {
    int host_fd = open(host_path, O_RDONLY | O_DIRECTORY);
    if (host_fd < 0) {
        fprintf(stderr, "Failed to preopen %s\n", host_path);
        return;
    }

    WasiFd fd = {
        .host_fd = host_fd,
        .path = strdup(guest_path),
        .rights = WASI_RIGHT_FD_READ | WASI_RIGHT_FD_WRITE |
                  WASI_RIGHT_PATH_OPEN | WASI_RIGHT_FD_READDIR,
        .inheriting = WASI_RIGHT_FD_READ | WASI_RIGHT_FD_WRITE,
        .preopened = true
    };

    ctx->fds[ctx->fd_count++] = fd;
}

Learning milestones:

  1. stdout/stderr work → Basic fd_write works
  2. File reading works → fd_read and path_open work
  3. Args and environ work → Process interface works
  4. Real WASI programs run → Compatible implementation!

Project 7: JavaScript Embedding API

  • File: LEARN_WEBASSEMBLY_DEEP_DIVE.md
  • Main Programming Language: JavaScript + Your Interpreter
  • Alternative Programming Languages: TypeScript
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: JavaScript / FFI / Browser APIs
  • Software or Tool: Node.js or browser
  • Main Book: “JavaScript: The Good Parts” by Douglas Crockford

What you’ll build: A JavaScript wrapper around your interpreter (or use the browser’s WebAssembly API) that demonstrates bidirectional communication between JS and Wasm.

Why it teaches WebAssembly: The browser is WebAssembly’s primary home. Understanding how JavaScript loads modules, calls exported functions, handles memory, and provides imports is essential for web development with Wasm.

Core challenges you’ll face:

  • Loading and instantiating modules → maps to WebAssembly.instantiate()
  • Calling Wasm from JS → maps to exported functions
  • Calling JS from Wasm → maps to imported functions
  • Sharing memory → maps to ArrayBuffer/TypedArrays

Key Concepts:

  • WebAssembly JS API: MDN WebAssembly JavaScript Interface
  • Memory management: SharedArrayBuffer, TypedArrays
  • Async instantiation: Streaming compilation
  • Error handling: WebAssembly.CompileError, RuntimeError

Difficulty: Intermediate Time estimate: 1 week Prerequisites: Project 1 (WAT), JavaScript knowledge

Real world outcome:

// Bidirectional communication
const imports = {
    console: {
        log: (ptr, len) => {
            const bytes = new Uint8Array(memory.buffer, ptr, len);
            console.log(new TextDecoder().decode(bytes));
        }
    },
    math: {
        random: () => Math.random()
    }
};

const { instance } = await WebAssembly.instantiateStreaming(
    fetch('game.wasm'),
    imports
);

const memory = instance.exports.memory;
const result = instance.exports.calculate(10, 20);
console.log(`Result: ${result}`);

Implementation Hints:

Basic loading:

// Method 1: From ArrayBuffer
const response = await fetch('module.wasm');
const bytes = await response.arrayBuffer();
const { module, instance } = await WebAssembly.instantiate(bytes, imports);

// Method 2: Streaming (more efficient, compiles while downloading)
const { module, instance } = await WebAssembly.instantiateStreaming(
    fetch('module.wasm'),
    imports
);

// Method 3: Synchronous (Node.js or small modules)
const bytes = fs.readFileSync('module.wasm');
const module = new WebAssembly.Module(bytes);
const instance = new WebAssembly.Instance(module, imports);

Working with memory:

// Wasm module with exported memory
const instance = await WebAssembly.instantiate(bytes, {});
const memory = instance.exports.memory;

// Read/write as typed arrays
const i32View = new Int32Array(memory.buffer);
const u8View = new Uint8Array(memory.buffer);

// Write a string to Wasm memory
function writeString(memory, offset, str) {
    const bytes = new TextEncoder().encode(str);
    const view = new Uint8Array(memory.buffer);
    view.set(bytes, offset);
    view[offset + bytes.length] = 0;  // Null terminator
    return bytes.length;
}

// Read a string from Wasm memory
function readString(memory, offset, len) {
    const bytes = new Uint8Array(memory.buffer, offset, len);
    return new TextDecoder().decode(bytes);
}

Providing imports:

const imports = {
    env: {
        // Provide memory
        memory: new WebAssembly.Memory({ initial: 1, maximum: 10 }),

        // Simple function import
        abort: () => { throw new Error('Wasm abort'); },

        // Function that reads from Wasm memory
        print: (ptr, len) => {
            const str = readString(memory, ptr, len);
            console.log(str);
        },

        // Function that returns a value
        get_time: () => Date.now(),

        // Imported table (for function pointers)
        table: new WebAssembly.Table({ initial: 10, element: 'anyfunc' }),
    }
};

Error handling:

try {
    const { instance } = await WebAssembly.instantiate(bytes, imports);
    instance.exports.might_fail();
} catch (e) {
    if (e instanceof WebAssembly.CompileError) {
        console.error('Invalid Wasm module:', e.message);
    } else if (e instanceof WebAssembly.LinkError) {
        console.error('Missing import:', e.message);
    } else if (e instanceof WebAssembly.RuntimeError) {
        console.error('Wasm trap:', e.message);  // e.g., out of bounds, unreachable
    } else {
        throw e;  // Re-throw non-Wasm errors
    }
}

Complete example - Image processing:

// Load Wasm module that processes images
const { instance } = await WebAssembly.instantiate(imageProcessorBytes, {});
const { memory, alloc, free, grayscale } = instance.exports;

async function processImage(imageData) {
    const { width, height, data } = imageData;

    // Allocate memory in Wasm
    const inputPtr = alloc(data.length);
    const outputPtr = alloc(data.length);

    // Copy image data to Wasm memory
    new Uint8Array(memory.buffer).set(data, inputPtr);

    // Process (returns 0 on success)
    const result = grayscale(inputPtr, outputPtr, width, height);

    // Copy result back
    const output = new Uint8ClampedArray(
        memory.buffer.slice(outputPtr, outputPtr + data.length)
    );

    // Free Wasm memory
    free(inputPtr);
    free(outputPtr);

    return new ImageData(output, width, height);
}

Learning milestones:

  1. Load and run simple module → Basic API works
  2. Call functions both directions → Imports/exports work
  3. Share memory correctly → TypedArrays work
  4. Handle errors gracefully → Robust integration

Project 8: JIT Compiler (Basic)

  • File: LEARN_WEBASSEMBLY_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, C++
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 5: Master
  • Knowledge Area: JIT Compilation / Machine Code Generation
  • Software or Tool: Understanding of target architecture (x86-64 or ARM64)
  • Main Book: “Engineering a Compiler, 2nd Edition” by Keith D. Cooper

What you’ll build: A basic JIT compiler that translates WebAssembly to native machine code at runtime, dramatically improving performance over interpretation.

Why it teaches WebAssembly: This is how production Wasm runtimes (V8, SpiderMonkey, Wasmtime) achieve near-native performance. Understanding JIT compilation reveals the true potential of WebAssembly as a compilation target.

Core challenges you’ll face:

  • Generating machine code → maps to target architecture knowledge
  • Register allocation → maps to moving from stack to registers
  • Calling conventions → maps to interop with native code
  • Memory mapping for execution → maps to mmap with PROT_EXEC

Key Concepts:

  • JIT fundamentals: Compile at runtime, execute immediately
  • x86-64 instruction encoding: How machine code is structured
  • Calling conventions: System V AMD64 ABI (Linux/macOS) or Windows x64
  • Memory protection: Allocating executable memory

Resources:

Difficulty: Master Time estimate: 2-3 months Prerequisites: All previous projects, assembly language knowledge

Real world outcome:

$ ./wasm_jit factorial.wasm --invoke factorial 20
Result: 2432902008176640000
Time: 0.001s (vs 0.5s interpreted)

$ ./wasm_jit benchmark.wasm
Interpreted: 1234ms
JIT compiled: 45ms
Speedup: 27x

Implementation Hints:

Allocating executable memory:

#include <sys/mman.h>

void* alloc_executable(size_t size) {
    void *mem = mmap(NULL, size,
                     PROT_READ | PROT_WRITE | PROT_EXEC,
                     MAP_PRIVATE | MAP_ANONYMOUS,
                     -1, 0);
    if (mem == MAP_FAILED) {
        return NULL;
    }
    return mem;
}

Simple x86-64 code emission:

typedef struct {
    uint8_t *code;
    size_t size;
    size_t capacity;
} CodeBuffer;

void emit8(CodeBuffer *buf, uint8_t byte) {
    buf->code[buf->size++] = byte;
}

void emit32(CodeBuffer *buf, int32_t val) {
    emit8(buf, val & 0xFF);
    emit8(buf, (val >> 8) & 0xFF);
    emit8(buf, (val >> 16) & 0xFF);
    emit8(buf, (val >> 24) & 0xFF);
}

// mov rax, imm64
void emit_mov_rax_imm64(CodeBuffer *buf, int64_t val) {
    emit8(buf, 0x48);  // REX.W prefix
    emit8(buf, 0xB8);  // mov rax, imm64
    for (int i = 0; i < 8; i++) {
        emit8(buf, (val >> (i * 8)) & 0xFF);
    }
}

// add rax, rbx
void emit_add_rax_rbx(CodeBuffer *buf) {
    emit8(buf, 0x48);  // REX.W
    emit8(buf, 0x01);  // add r/m64, r64
    emit8(buf, 0xD8);  // ModRM: rax, rbx
}

// ret
void emit_ret(CodeBuffer *buf) {
    emit8(buf, 0xC3);
}

Stack-to-register mapping (simple version):

// Wasm uses a stack, x86-64 uses registers
// Simple approach: map Wasm stack positions to x86-64 stack

void compile_i32_add(CodeBuffer *buf, int *stack_depth) {
    // pop rbx (second operand)
    emit8(buf, 0x5B);  // pop rbx
    // pop rax (first operand)
    emit8(buf, 0x58);  // pop rax
    // add rax, rbx
    emit_add_rax_rbx(buf);
    // push rax (result)
    emit8(buf, 0x50);  // push rax

    (*stack_depth)--;  // Net effect: two pops, one push
}

void compile_i32_const(CodeBuffer *buf, int32_t val, int *stack_depth) {
    // push immediate
    emit8(buf, 0x68);  // push imm32
    emit32(buf, val);

    (*stack_depth)++;
}

Basic JIT compilation loop:

typedef int64_t (*JittedFunc)(int64_t, int64_t);

JittedFunc jit_compile(WasmFunction *func) {
    CodeBuffer buf = { .code = alloc_executable(4096), .capacity = 4096 };

    // Function prologue
    emit8(&buf, 0x55);              // push rbp
    emit8(&buf, 0x48); emit8(&buf, 0x89); emit8(&buf, 0xE5);  // mov rbp, rsp

    // Save arguments to Wasm stack
    // rdi = arg0, rsi = arg1 (System V ABI)
    emit8(&buf, 0x57);  // push rdi
    emit8(&buf, 0x56);  // push rsi

    int stack_depth = 2;  // Two args pushed

    // Compile each instruction
    for (int i = 0; i < func->code_len; ) {
        uint8_t opcode = func->code[i++];

        switch (opcode) {
            case 0x41: {  // i32.const
                int32_t val = read_leb128_s_from(func->code, &i);
                compile_i32_const(&buf, val, &stack_depth);
                break;
            }

            case 0x6A:  // i32.add
                compile_i32_add(&buf, &stack_depth);
                break;

            case 0x20: {  // local.get
                uint32_t idx = read_leb128_u_from(func->code, &i);
                compile_local_get(&buf, idx, &stack_depth);
                break;
            }

            case 0x0F:  // return
            case 0x0B:  // end (at function level)
                // Pop result to rax
                emit8(&buf, 0x58);  // pop rax
                goto epilogue;
        }
    }

epilogue:
    // Function epilogue
    emit8(&buf, 0x48); emit8(&buf, 0x89); emit8(&buf, 0xEC);  // mov rsp, rbp
    emit8(&buf, 0x5D);  // pop rbp
    emit_ret(&buf);

    return (JittedFunc)buf.code;
}

// Usage
JittedFunc add = jit_compile(&module->functions[0]);
int64_t result = add(10, 20);  // Direct native call!

Learning milestones:

  1. Simple functions compile → Basic code generation works
  2. 10x speedup over interpreter → JIT provides real benefit
  3. All instructions supported → Complete Wasm coverage
  4. Approaching native speed → You’ve built a real JIT!

Project 9: Multi-Memory and Threads

  • File: LEARN_WEBASSEMBLY_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 4: Expert
  • Knowledge Area: Concurrency / Memory Models
  • Software or Tool: pthreads
  • Main Book: “C++ Concurrency in Action” by Anthony Williams

What you’ll build: Extensions to your interpreter/JIT that support WebAssembly threads, shared memory, and atomic operations.

Why it teaches WebAssembly: Modern applications need concurrency. Understanding how Wasm handles shared memory, atomics, and thread safety shows the full power of the platform.

Core challenges you’ll face:

  • Shared memory between threads → maps to SharedArrayBuffer semantics
  • Atomic operations → maps to memory ordering and synchronization
  • Wait and notify → maps to thread coordination
  • Thread-safe instantiation → maps to concurrent module usage

Key Concepts:

  • Threads proposal: WebAssembly Threads
  • Atomics: Compare-and-swap, load/store ordering
  • Memory model: Sequential consistency vs relaxed ordering
  • Wait/Notify: Futex-like primitives

Difficulty: Expert Time estimate: 3-4 weeks Prerequisites: Projects 4-6, threading experience

Real world outcome:

;; Parallel sum using atomic add
(module
  (memory (export "mem") 1 1 shared)

  (func $parallel_add (param $index i32) (param $value i32)
    (i32.atomic.rmw.add
      (i32.mul (local.get $index) (i32.const 4))
      (local.get $value)
    )
    drop
  )

  (func $wait (param $addr i32) (param $expected i32) (param $timeout i64) (result i32)
    (memory.atomic.wait32
      (local.get $addr)
      (local.get $expected)
      (local.get $timeout)
    )
  )

  (func $notify (param $addr i32) (param $count i32) (result i32)
    (memory.atomic.notify
      (local.get $addr)
      (local.get $count)
    )
  )
)

Implementation Hints:

Atomic operations:

// Atomic load
int32_t atomic_load_i32(WasmRuntime *rt, uint32_t addr) {
    if (addr % 4 != 0) {
        trap(rt, "unaligned atomic access");
    }
    return __atomic_load_n((int32_t*)(rt->memory + addr), __ATOMIC_SEQ_CST);
}

// Atomic store
void atomic_store_i32(WasmRuntime *rt, uint32_t addr, int32_t value) {
    if (addr % 4 != 0) {
        trap(rt, "unaligned atomic access");
    }
    __atomic_store_n((int32_t*)(rt->memory + addr), value, __ATOMIC_SEQ_CST);
}

// Atomic compare-and-swap
int32_t atomic_rmw_cmpxchg_i32(WasmRuntime *rt, uint32_t addr,
                               int32_t expected, int32_t replacement) {
    int32_t *ptr = (int32_t*)(rt->memory + addr);
    __atomic_compare_exchange_n(ptr, &expected, replacement,
                                false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
    return expected;  // Returns the old value
}

// Atomic add (returns old value)
int32_t atomic_rmw_add_i32(WasmRuntime *rt, uint32_t addr, int32_t value) {
    return __atomic_fetch_add((int32_t*)(rt->memory + addr),
                              value, __ATOMIC_SEQ_CST);
}

Wait and notify (futex-like):

#include <pthread.h>
#include <errno.h>

// Each memory address can have waiters
typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    int waiter_count;
} WaitQueue;

// memory.atomic.wait32
int32_t wasm_memory_wait32(WasmRuntime *rt, uint32_t addr,
                           int32_t expected, int64_t timeout_ns) {
    int32_t *ptr = (int32_t*)(rt->memory + addr);

    WaitQueue *queue = get_wait_queue(rt, addr);
    pthread_mutex_lock(&queue->mutex);

    // Check if value has already changed
    if (*ptr != expected) {
        pthread_mutex_unlock(&queue->mutex);
        return 1;  // "not-equal"
    }

    queue->waiter_count++;

    int result;
    if (timeout_ns < 0) {
        // Wait forever
        pthread_cond_wait(&queue->cond, &queue->mutex);
        result = 0;  // "ok"
    } else {
        struct timespec ts;
        clock_gettime(CLOCK_REALTIME, &ts);
        ts.tv_sec += timeout_ns / 1000000000;
        ts.tv_nsec += timeout_ns % 1000000000;

        int err = pthread_cond_timedwait(&queue->cond, &queue->mutex, &ts);
        result = (err == ETIMEDOUT) ? 2 : 0;  // "timed-out" or "ok"
    }

    queue->waiter_count--;
    pthread_mutex_unlock(&queue->mutex);

    return result;
}

// memory.atomic.notify
int32_t wasm_memory_notify(WasmRuntime *rt, uint32_t addr, int32_t count) {
    WaitQueue *queue = get_wait_queue(rt, addr);
    pthread_mutex_lock(&queue->mutex);

    int woken = (count > queue->waiter_count) ? queue->waiter_count : count;

    if (count == 0) {
        // Wake none
    } else if (count >= queue->waiter_count) {
        pthread_cond_broadcast(&queue->cond);
    } else {
        for (int i = 0; i < count; i++) {
            pthread_cond_signal(&queue->cond);
        }
    }

    pthread_mutex_unlock(&queue->mutex);
    return woken;
}

Learning milestones:

  1. Atomic loads/stores work → Basic atomics implemented
  2. Compare-and-swap works → Lock-free data structures possible
  3. Wait/notify coordinates threads → Synchronization works
  4. Parallel programs run correctly → Full threads support!

Project 10: Component Model Explorer

  • File: LEARN_WEBASSEMBLY_DEEP_DIVE.md
  • Main Programming Language: Rust or Python
  • Alternative Programming Languages: Any with Wasm tooling
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Module Systems / Interface Design
  • Software or Tool: wit-bindgen, wasm-tools
  • Main Book: N/A - Emerging technology

What you’ll build: Explore the WebAssembly Component Model by creating components with rich interfaces, composing them, and understanding how WIT (WebAssembly Interface Types) enables language-agnostic module linking.

Why it teaches WebAssembly: The Component Model is WebAssembly’s future—enabling true language interoperability and plug-in architectures. Understanding it now puts you ahead of the curve.

Core challenges you’ll face:

  • Understanding WIT syntax → maps to interface definition
  • Generating bindings → maps to wit-bindgen tooling
  • Composing components → maps to linking modules together
  • Working with complex types → maps to canonical ABI

Key Concepts:

  • WIT (WebAssembly Interface Types): WIT specification
  • Canonical ABI: How types are passed across component boundaries
  • Worlds: Complete interface specifications
  • Composition: Linking components together

Resources:

Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Rust or Python, basic Wasm knowledge

Real world outcome:

// greeter.wit
package example:greeter;

interface greet {
    greet: func(name: string) -> string;
}

world greeter {
    export greet;
}
// Rust implementation
wit_bindgen::generate!({
    path: "greeter.wit",
    world: "greeter",
});

struct MyGreeter;

impl Guest for MyGreeter {
    fn greet(name: String) -> String {
        format!("Hello, {}!", name)
    }
}

export!(MyGreeter);
$ cargo build --target wasm32-wasi
$ wasm-tools component new target/wasm32-wasi/release/greeter.wasm \
    -o greeter.component.wasm
$ wasmtime run greeter.component.wasm --invoke greet "World"
Hello, World!

Implementation Hints:

WIT basics:

// Primitive types
type my-int = s32;         // signed 32-bit
type my-float = float64;   // 64-bit float
type my-string = string;   // UTF-8 string
type my-bool = bool;

// Records (like structs)
record point {
    x: float32,
    y: float32,
}

// Variants (like enums)
variant result {
    ok(string),
    err(string),
}

// Lists
type numbers = list<s32>;

// Options
type maybe-int = option<s32>;

// Flags (bitfields)
flags permissions {
    read,
    write,
    execute,
}

// Resources (handles to host objects)
resource file {
    constructor(path: string);
    read: func(len: u32) -> list<u8>;
    write: func(data: list<u8>) -> u32;
}

Creating a component from core Wasm:

# 1. Compile to core Wasm
cargo build --target wasm32-wasi --release

# 2. Embed WIT and create component
wasm-tools component new \
    target/wasm32-wasi/release/mylib.wasm \
    --adapt wasi_snapshot_preview1.reactor.wasm \
    -o mylib.component.wasm

# 3. Inspect the component
wasm-tools component wit mylib.component.wasm

Composing components:

# Link two components together
wasm-tools compose \
    consumer.component.wasm \
    --definitions provider.component.wasm \
    -o composed.component.wasm

Learning milestones:

  1. Create WIT interface → Understand type system
  2. Generate bindings → wit-bindgen works
  3. Build and run component → Full workflow works
  4. Compose components → Language interop achieved

Comparison Table

Project Difficulty Time Depth of Understanding Fun Factor
1. WAT Explorer 1 week Stack machine basics ⭐⭐⭐
2. Binary Parser ⭐⭐ 2 weeks Binary format mastery ⭐⭐⭐⭐
3. Validator ⭐⭐⭐ 2-3 weeks Type system depth ⭐⭐⭐
4. Interpreter ⭐⭐⭐ 1 month Execution semantics ⭐⭐⭐⭐⭐
5. Compiler ⭐⭐⭐⭐ 1-2 months Code generation ⭐⭐⭐⭐⭐
6. WASI ⭐⭐⭐ 2-3 weeks System interface ⭐⭐⭐⭐
7. JS Embedding ⭐⭐ 1 week Web integration ⭐⭐⭐⭐
8. JIT Compiler ⭐⭐⭐⭐⭐ 2-3 months Native code gen ⭐⭐⭐⭐⭐
9. Threads ⭐⭐⭐⭐ 3-4 weeks Concurrency ⭐⭐⭐⭐
10. Component Model ⭐⭐⭐ 2 weeks Future of Wasm ⭐⭐⭐⭐

Path A: Understanding WebAssembly (Conceptual)

  1. Project 1 (WAT) - Learn to think in Wasm
  2. Project 2 (Binary Parser) - See what’s in the file
  3. Project 7 (JS Embedding) - Use Wasm in the browser
  4. Project 10 (Component Model) - See where Wasm is heading

Path B: Building a Runtime (Implementation)

  1. Project 1 (WAT) - Understand the semantics
  2. Project 2 (Binary Parser) - Build the front-end
  3. Project 3 (Validator) - Ensure correctness
  4. Project 4 (Interpreter) - Execute code
  5. Project 6 (WASI) - Run real programs
  6. Project 8 (JIT) - Make it fast

Path C: Compiler Writer

  1. Project 1 (WAT) - Know your target
  2. Project 5 (Compiler) - Build a compiler
  3. Project 2 (Binary Parser) - Verify your output
  4. Project 4 (Interpreter) - Test your code

Final Project: WebAssembly Runtime Contribution

  • File: LEARN_WEBASSEMBLY_DEEP_DIVE.md
  • Main Programming Language: Varies
  • Alternative Programming Languages: Any
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 4: Expert
  • Knowledge Area: Open Source / Systems Programming
  • Software or Tool: Git, GitHub
  • Main Book: N/A - Real-world contribution

What you’ll do: Contribute to an open-source WebAssembly project—whether it’s Wasmtime, wasm3, Emscripten, or the spec itself.

Ideas:

  1. Implement a missing opcode in a smaller runtime
  2. Improve documentation in the spec or a runtime
  3. Add a test case that catches an edge case
  4. Port WASI to a new platform
  5. Build a novel tool (debugger, profiler, optimizer)

Resources:


Resources Summary

Official Specifications

Tutorials & Guides

Reference Implementations

  • wac - C interpreter
  • wasm3 - Fast interpreter
  • Wasmtime - Production JIT runtime
  • WAMR - Embedded runtime

Tools

Books

  • “WebAssembly: The Definitive Guide” by Brian Sletten
  • “Programming WebAssembly with Rust” by Kevin Hoffman
  • “Crafting Interpreters” by Robert Nystrom (for interpreter techniques)
  • “Engineering a Compiler” by Cooper & Torczon (for JIT/compiler techniques)

Summary

# Project Main Language
1 WAT Explorer WAT + JavaScript
2 Binary Format Parser C
3 WebAssembly Validator C
4 WebAssembly Interpreter C
5 Simple Language → Wasm Compiler C
6 WASI Implementation C
7 JavaScript Embedding API JavaScript
8 JIT Compiler (Basic) C
9 Multi-Memory and Threads C
10 Component Model Explorer Rust/Python
Final Runtime Contribution Varies

WebAssembly represents a new era of portable, safe, fast code. By understanding it from the ground up—from bytes to JIT—you’ll be prepared to build the next generation of applications that run anywhere. 🚀