Project 6: Full-Stack WASM Toolchain (Capstone)

Project 6: Full-Stack WASM Toolchain (Capstone)

Integrate compiler, validator, interpreter, and debugger into a professional-grade WebAssembly toolchain


The Core Question Youโ€™re Answering

โ€œHow do all the pieces of a compilation pipeline fit together, from source code to running bytecode?โ€

This capstone project answers the fundamental question that every toolchain engineer must understand: how do disparate components - lexer, parser, type checker, code generator, validator, runtime, debugger - come together to form a cohesive system? You will discover that building individual tools is straightforward compared to the challenge of making them work together seamlessly with consistent error handling, shared data structures, and unified user experience.

The deeper insight: a toolchain is not just a collection of tools, but an ecosystem where each component must respect contracts established by others, where error messages must trace through multiple layers, and where optimization at any stage must preserve the semantics guaranteed by earlier stages.


Concepts You Must Understand First

Before attempting this capstone, ensure you have mastery of these foundational concepts:

End-to-End Compilation Pipeline

The journey from source code to execution involves multiple transformations, each with specific responsibilities:

Source Text โ†’ Tokens โ†’ AST โ†’ Typed AST โ†’ IR โ†’ Optimized IR โ†’ Binary โ†’ Validated Binary โ†’ Instantiated Module โ†’ Execution

Each arrow represents a complete component with its own algorithms, data structures, and error handling.

Linker and Module Composition

Understanding how separate compilation units combine:

  • Symbol resolution: How do we find function $add when it is defined in another module?
  • Relocation: How do we patch addresses when the final memory layout is determined?
  • Import/Export matching: How do types and signatures align across module boundaries?

Debug Information (DWARF)

DWARF provides all necessary information for debuggers to resolve locations, variable names, type layouts, and more. Key concepts:

  • Line number tables: Map bytecode offsets to source locations
  • Variable location descriptions: Where is variable x at each program point?
  • Type definitions: What is the structure of a user-defined type?
  • Call frame information: How to unwind the stack for backtraces?

Source Maps for Debugging

Source maps provide location mapping but cannot easily inspect variables. Understanding the tradeoff:

  • DWARF: Full debugging experience, larger files, requires tool support
  • Source maps: Widely supported, but limited to location mapping

Optimization Passes

How transformations improve code while preserving semantics:

  • Analysis passes: Gather information (liveness, dominance, reaching definitions)
  • Transformation passes: Modify code based on analysis
  • Pass ordering: Some optimizations enable others, some conflicts exist
  • Phase ordering problem: No universally optimal ordering exists

Module Instantiation and Linking

The runtime process of preparing a module for execution:

  • Allocate linear memory
  • Create function table
  • Resolve imports from other instances
  • Initialize globals and data segments
  • Execute start function if present

Questions to Guide Your Design

Integration Architecture

  1. How should parser, compiler, and runtime share module representation?
    • Should each have its own representation, or share one?
    • What are the tradeoffs of mutable vs immutable module structures?
    • How do you handle incremental updates (e.g., adding a function in REPL)?
  2. How do you maintain error context across components?
    • When the validator rejects code, how do you trace back to source?
    • How do you aggregate errors from multiple sources?
    • What information must each component preserve for debugging?

Module Linking

  1. How do you implement multi-module linking?
    • Eager vs lazy resolution: when do you verify imports match exports?
    • How do you handle cyclic dependencies?
    • What metadata must accompany each module?
  2. How do you handle versioning and compatibility?
    • What happens when an import signature changes?
    • How do you detect breaking changes vs compatible extensions?

Debug Support

  1. How do you add debug information without breaking optimization?
    • DWARF preservation through optimization passes
    • When to regenerate debug info vs preserve it?
    • How do you handle inlined functions in stack traces?
  2. How do you make the debugger responsive during execution?
    • Polling vs interrupt-based breakpoint checking
    • Impact on performance when debugging is enabled
    • How to minimize overhead when debugging is disabled?

Optimization Strategy

  1. How do you choose optimization levels?
    • What constitutes -O1 vs -O2 vs -O3?
    • How do you balance compile time vs runtime performance?
    • What optimizations should always run regardless of level?
  2. How do you verify optimizations preserve semantics?
    • Differential testing: compare optimized vs unoptimized output
    • Formal verification: prove transformations correct
    • When to trust the optimizer vs verify exhaustively?

Thinking Exercise

Trace a program through every stage of your toolchain:

Consider this simple source program:

func factorial(n: i32) -> i32 {
    if n <= 1 { return 1; }
    return n * factorial(n - 1);
}

Now trace its journey:

  1. Lexer: What tokens are produced? How do you handle the <= operator vs < followed by =?

  2. Parser: What AST structure captures the recursive call? How is the if-expression vs if-statement distinction handled?

  3. Type Checker: How do you verify that factorial(n - 1) has type i32? What context must be maintained?

  4. Code Generator: How does the recursive call become a WASM call instruction? Where do locals come from?

  5. Binary Emitter: How is the LEB128 encoding of the function index generated? What about the block structure for if?

  6. Validator: What stack state exists at each instruction? How does if affect the control stack?

  7. Optimizer: Can the recursive call be tail-call optimized? What analysis determines this?

  8. Instantiator: How is memory allocated? What happens to the function when loaded?

  9. Debugger: If we set a breakpoint at return n * ..., what state should be visible? How do we show nโ€™s value?

  10. Disassembler: How should the output look? Should we show the original source in comments?

Write out the complete transformation at each stage. This exercise reveals the data that must flow between components.


The Interview Questions Theyโ€™ll Ask

Toolchain Architecture

  1. โ€œExplain how a linker resolves symbols across compilation units.โ€
    • Discuss symbol tables, relocation entries, and the two-pass algorithm
    • Address weak vs strong symbols
    • Explain dynamic vs static linking tradeoffs
  2. โ€œHow would you design a modular compiler that supports multiple source languages and target architectures?โ€
    • Discuss the role of intermediate representations
    • Explain how frontends and backends decouple
    • Address the M x N problem (M languages, N targets)
  3. โ€œWhat data structures would you use to represent a WASM module in memory?โ€
    • Discuss arena allocation vs individual heap allocations
    • Address memory layout for cache efficiency
    • Explain immutability vs mutability tradeoffs

Compilation Pipeline

  1. โ€œWalk me through what happens when you compile a + b * c.โ€
    • Lexing: tokens a, +, b, *, c
    • Parsing: precedence handling to get a + (b * c)
    • Type checking: ensuring operands are compatible
    • Code generation: evaluation order considerations
  2. โ€œHow does constant folding work, and what are its limitations?โ€
    • Explain compile-time evaluation
    • Discuss floating-point precision concerns
    • Address overflow behavior differences
  3. โ€œExplain the difference between a syntax error and a semantic error.โ€
    • Syntax: violates grammar rules (parser catches)
    • Semantic: violates type rules (type checker catches)
    • Give examples of each in your language

Debugging

  1. โ€œHow does a debugger implement breakpoints?โ€
    • Software breakpoints: instruction replacement
    • Hardware breakpoints: CPU debug registers
    • For interpreters: instruction dispatch interception
  2. โ€œWhat information does DWARF encode, and why is it complex?โ€
    • Location information that changes as program executes
    • Type information for arbitrary user-defined types
    • Call frame information for stack unwinding
    • Discuss expression languages for variable locations
  3. โ€œHow would you implement step-over vs step-into for function calls?โ€
    • Step-into: stop at first instruction of callee
    • Step-over: set breakpoint at return address, continue
    • Handle recursive calls correctly

Optimization

  1. โ€œWhat is the phase ordering problem in compilers?โ€
    • Some optimizations enable others (inlining enables constant propagation)
    • Some optimizations conflict (instruction scheduling vs register allocation)
    • No universally optimal ordering exists
  2. โ€œHow does dead code elimination work?โ€
    • Compute liveness analysis
    • Mark all live instructions
    • Remove unmarked instructions
    • Handle side effects correctly
  3. โ€œWhatโ€™s the difference between local and global optimization?โ€
    • Local: within a basic block
    • Global: across basic blocks within a function
    • Interprocedural: across function boundaries

Hints in Layers

Integration Challenges

Layer 1: If components are not communicating, check that they share the same module representation. A common mistake is having the compiler emit a different structure than the validator expects.

Layer 2: Error handling across components requires a unified error type. Consider an error that carries: source location, component that detected it, severity, and suggested fixes.

Layer 3: The key insight for integration is contracts. Document what each component promises (postconditions) and requires (preconditions). Validation failures often indicate contract violations.

Debugger Implementation

Layer 1: Start with the simplest possible debugger: stop before every instruction, print it, wait for enter key. This proves your interpreter hook works.

Layer 2: Breakpoints need efficient lookup. A hash set of (function_index, instruction_offset) pairs works well. Donโ€™t linear search through a list on every instruction.

Layer 3: For step-over, you need to track call depth. When step-over is requested, note current depth, continue until depth returns to that level. Handle exceptions that unwind past the step point.

Optimizer Correctness

Layer 1: Before optimizing anything, write tests that compare optimized vs unoptimized output on many inputs. The optimizer is wrong if outputs differ.

Layer 2: Implement optimization passes as pure functions: transform(module) -> new_module. Never mutate in place during development. This makes debugging easier.

Layer 3: For complex optimizations, prove correctness on paper first. What invariant does the transformation preserve? Under what conditions is it safe to apply?

CLI Design

Layer 1: Use a consistent option style throughout. If compile -o output uses -o, then optimize -o output should too. Inconsistency frustrates users.

Layer 2: Return meaningful exit codes: 0 for success, 1 for user error (bad input), 2 for internal error (bug). Scripts depend on these.

Layer 3: Structured output (JSON, machine-readable) enables integration with editors and build systems. Consider --format=json options for all commands that produce output.


Books That Will Help

Book Author(s) Key Topics Why It Matters
Engineering a Compiler (3rd ed.) Keith D. Cooper, Linda Torczon Complete compiler pipeline, optimization algorithms, code generation The definitive practical guide to building production compilers; covers every phase in depth with modern techniques
Advanced C and C++ Compiling Milan Stevanovic Linking, loading, library design, ABI Deep dive into what happens after compilation; essential for understanding module linking
Low-Level Programming Igor Zhirkov Assembly, memory, calling conventions Grounds high-level compiler concepts in concrete machine reality
Practical Binary Analysis Dennis Andriesse Reverse engineering, binary formats, disassembly Understanding binaries from the consumer side; invaluable for debugger and disassembler implementation

Additional References

Resource Type Focus Area
V8 WASM Compilation Pipeline Documentation How a production engine handles WASM compilation
Chrome DWARF Debugging Blog Post Modern WASM debugging with source maps and DWARF
Emscripten Debugging Guide Documentation Practical debugging flags and techniques
LLVM Architecture Documentation How a real toolchain organizes its components

Project Overview

Attribute Value
Difficulty Expert
Time Estimate 2-3 months
Languages C (primary), Rust, Go
Prerequisites Projects 1-5 completed
Main Reference All previous project references
Knowledge Area Toolchain Architecture, Software Integration

Learning Objectives

After completing this project, you will be able to:

  1. Architect a complete toolchain - Design cohesive tools that work together
  2. Implement a WASM validator - Verify modules conform to the specification
  3. Build a debugger - Step through WASM execution with inspection
  4. Create a disassembler - Convert binary back to readable WAT
  5. Add optimizations - Improve generated code quality
  6. Design CLI interfaces - Create professional command-line tools
  7. Write comprehensive tests - Ensure toolchain reliability

Conceptual Foundation

1. What Is a Toolchain?

A toolchain is a collection of tools that work together to transform source code into running programs:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     Complete WebAssembly Toolchain                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                              โ”‚
โ”‚  Source (.mini)                                                              โ”‚
โ”‚       โ”‚                                                                      โ”‚
โ”‚       โ–ผ                                                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                                                             โ”‚
โ”‚  โ”‚   Compiler  โ”‚ mywasmcc source.mini -o module.wasm                        โ”‚
โ”‚  โ”‚  (Project 4)โ”‚                                                             โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜                                                             โ”‚
โ”‚         โ”‚                                                                    โ”‚
โ”‚         โ–ผ                                                                    โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                    โ”‚
โ”‚  โ”‚  Validator  โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚Disassembler โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  Optimizer  โ”‚                    โ”‚
โ”‚  โ”‚   (NEW)     โ”‚     โ”‚   (NEW)     โ”‚     โ”‚   (NEW)     โ”‚                    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜                    โ”‚
โ”‚         โ”‚                                       โ”‚                            โ”‚
โ”‚         โ–ผ                                       โ–ผ                            โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                    โ”‚
โ”‚  โ”‚ Interpreter โ”‚                         โ”‚  Debugger   โ”‚                    โ”‚
โ”‚  โ”‚ (Project 3) โ”‚                         โ”‚   (NEW)     โ”‚                    โ”‚
โ”‚  โ”‚ + WASI (P5) โ”‚                         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                                                             โ”‚
โ”‚                                                                              โ”‚
โ”‚  CLI Interface:                                                              โ”‚
โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”‚
โ”‚  mywasm compile source.mini -o module.wasm                                  โ”‚
โ”‚  mywasm validate module.wasm                                                 โ”‚
โ”‚  mywasm disasm module.wasm                                                   โ”‚
โ”‚  mywasm run module.wasm [args...]                                           โ”‚
โ”‚  mywasm debug module.wasm                                                    โ”‚
โ”‚  mywasm optimize module.wasm -o optimized.wasm                              โ”‚
โ”‚                                                                              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Complete WebAssembly Toolchain Architecture

2. Why Build a Complete Toolchain?

Building individual tools teaches concepts. Building a toolchain teaches:

  • Integration: How tools communicate and share data
  • User experience: How developers actually use tools
  • Robustness: Handling edge cases across the full pipeline
  • Architecture: Designing for extensibility and maintenance

The professional outcome: Youโ€™ll have a portfolio piece that demonstrates mastery of WebAssembly and software engineering.

3. Toolchain Components Overview

Component Purpose Status
Compiler Source โ†’ WASM From Project 4
Interpreter Execute WASM From Project 3
WASI Runtime System interface From Project 5
Validator Verify correctness NEW
Disassembler WASM โ†’ WAT NEW
Debugger Interactive execution NEW
Optimizer Improve code NEW
Linker Combine modules STRETCH

4. The Validator: Ensuring Correctness

WASM validation ensures a module is well-formed before execution:

Validation Checks:

1. Structure validation
   - Magic number correct
   - Version supported
   - Sections in correct order
   - No duplicate sections

2. Type validation
   - All type indices in bounds
   - Function signatures valid
   - Block types well-formed

3. Function validation
   - Stack balanced at every point
   - Types consistent through execution
   - All branches target valid labels
   - All calls reference valid functions

4. Memory/Table validation
   - Indices in bounds
   - Limits valid (min โ‰ค max)
   - Data segments fit in memory

5. Import/Export validation
   - All imports present
   - Export names unique
   - Indices valid

Type checking algorithm (stack-based):

validate_function(func):
    stack = []
    control_stack = []  # For blocks/loops/ifs

    for instruction in func.body:
        match instruction:
            case i32.const(n):
                push(stack, i32)

            case i32.add:
                pop_expect(stack, i32)
                pop_expect(stack, i32)
                push(stack, i32)

            case local.get(idx):
                type = func.locals[idx].type
                push(stack, type)

            case local.set(idx):
                type = func.locals[idx].type
                pop_expect(stack, type)

            case block(result_type):
                control_stack.push({
                    kind: BLOCK,
                    result: result_type,
                    height: len(stack)
                })

            case br(depth):
                label = control_stack[depth]
                # Pop values for label arity
                for type in label.result:
                    pop_expect(stack, type)
                # Mark as unreachable
                unreachable = true

            case end:
                block = control_stack.pop()
                # Stack should have exactly result values
                check(len(stack) == block.height + len(block.result))
                for type in block.result:
                    pop_expect(stack, type)
                # Push results back
                for type in block.result:
                    push(stack, type)

5. The Disassembler: Binary to Text

Convert .wasm back to readable WAT:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      Disassembler Pipeline                           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                      โ”‚
โ”‚  Binary Input                WAT Output                              โ”‚
โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                              โ”‚
โ”‚  00 61 73 6d 01 00 00 00    (module                                 โ”‚
โ”‚  01 07 01 60 02 7f 7f       (type (func (param i32 i32)            โ”‚
โ”‚  01 7f                            (result i32)))                    โ”‚
โ”‚  03 02 01 00                 (func (type 0)                          โ”‚
โ”‚  07 07 01 03 61 64 64        (export "add" (func 0))                โ”‚
โ”‚  00 00                       local.get 0                             โ”‚
โ”‚  0a 09 01 07 00 20 00        local.get 1                             โ”‚
โ”‚  20 01 6a 0b                 i32.add)                                โ”‚
โ”‚                             )                                        โ”‚
โ”‚                                                                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Disassembler Pipeline

Disassembler features:

  • Resolve function/type indices to names
  • Format with proper indentation
  • Show hex offsets (optional)
  • Include comments with original bytes

6. The Debugger: Interactive Execution

A debugger lets you control and inspect execution:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                       Debugger Architecture                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                      โ”‚
โ”‚  User Interface                                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”โ”‚
โ”‚  โ”‚ (wdb) break add                                                 โ”‚โ”‚
โ”‚  โ”‚ Breakpoint 1 set at function $add                               โ”‚โ”‚
โ”‚  โ”‚ (wdb) run                                                       โ”‚โ”‚
โ”‚  โ”‚ Breakpoint 1 hit at $add                                        โ”‚โ”‚
โ”‚  โ”‚ (wdb) stack                                                     โ”‚โ”‚
โ”‚  โ”‚   [0] i32: 5                                                    โ”‚โ”‚
โ”‚  โ”‚   [1] i32: 3                                                    โ”‚โ”‚
โ”‚  โ”‚ (wdb) step                                                      โ”‚โ”‚
โ”‚  โ”‚   local.get 0                                                   โ”‚โ”‚
โ”‚  โ”‚ (wdb) print $0                                                  โ”‚โ”‚
โ”‚  โ”‚   $0 = i32: 5                                                   โ”‚โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜โ”‚
โ”‚                      โ”‚                                               โ”‚
โ”‚                      โ–ผ                                               โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”โ”‚
โ”‚  โ”‚                Debug Controller                                 โ”‚โ”‚
โ”‚  โ”‚  - Breakpoint management                                        โ”‚โ”‚
โ”‚  โ”‚  - Step control (step, next, continue)                          โ”‚โ”‚
โ”‚  โ”‚  - State inspection (stack, locals, memory, globals)            โ”‚โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜โ”‚
โ”‚                             โ”‚                                        โ”‚
โ”‚                             โ–ผ                                        โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”โ”‚
โ”‚  โ”‚              Modified Interpreter (from P3)                     โ”‚โ”‚
โ”‚  โ”‚  - Hooks before each instruction                                โ”‚โ”‚
โ”‚  โ”‚  - State accessible to debugger                                 โ”‚โ”‚
โ”‚  โ”‚  - Can pause/resume execution                                   โ”‚โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜โ”‚
โ”‚                                                                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Debugger Architecture

Debugger commands:

  • break <func> - Set breakpoint at function
  • break <func>:<offset> - Set breakpoint at instruction
  • run [args] - Start execution
  • continue - Resume until next breakpoint
  • step - Execute one instruction
  • next - Execute to next line (step over calls)
  • finish - Execute until function returns
  • stack - Show value stack
  • locals - Show local variables
  • memory <addr> <len> - Dump memory
  • backtrace - Show call stack
  • print <expr> - Evaluate and print

7. The Optimizer: Improving Code

Simple optimizations that improve generated code:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     Optimization Passes                              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                      โ”‚
โ”‚  1. Constant Folding                                                 โ”‚
โ”‚     โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                โ”‚
โ”‚     Before:  i32.const 5                                             โ”‚
โ”‚              i32.const 3                                             โ”‚
โ”‚              i32.add                                                 โ”‚
โ”‚     After:   i32.const 8                                             โ”‚
โ”‚                                                                      โ”‚
โ”‚  2. Dead Code Elimination                                            โ”‚
โ”‚     โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                           โ”‚
โ”‚     Before:  return                                                  โ”‚
โ”‚              i32.const 5    ;; unreachable                           โ”‚
โ”‚              drop                                                    โ”‚
โ”‚     After:   return                                                  โ”‚
โ”‚                                                                      โ”‚
โ”‚  3. Local Variable Coalescing                                        โ”‚
โ”‚     โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                         โ”‚
โ”‚     Before:  local.get 0                                             โ”‚
โ”‚              local.set 1                                             โ”‚
โ”‚              local.get 1                                             โ”‚
โ”‚     After:   local.get 0                                             โ”‚
โ”‚              local.tee 1                                             โ”‚
โ”‚                                                                      โ”‚
โ”‚  4. Strength Reduction                                               โ”‚
โ”‚     โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                             โ”‚
โ”‚     Before:  i32.const 2                                             โ”‚
โ”‚              i32.mul                                                 โ”‚
โ”‚     After:   i32.const 1                                             โ”‚
โ”‚              i32.shl                                                 โ”‚
โ”‚                                                                      โ”‚
โ”‚  5. Block Flattening                                                 โ”‚
โ”‚     โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                โ”‚
โ”‚     Before:  block                                                   โ”‚
โ”‚                block                                                 โ”‚
โ”‚                  nop                                                 โ”‚
โ”‚                end                                                   โ”‚
โ”‚              end                                                     โ”‚
โ”‚     After:   nop                                                     โ”‚
โ”‚                                                                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Optimization Passes

8. Unified CLI Design

Create a cohesive command-line interface:

$ mywasm --help
mywasm - A WebAssembly toolchain

USAGE:
    mywasm <COMMAND> [OPTIONS]

COMMANDS:
    compile     Compile source language to WASM
    validate    Check if a WASM module is valid
    disasm      Disassemble WASM to WAT text format
    run         Execute a WASM module
    debug       Debug a WASM module interactively
    optimize    Optimize a WASM module
    info        Display module information

OPTIONS:
    -h, --help      Print help information
    -V, --version   Print version information
    -v, --verbose   Enable verbose output

EXAMPLES:
    mywasm compile hello.mini -o hello.wasm
    mywasm validate hello.wasm
    mywasm run hello.wasm
    mywasm debug hello.wasm
    mywasm disasm hello.wasm > hello.wat

Project Specification

Required Components

Core (Must Have):

  1. Unified CLI - Single mywasm command with subcommands
  2. Validator - Structural and type validation
  3. Disassembler - Binary to WAT conversion
  4. Debugger - Basic stepping and inspection
  5. Integration - Previous projects working together

Enhanced (Should Have):

  1. Optimizer - At least constant folding
  2. Module Info - Display module structure
  3. Error Messages - Clear, actionable diagnostics
  4. Test Suite - Comprehensive automated tests

Stretch (Nice to Have):

  1. Linker - Combine multiple modules
  2. Source Maps - Map WASM back to source
  3. Profiler - Execution timing and hotspots
  4. REPL - Interactive WASM evaluation

Success Criteria

  1. End-to-end flow: Compile, validate, run a program
  2. Validation catches errors: Reject malformed modules
  3. Debugger works: Set breakpoint, hit it, inspect state
  4. Disassembly round-trips: disasm(compile(source)) is readable
  5. Professional CLI: Help text, error messages, return codes
  6. Test coverage: All major functionality tested

Solution Architecture

Directory Structure

mywasm/
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ main.c                 # CLI entry point
โ”‚   โ”œโ”€โ”€ cli/
โ”‚   โ”‚   โ”œโ”€โ”€ cli.c              # Command parsing
โ”‚   โ”‚   โ”œโ”€โ”€ cli.h
โ”‚   โ”‚   โ”œโ”€โ”€ compile_cmd.c      # compile subcommand
โ”‚   โ”‚   โ”œโ”€โ”€ validate_cmd.c     # validate subcommand
โ”‚   โ”‚   โ”œโ”€โ”€ run_cmd.c          # run subcommand
โ”‚   โ”‚   โ”œโ”€โ”€ debug_cmd.c        # debug subcommand
โ”‚   โ”‚   โ”œโ”€โ”€ disasm_cmd.c       # disasm subcommand
โ”‚   โ”‚   โ””โ”€โ”€ optimize_cmd.c     # optimize subcommand
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ compiler/              # From Project 4
โ”‚   โ”‚   โ”œโ”€โ”€ lexer.c
โ”‚   โ”‚   โ”œโ”€โ”€ parser.c
โ”‚   โ”‚   โ”œโ”€โ”€ checker.c
โ”‚   โ”‚   โ””โ”€โ”€ codegen.c
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ runtime/               # From Projects 3 & 5
โ”‚   โ”‚   โ”œโ”€โ”€ parser.c           # WASM binary parser
โ”‚   โ”‚   โ”œโ”€โ”€ exec.c             # Interpreter
โ”‚   โ”‚   โ”œโ”€โ”€ memory.c
โ”‚   โ”‚   โ”œโ”€โ”€ stack.c
โ”‚   โ”‚   โ””โ”€โ”€ wasi/
โ”‚   โ”‚       โ”œโ”€โ”€ wasi.c
โ”‚   โ”‚       โ””โ”€โ”€ fd_table.c
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ validator/             # NEW
โ”‚   โ”‚   โ”œโ”€โ”€ validator.c
โ”‚   โ”‚   โ”œโ”€โ”€ validator.h
โ”‚   โ”‚   โ”œโ”€โ”€ type_checker.c
โ”‚   โ”‚   โ””โ”€โ”€ struct_checker.c
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ disasm/                # NEW
โ”‚   โ”‚   โ”œโ”€โ”€ disasm.c
โ”‚   โ”‚   โ””โ”€โ”€ disasm.h
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ debugger/              # NEW
โ”‚   โ”‚   โ”œโ”€โ”€ debugger.c
โ”‚   โ”‚   โ”œโ”€โ”€ debugger.h
โ”‚   โ”‚   โ”œโ”€โ”€ breakpoints.c
โ”‚   โ”‚   โ””โ”€โ”€ ui.c
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ optimizer/             # NEW
โ”‚   โ”‚   โ”œโ”€โ”€ optimizer.c
โ”‚   โ”‚   โ”œโ”€โ”€ optimizer.h
โ”‚   โ”‚   โ”œโ”€โ”€ const_fold.c
โ”‚   โ”‚   โ”œโ”€โ”€ dead_code.c
โ”‚   โ”‚   โ””โ”€โ”€ peephole.c
โ”‚   โ”‚
โ”‚   โ””โ”€โ”€ common/
โ”‚       โ”œโ”€โ”€ types.h
โ”‚       โ”œโ”€โ”€ error.c
โ”‚       โ”œโ”€โ”€ error.h
โ”‚       โ””โ”€โ”€ util.c
โ”‚
โ”œโ”€โ”€ tests/
โ”‚   โ”œโ”€โ”€ compiler/
โ”‚   โ”œโ”€โ”€ validator/
โ”‚   โ”œโ”€โ”€ runtime/
โ”‚   โ”œโ”€โ”€ debugger/
โ”‚   โ”œโ”€โ”€ optimizer/
โ”‚   โ””โ”€โ”€ integration/
โ”‚
โ”œโ”€โ”€ docs/
โ”‚   โ”œโ”€โ”€ user_guide.md
โ”‚   โ”œโ”€โ”€ architecture.md
โ”‚   โ””โ”€โ”€ contributing.md
โ”‚
โ”œโ”€โ”€ examples/
โ”‚   โ”œโ”€โ”€ hello.mini
โ”‚   โ”œโ”€โ”€ factorial.mini
โ”‚   โ”œโ”€โ”€ fibonacci.mini
โ”‚   โ””โ”€โ”€ cat.mini
โ”‚
โ”œโ”€โ”€ Makefile
โ””โ”€โ”€ README.md

Shared Module Representation

All tools share a common module representation:

// common/types.h

typedef struct {
    // Type section
    FuncType* types;
    uint32_t type_count;

    // Function section
    uint32_t* func_types;  // Type index for each function
    uint32_t func_count;

    // Code section
    FuncBody* code;

    // Memory section
    Memory* memories;
    uint32_t memory_count;

    // Global section
    Global* globals;
    uint32_t global_count;

    // Import section
    Import* imports;
    uint32_t import_count;

    // Export section
    Export* exports;
    uint32_t export_count;

    // Data section
    DataSegment* data;
    uint32_t data_count;

    // Custom sections (for names, debug info)
    CustomSection* custom;
    uint32_t custom_count;

    // Name section data (if present)
    NameSection* names;
} Module;

// Shared across tools:
Module* parse_wasm(const uint8_t* bytes, size_t len);
void free_module(Module* module);
uint8_t* emit_wasm(Module* module, size_t* out_len);

Implementation Guide

Phase 1: CLI Framework (Days 1-5)

Goal: Unified command-line interface

// main.c
int main(int argc, char** argv) {
    if (argc < 2) {
        print_usage();
        return 1;
    }

    const char* cmd = argv[1];

    if (strcmp(cmd, "compile") == 0) {
        return cmd_compile(argc - 1, argv + 1);
    } else if (strcmp(cmd, "validate") == 0) {
        return cmd_validate(argc - 1, argv + 1);
    } else if (strcmp(cmd, "run") == 0) {
        return cmd_run(argc - 1, argv + 1);
    } else if (strcmp(cmd, "debug") == 0) {
        return cmd_debug(argc - 1, argv + 1);
    } else if (strcmp(cmd, "disasm") == 0) {
        return cmd_disasm(argc - 1, argv + 1);
    } else if (strcmp(cmd, "optimize") == 0) {
        return cmd_optimize(argc - 1, argv + 1);
    } else if (strcmp(cmd, "--help") == 0 || strcmp(cmd, "-h") == 0) {
        print_help();
        return 0;
    } else {
        fprintf(stderr, "Unknown command: %s\n", cmd);
        return 1;
    }
}

Checkpoint: mywasm --help shows all commands.

Phase 2: Integration (Days 6-10)

Goal: Connect previous projects

Wire up existing code to CLI commands:

// cli/compile_cmd.c
int cmd_compile(int argc, char** argv) {
    // Parse arguments
    const char* input = NULL;
    const char* output = "a.wasm";

    for (int i = 1; i < argc; i++) {
        if (strcmp(argv[i], "-o") == 0 && i + 1 < argc) {
            output = argv[++i];
        } else if (argv[i][0] != '-') {
            input = argv[i];
        }
    }

    if (!input) {
        fprintf(stderr, "Usage: mywasm compile <input.mini> [-o output.wasm]\n");
        return 1;
    }

    // Read source
    char* source = read_file(input);
    if (!source) {
        fprintf(stderr, "Error: Cannot read %s\n", input);
        return 1;
    }

    // Compile (from Project 4)
    CompileResult result = compile(source);
    free(source);

    if (result.error) {
        fprintf(stderr, "%s:%d: %s\n",
                input, result.error_line, result.error_msg);
        return 1;
    }

    // Write output
    write_file(output, result.wasm, result.wasm_len);
    printf("Compiled %s -> %s (%zu bytes)\n", input, output, result.wasm_len);

    return 0;
}

Checkpoint: mywasm compile hello.mini && mywasm run hello.wasm works.

Phase 3: Validator (Days 11-20)

Goal: Catch invalid modules

// validator/validator.c

typedef struct {
    bool valid;
    char error[256];
    int error_offset;
} ValidationResult;

ValidationResult validate_module(Module* module) {
    ValidationResult result = {.valid = true};

    // Check structure
    if (!validate_structure(module, &result)) return result;

    // Check types
    if (!validate_types(module, &result)) return result;

    // Check functions
    for (uint32_t i = 0; i < module->func_count; i++) {
        if (!validate_function(module, i, &result)) return result;
    }

    // Check memory
    if (!validate_memory(module, &result)) return result;

    // Check data segments
    if (!validate_data(module, &result)) return result;

    return result;
}

bool validate_function(Module* module, uint32_t idx, ValidationResult* result) {
    FuncBody* body = &module->code[idx];
    FuncType* type = &module->types[module->func_types[idx]];

    // Initialize validator state
    ValidatorState state = {
        .stack = create_type_stack(),
        .control = create_control_stack(),
        .locals = get_local_types(module, idx),
        .num_locals = body->local_count + type->param_count,
    };

    // Validate each instruction
    for (size_t i = 0; i < body->code_len; ) {
        uint8_t opcode = body->code[i++];

        if (!validate_instruction(&state, opcode, body->code, &i, result)) {
            return false;
        }
    }

    // Check final stack matches return type
    if (!check_stack_matches(&state, type->results, type->result_count, result)) {
        return false;
    }

    return true;
}

Test invalid modules:

# Stack underflow
echo "(module (func i32.add))" | wat2wasm - -o bad.wasm 2>/dev/null
./mywasm validate bad.wasm
# Expected: "Error: Stack underflow at offset 0x10"

# Type mismatch
echo "(module (func (result i32) f32.const 1.0))" | wat2wasm - -o bad.wasm 2>/dev/null
./mywasm validate bad.wasm
# Expected: "Error: Type mismatch: expected i32, got f32"

Checkpoint: Rejects invalid modules with clear errors.

Phase 4: Disassembler (Days 21-28)

Goal: Convert WASM back to WAT

// disasm/disasm.c

void disassemble(Module* module, FILE* out) {
    fprintf(out, "(module\n");

    // Types
    for (uint32_t i = 0; i < module->type_count; i++) {
        disasm_type(module, i, out);
    }

    // Imports
    for (uint32_t i = 0; i < module->import_count; i++) {
        disasm_import(module, i, out);
    }

    // Functions
    for (uint32_t i = 0; i < module->func_count; i++) {
        disasm_function(module, i, out);
    }

    // Memory
    for (uint32_t i = 0; i < module->memory_count; i++) {
        disasm_memory(module, i, out);
    }

    // Exports
    for (uint32_t i = 0; i < module->export_count; i++) {
        disasm_export(module, i, out);
    }

    // Data segments
    for (uint32_t i = 0; i < module->data_count; i++) {
        disasm_data(module, i, out);
    }

    fprintf(out, ")\n");
}

void disasm_function(Module* module, uint32_t idx, FILE* out) {
    FuncBody* body = &module->code[idx];
    FuncType* type = &module->types[module->func_types[idx]];

    // Get name if available
    const char* name = get_func_name(module, idx);

    fprintf(out, "  (func");
    if (name) fprintf(out, " $%s", name);
    fprintf(out, " (type %u)", module->func_types[idx]);

    // Parameters
    for (uint32_t i = 0; i < type->param_count; i++) {
        fprintf(out, " (param %s)", type_name(type->params[i]));
    }

    // Results
    for (uint32_t i = 0; i < type->result_count; i++) {
        fprintf(out, " (result %s)", type_name(type->results[i]));
    }

    fprintf(out, "\n");

    // Locals
    for (uint32_t i = 0; i < body->local_count; i++) {
        fprintf(out, "    (local %s)\n", type_name(body->locals[i]));
    }

    // Instructions
    disasm_instructions(module, body->code, body->code_len, out, 2);

    fprintf(out, "  )\n");
}

void disasm_instructions(Module* module, uint8_t* code, size_t len,
                         FILE* out, int indent) {
    size_t i = 0;
    while (i < len) {
        uint8_t opcode = code[i++];

        print_indent(out, indent);

        switch (opcode) {
            case 0x00:
                fprintf(out, "unreachable\n");
                break;

            case 0x01:
                fprintf(out, "nop\n");
                break;

            case 0x02: {  // block
                int8_t block_type = (int8_t)code[i++];
                fprintf(out, "block");
                if (block_type != 0x40) {
                    fprintf(out, " (result %s)", type_name(block_type));
                }
                fprintf(out, "\n");
                indent++;
                break;
            }

            case 0x03: {  // loop
                int8_t block_type = (int8_t)code[i++];
                fprintf(out, "loop");
                if (block_type != 0x40) {
                    fprintf(out, " (result %s)", type_name(block_type));
                }
                fprintf(out, "\n");
                indent++;
                break;
            }

            case 0x0b:  // end
                indent--;
                print_indent(out, indent);
                fprintf(out, "end\n");
                break;

            case 0x0c: {  // br
                uint32_t depth = read_leb128(code, &i);
                fprintf(out, "br %u\n", depth);
                break;
            }

            case 0x10: {  // call
                uint32_t func_idx = read_leb128(code, &i);
                const char* name = get_func_name(module, func_idx);
                if (name) {
                    fprintf(out, "call $%s\n", name);
                } else {
                    fprintf(out, "call %u\n", func_idx);
                }
                break;
            }

            case 0x20: {  // local.get
                uint32_t idx = read_leb128(code, &i);
                fprintf(out, "local.get %u\n", idx);
                break;
            }

            case 0x41: {  // i32.const
                int32_t val = read_sleb128(code, &i);
                fprintf(out, "i32.const %d\n", val);
                break;
            }

            case 0x6a:
                fprintf(out, "i32.add\n");
                break;

            // ... all other opcodes ...

            default:
                fprintf(out, ";; unknown opcode 0x%02x\n", opcode);
        }
    }
}

Checkpoint: mywasm disasm hello.wasm produces readable WAT.

Phase 5: Debugger (Days 29-42)

Goal: Interactive debugging

// debugger/debugger.c

typedef struct {
    Module* module;
    Instance* instance;

    // Breakpoints
    Breakpoint* breakpoints;
    int breakpoint_count;

    // Current state
    uint32_t current_func;
    size_t current_ip;
    bool running;
    bool stepping;
} Debugger;

void debug_repl(Debugger* dbg) {
    char line[256];

    printf("WebAssembly Debugger\n");
    printf("Type 'help' for commands.\n\n");

    while (true) {
        printf("(wdb) ");
        fflush(stdout);

        if (!fgets(line, sizeof(line), stdin)) break;

        // Remove newline
        line[strcspn(line, "\n")] = 0;

        // Parse and execute command
        char* cmd = strtok(line, " ");
        if (!cmd) continue;

        if (strcmp(cmd, "run") == 0 || strcmp(cmd, "r") == 0) {
            cmd_run(dbg);
        } else if (strcmp(cmd, "break") == 0 || strcmp(cmd, "b") == 0) {
            char* arg = strtok(NULL, " ");
            cmd_break(dbg, arg);
        } else if (strcmp(cmd, "continue") == 0 || strcmp(cmd, "c") == 0) {
            cmd_continue(dbg);
        } else if (strcmp(cmd, "step") == 0 || strcmp(cmd, "s") == 0) {
            cmd_step(dbg);
        } else if (strcmp(cmd, "next") == 0 || strcmp(cmd, "n") == 0) {
            cmd_next(dbg);
        } else if (strcmp(cmd, "stack") == 0) {
            cmd_stack(dbg);
        } else if (strcmp(cmd, "locals") == 0) {
            cmd_locals(dbg);
        } else if (strcmp(cmd, "memory") == 0 || strcmp(cmd, "x") == 0) {
            char* addr_str = strtok(NULL, " ");
            char* len_str = strtok(NULL, " ");
            cmd_memory(dbg, addr_str, len_str);
        } else if (strcmp(cmd, "backtrace") == 0 || strcmp(cmd, "bt") == 0) {
            cmd_backtrace(dbg);
        } else if (strcmp(cmd, "help") == 0 || strcmp(cmd, "h") == 0) {
            cmd_help();
        } else if (strcmp(cmd, "quit") == 0 || strcmp(cmd, "q") == 0) {
            break;
        } else {
            printf("Unknown command: %s\n", cmd);
        }
    }
}

void cmd_step(Debugger* dbg) {
    if (!dbg->running) {
        printf("Program not running. Use 'run' to start.\n");
        return;
    }

    // Execute one instruction
    dbg->stepping = true;
    execute_one(dbg->instance);
    dbg->stepping = false;

    // Show current instruction
    print_current_instruction(dbg);
}

void cmd_stack(Debugger* dbg) {
    Stack* stack = &dbg->instance->stack;

    printf("Value stack (%d values):\n", stack->sp);
    for (int i = stack->sp - 1; i >= 0; i--) {
        Value* v = &stack->data[i];
        printf("  [%d] %s: ", stack->sp - 1 - i, type_name(v->type));
        print_value(v);
        printf("\n");
    }
}

void cmd_locals(Debugger* dbg) {
    Frame* frame = current_frame(dbg->instance);

    printf("Local variables:\n");
    for (uint32_t i = 0; i < frame->local_count; i++) {
        Value* v = &frame->locals[i];
        const char* name = get_local_name(dbg->module, dbg->current_func, i);
        if (name) {
            printf("  $%s: ", name);
        } else {
            printf("  [%u]: ", i);
        }
        printf("%s = ", type_name(v->type));
        print_value(v);
        printf("\n");
    }
}

Debugger integration with interpreter:

// Modify exec.c to support debugging

typedef void (*DebugHook)(Instance* inst, uint8_t opcode, size_t ip);

void execute_with_debug(Instance* inst, DebugHook hook) {
    while (!inst->halted) {
        uint8_t opcode = read_byte(inst);

        // Call debug hook before each instruction
        if (hook) {
            hook(inst, opcode, inst->ip - 1);
        }

        execute_instruction(inst, opcode);
    }
}

Checkpoint: Can set breakpoint, hit it, inspect stack and locals.

Phase 6: Optimizer (Days 43-52)

Goal: Improve code quality

// optimizer/optimizer.c

Module* optimize(Module* module, OptimizeOptions* opts) {
    Module* opt = clone_module(module);

    for (uint32_t i = 0; i < opt->func_count; i++) {
        FuncBody* body = &opt->code[i];

        if (opts->constant_fold) {
            constant_fold(body);
        }

        if (opts->dead_code) {
            eliminate_dead_code(body);
        }

        if (opts->peephole) {
            peephole_optimize(body);
        }
    }

    return opt;
}

// optimizer/const_fold.c

void constant_fold(FuncBody* body) {
    // Build instruction list
    Instruction* instrs = decode_instructions(body->code, body->code_len);
    int count = count_instructions(instrs);

    // Look for patterns
    for (int i = 0; i < count - 2; i++) {
        // i32.const X; i32.const Y; i32.add โ†’ i32.const (X+Y)
        if (instrs[i].opcode == 0x41 &&      // i32.const
            instrs[i+1].opcode == 0x41 &&    // i32.const
            instrs[i+2].opcode == 0x6a) {    // i32.add

            int32_t a = instrs[i].i32_val;
            int32_t b = instrs[i+1].i32_val;
            int32_t result = a + b;

            // Replace with single constant
            instrs[i].i32_val = result;
            mark_deleted(&instrs[i+1]);
            mark_deleted(&instrs[i+2]);
        }

        // Similar for other operations: sub, mul, etc.
    }

    // Rebuild body
    body->code = encode_instructions(instrs, &body->code_len);
}

// optimizer/dead_code.c

void eliminate_dead_code(FuncBody* body) {
    Instruction* instrs = decode_instructions(body->code, body->code_len);
    int count = count_instructions(instrs);

    // Mark instructions after unconditional branches as dead
    bool unreachable = false;

    for (int i = 0; i < count; i++) {
        if (unreachable) {
            // Mark as dead unless it's a target (end, else)
            if (instrs[i].opcode != 0x0b &&   // end
                instrs[i].opcode != 0x05) {   // else
                mark_deleted(&instrs[i]);
            } else {
                unreachable = false;
            }
        }

        // These make following code unreachable
        if (instrs[i].opcode == 0x00 ||    // unreachable
            instrs[i].opcode == 0x0f ||    // return
            instrs[i].opcode == 0x0c) {    // br (unconditional)
            unreachable = true;
        }
    }

    body->code = encode_instructions(instrs, &body->code_len);
}

Checkpoint: mywasm optimize reduces code size on test programs.

Phase 7: Testing & Polish (Days 53-60+)

Goal: Production quality

# tests/integration/test_full_pipeline.sh

#!/bin/bash

set -e

echo "=== Integration Tests ==="

# Test 1: Full pipeline
echo "Test 1: Compile -> Validate -> Run"
./mywasm compile examples/factorial.mini -o /tmp/fact.wasm
./mywasm validate /tmp/fact.wasm
result=$(./mywasm run /tmp/fact.wasm --invoke factorial 10)
[ "$result" = "3628800" ] && echo "PASS" || echo "FAIL: expected 3628800, got $result"

# Test 2: Disassembly round-trip
echo "Test 2: Disassembly produces valid WAT"
./mywasm disasm /tmp/fact.wasm > /tmp/fact.wat
wat2wasm /tmp/fact.wat -o /tmp/fact2.wasm
./mywasm validate /tmp/fact2.wasm && echo "PASS" || echo "FAIL"

# Test 3: Optimizer preserves semantics
echo "Test 3: Optimization preserves semantics"
./mywasm optimize /tmp/fact.wasm -o /tmp/fact_opt.wasm
./mywasm validate /tmp/fact_opt.wasm
result_opt=$(./mywasm run /tmp/fact_opt.wasm --invoke factorial 10)
[ "$result_opt" = "3628800" ] && echo "PASS" || echo "FAIL"

# Test 4: Validator rejects bad modules
echo "Test 4: Validator rejects invalid module"
echo "00 61 73 6d 01 00 00 00" | xxd -r -p > /tmp/empty.wasm
./mywasm validate /tmp/empty.wasm 2>&1 | grep -q "invalid" && echo "PASS" || echo "FAIL"

# Test 5: Error messages are helpful
echo "Test 5: Error messages include location"
echo "func main() { return x; }" > /tmp/bad.mini
./mywasm compile /tmp/bad.mini 2>&1 | grep -q "undefined" && echo "PASS" || echo "FAIL"

echo "=== All tests completed ==="

Testing Strategy

Unit Tests

Test each component in isolation:

// tests/validator/test_type_check.c

void test_stack_balance() {
    // Valid: push, push, add = balanced
    uint8_t code[] = {0x41, 0x01, 0x41, 0x02, 0x6a, 0x0b};
    ValidationResult r = validate_code(code, sizeof(code), TYPE_I32);
    assert(r.valid);
}

void test_stack_underflow() {
    // Invalid: add with empty stack
    uint8_t code[] = {0x6a, 0x0b};
    ValidationResult r = validate_code(code, sizeof(code), TYPE_VOID);
    assert(!r.valid);
    assert(strstr(r.error, "underflow"));
}

void test_type_mismatch() {
    // Invalid: f32 when i32 expected
    uint8_t code[] = {0x43, 0x00, 0x00, 0x80, 0x3f, 0x0b};  // f32.const 1.0
    ValidationResult r = validate_code(code, sizeof(code), TYPE_I32);
    assert(!r.valid);
    assert(strstr(r.error, "mismatch"));
}

Fuzzing

Use fuzzing to find edge cases:

// tests/fuzz/fuzz_validator.c

int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    if (size < 8) return 0;

    // Try to parse as WASM
    Module* module = parse_wasm(data, size);
    if (module) {
        // Validate (should not crash)
        ValidationResult r = validate_module(module);

        // If valid, try to run
        if (r.valid) {
            Instance* inst = instantiate(module);
            if (inst) {
                // Run briefly
                execute_steps(inst, 1000);
                free_instance(inst);
            }
        }

        free_module(module);
    }

    return 0;
}

Spec Conformance

Run the official WebAssembly test suite:

# Clone spec tests
git clone https://github.com/WebAssembly/spec.git

# Convert .wast to .wasm and run
for wast in spec/test/core/*.wast; do
    wast2json "$wast" -o /tmp/test.json
    ./mywasm test /tmp/test.json
done

Common Pitfalls

1. Validator Stack Polymorphism

After unreachable, the stack is polymorphic:

(func (result i32)
  unreachable
  ;; At this point, stack could be anything
  i32.add    ;; This is actually valid!
)

Handle with special โ€œpolymorphicโ€ stack state.

2. Debugger Thread Safety

If you support threading later, debugger state must be synchronized:

// Use mutex for breakpoint list
pthread_mutex_lock(&dbg->breakpoint_mutex);
// ... modify breakpoints ...
pthread_mutex_unlock(&dbg->breakpoint_mutex);

3. Optimizer Correctness

Always verify optimizations preserve semantics:

// Before releasing optimization:
// 1. Run all tests with optimization
// 2. Compare output of optimized vs unoptimized
// 3. Check code still validates

4. CLI Argument Parsing

Handle edge cases:

# These should all work:
mywasm run program.wasm
mywasm run program.wasm --
mywasm run program.wasm -- arg1 arg2
mywasm run program.wasm --verbose -- arg1

5. Error Message Quality

Bad: Error: validation failed Good: Error at function $add (offset 0x42): stack underflow on i32.add

Include:

  • Location (function, offset, line if available)
  • What went wrong
  • What was expected

Extensions

1. Module Linking

Combine multiple modules:

mywasm link module1.wasm module2.wasm -o combined.wasm

2. Source Maps

Generate DWARF debug info:

typedef struct {
    uint32_t wasm_offset;
    const char* source_file;
    int source_line;
    int source_column;
} SourceMapping;

3. Profiler

Add execution profiling:

$ mywasm profile program.wasm
Function          Calls    Time(ms)    Time%
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
factorial         1024     45.2        78.3%
helper            512      8.1         14.0%
main              1        4.5          7.7%

4. REPL

Interactive WASM evaluation:

$ mywasm repl
wasm> (module (func (export "add") (param i32 i32) (result i32) local.get 0 local.get 1 i32.add))
Module loaded.
wasm> add(5, 3)
8
wasm> (func (export "mul") (param i32 i32) (result i32) local.get 0 local.get 1 i32.mul)
Function added.
wasm> mul(4, 7)
28

5. IDE Integration

Create a language server:

{
  "capabilities": {
    "completionProvider": {},
    "hoverProvider": true,
    "definitionProvider": true,
    "diagnosticsProvider": true
  }
}

Real-World Outcome

Complete Development Workflow: Source to Debugging

This section demonstrates the complete toolchain workflow, showing how your tools work together from writing source code to debugging runtime issues.

The Complete Pipeline in Action

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                           COMPLETE TOOLCHAIN WORKFLOW                                    โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                                          โ”‚
โ”‚  1. WRITE SOURCE CODE                                                                    โ”‚
โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                                    โ”‚
โ”‚  $ cat > math.mini << 'EOF'                                                             โ”‚
โ”‚  func gcd(a: i32, b: i32) -> i32 {                                                      โ”‚
โ”‚      while b != 0 {                                                                     โ”‚
โ”‚          let temp = b;                                                                  โ”‚
โ”‚          b = a % b;                                                                     โ”‚
โ”‚          a = temp;                                                                      โ”‚
โ”‚      }                                                                                  โ”‚
โ”‚      return a;                                                                          โ”‚
โ”‚  }                                                                                      โ”‚
โ”‚  EOF                                                                                    โ”‚
โ”‚                                                                                          โ”‚
โ”‚  2. COMPILE WITH DEBUG INFO                                                             โ”‚
โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                             โ”‚
โ”‚  $ mywasm compile math.mini -o math.wasm -g                                             โ”‚
โ”‚  Compiling math.mini...                                                                 โ”‚
โ”‚  โœ“ Lexing: 47 tokens                                                                    โ”‚
โ”‚  โœ“ Parsing: AST generated                                                               โ”‚
โ”‚  โœ“ Type checking: All types verified                                                    โ”‚
โ”‚  โœ“ Code generation: 89 bytes                                                            โ”‚
โ”‚  โœ“ Debug info: DWARF sections embedded                                                  โ”‚
โ”‚  Output: math.wasm (142 bytes with debug info)                                          โ”‚
โ”‚                                                                                          โ”‚
โ”‚  3. VALIDATE MODULE                                                                      โ”‚
โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                                      โ”‚
โ”‚  $ mywasm validate math.wasm                                                            โ”‚
โ”‚  Validating math.wasm...                                                                โ”‚
โ”‚  โœ“ Magic number: valid                                                                  โ”‚
โ”‚  โœ“ Version: 1                                                                           โ”‚
โ”‚  โœ“ Section order: valid                                                                 โ”‚
โ”‚  โœ“ Type section: 1 type(s)                                                              โ”‚
โ”‚  โœ“ Function section: 1 function(s)                                                      โ”‚
โ”‚  โœ“ Code section: All functions type-checked                                             โ”‚
โ”‚  โœ“ Custom sections: names, dwarf                                                        โ”‚
โ”‚  Module is valid.                                                                       โ”‚
โ”‚                                                                                          โ”‚
โ”‚  4. DISASSEMBLE TO INSPECT                                                              โ”‚
โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                              โ”‚
โ”‚  $ mywasm disasm math.wasm                                                              โ”‚
โ”‚  (module                                                                                โ”‚
โ”‚    (type (;0;) (func (param i32 i32) (result i32)))                                    โ”‚
โ”‚    (func $gcd (type 0) (param $a i32) (param $b i32) (result i32)                      โ”‚
โ”‚      (local $temp i32)                                                                  โ”‚
โ”‚      block $exit                                                                        โ”‚
โ”‚        loop $continue                                                                   โ”‚
โ”‚          local.get $b                                                                   โ”‚
โ”‚          i32.eqz                                                                        โ”‚
โ”‚          br_if $exit                                                                    โ”‚
โ”‚          local.get $b                                                                   โ”‚
โ”‚          local.set $temp                                                                โ”‚
โ”‚          local.get $a                                                                   โ”‚
โ”‚          local.get $b                                                                   โ”‚
โ”‚          i32.rem_s                                                                      โ”‚
โ”‚          local.set $b                                                                   โ”‚
โ”‚          local.get $temp                                                                โ”‚
โ”‚          local.set $a                                                                   โ”‚
โ”‚          br $continue                                                                   โ”‚
โ”‚        end                                                                              โ”‚
โ”‚      end                                                                                โ”‚
โ”‚      local.get $a)                                                                      โ”‚
โ”‚    (export "gcd" (func $gcd)))                                                          โ”‚
โ”‚                                                                                          โ”‚
โ”‚  5. RUN THE PROGRAM                                                                      โ”‚
โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                                      โ”‚
โ”‚  $ mywasm run math.wasm --invoke gcd 48 18                                              โ”‚
โ”‚  Result: 6                                                                              โ”‚
โ”‚                                                                                          โ”‚
โ”‚  6. OPTIMIZE FOR PRODUCTION                                                             โ”‚
โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                             โ”‚
โ”‚  $ mywasm optimize math.wasm -O2 -o math.opt.wasm                                       โ”‚
โ”‚  Optimizing math.wasm...                                                                โ”‚
โ”‚  โœ“ Constant folding: 0 expressions folded                                              โ”‚
โ”‚  โœ“ Dead code elimination: 0 instructions removed                                        โ”‚
โ”‚  โœ“ Local coalescing: 0 locals merged                                                   โ”‚
โ”‚  โœ“ Strength reduction: 0 operations simplified                                          โ”‚
โ”‚  Output: math.opt.wasm (89 bytes, 0% reduction)                                         โ”‚
โ”‚                                                                                          โ”‚
โ”‚  7. DEBUG A PROBLEM                                                                      โ”‚
โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                                                                      โ”‚
โ”‚  $ mywasm debug math.wasm                                                               โ”‚
โ”‚  WebAssembly Debugger v1.0                                                              โ”‚
โ”‚  Module loaded: math.wasm                                                               โ”‚
โ”‚  Functions: 1 (gcd)                                                                     โ”‚
โ”‚                                                                                          โ”‚
โ”‚  (wdb) break gcd                                                                        โ”‚
โ”‚  Breakpoint 1 at function $gcd (offset 0x00)                                            โ”‚
โ”‚                                                                                          โ”‚
โ”‚  (wdb) run gcd 48 18                                                                    โ”‚
โ”‚  Starting execution with args: [48, 18]                                                 โ”‚
โ”‚  Breakpoint 1 hit at $gcd                                                               โ”‚
โ”‚    math.mini:1    func gcd(a: i32, b: i32) -> i32 {                                    โ”‚
โ”‚                                                                                          โ”‚
โ”‚  (wdb) locals                                                                           โ”‚
โ”‚  $a = i32: 48                                                                           โ”‚
โ”‚  $b = i32: 18                                                                           โ”‚
โ”‚  $temp = i32: 0 (uninitialized)                                                         โ”‚
โ”‚                                                                                          โ”‚
โ”‚  (wdb) step 5                                                                           โ”‚
โ”‚  Stepped 5 instructions                                                                 โ”‚
โ”‚    math.mini:4    b = a % b;                                                            โ”‚
โ”‚                                                                                          โ”‚
โ”‚  (wdb) stack                                                                            โ”‚
โ”‚  [0] i32: 48  (a)                                                                       โ”‚
โ”‚  [1] i32: 18  (b)                                                                       โ”‚
โ”‚                                                                                          โ”‚
โ”‚  (wdb) eval a % b                                                                       โ”‚
โ”‚  Result: i32: 12                                                                        โ”‚
โ”‚                                                                                          โ”‚
โ”‚  (wdb) continue                                                                         โ”‚
โ”‚  Execution completed.                                                                   โ”‚
โ”‚  Return value: i32: 6                                                                   โ”‚
โ”‚                                                                                          โ”‚
โ”‚  (wdb) quit                                                                             โ”‚
โ”‚                                                                                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Complete Toolchain Workflow

Debugging with DWARF and Source Maps

For production debugging, your toolchain supports both DWARF and source maps:

# Compile with DWARF debug info (full debugging, larger files)
$ mywasm compile app.mini -o app.wasm -g --debug-format=dwarf
# Result: DWARF sections embedded in custom sections

# Compile with source maps (location only, smaller files, wider support)
$ mywasm compile app.mini -o app.wasm -g --debug-format=sourcemap
# Result: app.wasm.map generated alongside binary

# Strip debug info for production (keep source map separate)
$ mywasm strip app.wasm -o app.prod.wasm --keep-sourcemap
# Result: minimal binary with external source map reference

CI/CD Integration Example

# .github/workflows/build.yml
name: Build and Test WASM

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Compile
        run: |
          mywasm compile src/main.mini -o dist/main.wasm -O2 -g
          mywasm compile src/main.mini -o dist/main.debug.wasm -g

      - name: Validate
        run: mywasm validate dist/main.wasm --strict

      - name: Test
        run: |
          mywasm run dist/main.wasm --invoke test_suite
          mywasm run dist/main.debug.wasm --invoke test_suite

      - name: Benchmark
        run: mywasm profile dist/main.wasm --invoke benchmark > profile.txt

      - name: Upload artifacts
        uses: actions/upload-artifact@v4
        with:
          name: wasm-binaries
          path: dist/

Professional Toolchains

Your toolchain mirrors production tools:

Your Tool Production Equivalent Key Learning
mywasm compile clang, rustc, emcc End-to-end compilation pipeline
mywasm validate wasm-validate (wabt) Type system verification
mywasm disasm wasm2wat (wabt) Binary format understanding
mywasm run wasmtime, wasmer, wasm3 Virtual machine execution
mywasm debug lldb, gdb, Chrome DevTools Debug protocol and state inspection
mywasm optimize wasm-opt (binaryen) Compiler optimization techniques
mywasm link wasm-ld (LLVM) Module composition and symbol resolution

Contributing to the Ecosystem

Your deep understanding enables contributions to:

Project Contribution Opportunities
wasmtime Runtime optimizations, new instruction support, debugging improvements
wasm3 Interpreter performance, embedded platform support
binaryen New optimization passes, IR transformations
wasi-sdk Toolchain improvements, better error messages
spec Test suite contributions, proposal implementations

Teaching Others

Your toolchain becomes an educational platform:

# Use as teaching tool
$ mywasm explain "local.get 0"
local.get 0
  Opcode: 0x20
  Operand: 0 (LEB128)
  Effect: Push value of local variable 0 onto the stack
  Stack: [...] -> [..., local_0_value]

$ mywasm trace math.wasm --invoke gcd 48 18
[0x00] local.get $a          stack: [] -> [48]
[0x02] local.get $b          stack: [48] -> [48, 18]
[0x04] i32.rem_s             stack: [48, 18] -> [12]
...

Your knowledge enables you to:

  • Write compiler courses using your toolchain as the example project
  • Mentor junior engineers on systems programming
  • Create YouTube/blog tutorials on โ€œhow WebAssembly really worksโ€
  • Contribute to WebAssembly education initiatives

Self-Assessment Checklist

Integration

  • All tools work together seamlessly
  • Shared module representation is consistent
  • Error messages are clear and actionable
  • CLI is intuitive and well-documented

Validator

  • Catches all structural errors
  • Type checks all instructions
  • Handles unreachable code correctly
  • Reports precise error locations

Disassembler

  • Produces valid WAT for all inputs
  • Uses names when available
  • Handles all instruction types
  • Output is properly indented

Debugger

  • Breakpoints work at function and instruction level
  • Step, next, continue all work correctly
  • State inspection shows accurate data
  • UI is responsive and clear

Testing

  • Unit tests cover all components
  • Integration tests verify full pipeline
  • Edge cases are handled gracefully
  • Performance is acceptable

Resources

Toolchain Design

Testing

Reference Tools


Key Insights

Integration is harder than implementation. Each component might work alone, but making them work together smoothly requires careful design of shared data structures and consistent error handling.

User experience matters. Clear error messages, intuitive CLI, and helpful documentation transform a technical project into a usable tool.

Testing is your safety net. With a complex toolchain, the only way to make changes confidently is comprehensive automated testing.

Youโ€™ve built something real. This isnโ€™t a toyโ€”itโ€™s a functional toolchain that could genuinely be used to compile, debug, and run WebAssembly programs.


Conclusion

Completing this capstone project demonstrates mastery of:

  1. WebAssembly internals - Binary format, execution semantics, type system
  2. Compiler construction - Lexing, parsing, type checking, code generation
  3. Virtual machine design - Stack machines, memory management, control flow
  4. System programming - WASI, sandboxing, capability security
  5. Software engineering - Testing, documentation, CLI design

You now understand WebAssembly at the level of its designers. You could:

  • Contribute to production runtimes
  • Design new languages targeting WASM
  • Build the next generation of edge computing platforms
  • Teach others how WebAssembly really works

Congratulations on completing the WebAssembly Deep Learning journey.


This is the culmination of Projects 1-5. Return to individual projects to deepen specific areas, or extend this toolchain with your own innovations.