Project 15: Compile Functional Language to LLVM/C

Project 15: Compile Functional Language to LLVM/C

Project Overview

Attribute Details
Difficulty Master
Time Estimate 2-3 months
Primary Language Haskell
Alternative Languages OCaml, Rust
Knowledge Area Compilers / Code Generation / Functional Language Implementation
Tools Required GHC, LLVM (llvm-hs or llvm-hs-pure), Cabal/Stack
Primary Reference “Compiling with Continuations” by Andrew Appel
Prerequisites Projects 4 (Type Inference) and 14 (Lisp Interpreter), understanding of lambda calculus

Learning Objectives

By completing this project, you will be able to:

  1. Implement Hindley-Milner type inference to statically type a functional language before compilation
  2. Perform closure conversion to transform higher-order functions into first-order code with explicit closures
  3. Apply lambda lifting to transform local functions into global functions with captured variables as parameters
  4. Generate LLVM IR for low-level representation of functional programs
  5. Implement tail call optimization at the compiler level
  6. Handle memory management with basic garbage collection or reference counting
  7. Understand the compilation pipeline from source to executable

Deep Theoretical Foundation

Why Compile Functional Languages?

Interpreters are excellent for learning and rapid development, but they have fundamental performance limitations:

  1. Interpretation overhead: Every expression re-parsed, re-analyzed at runtime
  2. Dynamic dispatch: Function calls resolved at runtime, not compile time
  3. No low-level optimization: Cannot leverage CPU features, SIMD, branch prediction
  4. Memory overhead: AST structures consume more memory than compiled code

A compiler translates high-level functional code into efficient machine code. This project bridges the gap between abstract lambda calculus and concrete hardware.

The challenge: functional features like closures, currying, and higher-order functions have no direct machine-level representation. We must transform them into constructs that do.

The Compilation Pipeline

Source Code
    |
    v
[Lexer/Parser] -----> Abstract Syntax Tree (AST)
    |
    v
[Type Inference] ---> Typed AST
    |
    v
[Desugaring] -------> Core Language (simpler AST)
    |
    v
[Closure Conversion] -> Explicit Closures
    |
    v
[Lambda Lifting] ----> First-Order Functions
    |
    v
[CPS Transform] -----> Continuation-Passing Style (optional)
    |
    v
[Code Generation] ---> LLVM IR / C Code
    |
    v
[LLVM/GCC] ---------> Machine Code
    |
    v
Executable

Functional Language Compilation Pipeline

Each stage transforms the program into a lower-level representation until we reach machine code.

Source Language Design

We will compile a small functional language called “Mini-ML”:

-- Types
Int, Bool, a, b, (a -> b), (a, b)

-- Expressions
x                     -- variables
42                    -- integer literals
true, false           -- boolean literals
\x -> e              -- lambda abstraction
e1 e2                -- application
let x = e1 in e2     -- let binding
let rec f x = e1 in e2  -- recursive binding
if e1 then e2 else e3   -- conditional
(e1, e2)             -- pairs
fst e, snd e         -- projections
e1 + e2, e1 - e2, ... -- arithmetic
e1 == e2, e1 < e2, ... -- comparison

Example program:

let rec factorial = \n ->
  if n == 0 then 1
  else n * factorial (n - 1)
in factorial 10

Hindley-Milner Type Inference

Hindley-Milner (HM) type inference infers types without explicit annotations. Key ideas:

Types:

Type = Int | Bool | TVar a | Type -> Type | (Type, Type)

Type Schemes (polymorphic types):

Scheme = forall a1 ... an. Type

Inference Algorithm (Algorithm W):

  1. Generate fresh type variables for unknowns
  2. Traverse AST generating constraints
  3. Solve constraints via unification
  4. Generalize to get polymorphic types
infer :: Env -> Expr -> (Subst, Type)

-- Variables: lookup in environment, instantiate scheme
infer env (Var x) =
    let scheme = lookup x env
        ty = instantiate scheme  -- replace forall vars with fresh vars
    in (emptySubst, ty)

-- Lambda: infer body with parameter added to env
infer env (Lam x body) =
    let alpha = freshTVar
        env' = extend env x (Forall [] alpha)
        (s, t) = infer env' body
    in (s, apply s alpha :-> t)

-- Application: infer function and argument, unify
infer env (App e1 e2) =
    let (s1, t1) = infer env e1
        (s2, t2) = infer (apply s1 env) e2
        alpha = freshTVar
        s3 = unify (apply s2 t1) (t2 :-> alpha)
    in (s3 `compose` s2 `compose` s1, apply s3 alpha)

-- Let: infer binding, generalize, infer body
infer env (Let x e1 e2) =
    let (s1, t1) = infer env e1
        scheme = generalize (apply s1 env) t1
        (s2, t2) = infer (extend (apply s1 env) x scheme) e2
    in (s2 `compose` s1, t2)

Unification:

unify :: Type -> Type -> Subst

unify (TVar a) t = bindVar a t
unify t (TVar a) = bindVar a t
unify (t1 :-> t2) (s1 :-> s2) =
    let su1 = unify t1 s1
        su2 = unify (apply su1 t2) (apply su1 s2)
    in compose su2 su1
unify Int Int = emptySubst
unify Bool Bool = emptySubst
unify t1 t2 = error $ "Cannot unify " ++ show t1 ++ " with " ++ show t2

Occurs Check: Prevent infinite types like a = a -> b.

Let Polymorphism: In let x = e1 in e2, we generalize x’s type to a scheme before checking e2. This is why:

let id = \x -> x in (id 1, id true)

is valid—id gets type forall a. a -> a.

Closure Conversion

Higher-order functions with free variables cannot directly become machine functions. Consider:

let add = \x -> (\y -> x + y)
let add5 = add 5
add5 10  -- returns 15

The inner lambda \y -> x + y references x, which is not in scope when the lambda is called. We must capture x in a closure.

Before closure conversion:

\y -> x + y

After closure conversion:

makeClosure(closureFunc, {x = x})

where closureFunc(env, y) = env.x + y

A closure is:

  1. A function pointer (the “code”)
  2. An environment record (captured variables)

Algorithm:

  1. For each lambda, find its free variables
  2. Create a closure record type containing those variables
  3. Convert the lambda body to take the environment as first argument
  4. At lambda creation, build the closure record
-- Find free variables
freeVars :: Expr -> Set String
freeVars (Var x) = singleton x
freeVars (Lam x body) = freeVars body `difference` singleton x
freeVars (App e1 e2) = freeVars e1 `union` freeVars e2
freeVars (Let x e1 e2) = freeVars e1 `union` (freeVars e2 `difference` singleton x)
...

-- Closure conversion
data CExpr
    = CVar String
    | CApp CExpr CExpr
    | CLet String CExpr CExpr
    | CMakeClosure String [String]  -- function name, captured vars
    | CEnvRef Int                   -- reference to captured variable
    | ...

closureConvert :: Expr -> CExpr
closureConvert (Lam x body) =
    let fvs = toList (freeVars body `difference` singleton x)
        funcName = freshName "closure"
        body' = convertBody fvs x body
    in CMakeClosure funcName fvs

Lambda Lifting

After closure conversion, we can perform lambda lifting to move all functions to top level:

Before:

let f = \x ->
  let g = \y -> x + y
  in g 5

After lambda lifting:

g_lifted(x, y) = x + y

f_lifted(x) = g_lifted(x, 5)

The key insight: captured variables become additional parameters.

Algorithm:

  1. Collect all lambdas in the program
  2. For each lambda, add free variables as extra parameters
  3. Replace lambda with call to lifted function, passing free vars

Combined with closure conversion, this gives us first-order code suitable for machine execution.

Continuation-Passing Style (CPS)

CPS makes control flow explicit. Every function takes an extra argument: the continuation (what to do with the result).

Direct style:

let result = f(x) in g(result)

CPS:

f(x, \result -> g(result, halt))

Why CPS?

  1. Tail calls become obvious: A call in CPS is always in tail position
  2. No return addresses: Continuations explicitly handle control flow
  3. Simplifies code generation: No stack needed for function returns
  4. Enables optimizations: Inlining, dead code elimination become easier

CPS Transformation:

data CPS
    = CPSHalt Value
    | CPSCall Value [Value] Var CPS  -- f(args, \x -> k)
    | CPSPrimOp Op [Value] Var CPS   -- primop(args, \x -> k)
    | CPSIf Value CPS CPS

cpsTransform :: Expr -> (Value -> CPS) -> CPS
cpsTransform (Lit n) k = k (VLit n)
cpsTransform (Var x) k = k (VVar x)
cpsTransform (App e1 e2) k =
    cpsTransform e1 (\v1 ->
        cpsTransform e2 (\v2 ->
            let r = freshVar
            in CPSCall v1 [v2] r (k (VVar r))))
cpsTransform (Lam x body) k =
    let kVar = freshVar
        body' = cpsTransform body (\v -> CPSHalt v)
    in k (VLam x kVar body')

LLVM IR Basics

LLVM (Low-Level Virtual Machine) is a compiler infrastructure that provides:

  • A well-defined intermediate representation (IR)
  • Powerful optimization passes
  • Code generation for multiple architectures

LLVM IR Features:

  • Static single assignment (SSA) form
  • Typed (everything has a type)
  • Infinite virtual registers
  • First-class functions (sort of)
  • Garbage collection support

Basic LLVM IR:

; Define a function
define i32 @add(i32 %a, i32 %b) {
entry:
    %result = add i32 %a, %b
    ret i32 %result
}

; Call a function
define i32 @main() {
entry:
    %x = call i32 @add(i32 3, i32 4)
    ret i32 %x
}

LLVM Types:

  • i1, i8, i32, i64: Integer types
  • float, double: Floating point
  • ptr: Pointer (opaque in modern LLVM)
  • { i32, ptr }: Struct types
  • [10 x i32]: Array types
  • i32 (i32, i32)*: Function pointer types

For our compiler:

  • Closures become structs: { ptr, ... } (code pointer + captured vars)
  • Function calls become indirect calls through closure’s code pointer
  • Heap allocation for closures (needs GC or manual management)

Memory Management

Functional languages allocate heavily (every closure, every data structure). We need memory management.

Options:

  1. Manual allocation (simple, leaky): Allocate but never free
    • Fine for short-running programs
    • Memory grows without bound
  2. Reference counting: Track how many references to each object
    • Free when count reaches zero
    • Problem: cycles (A -> B -> A never freed)
    • Solution: weak references or cycle detection
  3. Tracing garbage collection: Periodically find all reachable objects
    • Mark-and-sweep: Mark all reachable, sweep unmarked
    • Copying: Copy all reachable to new space, abandon old
    • Generational: Young objects collected more often
  4. Using Boehm GC: Conservative GC library
    • Drop-in replacement for malloc
    • Automatically collects garbage
    • Easy to integrate

For this project, we recommend starting with Boehm GC:

#include <gc.h>
void* ptr = GC_malloc(size);  // No need to free!

Tail Call Optimization

A tail call is a function call whose result is immediately returned:

-- Tail call (can optimize)
f x = g (x + 1)

-- Not a tail call (uses result)
f x = 1 + g x

Why TCO matters: Without it, recursive functions consume stack proportional to recursion depth.

Implementation:

  1. In CPS: All calls are tail calls (no stack growth)
  2. In direct style: Check if call is in tail position, use jmp instead of call
  3. LLVM: Use musttail attribute:
    musttail call i32 @factorial(i32 %n_minus_1, i32 %new_acc)
    

Trampoline approach (for non-TCO targets):

typedef struct {
    int tag;  // 0 = value, 1 = thunk
    union {
        int value;
        struct { void* func; void* arg; } thunk;
    };
} Bounce;

int trampoline(Bounce b) {
    while (b.tag == 1) {
        b = ((Bounce(*)(void*))b.thunk.func)(b.thunk.arg);
    }
    return b.value;
}

Calling Conventions

How do we call functions? The calling convention defines:

  • How arguments are passed (registers vs stack)
  • Who saves registers (caller vs callee)
  • How return values are passed

For closures:

call(closure, arg1, arg2, ...)
  = call(closure.code, closure.env, arg1, arg2, ...)

The environment pointer is an implicit first argument.

LLVM calling convention for closures:

; Closure struct: { code_ptr, env_ptr }
%closure = type { ptr, ptr }

; Closure call
define i32 @call_closure(ptr %clos, i32 %arg) {
    %code_ptr = getelementptr %closure, ptr %clos, i32 0, i32 0
    %code = load ptr, ptr %code_ptr
    %env_ptr = getelementptr %closure, ptr %clos, i32 0, i32 1
    %env = load ptr, ptr %env_ptr
    %result = call i32 %code(ptr %env, i32 %arg)
    ret i32 %result
}

Optimization Opportunities

Once in IR form, many optimizations become possible:

  1. Inlining: Replace function call with function body
  2. Constant folding: Evaluate constant expressions at compile time
  3. Dead code elimination: Remove unused computations
  4. Common subexpression elimination: Compute repeated expressions once
  5. Strength reduction: Replace expensive ops with cheaper ones
  6. Loop optimizations: Unrolling, invariant code motion

LLVM provides these optimizations—we just need to generate reasonable IR.


Complete Project Specification

What You Are Building

A compiler called miniml that compiles Mini-ML to native executables:

miniml compile program.ml -o program
./program

Language Features

-- Core expressions
42, true, false, "string"   -- literals
x                           -- variables
\x -> e                     -- lambda
e1 e2                       -- application
let x = e1 in e2            -- let binding
let rec f = \x -> e in e'   -- recursive binding
if e1 then e2 else e3       -- conditional
(e1, e2)                    -- pairs
fst e, snd e                -- projections

-- Arithmetic and comparison
+, -, *, /                  -- arithmetic
==, /=, <, >, <=, >=        -- comparison
&&, ||, not                 -- boolean operations

-- Optional: Algebraic data types
data List a = Nil | Cons a (List a)
case e of { Nil -> e1 ; Cons x xs -> e2 }

Compilation Stages

Source (.ml)
    |
[Parser] -----------> Expr (AST)
    |
[Type Inference] ---> TypedExpr (AST with types)
    |
[Desugar] ----------> Core (simplified AST)
    |
[Closure Convert] --> CCExpr (explicit closures)
    |
[Lambda Lift] ------> LLExpr (all functions top-level)
    |
[CodeGen] ----------> LLVM Module
    |
[LLVM] ------------> Object file (.o)
    |
[Linker] ----------> Executable

Expected Output

$ cat factorial.ml
let rec factorial = \n ->
  if n == 0 then 1
  else n * factorial (n - 1)
in
  factorial 10

$ miniml compile factorial.ml -o factorial
Parsing... done
Type checking... factorial : Int -> Int
Closure converting... done
Lambda lifting... done
Generating LLVM IR... done
Compiling to native... done

$ ./factorial
3628800

$ miniml compile factorial.ml --emit-llvm
; ModuleID = 'factorial'

define i64 @factorial(i64 %n) {
entry:
  %cmp = icmp eq i64 %n, 0
  br i1 %cmp, label %then, label %else

then:
  ret i64 1

else:
  %n_minus_1 = sub i64 %n, 1
  %rec_result = call i64 @factorial(i64 %n_minus_1)
  %result = mul i64 %n, %rec_result
  ret i64 %result
}

define i64 @main() {
entry:
  %result = call i64 @factorial(i64 10)
  ret i64 %result
}

Solution Architecture

Module Structure

src/
  MiniML/
    Syntax/
      AST.hs           -- Surface syntax AST
      Core.hs          -- Core language
      Parser.hs        -- Parser (Megaparsec)
      Lexer.hs         -- Lexer
    Types/
      Type.hs          -- Type definitions
      Infer.hs         -- Type inference (Algorithm W)
      Unify.hs         -- Unification
      Subst.hs         -- Substitutions
    Transform/
      Desugar.hs       -- Desugar to Core
      ClosureConvert.hs -- Closure conversion
      LambdaLift.hs    -- Lambda lifting
      CPS.hs           -- Optional CPS transform
    CodeGen/
      LLVM.hs          -- LLVM IR generation
      Runtime.hs       -- Runtime support
    Driver/
      Compile.hs       -- Compilation pipeline
      CLI.hs           -- Command-line interface
  Main.hs              -- Entry point

runtime/
  runtime.c            -- C runtime (GC, printing, etc.)
  runtime.h

Core Data Types

-- Surface syntax
data Expr
    = ELit Lit
    | EVar String
    | ELam String Expr
    | EApp Expr Expr
    | ELet String Expr Expr
    | ELetRec String Expr Expr
    | EIf Expr Expr Expr
    | EPair Expr Expr
    | EFst Expr
    | ESnd Expr
    | EBinOp BinOp Expr Expr
    deriving (Show)

-- Types
data Type
    = TInt
    | TBool
    | TVar TyVar
    | TFun Type Type
    | TPair Type Type
    deriving (Eq, Show)

data Scheme = Forall [TyVar] Type
    deriving (Show)

-- Core language (after desugaring)
data Core
    = CLit Lit
    | CVar String
    | CLam String Core
    | CApp Core Core
    | CLet String Core Core
    | CIf Core Core Core
    | CPrimOp PrimOp [Core]
    deriving (Show)

-- After closure conversion
data CCExpr
    = CCLit Lit
    | CCVar String
    | CCMakeClosure String [String]  -- func name, captured vars
    | CCCallClosure CCExpr [CCExpr]
    | CCLet String CCExpr CCExpr
    | CCIf CCExpr CCExpr CCExpr
    | CCPrimOp PrimOp [CCExpr]
    deriving (Show)

-- Top-level function (after lambda lifting)
data TopLevel
    = TopFunc String [String] CCExpr  -- name, params, body
    deriving (Show)

Key Algorithms

Closure Conversion:

closureConvert :: Core -> State Int CCExpr
closureConvert (CLam x body) = do
    body' <- closureConvert body
    let fvs = toList $ freeVars (CLam x body)
    funcName <- freshName "closure"
    -- Register new top-level function
    emitFunction funcName ("env" : x : []) (substituteEnv fvs body')
    return $ CCMakeClosure funcName fvs

closureConvert (CApp e1 e2) = do
    e1' <- closureConvert e1
    e2' <- closureConvert e2
    return $ CCCallClosure e1' [e2']

-- Other cases...

LLVM Code Generation:

genExpr :: CCExpr -> LLVM Operand

genExpr (CCLit (LInt n)) = return $ ConstantOperand (Int 64 n)

genExpr (CCVar x) = do
    ptr <- lookupVar x
    load ptr

genExpr (CCMakeClosure funcName captured) = do
    -- Allocate closure struct
    closSize <- closureSize (length captured)
    ptr <- gcMalloc closSize

    -- Store function pointer
    funcPtr <- getFunction funcName
    storeAt ptr 0 funcPtr

    -- Store captured variables
    forM_ (zip [1..] captured) $ \(i, var) -> do
        val <- lookupVar var
        storeAt ptr i val

    return ptr

genExpr (CCCallClosure clos args) = do
    closPtr <- genExpr clos
    argVals <- mapM genExpr args

    -- Load function pointer from closure
    funcPtr <- loadAt closPtr 0
    -- Load environment pointer
    envPtr <- loadAt closPtr 1

    -- Call with env as first arg
    call funcPtr (envPtr : argVals)

genExpr (CCPrimOp op args) = do
    vals <- mapM genExpr args
    genPrimOp op vals

Phased Implementation Guide

Phase 1: Parser and AST (Week 1)

Goal: Parse Mini-ML source code into an AST.

Tasks:

  1. Define Expr AST data type
  2. Implement lexer with Megaparsec
  3. Implement parser for all expression forms
  4. Add pretty-printer for debugging
  5. Write parser tests

Validation:

parse "\\x -> x + 1" == ELam "x" (EBinOp Add (EVar "x") (ELit (LInt 1)))

Hints:

  • Handle operator precedence carefully
  • Support multi-line expressions
  • Add source locations for error messages

Phase 2: Type Inference (Weeks 2-3)

Goal: Implement Hindley-Milner type inference.

Tasks:

  1. Define Type and Scheme data types
  2. Implement substitution and composition
  3. Implement unification with occurs check
  4. Implement Algorithm W
  5. Add type error messages
  6. Test on polymorphic examples

Validation:

infer "\\x -> x"  == Forall [a] (a -> a)
infer "\\f -> \\x -> f x" == Forall [a,b] ((a -> b) -> a -> b)

Hints:

  • Use State monad for fresh variable generation
  • Either TypeError for error handling
  • Generalization happens at let bindings only

Phase 3: Closure Conversion (Week 4)

Goal: Transform lambdas into explicit closures.

Tasks:

  1. Implement free variable analysis
  2. Define closure-converted AST
  3. Implement closure conversion algorithm
  4. Generate closure allocation code
  5. Transform function applications to closure calls

Validation:

Before: \x -> \y -> x + y
After: makeClosure("f1", []) where f1(env, x) = makeClosure("f2", [x])
       and f2(env, y) = env[0] + y

Hints:

  • Track which variables are from environment vs parameters
  • Generate unique names for closure functions
  • Test with nested lambdas

Phase 4: Lambda Lifting (Week 5)

Goal: Move all functions to top level.

Tasks:

  1. Collect all closure functions
  2. Generate top-level function definitions
  3. Replace closure creation with allocation + init
  4. Handle recursive functions specially
  5. Verify all functions are now global

Validation: No nested lambdas remain; all functions at top level.

Hints:

  • Maintain a map from closure names to their definitions
  • Recursive closures need special handling (store function pointer after allocation)

Phase 5: LLVM Code Generation (Weeks 6-8)

Goal: Generate LLVM IR from closure-converted code.

Tasks:

  1. Set up llvm-hs-pure or llvm-hs library
  2. Generate code for literals and arithmetic
  3. Implement closure struct representation
  4. Generate code for closure creation
  5. Generate code for closure calls
  6. Implement primitive operations
  7. Create main function

Validation:

miniml factorial.ml --emit-llvm | lli
# Outputs: 3628800

Hints:

  • Start with a minimal runtime (just printing)
  • Use i64 for all integers initially
  • Test each construct in isolation

Phase 6: Runtime and GC (Weeks 9-10)

Goal: Add runtime support and memory management.

Tasks:

  1. Create C runtime with print functions
  2. Integrate Boehm GC for allocation
  3. Handle program startup (call main)
  4. Add error handling (division by zero, etc.)
  5. Link runtime with generated code

Validation: Programs run without memory leaks (check with valgrind).

Hints:

  • Keep runtime minimal
  • Use GC_malloc for all heap allocations
  • Compile runtime to object file, link with generated code

Phase 7: Optimization and Polish (Weeks 11-12)

Goal: Optimize generated code and improve UX.

Tasks:

  1. Add LLVM optimization passes
  2. Implement tail call optimization
  3. Improve error messages
  4. Add --emit-llvm and --emit-asm flags
  5. Handle multiple source files
  6. Write comprehensive tests

Validation: Factorial of 1,000,000 runs without stack overflow (with TCO).

Hints:

  • LLVM’s optimization passes are powerful—use them
  • musttail for tail calls
  • Profile and optimize hot paths

Testing Strategy

Unit Tests

-- Parser tests
testParseLambda = parse "\\x -> x" @?= Right (ELam "x" (EVar "x"))
testParseApp = parse "f x y" @?= Right (EApp (EApp (EVar "f") (EVar "x")) (EVar "y"))

-- Type inference tests
testInferIdent = inferType "\\x -> x" @?= Right (Forall [a] (TVar a :-> TVar a))
testInferConst = inferType "\\x -> \\y -> x" @?=
    Right (Forall [a,b] (TVar a :-> TVar b :-> TVar a))

-- Closure conversion tests
testClosureSimple = closureConvert (ELam "x" (EVar "x")) @?=
    CCMakeClosure "f0" []

testClosureCapture = closureConvert (ELam "x" (ELam "y" (EBinOp Add (EVar "x") (EVar "y")))) @?=
    -- Should capture x in inner closure
    ...

Integration Tests

Create test files and verify output:

# test_arith.ml
let x = 2 + 3 in x * 4

# Expected output: 20
integrationTest :: FilePath -> String -> Test
integrationTest file expected = TestCase $ do
    compile file "test_output"
    result <- readProcess "./test_output" [] ""
    assertEqual file expected result

Property-Based Tests

-- Type inference preserves semantics
prop_typePreservation :: Expr -> Property
prop_typePreservation e =
    isRight (inferType e) ==>
    eval e == eval (compileAndRun e)

-- Closure conversion preserves semantics
prop_closureCorrect :: Expr -> Property
prop_closureCorrect e =
    interpret e == compileAndRun e

Common Pitfalls and Debugging

Pitfall 1: Wrong Free Variable Calculation

Symptom: Closures don’t capture the right variables.

Cause: Forgetting to remove bound variables.

Solution:

freeVars (ELam x body) = freeVars body `difference` singleton x
-- NOT: freeVars body (wrong!)

Pitfall 2: Closure Self-Reference

Symptom: Recursive closures segfault.

Cause: Closure tries to call itself before it’s fully constructed.

Solution:

-- Allocate first, then store function pointer
closPtr <- allocateClosure size
funcPtr <- getFunction funcName
storeAt closPtr 0 funcPtr  -- Now we can store pointer to self

Pitfall 3: Type Inference Infinite Loop

Symptom: Type inference hangs on certain expressions.

Cause: Missing occurs check in unification.

Solution:

unify (TVar a) t
    | a `occursIn` t = Left $ InfiniteType a t  -- Occurs check!
    | otherwise = Right [(a, t)]

Pitfall 4: LLVM Type Mismatches

Symptom: LLVM rejects generated IR with type errors.

Cause: Mixing up types (i32 vs i64, ptr vs value).

Solution:

  • Always track types explicitly
  • Use helper functions that ensure type consistency
  • Validate IR with opt -verify

Pitfall 5: Missing GC Roots

Symptom: Programs crash or produce garbage after GC.

Cause: Live pointers not visible to garbage collector.

Solution:

  • Store all live pointers in GC-visible locations
  • Use GC_malloc for all heap allocation
  • Consider using Boehm GC’s conservative scanning

Debugging Tips

  1. Emit intermediate representations: Add --dump-core, --dump-closure flags
  2. Test each stage: Verify parser, then type inference, then closure conversion, etc.
  3. Use lli for testing: LLVM interpreter catches many IR errors
  4. Compare with reference: Use GHC to verify expected behavior
  5. Add runtime tracing: Print when closures are created/called

Extensions and Challenges

Extension 1: Algebraic Data Types

Add sum types and pattern matching:

data List a = Nil | Cons a (List a)

let length = \xs ->
  case xs of
    Nil -> 0
    Cons _ rest -> 1 + length rest

Requires:

  • Type inference for ADTs
  • Representation of constructors (tagged unions)
  • Pattern match compilation (decision trees)

Extension 2: Polymorphic Recursion

Support polymorphic recursive functions:

let rec length = \xs ->
  case xs of
    Nil -> 0
    Cons _ rest -> 1 + length rest  -- length called polymorphically

Requires explicit type annotations or more sophisticated inference.

Extension 3: Separate Compilation

Support multiple source files:

-- Lib.ml
let square = \x -> x * x

-- Main.ml
import Lib
square 5

Requires:

  • Module system
  • Interface files
  • Linking multiple object files

Extension 4: Optimizations

Implement optimizations:

  • Inlining small functions
  • Constant propagation
  • Dead code elimination
  • Unboxing specialized monomorphic code

Challenge: Self-Hosting

Write the compiler in Mini-ML itself:

-- In Mini-ML
let parse = \source -> ...
let typeCheck = \ast -> ...
let compile = \ast -> ...

let main = \() ->
  let source = readFile "input.ml" in
  let ast = parse source in
  let typedAst = typeCheck ast in
  compile typedAst

This is the ultimate test—your language can compile itself!


Real-World Connections

Where These Techniques Appear

  1. GHC (Haskell): Uses STG machine, a specialized closure representation
  2. OCaml: Direct-style compilation with efficient closures
  3. Rust closures: Compile to structs with captured variables
  4. JavaScript engines (V8): Closure optimization for performance
  5. Clang/LLVM: LLVM IR is the industry standard

Industry Applications

  • Facebook Hack/HHVM: JIT compilation of PHP/Hack
  • Apple Swift: LLVM-based, aggressive optimization
  • WebAssembly compilers: Functional language to Wasm
  • MLton: Whole-program optimizing SML compiler
  • Graal/Truffle: Self-optimizing language implementations

What This Project Teaches You

  • Compiler architecture (applicable to any language)
  • Type system implementation (fundamental for language tools)
  • Low-level code generation (useful for performance optimization)
  • LLVM IR (industry-standard intermediate representation)
  • Memory management strategies (important for systems programming)

Interview Questions

After completing this project, you should be able to answer:

  1. What is closure conversion and why is it necessary?
    • Higher-order functions reference free variables
    • Machine functions cannot access caller’s stack
    • Solution: pack free variables into a data structure
    • Closure = code pointer + environment
  2. How does Hindley-Milner type inference work?
    • Generate constraints from expression structure
    • Solve constraints via unification
    • Generalize at let bindings (let polymorphism)
    • Algorithm W or Algorithm J
  3. What is the difference between lambda lifting and closure conversion?
    • Closure conversion: make closures explicit data structures
    • Lambda lifting: move functions to top level, add params for free vars
    • Often used together
    • Lifting can avoid runtime closure allocation in some cases
  4. How do you implement tail call optimization?
    • Identify tail position calls
    • Reuse current stack frame instead of pushing new
    • In LLVM: musttail attribute
    • Alternative: CPS makes all calls tail calls
  5. What is CPS and what are its benefits for compilation?
    • Continuation-passing style: explicit continuations
    • Makes control flow explicit
    • All calls become tail calls
    • Easier to implement exceptions, generators, coroutines
  6. How does garbage collection work with closures?
    • Closures contain pointers to captured values
    • GC must trace through closure environments
    • Conservative GC (Boehm) scans all memory for pointers
    • Precise GC requires stack maps
  7. What is LLVM IR and why is it useful?
    • Low-level intermediate representation
    • Typed, SSA form
    • Platform-independent
    • Powerful optimization passes
    • Easy code generation for multiple architectures

Self-Assessment Checklist

Before considering this project complete, verify:

  • Parser handles all language constructs
  • Type inference correctly infers polymorphic types
  • Let polymorphism works (same function used at different types)
  • Closure conversion correctly identifies free variables
  • Closures capture their lexical environment
  • Lambda lifting produces only top-level functions
  • Generated LLVM IR is valid (passes opt -verify)
  • Simple programs compile and run correctly
  • Recursive functions work
  • Higher-order functions work
  • Tail recursive functions don’t overflow stack (with TCO)
  • Memory doesn’t grow unboundedly (GC works)
  • Error messages are helpful

Resources

Essential Reading

Topic Resource Notes
Compilation overview “Modern Compiler Implementation in ML” by Andrew Appel Comprehensive textbook
CPS and closures “Compiling with Continuations” by Andrew Appel Deep dive into CPS-based compilation
Type inference “Types and Programming Languages” by Pierce, Ch. 22 Hindley-Milner algorithm
LLVM LLVM Language Reference Manual Official IR documentation
llvm-hs Hackage documentation Haskell LLVM bindings
Garbage collection “The Garbage Collection Handbook” by Jones et al. Comprehensive GC reference

Papers

  • “The Definition of Standard ML” - Milner et al.
  • “Practical Type Inference for Arbitrary-Rank Types” - Peyton Jones et al.
  • “Orbit: An Optimizing Compiler for Scheme” - Kranz et al.
  • “Making a Fast Curry” - Marlow & Peyton Jones
  • “Lambda Lifting in Quadratic Time” - MorazĂĄn & Mulder

Books

  • “Compilers: Principles, Techniques, and Tools” (Dragon Book) by Aho et al.
  • “Engineering a Compiler” by Cooper & Torczon
  • “Essentials of Programming Languages” by Friedman & Wand
  • “Lisp in Small Pieces” by Queinnec

Online Resources

Reference Implementations

  • GHC: The Glasgow Haskell Compiler (production quality)
  • MLton: Whole-program SML compiler
  • Chicken Scheme: Scheme to C compiler
  • Mini-Haskell compilers: Various tutorials online

“A programming language is low level when its programs require attention to the irrelevant.” - Alan Perlis

By completing this project, you will understand why functional languages require sophisticated compilation techniques—and you’ll have built a real compiler that transforms elegant lambda calculus into efficient machine code.