Project 4: Simply Typed Lambda Calculus + Type Inference
Project 4: Simply Typed Lambda Calculus + Type Inference
Build a type checker for the simply typed lambda calculus, then extend it to full Hindley-Milner type inference with let-polymorphism.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Expert |
| Time Estimate | 3-4 weeks |
| Language | Haskell (Alternatives: OCaml, Rust, Scala) |
| Prerequisites | Projects 1-3 (Lambda Calculus Interpreter), understanding of substitution, basic type theory |
| Key Topics | Simply Typed Lambda Calculus, Hindley-Milner Type System, Algorithm W, Unification, Let-Polymorphism |
| Main Resources | âTypes and Programming Languagesâ by Benjamin Pierce (Chapters 9-11, 22) |
1. Learning Objectives
By completing this project, you will:
- Understand type systems from first principles: Explain why types exist, what problems they solve, and how they constrain programs to prevent certain errors
- Implement a type checker for STLC: Given an explicitly-typed term, verify it is well-typed using bidirectional type checking
- Implement Hindley-Milner type inference: Infer the most general type of an untyped term without any annotations
- Master unification: Implement Robinsonâs unification algorithm for solving type constraints
- Understand let-polymorphism: Explain why
letenables polymorphism while lambda abstraction alone does not - Debug type errors: Given a type error, trace back to understand what constraint was violated and why
- Connect theory to practice: Relate your implementation to how GHC, OCaml, and Rust infer types
2. Conceptual Foundation
2.1 Why Types?
Before diving into implementation, we must understand why type systems exist at all.
The Fundamental Problem
Consider the untyped lambda calculus from Project 1. Every term is a function, and any term can be applied to any other term. This is maximally flexible but also maximally dangerous:
(lambda x. x + 1) "hello" -- What does "hello" + 1 mean?
(lambda x. x x) (lambda x. x x) -- Infinite loop
(lambda x. x 5) true -- What does true(5) mean?
In the untyped world, these expressions are syntactically valid. We only discover they are nonsensical at runtime (or they diverge forever).
Types solve this by classifying terms before execution. If we can prove at compile time that every function application makes sense, we can eliminate entire classes of runtime errors.
What Types Are
A type is a conservative approximation of the set of values a term can evaluate to:
Intmeans âthis term will produce an integerâBoolmeans âthis term will produce a booleanâInt -> Boolmeans âthis term is a function that takes an integer and produces a booleanâ
The key word is conservative: the type system may reject some programs that would actually run correctly, but it will never accept a program that would produce a type error at runtime.
This is the fundamental tradeoff: type systems sacrifice completeness for soundness. A well-typed program might still loop forever or throw a non-type exception, but it will never apply a function to an argument of the wrong type.
2.2 The Simply Typed Lambda Calculus (STLC)
The Simply Typed Lambda Calculus (STLC) is the simplest typed extension of the lambda calculus. It was introduced by Alonzo Church in 1940 and forms the foundation of all modern type systems.
Syntax
The STLC extends lambda calculus with type annotations:
Types:
T ::= Int -- Base type
| Bool -- Base type
| T1 -> T2 -- Function type (right-associative)
Terms:
e ::= x -- Variable
| lambda x:T. e -- Abstraction (annotated)
| e1 e2 -- Application
| n -- Integer literal
| true | false -- Boolean literals
The crucial addition is that lambda-bound variables now carry type annotations: lambda x:Int. x + 1 instead of just lambda x. x + 1.
The Function Type
The arrow type A -> B is the type of functions from A to B. It is right-associative, meaning:
A -> B -> C = A -> (B -> C)
This makes sense when you think about currying. A function add : Int -> Int -> Int takes an Int and returns a function Int -> Int. So:
add : Int -> Int -> Int -- Whole function
add 3 : Int -> Int -- Partially applied
add 3 4 : Int -- Fully applied
Typing Rules
The STLC is defined by typing judgments of the form:
Gamma |- e : T
Read as: âUnder context Gamma, expression e has type Tâ
The context (or environment) Gamma maps variable names to their types. The typing rules are:
(VAR) Gamma, x:T |- x : T
(ABS) Gamma, x:T1 |- e : T2
-------------------------
Gamma |- (lambda x:T1. e) : T1 -> T2
(APP) Gamma |- e1 : T1 -> T2 Gamma |- e2 : T1
--------------------------------------------
Gamma |- e1 e2 : T2
Letâs read these rules:
-
VAR: If
xhas typeTin the context, thenxhas typeT. (Trivial but necessary.) - ABS: To type an abstraction
lambda x:T1. e:- Add
x:T1to the context - Type-check the body
eto get some typeT2 - The whole abstraction has type
T1 -> T2
- Add
- APP: To type an application
e1 e2:- Type-check
e1to get a function typeT1 -> T2 - Type-check
e2to get typeT1(must match the domain!) - The result has type
T2
- Type-check
Type Checking vs Type Inference
In STLC, we typically do type checking: given a term with type annotations, verify it is well-typed. This is straightforward because we always know the types of bound variables.
Type inference is harder: given an unannotated term, figure out what type annotations would make it well-typed (if any). This is what Hindley-Milner does.
2.3 The Problem with Simple Types
STLC has a severe limitation: no polymorphism. Consider the identity function:
id = lambda x:Int. x
This works for integers, but if we want to use it on booleans, we need a separate function:
id_bool = lambda x:Bool. x
And for functions, yet another:
id_fun = lambda x:(Int -> Int). x
This is absurd. The identity function does the same thing regardless of type. We want to write it once:
id = lambda x. x
And have it work at any type. This requires parametric polymorphism: the ability for a single function to work at many types.
2.4 Hindley-Milner Type System
The Hindley-Milner (HM) type system solves this problem. Independently discovered by Roger Hindley in 1969 and Robin Milner in 1978, it provides:
- Type inference: No type annotations needed
- Parametric polymorphism: Functions can work at multiple types
- Principal types: Every expression has a âmost generalâ type
Type Variables and Quantification
HM introduces type variables (written as a, b, c or in Haskell notation a, b, c):
id : forall a. a -> a
Read: âFor all types a, id is a function from a to a.â
The forall is called universal quantification. It means id can be used at any type:
id 5 -- id used at type Int -> Int
id true -- id used at type Bool -> Bool
id id -- id used at type (a -> a) -> (a -> a)
Each use instantiates the type variable with a concrete type.
Monotypes vs Polytypes
HM distinguishes two kinds of types:
Monotypes (tau): Types without quantifiers
Int
Bool
a -- A type variable
a -> b -- Function with type variables
(a -> b) -> a -> b -- More complex
Polytypes or Type Schemes (sigma): Quantified types
forall a. a -> a
forall a b. a -> b -> a
forall a. (a -> a) -> a -> a
The quantifiers are only allowed at the âtop levelâ of a type scheme. This is called prenex polymorphism or rank-1 polymorphism.
The Let-Polymorphism Insight
Hereâs the key question: when can we generalize a type? Consider:
let id = lambda x. x
in (id 5, id true)
We want id to be polymorphic so we can use it at both Int and Bool. But consider:
(lambda id. (id 5, id true)) (lambda x. x)
This is equivalent beta-eta, but now we have a problem! Lambda-bound variables cannot be polymorphic in HM. The term id has a single type that must unify with both Int -> Int and Bool -> Bool, which is impossible.
The solution: let is special. When we write:
let x = e1 in e2
We can generalize the type of e1 before using it in e2. This generalization is only safe because:
- We know exactly what
e1is (its full definition) - We can specialize it differently at each use site
Lambda-bound variables donât have this luxuryâwe donât know what argument will be passed until runtime.
2.5 Algorithm W: Type Inference
Algorithm W is the standard algorithm for HM type inference. It was introduced by Milner and is the basis for type inference in ML, OCaml, Haskell, and many other languages.
The Core Idea
Algorithm W works by:
- Generating fresh type variables for unknowns
- Collecting constraints during traversal
- Solving constraints via unification
- Generalizing at let-bindings
Unification
Unification is the heart of type inference. Given two types, find a substitution (mapping from type variables to types) that makes them equal.
unify(a, Int) = {a := Int}
unify(a -> b, Int -> Bool) = {a := Int, b := Bool}
unify(a -> a, Int -> Bool) = FAIL (a cannot be both Int and Bool)
Robinsonâs unification algorithm:
unify(T1, T2):
if T1 is a type variable a:
if a == T2: return {}
if a occurs in T2: FAIL (infinite type!)
return {a := T2}
if T2 is a type variable: return unify(T2, T1)
if T1 = S1 -> S2 and T2 = R1 -> R2:
sub1 = unify(S1, R1)
sub2 = unify(apply(sub1, S2), apply(sub1, R2))
return compose(sub2, sub1)
if T1 = T2 (same base type): return {}
FAIL (incompatible types)
The occurs check prevents infinite types. Without it, unify(a, a -> Int) would produce {a := a -> Int}, but then a = a -> Int = (a -> Int) -> Int = ... forever.
The Algorithm
Algorithm W traverses the expression, maintaining:
- A type environment mapping variables to type schemes
- A substitution accumulated from unifications
- A fresh variable counter
W(env, e):
case e of:
x (variable):
scheme = lookup(x, env)
return (empty_subst, instantiate(scheme))
lambda x. body:
alpha = fresh_type_var()
(subst, tau) = W(env + {x: alpha}, body)
return (subst, apply(subst, alpha) -> tau)
e1 e2:
(s1, tau1) = W(env, e1)
(s2, tau2) = W(apply(s1, env), e2)
alpha = fresh_type_var()
s3 = unify(apply(s2, tau1), tau2 -> alpha)
return (compose(s3, compose(s2, s1)), apply(s3, alpha))
let x = e1 in e2:
(s1, tau1) = W(env, e1)
scheme = generalize(apply(s1, env), tau1)
(s2, tau2) = W(apply(s1, env) + {x: scheme}, e2)
return (compose(s2, s1), tau2)
Generalization and Instantiation
Generalization turns a monotype into a polytype by quantifying over free type variables that donât appear in the environment:
generalize(env, tau):
free_vars = freeVars(tau) - freeVars(env)
return forall free_vars. tau
Instantiation turns a polytype into a monotype by replacing quantified variables with fresh type variables:
instantiate(forall a1 ... an. tau):
alpha1 = fresh(), ..., alphan = fresh()
return tau[a1 := alpha1, ..., an := alphan]
2.6 Type Safety: Progress and Preservation
A type system is sound if it satisfies two properties:
- Progress: A well-typed term is either a value or can take a step.
- Preservation: If a well-typed term takes a step, the result is also well-typed.
Together, these guarantee: well-typed programs donât get stuck.
For STLC/HM, we can prove:
- Progress: By case analysis on the typing derivation
- Preservation: By the substitution lemma (substituting a value for a variable preserves types)
2.7 Extensions and Variations
Adding Base Types and Primitives
Real languages have more than just functions:
Gamma |- n : Int (INT)
Gamma |- true : Bool (TRUE)
Gamma |- false : Bool (FALSE)
Gamma |- e1 : Int Gamma |- e2 : Int
-------------------------------------- (PLUS)
Gamma |- e1 + e2 : Int
Gamma |- e1 : Bool Gamma |- e2 : T Gamma |- e3 : T
-------------------------------------------------------- (IF)
Gamma |- if e1 then e2 else e3 : T
Recursive Types and Recursion
The Y combinator is untypable in STLC! To support recursion, we need either:
- A fixpoint operator:
fix : forall a. (a -> a) -> a - Recursive types:
type List a = Nil | Cons a (List a)
Type Annotations
Even with inference, programmers sometimes want to annotate types:
id : forall a. a -> a
id x = x
This requires checking that the inferred type is at least as general as the declared type.
2.8 Common Type Errors and What They Mean
| Error | Meaning |
|---|---|
| âCannot unify Int with Boolâ | You used an Int where a Bool was expected (or vice versa) |
| âInfinite type: a ~ a -> bâ | Occurs check failed; youâre trying to create a recursive type |
| âUnbound variable: xâ | Variable used but not in scope |
| âCannot unify a -> b with Intâ | You applied something thatâs not a function |
| âRigid type variableâ (GHC) | Youâre trying to instantiate a type variable thatâs already fixed |
3. Project Specification
3.1 Phase 1: Simply Typed Lambda Calculus (Week 1)
Build a type checker for the simply typed lambda calculus with explicit type annotations.
Features
- Syntax:
- Types:
Int,Bool,T1 -> T2 - Terms: variables, annotated lambdas, application, literals
- Parser for the syntax
- Types:
- Type Checker:
- Implement the typing rules for STLC
- Maintain a typing context (environment)
- Produce helpful error messages
- REPL:
:type <expr>to check the type of an expression- Parse and type-check expressions interactively
Example Session
> :type lambda x:Int. x
Int -> Int
> :type lambda x:Int. x + 1
Int -> Int
> :type lambda f:(Int -> Int). lambda x:Int. f (f x)
(Int -> Int) -> Int -> Int
> :type lambda x:Int. x x
Type error: Cannot unify Int with Int -> a
In expression: x x
x has type Int but is applied as a function
> :type (lambda x:Int. x) true
Type error: Cannot unify Int with Bool
Argument 'true' has type Bool
But function expects type Int
3.2 Phase 2: Unification (Week 2)
Implement Robinsonâs unification algorithm as a standalone module.
Features
- Type representation with variables:
- Type variables (fresh, named)
- Substitutions (mappings from variables to types)
- Apply substitution to type
- Unification:
unify :: Type -> Type -> Either TypeError Substitution- Occurs check to prevent infinite types
- Compose substitutions correctly
- Tests:
- Unit tests for all unification cases
- Property tests:
unify t1 t2produces substitution such thatapply s t1 == apply s t2
3.3 Phase 3: Hindley-Milner Inference (Week 3)
Implement Algorithm W for full type inference.
Features
- Type Schemes:
forallquantification- Generalization (monotype to polytype)
- Instantiation (polytype to monotype)
- Algorithm W:
- Fresh variable generation
- Constraint collection
- Let-polymorphism
- Extended REPL:
> :type \x -> x Inferred: forall a. a -> a > :type \f -> \x -> f (f x) Inferred: forall a. (a -> a) -> a -> a > let id = \x -> x in (id 5, id true) Inferred: (Int, Bool)
3.4 Phase 4: Extensions (Week 4)
Extend your type system with practical features.
Features
- Recursive let (
let rec):> let rec fact = \n -> if n == 0 then 1 else n * fact (n - 1) in fact 5 Inferred: Int - Pattern matching (basic):
> \(x, y) -> x + y Inferred: (Int, Int) -> Int - Better error messages:
- Location information
- Suggestions for common mistakes
- âDid you mean?â for typos
4. Solution Architecture
4.1 Module Organization
src/
Types/
Syntax.hs -- Type and Term AST
Substitution.hs -- Substitutions on types
Unify.hs -- Unification algorithm
Check.hs -- STLC type checking
Infer.hs -- HM type inference
Scheme.hs -- Type schemes, generalization, instantiation
Error.hs -- Type error representation
Pretty.hs -- Pretty printing types and terms
Parser/
Lexer.hs
Parser.hs
Repl.hs
Main.hs
test/
Types/
SubstitutionSpec.hs
UnifySpec.hs
CheckSpec.hs
InferSpec.hs
4.2 Core Data Types
-- Types
data Type
= TVar TVar -- Type variable
| TInt -- Base type
| TBool -- Base type
| TArrow Type Type -- Function type
newtype TVar = TVar String
-- Type schemes (polytypes)
data Scheme = Forall [TVar] Type
-- Terms
data Expr
= Var String
| Lam String (Maybe Type) Expr -- Optional annotation
| App Expr Expr
| Let String Expr Expr
| LetRec String Expr Expr
| IntLit Int
| BoolLit Bool
| BinOp Op Expr Expr
| If Expr Expr Expr
-- Type errors
data TypeError
= UnboundVariable String
| CannotUnify Type Type
| OccursCheckFailed TVar Type
| NotAFunction Type
| TypeMismatch Type Type String -- expected, actual, context
4.3 Substitution
A substitution is a finite mapping from type variables to types:
newtype Subst = Subst (Map TVar Type)
empty :: Subst
singleton :: TVar -> Type -> Subst
compose :: Subst -> Subst -> Subst -- Order matters!
class Substitutable a where
apply :: Subst -> a -> a
freeVars :: a -> Set TVar
Key insight: compose s2 s1 means âapply s1 first, then s2â.
compose s2 s1 = Subst (Map.map (apply s2) (unSubst s1) `Map.union` unSubst s2)
4.4 Unification Algorithm
unify :: Type -> Type -> Either TypeError Subst
unify (TVar v) t = bindVar v t
unify t (TVar v) = bindVar v t
unify (TArrow a1 a2) (TArrow b1 b2) = do
s1 <- unify a1 b1
s2 <- unify (apply s1 a2) (apply s1 b2)
return (compose s2 s1)
unify TInt TInt = return empty
unify TBool TBool = return empty
unify t1 t2 = Left (CannotUnify t1 t2)
bindVar :: TVar -> Type -> Either TypeError Subst
bindVar v (TVar u) | v == u = return empty
bindVar v t | v `Set.member` freeVars t = Left (OccursCheckFailed v t)
bindVar v t = return (singleton v t)
4.5 Algorithm W
-- The inference monad
type Infer a = ExceptT TypeError (State InferState) a
data InferState = InferState
{ counter :: Int -- For fresh variables
}
fresh :: Infer Type
fresh = do
n <- gets counter
modify (\s -> s { counter = n + 1 })
return (TVar (TVar ("t" ++ show n)))
-- Main inference function
infer :: TypeEnv -> Expr -> Infer (Subst, Type)
5. Implementation Guide
5.1 Phase 1 Milestones
Milestone 1.1: Type Representation
- Define
Typedata type - Implement
ShowandEq - Parse simple types
Milestone 1.2: Term Representation
- Define
Exprdata type - Parser for explicitly-typed terms
- Pretty printer
Milestone 1.3: Type Checker
- Implement
check :: TypeEnv -> Expr -> Either TypeError Type - Handle all STLC cases
- Error messages with context
5.2 Phase 2 Milestones
Milestone 2.1: Substitution
- Implement
Substtype - Implement
applyfor types and environments - Implement
composecorrectly
Milestone 2.2: Unification
- Implement basic unification
- Add occurs check
- Comprehensive test suite
Milestone 2.3: Property Tests
- Idempotence:
apply s (apply s t) == apply s t - Correctness:
unify t1 t2 = Right s => apply s t1 == apply s t2 - Compositionality tests
5.3 Phase 3 Milestones
Milestone 3.1: Type Schemes
- Define
Schemetype - Implement
generalize - Implement
instantiate
Milestone 3.2: Algorithm W
- Implement core algorithm
- Handle let-polymorphism
- Fresh variable generation
Milestone 3.3: Integration
- Connect to REPL
- Infer types for unannotated terms
- Show most general type
5.4 Hints for Tricky Parts
Hint 1: Composition Order
The order in compose s2 s1 matters. Think of it as function composition:
apply (compose s2 s1) t == apply s2 (apply s1 t)
Hint 2: Environment Updates
When descending into a lambda body, you extend the environment. But when returning, you must apply the accumulated substitution to the extended variableâs type.
Hint 3: Let vs Lambda
The key difference:
- Lambda: The bound variable has a monotype (no generalization)
- Let: The bound variableâs type is generalized before use
Hint 4: Occurs Check
Donât forget it! Without the occurs check:
\x -> x x
Would infer a ~ a -> b, giving the âinfinite typeâ a -> a -> a -> ...
6. Testing Strategy
6.1 Unit Tests
-- Unification tests
describe "unify" $ do
it "unifies type variables" $
unify (TVar "a") TInt `shouldBe` Right (singleton "a" TInt)
it "fails occurs check" $
unify (TVar "a") (TArrow (TVar "a") TInt) `shouldSatisfy` isLeft
it "unifies function types" $
unify (TArrow (TVar "a") (TVar "b"))
(TArrow TInt TBool)
`shouldBe` Right (fromList [("a", TInt), ("b", TBool)])
-- Inference tests
describe "infer" $ do
it "infers identity" $
inferType (Lam "x" Nothing (Var "x"))
`shouldBe` Right (Forall ["a"] (TArrow (TVar "a") (TVar "a")))
it "infers const" $
inferType (Lam "x" Nothing (Lam "y" Nothing (Var "x")))
`shouldBe` Right (Forall ["a", "b"] (TArrow (TVar "a") (TArrow (TVar "b") (TVar "a"))))
6.2 Property-Based Tests
-- Substitution properties
prop_applyIdempotent :: Subst -> Type -> Bool
prop_applyIdempotent s t = apply s (apply s t) == apply s t
-- Unification properties
prop_unifyCorrect :: Type -> Type -> Property
prop_unifyCorrect t1 t2 =
case unify t1 t2 of
Left _ -> property True -- No constraint on failure
Right s -> apply s t1 === apply s t2
-- Generalization/Instantiation
prop_instantiateSpecializes :: TypeEnv -> Type -> Property
prop_instantiateSpecializes env tau =
let scheme = generalize env tau
tau' = runInfer (instantiate scheme)
in canUnify tau tau'
6.3 Integration Tests
Test complete programs:
testPrograms :: [(String, String)]
testPrograms =
[ ("\\x -> x", "forall a. a -> a")
, ("\\f -> \\x -> f (f x)", "forall a. (a -> a) -> a -> a")
, ("let id = \\x -> x in id id", "forall a. a -> a")
, ("let id = \\x -> x in (id 1, id true)", "(Int, Bool)")
, ("\\x -> x x", "TYPE ERROR") -- Should fail
]
7. Common Pitfalls
7.1 Substitution Composition
Wrong:
compose s2 s1 = Subst (unSubst s1 `Map.union` unSubst s2)
Right:
compose s2 s1 = Subst (Map.map (apply s2) (unSubst s1) `Map.union` unSubst s2)
You must apply s2 to the range of s1!
7.2 Forgetting the Occurs Check
Without the occurs check, your inferencer will loop or produce nonsense:
-- Without occurs check:
unify (TVar "a") (TArrow (TVar "a") TInt)
-- Returns: {a := a -> Int}
-- But a = a -> Int = (a -> Int) -> Int = ...
7.3 Generalization Timing
You must generalize after type checking e1 in let x = e1 in e2, not before. The substitution from checking e1 must be applied first.
7.4 Fresh Variable Collisions
Use a counter or supply monad for fresh variables. Donât use random names or youâll get collisions.
7.5 Monomorphism Restriction
In Haskell, top-level bindings without type signatures and with no function arguments donât get generalized (the âmonomorphism restrictionâ). You can ignore this for your project or implement it as an extension.
8. Extensions and Challenges
8.1 Intermediate Extensions
- Type Annotations: Allow optional annotations and check theyâre not less general than inferred
- Recursive Let: Extend
letto allow recursive definitions - Tuples and Records: Add product types with projection
- Sum Types: Add
Either-like types with case analysis
8.2 Advanced Extensions
- Higher-Rank Types: Allow
forallinside types (Rank-N polymorphism) - Constraint-Based Inference: Separate constraint generation from solving
- Error Recovery: Continue after errors to report multiple problems
- Type Classes: Add Haskell-style overloading (significant extension!)
8.3 Research Extensions
- Row Polymorphism: Extensible records
- GADTs: Generalized Algebraic Data Types
- Dependent Types: Types that depend on values
- Effect Systems: Track side effects in types
9. Real-World Connections
9.1 Where HM Appears
| Language | Type System |
|---|---|
| Haskell | HM + type classes + many extensions |
| OCaml | HM + objects + modules |
| ML (SML) | Classic HM |
| F# | HM + .NET interop |
| Rust | HM-inspired + ownership + lifetimes |
| TypeScript | Structural + limited inference (not HM) |
| Elm | Pure HM (no type classes) |
9.2 Type Inference in Production
GHC (Haskell): Uses constraint-based inference with extensive extensions (GADTs, type families, etc.). Error messages have improved dramatically with work on error provenance.
Rust: The borrow checker is separate from type inference. Rustâs inference is mostly local (within functions) with limited bidirectional inference.
TypeScript: Uses a âbest common typeâ algorithm with widening. Not HM because it needs to interop with JavaScriptâs dynamic typing.
9.3 Industry Applications
- Compiler internals: Every typed language needs a type checker
- IDE support: Type inference powers autocomplete and refactoring
- Static analysis: Type-like systems for security, dataflow
- Program synthesis: Types guide search for correct programs
10. Interview Questions
After completing this project, you should be able to answer:
- What is the difference between type checking and type inference?
- Checking: verify a term has a given type
- Inference: discover the type without annotations
- HM does inference with principal types
- Explain the occurs check and why itâs necessary.
- Prevents infinite types like
a = a -> Int - Without it, unification would loop or produce nonsense
- Related to the termination of the unification algorithm
- Prevents infinite types like
- Why can lambda-bound variables not be polymorphic in HM?
- We donât know what value will be passed at runtime
- Canât generalize because the variable might unify with something concrete
- Let-polymorphism is safe because we know the full definition
- What is a principal type?
- The most general type of an expression
- All other valid types are instances of it
- HM guarantees principal types exist and are computable
- How would you extend HM to support type classes?
- Add constraints/predicates to type schemes
- Instance resolution during type checking
- Dictionary passing in implementation
- What are the tradeoffs of Hindley-Milner vs. dependent types?
- HM: decidable inference, simpler, proven track record
- Dependent types: more expressive, harder inference, fewer mainstream languages
- Explain Algorithm W in your own words.
- Walk the syntax tree
- Generate fresh type variables for unknowns
- Collect constraints via unification
- Generalize at let-bindings
- Return accumulated substitution and final type
11. Self-Assessment Checklist
Phase 1: STLC Type Checker
- Can type-check identity:
lambda x:Int. x : Int -> Int - Can type-check composition:
lambda f:(Int->Int). lambda g:(Int->Int). lambda x:Int. f (g x) - Rejects ill-typed terms with helpful errors
- Parser handles nested lambdas and applications correctly
Phase 2: Unification
- Unifies type variables with concrete types
- Unifies function types recursively
- Occurs check prevents infinite types
- Composition of substitutions is correct
- Property tests pass
Phase 3: Algorithm W
- Infers
forall a. a -> afor\x -> x - Infers
forall a b. a -> b -> afor\x -> \y -> x - Let-polymorphism works:
let id = \x -> x in (id 1, id true) - Rejects
\x -> x xwith occurs check error - Fresh variable generation is sound
Phase 4: Extensions
- Recursive let works for factorial
- Error messages include source locations
- Type annotations are checked for consistency
- REPL provides good user experience
Understanding
- Can explain why types prevent runtime errors
- Can trace through Algorithm W by hand
- Can explain let-polymorphism vs. lambda restriction
- Can relate to Haskell/OCaml type inference
- Can discuss soundness (progress + preservation)
12. Resources
Primary Texts
- âTypes and Programming Languagesâ by Benjamin Pierce
- Chapters 9-11: Simply Typed Lambda Calculus
- Chapter 22: Type Reconstruction (Hindley-Milner)
- The definitive reference
- âHaskell Programming from First Principlesâ by Allen & Moronuki
- Chapter 5: Types
- Chapter 6: Typeclasses
- Good intuition-building
Papers
- âPrincipal Type-Schemes for Functional Programsâ by Damas and Milner (1982)
- The original Algorithm W paper
- Dense but foundational
- âA Tutorial Introduction to Type Inferenceâ by Luca Cardelli
- More accessible than Pierce
- Good pseudocode
- âTyping Haskell in Haskellâ by Mark P. Jones
- Complete type checker implementation
- Includes type classes
Online Resources
- Write You a Haskell
- Chapter on type inference
- Working Haskell code
- Algorithm W Step by Step
- Martin Grabmullerâs tutorial
- Complete implementation
- Stanford CS242: Type Inference
- Lecture notes with examples
- Good for review
Reference Implementations
- Hindley-Milner in 500 lines: Various implementations on GitHub
- GHC source: The real thing (complex but instructive)
- Mini-ML: Various teaching implementations
âThe type system is a theorem prover, and type checking is proof verification. Every program that type-checks comes with a machine-verified proof that certain errors cannot occur.â
After completing this project, youâll never look at a type error the same way again.