Project 14: Lisp Interpreter in Haskell
Project 14: Lisp Interpreter in Haskell
Project Overview
| Attribute | Details |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 3-4 weeks |
| Primary Language | Haskell |
| Alternative Languages | OCaml, Rust, Racket |
| Knowledge Area | Interpreters / Language Implementation / Lisp |
| Tools Required | GHC, Cabal/Stack, Parsec or Megaparsec |
| Primary Reference | âWrite Yourself a Scheme in 48 Hoursâ (Wikibooks) |
| Prerequisites | Project 7 (Parser Combinators), familiarity with monads |
Learning Objectives
By completing this project, you will be able to:
- Parse S-expressions using parser combinators, handling atoms, numbers, strings, lists, and quoted expressions
- Implement lexical scoping with environments represented as nested association lists or maps
- Evaluate Lisp expressions using the classic eval/apply pattern from McCarthyâs original paper
- Handle special forms (if, define, lambda, quote, let, cond) that require non-standard evaluation
- Implement closures that capture their lexical environment at definition time
- Build a REPL with proper error handling using monad transformers
- Add standard library functions for list manipulation, arithmetic, and I/O
Deep Theoretical Foundation
Why Build a Lisp Interpreter?
Building a Lisp interpreter is one of the most educational projects in computer science. There are several reasons:
-
Minimalism: Lispâs syntax is trivialâjust parentheses and atoms. This lets you focus entirely on semantics.
-
Homoiconicity: Lisp code is Lisp data. This blurs the line between interpreter and interpreted, teaching you about the fundamental nature of programming languages.
-
Lambda Calculus Connection: Lisp is lambda calculus with syntax. Everything you learned about Church encodings applies directly.
-
Historical Importance: McCarthyâs 1960 paper âRecursive Functions of Symbolic Expressionsâ created both Lisp and the concept of a meta-circular evaluatorâa language interpreter written in itself.
-
Monadic Excellence: Implementing a Lisp in Haskell requires mastery of monads for state, errors, and I/Oâall in concert.
Paul Graham called Lisp âthe Maxwellâs equations of software.â By implementing it, you understand the fundamental laws governing all programming languages.
The Origin of Lisp
In 1958, John McCarthy was thinking about how to represent and process symbolic expressions. He invented:
- S-expressions: A universal notation for structured data
- eval: A function that interprets S-expressions as code
- The meta-circular evaluator: eval written in Lisp itself
McCarthy described eval in about a page of code. This is possible because Lisp is its own abstract syntax treeâthere is no separate parsing step in the conceptual model.
The famous Lisp paper includes these core functions:
car: First element of a listcdr: Rest of a list (everything but first)cons: Construct a list from head and tailatom: Test if something is atomic (not a list)eq: Test equality of atomscond: Conditional expressionlambda: Anonymous functionquote: Prevent evaluation
From these primitives, everything else can be built.
S-Expressions: The Universal Data Structure
An S-expression (symbolic expression) is either:
- An atom: A symbol like
foo, a number like42, or a string like"hello" - A list: A sequence of S-expressions in parentheses:
(a b c)
That is the entire syntax. There are no operators, no special forms at the syntactic level, no precedence rules. Everything is prefix notation with parentheses.
; Atoms
42 ; number
"hello" ; string
foo ; symbol
#t ; boolean true
#f ; boolean false
; Lists
() ; empty list (nil)
(1 2 3) ; list of numbers
(a b c) ; list of symbols
(+ 1 2) ; looks like function call
(define x 10) ; looks like definition
(if p a b) ; looks like conditional
; Nested lists
((1 2) (3 4)) ; list of lists
(define (square x) (* x x)) ; function definition
The key insight: (+ 1 2) is just a list with three elements: the symbol +, the number 1, and the number 2. It is only when we evaluate it that it becomes a function call.
The Quote Mechanism
Without quote, every list would be evaluated as a function call:
(+ 1 2) ; Evaluates to 3
'(+ 1 2) ; Evaluates to the list (+ 1 2)
(quote (+ 1 2)) ; Same as above
Quote is the fundamental mechanism for treating code as data. This is what enables macros, code generation, and Lispâs legendary metaprogramming.
; Building code as data
(define code '(+ 1 2))
(eval code) ; Returns 3
; Code that writes code
(define (make-adder n)
(list 'lambda '(x) (list '+ 'x n)))
(make-adder 5) ; Returns (lambda (x) (+ x 5))
(eval (make-adder 5)) ; Returns a function that adds 5
Evaluation: The Heart of the Interpreter
Evaluation follows simple rules:
- Numbers, strings, booleans evaluate to themselves (self-evaluating)
- Symbols evaluate by looking them up in the environment
- Lists are evaluated as function applications:
- Evaluate the first element (should be a function)
- Evaluate the remaining elements (arguments)
- Apply the function to the arguments
- Special forms (if, define, lambda, quote, etc.) have custom evaluation rules
Here is the classic eval definition in pseudo-code:
eval(expr, env):
if self-evaluating(expr):
return expr
if symbol(expr):
return lookup(expr, env)
if quoted(expr):
return text-of-quotation(expr)
if assignment(expr):
return eval-assignment(expr, env)
if definition(expr):
return eval-definition(expr, env)
if if(expr):
return eval-if(expr, env)
if lambda(expr):
return make-procedure(params(expr), body(expr), env)
if begin(expr):
return eval-sequence(actions(expr), env)
if application(expr):
return apply(
eval(operator(expr), env),
list-of-values(operands(expr), env)
)
And apply:
apply(proc, args):
if primitive-procedure(proc):
return apply-primitive(proc, args)
if compound-procedure(proc):
return eval-sequence(
body(proc),
extend-environment(params(proc), args, environment(proc))
)
Environments and Scoping
An environment is a mapping from names to values, with a link to an enclosing environment:
Global Environment:
+ -> <primitive: addition>
- -> <primitive: subtraction>
* -> <primitive: multiplication>
Extended Environment (when evaluating a function body):
x -> 5
y -> 10
[parent] -> Global Environment
When looking up a name:
- Search in current environment
- If not found, search in parent environment
- Continue until found or reach top level
- If not found at top level, error: unbound variable
This is lexical scoping: a variable refers to the binding that was visible at the point where the code was written, not where it is executed.
Closures: Functions with Memory
A closure is a function bundled with its lexical environment. When you define a lambda:
(define (make-counter)
(let ((count 0))
(lambda ()
(set! count (+ count 1))
count)))
(define counter1 (make-counter))
(counter1) ; Returns 1
(counter1) ; Returns 2
(counter1) ; Returns 3
(define counter2 (make-counter))
(counter2) ; Returns 1 (different count!)
Each closure captures its own count variable. The function âcloses overâ its environmentâhence the name âclosure.â
In Haskell, we represent this as:
data LispVal
= ...
| Func { params :: [String]
, vararg :: Maybe String
, body :: [LispVal]
, closure :: Env
}
Special Forms vs Functions
Special forms look like function calls but have different evaluation rules:
Functions: Evaluate all arguments first, then call the function.
(+ 1 2) ; Evaluate 1, evaluate 2, then add them
Special forms: Custom evaluation of arguments.
(if test then else)
; Only evaluate test, then evaluate EITHER then OR else, not both
(define name value)
; Do NOT evaluate name---it's a symbol to bind
(lambda (x) body)
; Do NOT evaluate (x) or body---they define a function
(quote expr)
; Do NOT evaluate expr---return it literally
Common special forms:
| Form | Behavior |
|---|---|
quote |
Return argument unevaluated |
if |
Evaluate condition, then one branch |
define |
Bind name to value in environment |
set! |
Mutate existing binding |
lambda |
Create function (closure) |
let |
Local bindings |
let* |
Sequential local bindings |
letrec |
Recursive local bindings |
cond |
Multi-way conditional |
and |
Short-circuit conjunction |
or |
Short-circuit disjunction |
begin |
Sequence expressions |
Macros: Code that Writes Code
Lisp macros operate on code as data:
(define-macro (when test . body)
`(if ,test (begin ,@body) #f))
; Usage:
(when (> x 0)
(print "positive")
(print "and nonzero"))
; Expands to:
(if (> x 0) (begin (print "positive") (print "and nonzero")) #f)
The quasiquote (`) and unquote (,) and unquote-splicing (,@) allow template-based code generation.
Macros run at âcompile timeââthey transform source code before evaluation. This is Lispâs killer feature: you can extend the language itself.
Tail Call Optimization
Functional languages use recursion instead of loops. Without optimization, this would blow the stack:
(define (factorial n)
(if (= n 0)
1
(* n (factorial (- n 1)))))
(factorial 1000000) ; Stack overflow!
But with tail call optimization, recursive calls in tail position reuse the current stack frame:
(define (factorial-tail n acc)
(if (= n 0)
acc
(factorial-tail (- n 1) (* n acc))))
(factorial-tail 1000000 1) ; Works!
A call is in tail position if its result is returned directly without further computation. Implementing TCO is crucial for a production-quality interpreter.
Continuations and call/cc
Schemeâs most powerful feature is first-class continuations via call/cc (call-with-current-continuation):
(call/cc
(lambda (k)
(+ 1 (k 42)))) ; Returns 42, not 43
; k is the continuation---"what to do next"
; Calling k immediately returns 42 from the entire call/cc
Continuations enable:
- Non-local exit (like exceptions)
- Coroutines
- Backtracking
- Cooperative multitasking
- Generators
Implementing call/cc requires continuation-passing style (CPS) or explicit continuation data structures.
The Metacircular Evaluator
The most profound aspect of Lisp is the metacircular evaluator: an interpreter for Lisp written in Lisp itself. From SICP:
(define (eval exp env)
(cond ((self-evaluating? exp) exp)
((variable? exp) (lookup-variable-value exp env))
((quoted? exp) (text-of-quotation exp))
((assignment? exp) (eval-assignment exp env))
((definition? exp) (eval-definition exp env))
((if? exp) (eval-if exp env))
((lambda? exp)
(make-procedure (lambda-parameters exp)
(lambda-body exp)
env))
((begin? exp)
(eval-sequence (begin-actions exp) env))
((cond? exp) (eval (cond->if exp) env))
((application? exp)
(apply (eval (operator exp) env)
(list-of-values (operands exp) env)))
(else
(error "Unknown expression type -- EVAL" exp))))
(define (apply procedure arguments)
(cond ((primitive-procedure? procedure)
(apply-primitive-procedure procedure arguments))
((compound-procedure? procedure)
(eval-sequence
(procedure-body procedure)
(extend-environment
(procedure-parameters procedure)
arguments
(procedure-environment procedure))))
(else
(error "Unknown procedure type -- APPLY" procedure))))
This is both a running interpreter and a formal specification of Lispâs semantics.
Lisp vs Scheme vs Other Dialects
Lisp is the family name. Main dialects:
- Common Lisp: The ANSI standard, large, practical, Lisp-2 (separate namespaces for functions and variables)
- Scheme: Minimalist, clean, Lisp-1 (single namespace), emphasizes closures and continuations
- Clojure: Modern, runs on JVM, immutable by default, practical
- Racket: Scheme descendant, excellent for language design
We will implement a subset of Scheme because:
- Cleaner semantics
- Single namespace simplifies implementation
- Write Yourself a Scheme provides guidance
- Better matches Haskellâs functional style
Complete Project Specification
What You Are Building
A Scheme interpreter called hscheme that:
- Parses S-expressions from text to an AST
- Evaluates expressions in an environment
- Supports core special forms (define, lambda, if, quote, let, cond)
- Provides primitive functions (arithmetic, list operations, comparison)
- Implements closures with proper lexical scoping
- Provides a REPL for interactive use
- Handles errors gracefully with meaningful messages
Functional Requirements
; Arithmetic
(+ 1 2 3 4) ; => 10
(- 10 3) ; => 7
(* 2 3 4) ; => 24
(/ 10 2) ; => 5
; Comparison
(= 1 1) ; => #t
(< 1 2) ; => #t
(> 3 2) ; => #t
; List operations
(cons 1 '(2 3)) ; => (1 2 3)
(car '(1 2 3)) ; => 1
(cdr '(1 2 3)) ; => (2 3)
(list 1 2 3) ; => (1 2 3)
(null? '()) ; => #t
(length '(1 2 3)) ; => 3
; Variables
(define x 10)
x ; => 10
; Functions
(define (square x) (* x x))
(square 5) ; => 25
(define (factorial n)
(if (= n 0)
1
(* n (factorial (- n 1)))))
(factorial 10) ; => 3628800
; Lambda
((lambda (x y) (+ x y)) 3 4) ; => 7
; Higher-order functions
(define (map f xs)
(if (null? xs)
'()
(cons (f (car xs)) (map f (cdr xs)))))
(map square '(1 2 3 4 5)) ; => (1 4 9 16 25)
; Closures
(define (make-adder n)
(lambda (x) (+ x n)))
(define add5 (make-adder 5))
(add5 10) ; => 15
; Let bindings
(let ((x 1) (y 2)) (+ x y)) ; => 3
; Conditionals
(cond ((< x 0) 'negative)
((= x 0) 'zero)
(else 'positive))
Core Data Types
-- The Lisp value type
data LispVal
= Atom String -- Symbols
| List [LispVal] -- Lists (including nil)
| DottedList [LispVal] LispVal -- Improper lists
| Number Integer -- Integers
| Float Double -- Floating point
| String String -- Strings
| Bool Bool -- Booleans
| PrimitiveFunc ([LispVal] -> ThrowsError LispVal)
| Func { params :: [String]
, vararg :: Maybe String
, body :: [LispVal]
, closure :: Env
}
| IOFunc ([LispVal] -> IOThrowsError LispVal)
| Port Handle
-- Environment: mutable mapping from names to values
type Env = IORef [(String, IORef LispVal)]
-- Error types
data LispError
= NumArgs Integer [LispVal]
| TypeMismatch String LispVal
| Parser ParseError
| BadSpecialForm String LispVal
| NotFunction String String
| UnboundVar String String
| Default String
-- Error monad stack
type ThrowsError = Either LispError
type IOThrowsError = ExceptT LispError IO
Expected REPL Session
$ hscheme
Haskell Scheme Interpreter v1.0
Type :quit to exit, :help for help
scheme> (+ 1 2 3)
6
scheme> (define (fib n)
(cond ((= n 0) 0)
((= n 1) 1)
(else (+ (fib (- n 1))
(fib (- n 2))))))
fib
scheme> (fib 10)
55
scheme> (map (lambda (x) (* x x)) '(1 2 3 4 5))
(1 4 9 16 25)
scheme> (define (compose f g)
(lambda (x) (f (g x))))
compose
scheme> (define square-then-double
(compose (lambda (x) (* x 2)) (lambda (x) (* x x))))
square-then-double
scheme> (square-then-double 5)
50
scheme> :quit
Goodbye!
Solution Architecture
Module Structure
src/
HScheme/
Parser.hs -- S-expression parsing
Types.hs -- LispVal, LispError types
Eval.hs -- Core evaluation
Env.hs -- Environment operations
Primitives.hs -- Built-in functions
SpecialForms.hs -- if, define, lambda, etc.
REPL.hs -- Read-eval-print loop
Stdlib.hs -- Standard library (in Scheme)
Main.hs -- CLI entry point
Parser Design
Use Megaparsec or Parsec for parsing:
-- Main parser
parseExpr :: Parser LispVal
parseExpr = parseAtom
<|> parseString
<|> try parseFloat
<|> parseNumber
<|> parseQuoted
<|> parseQuasiquote
<|> parseUnquote
<|> parseList
-- Atoms (symbols)
parseAtom :: Parser LispVal
parseAtom = do
first <- letter <|> symbol
rest <- many (letter <|> digit <|> symbol)
let atom = first : rest
return $ case atom of
"#t" -> Bool True
"#f" -> Bool False
_ -> Atom atom
-- Lists
parseList :: Parser LispVal
parseList = do
char '('
x <- sepEndBy parseExpr spaces
char ')'
return $ List x
-- Quoted expressions
parseQuoted :: Parser LispVal
parseQuoted = do
char '\''
x <- parseExpr
return $ List [Atom "quote", x]
Evaluator Design
eval :: Env -> LispVal -> IOThrowsError LispVal
-- Self-evaluating forms
eval env val@(String _) = return val
eval env val@(Number _) = return val
eval env val@(Float _) = return val
eval env val@(Bool _) = return val
-- Symbol lookup
eval env (Atom id) = getVar env id
-- Quote
eval env (List [Atom "quote", val]) = return val
-- If
eval env (List [Atom "if", pred, conseq, alt]) = do
result <- eval env pred
case result of
Bool False -> eval env alt
_ -> eval env conseq
-- Define variable
eval env (List [Atom "define", Atom var, form]) = do
val <- eval env form
defineVar env var val
-- Define function (syntactic sugar)
eval env (List (Atom "define" : List (Atom var : params) : body)) =
makeNormalFunc env params body >>= defineVar env var
-- Lambda
eval env (List (Atom "lambda" : List params : body)) =
makeNormalFunc env params body
-- Let
eval env (List (Atom "let" : List bindings : body)) = do
env' <- bindVars env =<< mapM extractBinding bindings
evalBody env' body
where
extractBinding (List [Atom var, val]) = do
val' <- eval env val
return (var, val')
-- Function application
eval env (List (function : args)) = do
func <- eval env function
argVals <- mapM (eval env) args
apply func argVals
Application Logic
apply :: LispVal -> [LispVal] -> IOThrowsError LispVal
-- Primitive functions
apply (PrimitiveFunc func) args = liftThrows $ func args
-- IO functions
apply (IOFunc func) args = func args
-- User-defined functions
apply (Func params vararg body closure) args =
if num params /= num args && vararg == Nothing
then throwError $ NumArgs (num params) args
else do
env <- liftIO $ bindVars closure $ zip params args
env' <- bindVarArg vararg env
evalBody env' body
where
num = toInteger . length
remainingArgs = drop (length params) args
bindVarArg arg env = case arg of
Just argName -> liftIO $ bindVars env [(argName, List remainingArgs)]
Nothing -> return env
evalBody :: Env -> [LispVal] -> IOThrowsError LispVal
evalBody env = fmap last . mapM (eval env)
Environment Implementation
-- Create empty environment
nullEnv :: IO Env
nullEnv = newIORef []
-- Check if variable is bound
isBound :: Env -> String -> IO Bool
isBound envRef var = do
env <- readIORef envRef
return $ isJust (lookup var env)
-- Get variable value
getVar :: Env -> String -> IOThrowsError LispVal
getVar envRef var = do
env <- liftIO $ readIORef envRef
case lookup var env of
Just ref -> liftIO $ readIORef ref
Nothing -> throwError $ UnboundVar "Unbound variable" var
-- Set variable value (mutation)
setVar :: Env -> String -> LispVal -> IOThrowsError LispVal
setVar envRef var value = do
env <- liftIO $ readIORef envRef
case lookup var env of
Just ref -> liftIO $ writeIORef ref value >> return value
Nothing -> throwError $ UnboundVar "Setting unbound variable" var
-- Define new variable
defineVar :: Env -> String -> LispVal -> IOThrowsError LispVal
defineVar envRef var value = do
alreadyDefined <- liftIO $ isBound envRef var
if alreadyDefined
then setVar envRef var value
else liftIO $ do
valueRef <- newIORef value
env <- readIORef envRef
writeIORef envRef ((var, valueRef) : env)
return value
-- Bind multiple variables (for function calls)
bindVars :: Env -> [(String, LispVal)] -> IO Env
bindVars envRef bindings = do
env <- readIORef envRef
extendedEnv <- fmap (++ env) (mapM addBinding bindings)
newIORef extendedEnv
where
addBinding (var, value) = do
ref <- newIORef value
return (var, ref)
Phased Implementation Guide
Phase 1: Parser (Days 1-4)
Goal: Parse S-expressions into LispVal AST.
Tasks:
- Set up project with Parsec/Megaparsec
- Implement
parseNumberfor integers - Implement
parseAtomfor symbols and booleans - Implement
parseStringfor string literals - Implement
parseListfor proper lists - Implement
parseQuotedfor quote syntax - Add tests for all parsers
Validation:
parse "(+ 1 2)" == List [Atom "+", Number 1, Number 2]
parse "'(a b c)" == List [Atom "quote", List [Atom "a", Atom "b", Atom "c"]]
Hints:
- Handle whitespace between tokens
- Remember to handle negative numbers
- Atoms can contain special characters like
+,-,*,?,!
Phase 2: Basic Evaluator (Days 5-9)
Goal: Evaluate simple expressions with primitives.
Tasks:
- Implement
evalfor self-evaluating forms - Create primitive environment with arithmetic operations
- Implement function application for primitives
- Add comparison operators
- Implement
ifspecial form - Add basic error handling
Validation:
(+ 1 2 3) ; => 6
(if (< 1 2) 'yes 'no) ; => yes
(* (+ 1 2) (- 5 3)) ; => 6
Hints:
- Primitives take
[LispVal]and returnThrowsError LispVal - Use
foldl1for variadic arithmetic - Remember:
(if #f 'yes 'no)should return'no(only#fis false)
Phase 3: Variables and Definitions (Days 10-13)
Goal: Implement environments with variable binding.
Tasks:
- Implement
Envtype with IORef - Implement
getVar,setVar,defineVar - Add
definespecial form - Add
set!for mutation - Implement
letandlet*special forms
Validation:
(define x 10)
x ; => 10
(set! x 20)
x ; => 20
(let ((a 1) (b 2)) (+ a b)) ; => 3
Hints:
- Use
IOThrowsErrorfor environment operations (need IO for refs) letcreates new scope;set!mutates existing bindinglet*bindings can reference previous bindings
Phase 4: Functions and Closures (Days 14-18)
Goal: Implement lambda and closures.
Tasks:
- Add
Funcconstructor to LispVal - Implement
lambdaspecial form capturing environment - Implement
applyfor user-defined functions - Add syntactic sugar for
(define (f x) body) - Implement variadic functions with dot notation
Validation:
(define (square x) (* x x))
(square 5) ; => 25
((lambda (x) (* x x)) 5) ; => 25
(define (make-adder n)
(lambda (x) (+ x n)))
(define add5 (make-adder 5))
(add5 10) ; => 15 (closure works!)
Hints:
- Closure captures
envat lambda definition time - When applying, extend closureâs environment, not current environment
- This is the key to lexical scoping
Phase 5: List Operations and Standard Library (Days 19-22)
Goal: Implement list primitives and standard library.
Tasks:
- Implement
cons,car,cdr,list - Add predicates:
null?,pair?,list?,number?,string? - Implement
eq?,eqv?,equal? - Add
length,append,reverse - Implement
map,filter,fold(in Scheme)
Validation:
(cons 1 '(2 3)) ; => (1 2 3)
(car '(1 2 3)) ; => 1
(cdr '(1 2 3)) ; => (2 3)
(map (lambda (x) (* x 2)) '(1 2 3)) ; => (2 4 6)
Hints:
conscreates a pair;(cons 1 2)is different from(cons 1 '(2))equal?does deep equality;eq?is pointer equality- Standard library functions can be defined in Scheme itself
Phase 6: REPL and Polish (Days 23-28)
Goal: Build interactive REPL with good UX.
Tasks:
- Implement read-eval-print loop
- Add line editing with haskeline
- Implement
:quit,:help,:loadcommands - Add multi-line input support
- Improve error messages
- Load standard library on startup
Validation: Full interactive session with helpful errors and command history.
Hints:
- Use
haskelinefor readline-like functionality - Catch exceptions and continue REPL
- Support loading files with
(load "file.scm")
Testing Strategy
Unit Tests
-- Parser tests
testParseNumber = parse "42" @?= Number 42
testParseNegative = parse "-42" @?= Number (-42)
testParseAtom = parse "foo" @?= Atom "foo"
testParseList = parse "(1 2 3)" @?= List [Number 1, Number 2, Number 3]
testParseQuote = parse "'x" @?= List [Atom "quote", Atom "x"]
testParseNested = parse "((1 2) (3 4))" @?=
List [List [Number 1, Number 2], List [Number 3, Number 4]]
-- Evaluator tests
testArithmetic = evalStr "(+ 1 2 3)" @?= "6"
testNested = evalStr "(* (+ 1 2) (- 5 3))" @?= "6"
testIf = evalStr "(if #t 'yes 'no)" @?= "yes"
testDefine = do
env <- primitiveEnv
evalStr' env "(define x 10)"
evalStr' env "x" @?= "10"
testLambda = evalStr "((lambda (x) (* x x)) 5)" @?= "25"
testClosure = do
env <- primitiveEnv
evalStr' env "(define (make-adder n) (lambda (x) (+ x n)))"
evalStr' env "(define add5 (make-adder 5))"
evalStr' env "(add5 10)" @?= "15"
Property-Based Tests
-- Parse/print round-trip
prop_parseShowRoundtrip :: LispVal -> Property
prop_parseShowRoundtrip val =
parse (show val) === Right val
-- Arithmetic laws
prop_addCommutative :: Integer -> Integer -> Property
prop_addCommutative a b =
evalStr (printf "(+ %d %d)" a b) === evalStr (printf "(+ %d %d)" b a)
-- List operations
prop_consCarCdr :: LispVal -> [LispVal] -> Property
prop_consCarCdr x xs =
let list = List xs
in evalExpr (List [Atom "car", List [Atom "cons", quote x, quote list]]) === x
Integration Tests
Write test files in Scheme:
; test-arithmetic.scm
(define (test name expected actual)
(if (equal? expected actual)
(begin (display "PASS: ") (display name) (newline))
(begin (display "FAIL: ") (display name)
(display " expected: ") (display expected)
(display " got: ") (display actual) (newline))))
(test "addition" 6 (+ 1 2 3))
(test "nested" 12 (* (+ 1 2) (+ 1 3)))
(test "factorial" 120 (factorial 5))
Common Pitfalls and Debugging
Pitfall 1: Evaluating Quoteâs Argument
Symptom: 'x returns x instead of the symbol x.
Cause: Evaluating the argument to quote.
Solution:
-- WRONG
eval env (List [Atom "quote", val]) = eval env val
-- RIGHT
eval env (List [Atom "quote", val]) = return val
Pitfall 2: Wrong Environment for Closures
Symptom: Closures donât capture variables correctly.
Cause: Using the calling environment instead of the definition environment.
Solution:
-- When creating closure, capture CURRENT environment
eval env (List (Atom "lambda" : List params : body)) =
return $ Func (map extractAtom params) Nothing body env
^^^
This is the closure!
-- When applying, extend CLOSURE'S environment
apply (Func params vararg body closure) args = do
env <- bindVars closure (zip params args) -- closure, not caller's env
evalBody env body
Pitfall 3: Let Bindings Order
Symptom: let bindings canât reference each other.
Cause: This is correct for let! Use let* for sequential bindings.
Solution: Implement both:
-- let: all bindings evaluated in OUTER environment
(let ((a 1) (b a)) ...) ; ERROR: a not bound during b's evaluation
-- let*: each binding evaluated with previous bindings visible
(let* ((a 1) (b a)) ...) ; OK: b sees a
Pitfall 4: Tail Recursion Blows Stack
Symptom: (factorial 100000) crashes with stack overflow.
Cause: No tail call optimization.
Solution: For basic interpreter, this is expected. For TCO:
- Use trampolining
- Or transform to CPS
- Or implement explicit stack
Pitfall 5: Variable Arity Functions
Symptom: Canât define functions that take any number of arguments.
Cause: Missing vararg support.
Solution:
-- Support dot notation: (lambda (x y . z) body)
-- z binds to list of remaining arguments
data LispVal = ...
| Func { params :: [String]
, vararg :: Maybe String -- the vararg parameter
, body :: [LispVal]
, closure :: Env
}
-- In apply:
let remainingArgs = drop (length params) args
bindVarArg vararg env = case vararg of
Just name -> bindVars env [(name, List remainingArgs)]
Nothing -> return env
Debugging Tips
- Add tracing: Print expressions as theyâre evaluated
- Test incrementally: Verify parser before evaluator, evaluator before environment
- Use the REPL: Debug interactively with simple examples
- Compare to reference: Use Racket or Guile to verify expected behavior
- Print environments: Show whatâs in scope at each step
Extensions and Challenges
Extension 1: Macros
Implement define-macro for syntactic extension:
(define-macro (when test . body)
`(if ,test (begin ,@body) #f))
(when (> x 0)
(display "positive")
(newline))
Requires:
- Quasiquote (`) and unquote (,) syntax
- Macro expansion phase before evaluation
- Careful handling of hygiene
Extension 2: Continuations
Implement call/cc:
(call/cc
(lambda (return)
(+ 1 (return 42) 3))) ; => 42
Requires:
- CPS transformation or
- Explicit continuation objects
Extension 3: Proper Tail Calls
Implement tail call optimization:
(define (loop n)
(if (= n 0)
'done
(loop (- n 1))))
(loop 1000000) ; Should not overflow
Approaches:
- Trampolining
- Explicit stack with
TailCallreturn value - CPS transformation
Extension 4: Garbage Collection
Track memory and implement garbage collection:
- Mark-and-sweep
- Reference counting (with cycle detection)
- Copying collector
Challenge: Self-Hosting
Write enough of the interpreter in Scheme that it can interpret itself:
; A Scheme interpreter written in Scheme
(define (eval exp env)
(cond ((self-evaluating? exp) exp)
((variable? exp) (lookup-variable-value exp env))
...))
This is the ultimate test of language completeness.
Real-World Connections
Where Lisp Ideas Appear
- JSON: S-expressions for web (data as trees)
- React JSX: Code-as-data philosophy
- Terraform HCL: Configuration as data
- Clojure: Modern Lisp on JVM, immutable by default
- Emacs Lisp: Most powerful editor extensibility
- AutoCAD AutoLISP: CAD scripting
Industry Applications
- Grammarly: Written in Common Lisp
- NASA JPL: Deep space mission planning in Lisp
- ITA Software (Google Flights): Flight search in Common Lisp
- YCombinator: Paul Graham built first web apps in Lisp
- Nubank: Clojure for financial systems
- Walmart Labs: Clojure at scale
What This Project Teaches You
- Parser combinator patterns (useful everywhere)
- Interpreter architecture (applicable to DSLs, configs)
- Environment and scope (fundamental to all languages)
- Closure implementation (used in JavaScript, Python, etc.)
- Monad transformer stacks (real Haskell patterns)
Interview Questions
After completing this project, you should be able to answer:
- What is an S-expression and why is the syntax significant?
- Definition: atoms and lists, recursively
- Homoiconicity: code is data
- Enables macros and metaprogramming
- No parsing ambiguity
- How does lexical scoping work and why does it matter?
- Variables resolved in definition environment, not call environment
- Closures capture their lexical environment
- Enables modular reasoning about code
- Compare to dynamic scoping
- What is a closure and how do you implement one?
- Function + captured environment
- Stored together in Func data structure
- When applied, extend closureâs environment
- Key: use definition-time env, not call-time env
- What are special forms and how do they differ from functions?
- Special forms have custom evaluation rules
- Functions evaluate all arguments first
- Examples: if, define, lambda, quote
- Cannot implement if as a function (would evaluate both branches)
- How would you implement tail call optimization?
- Identify tail position (result returned directly)
- Reuse current stack frame
- Techniques: trampolining, explicit stack, CPS
- Why it matters for functional languages
- Explain the monad transformer stack you used.
ExceptT LispError IOfor errors + IO- Why we need IO (mutable environments)
- How to lift between layers
- Alternative: effect systems
- How would you add macros to your interpreter?
- Macro expansion phase before evaluation
- Quasiquote for template-based expansion
- Hygiene considerations
- Contrast with Lisp macros vs Rust/C macros
Self-Assessment Checklist
Before considering this project complete, verify:
- Parser handles all S-expression forms correctly
- Arithmetic works with multiple operands
- Variables can be defined and used
- Lambda creates proper closures
- Closures capture lexical environment (not dynamic)
letandlet*work correctly (different scoping)- Standard list operations work (car, cdr, cons)
- Higher-order functions work (map, filter via Scheme)
- Recursion works (factorial, fibonacci)
- REPL provides good user experience
- Error messages are helpful
- You can write non-trivial programs in your interpreter
Resources
Essential Reading
| Topic | Resource | Notes |
|---|---|---|
| Implementation guide | âWrite Yourself a Scheme in 48 Hoursâ | Free online, follows this project closely |
| Language semantics | SICP Chapter 4 | Metacircular evaluator |
| Original paper | âRecursive Functions of Symbolic Expressionsâ - McCarthy (1960) | Historical foundation |
| Modern Scheme | R7RS Scheme standard | Reference specification |
| Parser combinators | âParsec documentationâ on Hackage | Parser library reference |
| Error handling | â8 ways to report errors in Haskellâ | Best practices |
Books
- âStructure and Interpretation of Computer Programsâ by Abelson & Sussman - The classic
- âThe Little Schemerâ by Friedman & Felleisen - Teaches recursion beautifully
- âEssentials of Programming Languagesâ by Friedman & Wand - Interpreter design patterns
- âLisp in Small Piecesâ by Christian Queinnec - Deep dive into Lisp implementation
Papers
- âThe Original LISP Paperâ - McCarthy (1960)
- âLambda: The Ultimate Imperativeâ - Steele & Sussman
- âThe Art of the Interpreterâ - Steele & Sussman
- âCONS Should Not Evaluate Its Argumentsâ - on special forms
Online Resources
- Write Yourself a Scheme
- Beautiful Racket - Language-oriented programming
- The Racket Guide - Modern Scheme reference
- Scheme Wiki
Reference Implementations
- Racket (full-featured Scheme)
- Guile (GNU Scheme)
- Chicken Scheme (compiles to C)
- MIT Scheme (classic implementation)
âLisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot.â - Eric S. Raymond
By building a Lisp interpreter, you join a lineage stretching back to 1958. Youâll understand why Paul Graham called Lisp âa programming language for people who want to understand programming languages.â