Project 14: Lisp Interpreter in Haskell

Project 14: Lisp Interpreter in Haskell

Project Overview

Attribute Details
Difficulty Advanced
Time Estimate 3-4 weeks
Primary Language Haskell
Alternative Languages OCaml, Rust, Racket
Knowledge Area Interpreters / Language Implementation / Lisp
Tools Required GHC, Cabal/Stack, Parsec or Megaparsec
Primary Reference “Write Yourself a Scheme in 48 Hours” (Wikibooks)
Prerequisites Project 7 (Parser Combinators), familiarity with monads

Learning Objectives

By completing this project, you will be able to:

  1. Parse S-expressions using parser combinators, handling atoms, numbers, strings, lists, and quoted expressions
  2. Implement lexical scoping with environments represented as nested association lists or maps
  3. Evaluate Lisp expressions using the classic eval/apply pattern from McCarthy’s original paper
  4. Handle special forms (if, define, lambda, quote, let, cond) that require non-standard evaluation
  5. Implement closures that capture their lexical environment at definition time
  6. Build a REPL with proper error handling using monad transformers
  7. Add standard library functions for list manipulation, arithmetic, and I/O

Deep Theoretical Foundation

Why Build a Lisp Interpreter?

Building a Lisp interpreter is one of the most educational projects in computer science. There are several reasons:

  1. Minimalism: Lisp’s syntax is trivial—just parentheses and atoms. This lets you focus entirely on semantics.

  2. Homoiconicity: Lisp code is Lisp data. This blurs the line between interpreter and interpreted, teaching you about the fundamental nature of programming languages.

  3. Lambda Calculus Connection: Lisp is lambda calculus with syntax. Everything you learned about Church encodings applies directly.

  4. Historical Importance: McCarthy’s 1960 paper “Recursive Functions of Symbolic Expressions” created both Lisp and the concept of a meta-circular evaluator—a language interpreter written in itself.

  5. Monadic Excellence: Implementing a Lisp in Haskell requires mastery of monads for state, errors, and I/O—all in concert.

Paul Graham called Lisp “the Maxwell’s equations of software.” By implementing it, you understand the fundamental laws governing all programming languages.

The Origin of Lisp

In 1958, John McCarthy was thinking about how to represent and process symbolic expressions. He invented:

  1. S-expressions: A universal notation for structured data
  2. eval: A function that interprets S-expressions as code
  3. The meta-circular evaluator: eval written in Lisp itself

McCarthy described eval in about a page of code. This is possible because Lisp is its own abstract syntax tree—there is no separate parsing step in the conceptual model.

The famous Lisp paper includes these core functions:

  • car: First element of a list
  • cdr: Rest of a list (everything but first)
  • cons: Construct a list from head and tail
  • atom: Test if something is atomic (not a list)
  • eq: Test equality of atoms
  • cond: Conditional expression
  • lambda: Anonymous function
  • quote: Prevent evaluation

From these primitives, everything else can be built.

S-Expressions: The Universal Data Structure

An S-expression (symbolic expression) is either:

  1. An atom: A symbol like foo, a number like 42, or a string like "hello"
  2. A list: A sequence of S-expressions in parentheses: (a b c)

That is the entire syntax. There are no operators, no special forms at the syntactic level, no precedence rules. Everything is prefix notation with parentheses.

; Atoms
42                    ; number
"hello"               ; string
foo                   ; symbol
#t                    ; boolean true
#f                    ; boolean false

; Lists
()                    ; empty list (nil)
(1 2 3)              ; list of numbers
(a b c)              ; list of symbols
(+ 1 2)              ; looks like function call
(define x 10)        ; looks like definition
(if p a b)           ; looks like conditional

; Nested lists
((1 2) (3 4))        ; list of lists
(define (square x) (* x x))  ; function definition

The key insight: (+ 1 2) is just a list with three elements: the symbol +, the number 1, and the number 2. It is only when we evaluate it that it becomes a function call.

The Quote Mechanism

Without quote, every list would be evaluated as a function call:

(+ 1 2)       ; Evaluates to 3
'(+ 1 2)      ; Evaluates to the list (+ 1 2)
(quote (+ 1 2))  ; Same as above

Quote is the fundamental mechanism for treating code as data. This is what enables macros, code generation, and Lisp’s legendary metaprogramming.

; Building code as data
(define code '(+ 1 2))
(eval code)  ; Returns 3

; Code that writes code
(define (make-adder n)
  (list 'lambda '(x) (list '+ 'x n)))
(make-adder 5)  ; Returns (lambda (x) (+ x 5))
(eval (make-adder 5))  ; Returns a function that adds 5

Evaluation: The Heart of the Interpreter

Evaluation follows simple rules:

  1. Numbers, strings, booleans evaluate to themselves (self-evaluating)
  2. Symbols evaluate by looking them up in the environment
  3. Lists are evaluated as function applications:
    • Evaluate the first element (should be a function)
    • Evaluate the remaining elements (arguments)
    • Apply the function to the arguments
  4. Special forms (if, define, lambda, quote, etc.) have custom evaluation rules

Here is the classic eval definition in pseudo-code:

eval(expr, env):
    if self-evaluating(expr):
        return expr
    if symbol(expr):
        return lookup(expr, env)
    if quoted(expr):
        return text-of-quotation(expr)
    if assignment(expr):
        return eval-assignment(expr, env)
    if definition(expr):
        return eval-definition(expr, env)
    if if(expr):
        return eval-if(expr, env)
    if lambda(expr):
        return make-procedure(params(expr), body(expr), env)
    if begin(expr):
        return eval-sequence(actions(expr), env)
    if application(expr):
        return apply(
            eval(operator(expr), env),
            list-of-values(operands(expr), env)
        )

And apply:

apply(proc, args):
    if primitive-procedure(proc):
        return apply-primitive(proc, args)
    if compound-procedure(proc):
        return eval-sequence(
            body(proc),
            extend-environment(params(proc), args, environment(proc))
        )

Environments and Scoping

An environment is a mapping from names to values, with a link to an enclosing environment:

Global Environment:
  +  -> <primitive: addition>
  -  -> <primitive: subtraction>
  *  -> <primitive: multiplication>

  Extended Environment (when evaluating a function body):
    x  -> 5
    y  -> 10
    [parent] -> Global Environment

When looking up a name:

  1. Search in current environment
  2. If not found, search in parent environment
  3. Continue until found or reach top level
  4. If not found at top level, error: unbound variable

This is lexical scoping: a variable refers to the binding that was visible at the point where the code was written, not where it is executed.

Closures: Functions with Memory

A closure is a function bundled with its lexical environment. When you define a lambda:

(define (make-counter)
  (let ((count 0))
    (lambda ()
      (set! count (+ count 1))
      count)))

(define counter1 (make-counter))
(counter1)  ; Returns 1
(counter1)  ; Returns 2
(counter1)  ; Returns 3

(define counter2 (make-counter))
(counter2)  ; Returns 1 (different count!)

Each closure captures its own count variable. The function “closes over” its environment—hence the name “closure.”

In Haskell, we represent this as:

data LispVal
    = ...
    | Func { params  :: [String]
           , vararg  :: Maybe String
           , body    :: [LispVal]
           , closure :: Env
           }

Special Forms vs Functions

Special forms look like function calls but have different evaluation rules:

Functions: Evaluate all arguments first, then call the function.

(+ 1 2)  ; Evaluate 1, evaluate 2, then add them

Special forms: Custom evaluation of arguments.

(if test then else)
; Only evaluate test, then evaluate EITHER then OR else, not both

(define name value)
; Do NOT evaluate name---it's a symbol to bind

(lambda (x) body)
; Do NOT evaluate (x) or body---they define a function

(quote expr)
; Do NOT evaluate expr---return it literally

Common special forms:

Form Behavior
quote Return argument unevaluated
if Evaluate condition, then one branch
define Bind name to value in environment
set! Mutate existing binding
lambda Create function (closure)
let Local bindings
let* Sequential local bindings
letrec Recursive local bindings
cond Multi-way conditional
and Short-circuit conjunction
or Short-circuit disjunction
begin Sequence expressions

Macros: Code that Writes Code

Lisp macros operate on code as data:

(define-macro (when test . body)
  `(if ,test (begin ,@body) #f))

; Usage:
(when (> x 0)
  (print "positive")
  (print "and nonzero"))

; Expands to:
(if (> x 0) (begin (print "positive") (print "and nonzero")) #f)

The quasiquote (`) and unquote (,) and unquote-splicing (,@) allow template-based code generation.

Macros run at “compile time”—they transform source code before evaluation. This is Lisp’s killer feature: you can extend the language itself.

Tail Call Optimization

Functional languages use recursion instead of loops. Without optimization, this would blow the stack:

(define (factorial n)
  (if (= n 0)
      1
      (* n (factorial (- n 1)))))

(factorial 1000000)  ; Stack overflow!

But with tail call optimization, recursive calls in tail position reuse the current stack frame:

(define (factorial-tail n acc)
  (if (= n 0)
      acc
      (factorial-tail (- n 1) (* n acc))))

(factorial-tail 1000000 1)  ; Works!

A call is in tail position if its result is returned directly without further computation. Implementing TCO is crucial for a production-quality interpreter.

Continuations and call/cc

Scheme’s most powerful feature is first-class continuations via call/cc (call-with-current-continuation):

(call/cc
  (lambda (k)
    (+ 1 (k 42))))  ; Returns 42, not 43

; k is the continuation---"what to do next"
; Calling k immediately returns 42 from the entire call/cc

Continuations enable:

  • Non-local exit (like exceptions)
  • Coroutines
  • Backtracking
  • Cooperative multitasking
  • Generators

Implementing call/cc requires continuation-passing style (CPS) or explicit continuation data structures.

The Metacircular Evaluator

The most profound aspect of Lisp is the metacircular evaluator: an interpreter for Lisp written in Lisp itself. From SICP:

(define (eval exp env)
  (cond ((self-evaluating? exp) exp)
        ((variable? exp) (lookup-variable-value exp env))
        ((quoted? exp) (text-of-quotation exp))
        ((assignment? exp) (eval-assignment exp env))
        ((definition? exp) (eval-definition exp env))
        ((if? exp) (eval-if exp env))
        ((lambda? exp)
         (make-procedure (lambda-parameters exp)
                        (lambda-body exp)
                        env))
        ((begin? exp)
         (eval-sequence (begin-actions exp) env))
        ((cond? exp) (eval (cond->if exp) env))
        ((application? exp)
         (apply (eval (operator exp) env)
               (list-of-values (operands exp) env)))
        (else
         (error "Unknown expression type -- EVAL" exp))))

(define (apply procedure arguments)
  (cond ((primitive-procedure? procedure)
         (apply-primitive-procedure procedure arguments))
        ((compound-procedure? procedure)
         (eval-sequence
          (procedure-body procedure)
          (extend-environment
           (procedure-parameters procedure)
           arguments
           (procedure-environment procedure))))
        (else
         (error "Unknown procedure type -- APPLY" procedure))))

This is both a running interpreter and a formal specification of Lisp’s semantics.

Lisp vs Scheme vs Other Dialects

Lisp is the family name. Main dialects:

  • Common Lisp: The ANSI standard, large, practical, Lisp-2 (separate namespaces for functions and variables)
  • Scheme: Minimalist, clean, Lisp-1 (single namespace), emphasizes closures and continuations
  • Clojure: Modern, runs on JVM, immutable by default, practical
  • Racket: Scheme descendant, excellent for language design

We will implement a subset of Scheme because:

  1. Cleaner semantics
  2. Single namespace simplifies implementation
  3. Write Yourself a Scheme provides guidance
  4. Better matches Haskell’s functional style

Complete Project Specification

What You Are Building

A Scheme interpreter called hscheme that:

  1. Parses S-expressions from text to an AST
  2. Evaluates expressions in an environment
  3. Supports core special forms (define, lambda, if, quote, let, cond)
  4. Provides primitive functions (arithmetic, list operations, comparison)
  5. Implements closures with proper lexical scoping
  6. Provides a REPL for interactive use
  7. Handles errors gracefully with meaningful messages

Functional Requirements

; Arithmetic
(+ 1 2 3 4)           ; => 10
(- 10 3)              ; => 7
(* 2 3 4)             ; => 24
(/ 10 2)              ; => 5

; Comparison
(= 1 1)               ; => #t
(< 1 2)               ; => #t
(> 3 2)               ; => #t

; List operations
(cons 1 '(2 3))       ; => (1 2 3)
(car '(1 2 3))        ; => 1
(cdr '(1 2 3))        ; => (2 3)
(list 1 2 3)          ; => (1 2 3)
(null? '())           ; => #t
(length '(1 2 3))     ; => 3

; Variables
(define x 10)
x                     ; => 10

; Functions
(define (square x) (* x x))
(square 5)            ; => 25

(define (factorial n)
  (if (= n 0)
      1
      (* n (factorial (- n 1)))))
(factorial 10)        ; => 3628800

; Lambda
((lambda (x y) (+ x y)) 3 4)  ; => 7

; Higher-order functions
(define (map f xs)
  (if (null? xs)
      '()
      (cons (f (car xs)) (map f (cdr xs)))))
(map square '(1 2 3 4 5))  ; => (1 4 9 16 25)

; Closures
(define (make-adder n)
  (lambda (x) (+ x n)))
(define add5 (make-adder 5))
(add5 10)             ; => 15

; Let bindings
(let ((x 1) (y 2)) (+ x y))  ; => 3

; Conditionals
(cond ((< x 0) 'negative)
      ((= x 0) 'zero)
      (else 'positive))

Core Data Types

-- The Lisp value type
data LispVal
    = Atom String                 -- Symbols
    | List [LispVal]              -- Lists (including nil)
    | DottedList [LispVal] LispVal -- Improper lists
    | Number Integer              -- Integers
    | Float Double                -- Floating point
    | String String               -- Strings
    | Bool Bool                   -- Booleans
    | PrimitiveFunc ([LispVal] -> ThrowsError LispVal)
    | Func { params  :: [String]
           , vararg  :: Maybe String
           , body    :: [LispVal]
           , closure :: Env
           }
    | IOFunc ([LispVal] -> IOThrowsError LispVal)
    | Port Handle

-- Environment: mutable mapping from names to values
type Env = IORef [(String, IORef LispVal)]

-- Error types
data LispError
    = NumArgs Integer [LispVal]
    | TypeMismatch String LispVal
    | Parser ParseError
    | BadSpecialForm String LispVal
    | NotFunction String String
    | UnboundVar String String
    | Default String

-- Error monad stack
type ThrowsError = Either LispError
type IOThrowsError = ExceptT LispError IO

Expected REPL Session

$ hscheme
Haskell Scheme Interpreter v1.0
Type :quit to exit, :help for help

scheme> (+ 1 2 3)
6

scheme> (define (fib n)
          (cond ((= n 0) 0)
                ((= n 1) 1)
                (else (+ (fib (- n 1))
                         (fib (- n 2))))))
fib

scheme> (fib 10)
55

scheme> (map (lambda (x) (* x x)) '(1 2 3 4 5))
(1 4 9 16 25)

scheme> (define (compose f g)
          (lambda (x) (f (g x))))
compose

scheme> (define square-then-double
          (compose (lambda (x) (* x 2)) (lambda (x) (* x x))))
square-then-double

scheme> (square-then-double 5)
50

scheme> :quit
Goodbye!

Solution Architecture

Module Structure

src/
  HScheme/
    Parser.hs        -- S-expression parsing
    Types.hs         -- LispVal, LispError types
    Eval.hs          -- Core evaluation
    Env.hs           -- Environment operations
    Primitives.hs    -- Built-in functions
    SpecialForms.hs  -- if, define, lambda, etc.
    REPL.hs          -- Read-eval-print loop
    Stdlib.hs        -- Standard library (in Scheme)
  Main.hs            -- CLI entry point

Parser Design

Use Megaparsec or Parsec for parsing:

-- Main parser
parseExpr :: Parser LispVal
parseExpr = parseAtom
        <|> parseString
        <|> try parseFloat
        <|> parseNumber
        <|> parseQuoted
        <|> parseQuasiquote
        <|> parseUnquote
        <|> parseList

-- Atoms (symbols)
parseAtom :: Parser LispVal
parseAtom = do
    first <- letter <|> symbol
    rest <- many (letter <|> digit <|> symbol)
    let atom = first : rest
    return $ case atom of
        "#t" -> Bool True
        "#f" -> Bool False
        _    -> Atom atom

-- Lists
parseList :: Parser LispVal
parseList = do
    char '('
    x <- sepEndBy parseExpr spaces
    char ')'
    return $ List x

-- Quoted expressions
parseQuoted :: Parser LispVal
parseQuoted = do
    char '\''
    x <- parseExpr
    return $ List [Atom "quote", x]

Evaluator Design

eval :: Env -> LispVal -> IOThrowsError LispVal

-- Self-evaluating forms
eval env val@(String _) = return val
eval env val@(Number _) = return val
eval env val@(Float _) = return val
eval env val@(Bool _) = return val

-- Symbol lookup
eval env (Atom id) = getVar env id

-- Quote
eval env (List [Atom "quote", val]) = return val

-- If
eval env (List [Atom "if", pred, conseq, alt]) = do
    result <- eval env pred
    case result of
        Bool False -> eval env alt
        _          -> eval env conseq

-- Define variable
eval env (List [Atom "define", Atom var, form]) = do
    val <- eval env form
    defineVar env var val

-- Define function (syntactic sugar)
eval env (List (Atom "define" : List (Atom var : params) : body)) =
    makeNormalFunc env params body >>= defineVar env var

-- Lambda
eval env (List (Atom "lambda" : List params : body)) =
    makeNormalFunc env params body

-- Let
eval env (List (Atom "let" : List bindings : body)) = do
    env' <- bindVars env =<< mapM extractBinding bindings
    evalBody env' body
  where
    extractBinding (List [Atom var, val]) = do
        val' <- eval env val
        return (var, val')

-- Function application
eval env (List (function : args)) = do
    func <- eval env function
    argVals <- mapM (eval env) args
    apply func argVals

Application Logic

apply :: LispVal -> [LispVal] -> IOThrowsError LispVal

-- Primitive functions
apply (PrimitiveFunc func) args = liftThrows $ func args

-- IO functions
apply (IOFunc func) args = func args

-- User-defined functions
apply (Func params vararg body closure) args =
    if num params /= num args && vararg == Nothing
        then throwError $ NumArgs (num params) args
        else do
            env <- liftIO $ bindVars closure $ zip params args
            env' <- bindVarArg vararg env
            evalBody env' body
  where
    num = toInteger . length
    remainingArgs = drop (length params) args
    bindVarArg arg env = case arg of
        Just argName -> liftIO $ bindVars env [(argName, List remainingArgs)]
        Nothing -> return env

evalBody :: Env -> [LispVal] -> IOThrowsError LispVal
evalBody env = fmap last . mapM (eval env)

Environment Implementation

-- Create empty environment
nullEnv :: IO Env
nullEnv = newIORef []

-- Check if variable is bound
isBound :: Env -> String -> IO Bool
isBound envRef var = do
    env <- readIORef envRef
    return $ isJust (lookup var env)

-- Get variable value
getVar :: Env -> String -> IOThrowsError LispVal
getVar envRef var = do
    env <- liftIO $ readIORef envRef
    case lookup var env of
        Just ref -> liftIO $ readIORef ref
        Nothing  -> throwError $ UnboundVar "Unbound variable" var

-- Set variable value (mutation)
setVar :: Env -> String -> LispVal -> IOThrowsError LispVal
setVar envRef var value = do
    env <- liftIO $ readIORef envRef
    case lookup var env of
        Just ref -> liftIO $ writeIORef ref value >> return value
        Nothing  -> throwError $ UnboundVar "Setting unbound variable" var

-- Define new variable
defineVar :: Env -> String -> LispVal -> IOThrowsError LispVal
defineVar envRef var value = do
    alreadyDefined <- liftIO $ isBound envRef var
    if alreadyDefined
        then setVar envRef var value
        else liftIO $ do
            valueRef <- newIORef value
            env <- readIORef envRef
            writeIORef envRef ((var, valueRef) : env)
            return value

-- Bind multiple variables (for function calls)
bindVars :: Env -> [(String, LispVal)] -> IO Env
bindVars envRef bindings = do
    env <- readIORef envRef
    extendedEnv <- fmap (++ env) (mapM addBinding bindings)
    newIORef extendedEnv
  where
    addBinding (var, value) = do
        ref <- newIORef value
        return (var, ref)

Phased Implementation Guide

Phase 1: Parser (Days 1-4)

Goal: Parse S-expressions into LispVal AST.

Tasks:

  1. Set up project with Parsec/Megaparsec
  2. Implement parseNumber for integers
  3. Implement parseAtom for symbols and booleans
  4. Implement parseString for string literals
  5. Implement parseList for proper lists
  6. Implement parseQuoted for quote syntax
  7. Add tests for all parsers

Validation:

parse "(+ 1 2)" == List [Atom "+", Number 1, Number 2]
parse "'(a b c)" == List [Atom "quote", List [Atom "a", Atom "b", Atom "c"]]

Hints:

  • Handle whitespace between tokens
  • Remember to handle negative numbers
  • Atoms can contain special characters like +, -, *, ?, !

Phase 2: Basic Evaluator (Days 5-9)

Goal: Evaluate simple expressions with primitives.

Tasks:

  1. Implement eval for self-evaluating forms
  2. Create primitive environment with arithmetic operations
  3. Implement function application for primitives
  4. Add comparison operators
  5. Implement if special form
  6. Add basic error handling

Validation:

(+ 1 2 3)     ; => 6
(if (< 1 2) 'yes 'no)  ; => yes
(* (+ 1 2) (- 5 3))    ; => 6

Hints:

  • Primitives take [LispVal] and return ThrowsError LispVal
  • Use foldl1 for variadic arithmetic
  • Remember: (if #f 'yes 'no) should return 'no (only #f is false)

Phase 3: Variables and Definitions (Days 10-13)

Goal: Implement environments with variable binding.

Tasks:

  1. Implement Env type with IORef
  2. Implement getVar, setVar, defineVar
  3. Add define special form
  4. Add set! for mutation
  5. Implement let and let* special forms

Validation:

(define x 10)
x                      ; => 10
(set! x 20)
x                      ; => 20
(let ((a 1) (b 2)) (+ a b))  ; => 3

Hints:

  • Use IOThrowsError for environment operations (need IO for refs)
  • let creates new scope; set! mutates existing binding
  • let* bindings can reference previous bindings

Phase 4: Functions and Closures (Days 14-18)

Goal: Implement lambda and closures.

Tasks:

  1. Add Func constructor to LispVal
  2. Implement lambda special form capturing environment
  3. Implement apply for user-defined functions
  4. Add syntactic sugar for (define (f x) body)
  5. Implement variadic functions with dot notation

Validation:

(define (square x) (* x x))
(square 5)             ; => 25

((lambda (x) (* x x)) 5)  ; => 25

(define (make-adder n)
  (lambda (x) (+ x n)))
(define add5 (make-adder 5))
(add5 10)              ; => 15 (closure works!)

Hints:

  • Closure captures env at lambda definition time
  • When applying, extend closure’s environment, not current environment
  • This is the key to lexical scoping

Phase 5: List Operations and Standard Library (Days 19-22)

Goal: Implement list primitives and standard library.

Tasks:

  1. Implement cons, car, cdr, list
  2. Add predicates: null?, pair?, list?, number?, string?
  3. Implement eq?, eqv?, equal?
  4. Add length, append, reverse
  5. Implement map, filter, fold (in Scheme)

Validation:

(cons 1 '(2 3))        ; => (1 2 3)
(car '(1 2 3))         ; => 1
(cdr '(1 2 3))         ; => (2 3)
(map (lambda (x) (* x 2)) '(1 2 3))  ; => (2 4 6)

Hints:

  • cons creates a pair; (cons 1 2) is different from (cons 1 '(2))
  • equal? does deep equality; eq? is pointer equality
  • Standard library functions can be defined in Scheme itself

Phase 6: REPL and Polish (Days 23-28)

Goal: Build interactive REPL with good UX.

Tasks:

  1. Implement read-eval-print loop
  2. Add line editing with haskeline
  3. Implement :quit, :help, :load commands
  4. Add multi-line input support
  5. Improve error messages
  6. Load standard library on startup

Validation: Full interactive session with helpful errors and command history.

Hints:

  • Use haskeline for readline-like functionality
  • Catch exceptions and continue REPL
  • Support loading files with (load "file.scm")

Testing Strategy

Unit Tests

-- Parser tests
testParseNumber = parse "42" @?= Number 42
testParseNegative = parse "-42" @?= Number (-42)
testParseAtom = parse "foo" @?= Atom "foo"
testParseList = parse "(1 2 3)" @?= List [Number 1, Number 2, Number 3]
testParseQuote = parse "'x" @?= List [Atom "quote", Atom "x"]
testParseNested = parse "((1 2) (3 4))" @?=
    List [List [Number 1, Number 2], List [Number 3, Number 4]]

-- Evaluator tests
testArithmetic = evalStr "(+ 1 2 3)" @?= "6"
testNested = evalStr "(* (+ 1 2) (- 5 3))" @?= "6"
testIf = evalStr "(if #t 'yes 'no)" @?= "yes"
testDefine = do
    env <- primitiveEnv
    evalStr' env "(define x 10)"
    evalStr' env "x" @?= "10"
testLambda = evalStr "((lambda (x) (* x x)) 5)" @?= "25"
testClosure = do
    env <- primitiveEnv
    evalStr' env "(define (make-adder n) (lambda (x) (+ x n)))"
    evalStr' env "(define add5 (make-adder 5))"
    evalStr' env "(add5 10)" @?= "15"

Property-Based Tests

-- Parse/print round-trip
prop_parseShowRoundtrip :: LispVal -> Property
prop_parseShowRoundtrip val =
    parse (show val) === Right val

-- Arithmetic laws
prop_addCommutative :: Integer -> Integer -> Property
prop_addCommutative a b =
    evalStr (printf "(+ %d %d)" a b) === evalStr (printf "(+ %d %d)" b a)

-- List operations
prop_consCarCdr :: LispVal -> [LispVal] -> Property
prop_consCarCdr x xs =
    let list = List xs
    in evalExpr (List [Atom "car", List [Atom "cons", quote x, quote list]]) === x

Integration Tests

Write test files in Scheme:

; test-arithmetic.scm
(define (test name expected actual)
  (if (equal? expected actual)
      (begin (display "PASS: ") (display name) (newline))
      (begin (display "FAIL: ") (display name)
             (display " expected: ") (display expected)
             (display " got: ") (display actual) (newline))))

(test "addition" 6 (+ 1 2 3))
(test "nested" 12 (* (+ 1 2) (+ 1 3)))
(test "factorial" 120 (factorial 5))

Common Pitfalls and Debugging

Pitfall 1: Evaluating Quote’s Argument

Symptom: 'x returns x instead of the symbol x.

Cause: Evaluating the argument to quote.

Solution:

-- WRONG
eval env (List [Atom "quote", val]) = eval env val

-- RIGHT
eval env (List [Atom "quote", val]) = return val

Pitfall 2: Wrong Environment for Closures

Symptom: Closures don’t capture variables correctly.

Cause: Using the calling environment instead of the definition environment.

Solution:

-- When creating closure, capture CURRENT environment
eval env (List (Atom "lambda" : List params : body)) =
    return $ Func (map extractAtom params) Nothing body env
                                                       ^^^
                                                       This is the closure!

-- When applying, extend CLOSURE'S environment
apply (Func params vararg body closure) args = do
    env <- bindVars closure (zip params args)  -- closure, not caller's env
    evalBody env body

Pitfall 3: Let Bindings Order

Symptom: let bindings can’t reference each other.

Cause: This is correct for let! Use let* for sequential bindings.

Solution: Implement both:

-- let: all bindings evaluated in OUTER environment
(let ((a 1) (b a)) ...)  ; ERROR: a not bound during b's evaluation

-- let*: each binding evaluated with previous bindings visible
(let* ((a 1) (b a)) ...) ; OK: b sees a

Pitfall 4: Tail Recursion Blows Stack

Symptom: (factorial 100000) crashes with stack overflow.

Cause: No tail call optimization.

Solution: For basic interpreter, this is expected. For TCO:

  • Use trampolining
  • Or transform to CPS
  • Or implement explicit stack

Pitfall 5: Variable Arity Functions

Symptom: Can’t define functions that take any number of arguments.

Cause: Missing vararg support.

Solution:

-- Support dot notation: (lambda (x y . z) body)
-- z binds to list of remaining arguments

data LispVal = ...
    | Func { params :: [String]
           , vararg :: Maybe String  -- the vararg parameter
           , body :: [LispVal]
           , closure :: Env
           }

-- In apply:
let remainingArgs = drop (length params) args
bindVarArg vararg env = case vararg of
    Just name -> bindVars env [(name, List remainingArgs)]
    Nothing -> return env

Debugging Tips

  1. Add tracing: Print expressions as they’re evaluated
  2. Test incrementally: Verify parser before evaluator, evaluator before environment
  3. Use the REPL: Debug interactively with simple examples
  4. Compare to reference: Use Racket or Guile to verify expected behavior
  5. Print environments: Show what’s in scope at each step

Extensions and Challenges

Extension 1: Macros

Implement define-macro for syntactic extension:

(define-macro (when test . body)
  `(if ,test (begin ,@body) #f))

(when (> x 0)
  (display "positive")
  (newline))

Requires:

  • Quasiquote (`) and unquote (,) syntax
  • Macro expansion phase before evaluation
  • Careful handling of hygiene

Extension 2: Continuations

Implement call/cc:

(call/cc
  (lambda (return)
    (+ 1 (return 42) 3)))  ; => 42

Requires:

  • CPS transformation or
  • Explicit continuation objects

Extension 3: Proper Tail Calls

Implement tail call optimization:

(define (loop n)
  (if (= n 0)
      'done
      (loop (- n 1))))

(loop 1000000)  ; Should not overflow

Approaches:

  • Trampolining
  • Explicit stack with TailCall return value
  • CPS transformation

Extension 4: Garbage Collection

Track memory and implement garbage collection:

  • Mark-and-sweep
  • Reference counting (with cycle detection)
  • Copying collector

Challenge: Self-Hosting

Write enough of the interpreter in Scheme that it can interpret itself:

; A Scheme interpreter written in Scheme
(define (eval exp env)
  (cond ((self-evaluating? exp) exp)
        ((variable? exp) (lookup-variable-value exp env))
        ...))

This is the ultimate test of language completeness.


Real-World Connections

Where Lisp Ideas Appear

  1. JSON: S-expressions for web (data as trees)
  2. React JSX: Code-as-data philosophy
  3. Terraform HCL: Configuration as data
  4. Clojure: Modern Lisp on JVM, immutable by default
  5. Emacs Lisp: Most powerful editor extensibility
  6. AutoCAD AutoLISP: CAD scripting

Industry Applications

  • Grammarly: Written in Common Lisp
  • NASA JPL: Deep space mission planning in Lisp
  • ITA Software (Google Flights): Flight search in Common Lisp
  • YCombinator: Paul Graham built first web apps in Lisp
  • Nubank: Clojure for financial systems
  • Walmart Labs: Clojure at scale

What This Project Teaches You

  • Parser combinator patterns (useful everywhere)
  • Interpreter architecture (applicable to DSLs, configs)
  • Environment and scope (fundamental to all languages)
  • Closure implementation (used in JavaScript, Python, etc.)
  • Monad transformer stacks (real Haskell patterns)

Interview Questions

After completing this project, you should be able to answer:

  1. What is an S-expression and why is the syntax significant?
    • Definition: atoms and lists, recursively
    • Homoiconicity: code is data
    • Enables macros and metaprogramming
    • No parsing ambiguity
  2. How does lexical scoping work and why does it matter?
    • Variables resolved in definition environment, not call environment
    • Closures capture their lexical environment
    • Enables modular reasoning about code
    • Compare to dynamic scoping
  3. What is a closure and how do you implement one?
    • Function + captured environment
    • Stored together in Func data structure
    • When applied, extend closure’s environment
    • Key: use definition-time env, not call-time env
  4. What are special forms and how do they differ from functions?
    • Special forms have custom evaluation rules
    • Functions evaluate all arguments first
    • Examples: if, define, lambda, quote
    • Cannot implement if as a function (would evaluate both branches)
  5. How would you implement tail call optimization?
    • Identify tail position (result returned directly)
    • Reuse current stack frame
    • Techniques: trampolining, explicit stack, CPS
    • Why it matters for functional languages
  6. Explain the monad transformer stack you used.
    • ExceptT LispError IO for errors + IO
    • Why we need IO (mutable environments)
    • How to lift between layers
    • Alternative: effect systems
  7. How would you add macros to your interpreter?
    • Macro expansion phase before evaluation
    • Quasiquote for template-based expansion
    • Hygiene considerations
    • Contrast with Lisp macros vs Rust/C macros

Self-Assessment Checklist

Before considering this project complete, verify:

  • Parser handles all S-expression forms correctly
  • Arithmetic works with multiple operands
  • Variables can be defined and used
  • Lambda creates proper closures
  • Closures capture lexical environment (not dynamic)
  • let and let* work correctly (different scoping)
  • Standard list operations work (car, cdr, cons)
  • Higher-order functions work (map, filter via Scheme)
  • Recursion works (factorial, fibonacci)
  • REPL provides good user experience
  • Error messages are helpful
  • You can write non-trivial programs in your interpreter

Resources

Essential Reading

Topic Resource Notes
Implementation guide “Write Yourself a Scheme in 48 Hours” Free online, follows this project closely
Language semantics SICP Chapter 4 Metacircular evaluator
Original paper “Recursive Functions of Symbolic Expressions” - McCarthy (1960) Historical foundation
Modern Scheme R7RS Scheme standard Reference specification
Parser combinators “Parsec documentation” on Hackage Parser library reference
Error handling “8 ways to report errors in Haskell” Best practices

Books

  • “Structure and Interpretation of Computer Programs” by Abelson & Sussman - The classic
  • “The Little Schemer” by Friedman & Felleisen - Teaches recursion beautifully
  • “Essentials of Programming Languages” by Friedman & Wand - Interpreter design patterns
  • “Lisp in Small Pieces” by Christian Queinnec - Deep dive into Lisp implementation

Papers

  • “The Original LISP Paper” - McCarthy (1960)
  • “Lambda: The Ultimate Imperative” - Steele & Sussman
  • “The Art of the Interpreter” - Steele & Sussman
  • “CONS Should Not Evaluate Its Arguments” - on special forms

Online Resources

Reference Implementations

  • Racket (full-featured Scheme)
  • Guile (GNU Scheme)
  • Chicken Scheme (compiles to C)
  • MIT Scheme (classic implementation)

“Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot.” - Eric S. Raymond

By building a Lisp interpreter, you join a lineage stretching back to 1958. You’ll understand why Paul Graham called Lisp “a programming language for people who want to understand programming languages.”