Project 5: Algebraic Data Types and Pattern Matching

Project 5: Algebraic Data Types and Pattern Matching

Implement core Haskell data types from scratch: Maybe, Either, List, Tree. Understand the algebra of types and the power of pattern matching.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 1-2 weeks
Language Haskell (Alternatives: OCaml, Rust, Scala)
Prerequisites Basic Haskell syntax, recursion, functions as values
Key Topics Sum Types, Product Types, Recursive Types, Pattern Matching, Catamorphisms
Main Resources “Haskell Programming from First Principles” Chapter 11, “Programming in Haskell” Chapter 8

1. Learning Objectives

By completing this project, you will:

  1. Understand types as algebra: Explain why they’re called “algebraic” data types and calculate the cardinality of any type
  2. Model data with sum and product types: Choose the right type structure to accurately represent real-world domains
  3. Implement recursive data structures: Build lists, trees, and other recursive types with correct termination
  4. Master pattern matching: Destructure values exhaustively and understand how the compiler checks coverage
  5. Write generic functions over structures: Implement folds (catamorphisms) that capture the essence of structural recursion
  6. Connect theory to practice: Relate ADTs to union types in TypeScript, enums in Rust, and sealed classes in Kotlin
  7. Design types that make illegal states unrepresentable: Use the type system to enforce business invariants

2. Conceptual Foundation

2.1 What Are Algebraic Data Types?

The name “algebraic” comes from a precise mathematical correspondence: types form an algebra where:

  • Sum types correspond to addition (choice)
  • Product types correspond to multiplication (combination)
  • Function types correspond to exponentiation
  • Recursive types correspond to fixed points

This isn’t just a metaphor. We can literally do arithmetic with types and calculate how many values each type has.

2.2 The Algebra of Types

Cardinality

The cardinality of a type is the number of distinct values it can hold:

Type Cardinality
Void (no constructors) 0
() (unit) 1
Bool 2
Int8 256
Char ~1,100,000 (Unicode)

Void: The Zero Type

data Void  -- No constructors!

There are no values of type Void. It’s the empty type, the type with zero inhabitants. You can never construct a Void, but you can define functions from it:

absurd :: Void -> a
absurd v = case v of {}  -- Empty case, vacuously true

This corresponds to the algebraic fact that 0 * a = 0 and a^0 = 1.

Unit: The One Type

data () = ()  -- One constructor, no fields

There’s exactly one value of type (), also written (). It carries no information.

unit :: ()
unit = ()

In algebra: 1 * a = a (unit is the multiplicative identity).

Sum Types (Addition)

A sum type (also called a tagged union or coproduct) represents a choice between alternatives:

data Bool = False | True
Bool has two constructors, so Bool = 2.
data Either a b = Left a | Right b

Either a b is either a value of type a (tagged with Left) or a value of type b (tagged with Right):

Either a b = a + b

This is addition! If a has 3 values and b has 5 values, then Either a b has 3 + 5 = 8 values.

Why it’s called a sum: You’re adding the possibilities. A value is one thing or another.

data Maybe a = Nothing | Just a
Maybe a = 1 + a (one for Nothing, plus all the Just values)

This is like adding zero: Maybe a is Either () a.

Product Types (Multiplication)

A product type (also called a tuple or record) combines multiple values together:

data (a, b) = (a, b)  -- Tuple
data Person = Person String Int  -- Record-ish

For a pair:

(a, b) = a * b

This is multiplication! If a has 3 values and b has 5 values, then (a, b) has 3 * 5 = 15 values.

Why it’s called a product: You’re multiplying possibilities. A value contains one thing and another.

Function Types (Exponentiation)

Functions correspond to exponentiation:

a -> b = b ^ a
Why? A function from a to b must map each of the a inputs to one of the b outputs. The number of such mappings is b * b * … * b ( a times) = b ^ a .
Example: Bool -> Bool has Bool ^ Bool = 2^2 = 4 possible functions:
id    :: Bool -> Bool  -- id x = x
not   :: Bool -> Bool  -- not x = not x
const True  :: Bool -> Bool  -- \_ -> True
const False :: Bool -> Bool  -- \_ -> False

2.3 Sum Types in Depth

The Essence of Sum Types

A sum type represents a closed set of alternatives. The compiler knows all possible cases and can verify you handle each one.

data Color = Red | Green | Blue

colorToRGB :: Color -> (Int, Int, Int)
colorToRGB Red   = (255, 0, 0)
colorToRGB Green = (0, 255, 0)
colorToRGB Blue  = (0, 0, 255)

If you forget a case, the compiler warns you (with -Wall):

Warning: Pattern match(es) are non-exhaustive
    In an equation for 'colorToRGB': Patterns not matched: Blue

Sum Types with Data

Constructors can carry data:

data Shape
  = Circle Double              -- radius
  | Rectangle Double Double    -- width, height
  | Triangle Double Double Double  -- three sides

Each constructor is like a labeled box that can hold different amounts of stuff:

area :: Shape -> Double
area (Circle r)        = pi * r * r
area (Rectangle w h)   = w * h
area (Triangle a b c)  = -- Heron's formula
  let s = (a + b + c) / 2
  in sqrt (s * (s-a) * (s-b) * (s-c))

Maybe: The Canonical Optional Type

data Maybe a = Nothing | Just a

Maybe represents the possibility of absence. Instead of null pointers:

-- Old way: returns null if not found (runtime error waiting to happen)
-- find :: String -> [Person] -> Person

-- Better way: explicit about possible absence
find :: String -> [Person] -> Maybe Person
find name []     = Nothing
find name (p:ps)
  | personName p == name = Just p
  | otherwise            = find name ps

The type system forces you to handle both cases:

greet :: Maybe Person -> String
greet Nothing  = "Hello, stranger!"
greet (Just p) = "Hello, " ++ personName p ++ "!"

Either: The Canonical Error Type

data Either a b = Left a | Right b

By convention, Left holds errors and Right holds success values (mnemonic: “right” = “correct”):

data ParseError = InvalidNumber | EmptyInput | TooLong

parseInt :: String -> Either ParseError Int
parseInt "" = Left EmptyInput
parseInt s
  | length s > 10 = Left TooLong
  | all isDigit s = Right (read s)
  | otherwise     = Left InvalidNumber

Unlike exceptions, the type signature tells you parsing can fail. You must handle both cases to use the result.

2.4 Product Types in Depth

Tuples

Haskell has built-in tuple syntax:

point :: (Double, Double)
point = (3.0, 4.0)

getName :: (String, Int) -> String
getName (name, _) = name

But tuples are just products:

-- These are equivalent:
data Pair a b = Pair a b
type Pair' a b = (a, b)

Records with Named Fields

For more than 2-3 fields, use named fields:

data Person = Person
  { personName :: String
  , personAge  :: Int
  , personEmail :: String
  }

-- Construction
alice :: Person
alice = Person { personName = "Alice", personAge = 30, personEmail = "alice@example.com" }

-- Access
greeting = "Hello, " ++ personName alice

-- Update
olderAlice = alice { personAge = 31 }

Named fields automatically create accessor functions:

personName :: Person -> String
personAge :: Person -> Int
personEmail :: Person -> String

Newtypes: Single-Constructor Single-Field

When you have exactly one constructor with exactly one field, use newtype:

newtype UserId = UserId Int
newtype Celsius = Celsius Double
newtype Meters = Meters Double

newtype has zero runtime overhead (it’s erased after type checking) but provides type safety:

addDistances :: Meters -> Meters -> Meters
addDistances (Meters a) (Meters b) = Meters (a + b)

-- This won't compile:
-- addDistances (Celsius 20) (Meters 5)

2.5 Recursive Types

The Key Insight

A recursive type is defined in terms of itself. This mirrors mathematical induction:

  • Base case: A constructor that doesn’t refer to the type
  • Recursive case: A constructor that includes the type being defined

Lists

data List a = Nil | Cons a (List a)
  • Nil is the base case (empty list)
  • Cons x xs is the recursive case (element + rest of list)

A list is either:

  • Empty (Nil), or
  • An element followed by another list (Cons x xs)
exampleList :: List Int
exampleList = Cons 1 (Cons 2 (Cons 3 Nil))
-- Equivalent to [1, 2, 3]

Trees

data Tree a = Leaf | Node a (Tree a) (Tree a)

A tree is either:

  • Empty (Leaf), or
  • A value with left and right subtrees (Node x left right)
exampleTree :: Tree Int
exampleTree = Node 1
                (Node 2 Leaf Leaf)
                (Node 3 (Node 4 Leaf Leaf) Leaf)
--     1
--    / \
--   2   3
--      /
--     4

Natural Numbers (Peano)

Even numbers can be defined recursively:

data Nat = Zero | Succ Nat
  • Zero is 0
  • Succ n is n + 1
one   = Succ Zero
two   = Succ (Succ Zero)
three = Succ (Succ (Succ Zero))

This is impractical for computation but illuminating for understanding recursion.

2.6 Pattern Matching

Pattern matching is the primary way to work with algebraic data types. It combines:

  1. Case analysis: Which constructor was used?
  2. Destructuring: Extract the components
  3. Binding: Give names to the parts

Basic Patterns

isZero :: Nat -> Bool
isZero Zero    = True
isZero (Succ _) = False

head :: List a -> Maybe a
head Nil        = Nothing
head (Cons x _) = Just x

depth :: Tree a -> Int
depth Leaf = 0
depth (Node _ left right) = 1 + max (depth left) (depth right)

Pattern Syntax

Pattern Matches
x Anything, binds to x
_ Anything, doesn’t bind
Constructor p1 p2 ... Values built with that constructor
(p1, p2) Tuples
[p1, p2, p3] Lists of exactly that length
(p1:p2) Non-empty lists
n@pattern Binds whole value to n, also matches pattern
!p Strict pattern (forces evaluation)

Guards

Guards add conditions to patterns:

classify :: Int -> String
classify n
  | n < 0     = "negative"
  | n == 0    = "zero"
  | n < 10    = "small positive"
  | otherwise = "large positive"

safeDiv :: Int -> Int -> Maybe Int
safeDiv _ 0 = Nothing
safeDiv n d = Just (n `div` d)

As-Patterns

The @ pattern binds a name while also matching structure:

-- Without as-pattern (rebuilds the list):
duplicate :: List a -> List a
duplicate Nil         = Nil
duplicate (Cons x xs) = Cons x (Cons x (duplicate xs))

-- With as-pattern (reuses the cons cell):
firstTwo :: List a -> Maybe (a, a)
firstTwo (Cons x rest@(Cons y _)) = Just (x, y)
firstTwo _ = Nothing

2.7 Structural Recursion and Folds

The Pattern

Nearly every function on a recursive type follows the same structure:

-- For List:
f Nil         = <base case>
f (Cons x xs) = <combine x with (f xs)>

-- For Tree:
g Leaf              = <base case>
g (Node x left right) = <combine x with (g left) and (g right)>

This pattern is called structural recursion or primitive recursion. It always terminates because each recursive call works on a smaller structure.

Foldr: The Universal List Function

foldr :: (a -> b -> b) -> b -> List a -> b
foldr f z Nil         = z
foldr f z (Cons x xs) = f x (foldr f z xs)

foldr captures the essence of structural recursion on lists. Any function that follows the pattern can be written as a fold:

sum :: List Int -> Int
sum = foldr (+) 0

product :: List Int -> Int
product = foldr (*) 1

length :: List a -> Int
length = foldr (\_ n -> n + 1) 0

map :: (a -> b) -> List a -> List b
map f = foldr (\x xs -> Cons (f x) xs) Nil

filter :: (a -> Bool) -> List a -> List a
filter p = foldr (\x xs -> if p x then Cons x xs else xs) Nil

append :: List a -> List a -> List a
append xs ys = foldr Cons ys xs

The universality of foldr is profound: any function that recursively processes a list can be expressed as a fold.

Catamorphisms

The general concept is called a catamorphism (from Greek “cata-“ meaning “down” + “morphism” meaning “shape/form”). For any recursive type F, there’s a corresponding fold:

-- List catamorphism
foldList :: (a -> b -> b) -> b -> List a -> b

-- Tree catamorphism
foldTree :: b -> (a -> b -> b -> b) -> Tree a -> b
foldTree leaf node Leaf = leaf
foldTree leaf node (Node x l r) = node x (foldTree leaf node l) (foldTree leaf node r)

-- Nat catamorphism
foldNat :: a -> (a -> a) -> Nat -> a
foldNat z s Zero     = z
foldNat z s (Succ n) = s (foldNat z s n)

2.8 Making Illegal States Unrepresentable

The deepest lesson of ADTs is designing types so that invalid states cannot be constructed.

Bad: Representing State with Booleans

data User = User
  { name :: String
  , isLoggedIn :: Bool
  , sessionToken :: Maybe String  -- Only valid if isLoggedIn
  , loginTime :: Maybe UTCTime    -- Only valid if isLoggedIn
  }

-- Problem: What does this mean?
badUser = User "Bob" False (Just "token123") (Just someTime)
-- Not logged in, but has a session token?

Good: Encode States in Types

data User
  = Anonymous
  | LoggedIn { name :: String, sessionToken :: String, loginTime :: UTCTime }

-- Now it's impossible to have a session token without being logged in!

Another Example: Non-Empty Lists

-- Regular lists can be empty
data List a = Nil | Cons a (List a)

-- Non-empty lists cannot be empty by construction
data NonEmpty a = NonEmpty a (List a)

-- safe head:
head :: NonEmpty a -> a
head (NonEmpty x _) = x

-- No Maybe needed! Absence of elements is not representable.

2.9 Deriving Type Class Instances

Haskell can automatically derive common instances:

data Color = Red | Green | Blue
  deriving (Eq, Ord, Show, Read, Enum, Bounded)

-- Now we can:
Red == Green      -- False
Red < Blue        -- True (Ord follows constructor order)
show Green        -- "Green"
read "Blue" :: Color  -- Blue
[Red .. Blue]     -- [Red, Green, Blue]
minBound :: Color -- Red
maxBound :: Color -- Blue

For custom types:

data Point = Point Double Double
  deriving (Eq, Show)

data Tree a = Leaf | Node a (Tree a) (Tree a)
  deriving (Eq, Show, Functor, Foldable, Traversable)

3. Project Specification

3.1 Phase 1: Core Types (Week 1, Days 1-3)

Implement the fundamental algebraic data types from scratch.

Task 1: MyMaybe

data MyMaybe a = MyNothing | MyJust a

-- Required functions:
myFromMaybe :: a -> MyMaybe a -> a
myMaybe :: b -> (a -> b) -> MyMaybe a -> b  -- Catamorphism
myMapMaybe :: (a -> b) -> MyMaybe a -> MyMaybe b
myIsJust :: MyMaybe a -> Bool
myIsNothing :: MyMaybe a -> Bool
myFromJust :: MyMaybe a -> a  -- Partial, document why
myCatMaybes :: [MyMaybe a] -> [a]
myMapMaybes :: (a -> MyMaybe b) -> [a] -> [b]

Task 2: MyEither

data MyEither a b = MyLeft a | MyRight b

-- Required functions:
myEither :: (a -> c) -> (b -> c) -> MyEither a b -> c  -- Catamorphism
myFromLeft :: a -> MyEither a b -> a
myFromRight :: b -> MyEither a b -> b
myIsLeft :: MyEither a b -> Bool
myIsRight :: MyEither a b -> Bool
myLefts :: [MyEither a b] -> [a]
myRights :: [MyEither a b] -> [b]
myPartitionEithers :: [MyEither a b] -> ([a], [b])

3.2 Phase 2: Recursive Types (Week 1, Days 4-7)

Task 3: MyList

data MyList a = MyNil | MyCons a (MyList a)

-- Core functions:
myHead :: MyList a -> MyMaybe a
myTail :: MyList a -> MyMaybe (MyList a)
myLength :: MyList a -> Int
myNull :: MyList a -> Bool

-- The fold (catamorphism):
myFoldr :: (a -> b -> b) -> b -> MyList a -> b
myFoldl :: (b -> a -> b) -> b -> MyList a -> b
myFoldl' :: (b -> a -> b) -> b -> MyList a -> b  -- Strict version

-- Higher-order functions (implement using foldr):
myMap :: (a -> b) -> MyList a -> MyList b
myFilter :: (a -> Bool) -> MyList a -> MyList a
myAppend :: MyList a -> MyList a -> MyList a
myConcat :: MyList (MyList a) -> MyList a
myConcatMap :: (a -> MyList b) -> MyList a -> MyList b

-- More complex:
myZip :: MyList a -> MyList b -> MyList (a, b)
myZipWith :: (a -> b -> c) -> MyList a -> MyList b -> MyList c
myTake :: Int -> MyList a -> MyList a
myDrop :: Int -> MyList a -> MyList a
myReverse :: MyList a -> MyList a
mySplitAt :: Int -> MyList a -> (MyList a, MyList a)

Task 4: MyTree

data MyTree a = MyLeaf | MyNode a (MyTree a) (MyTree a)

-- Core functions:
treeSize :: MyTree a -> Int
treeDepth :: MyTree a -> Int
treeSum :: Num a => MyTree a -> a
treeProduct :: Num a => MyTree a -> a

-- The fold (catamorphism):
treeFold :: b -> (a -> b -> b -> b) -> MyTree a -> b

-- Traversals (implement using treeFold):
treeInOrder :: MyTree a -> MyList a
treePreOrder :: MyTree a -> MyList a
treePostOrder :: MyTree a -> MyList a
treeLevelOrder :: MyTree a -> MyList a  -- BFS, needs queue

-- Maps:
treeMap :: (a -> b) -> MyTree a -> MyTree b
treeFilter :: (a -> Bool) -> MyTree a -> MyTree a  -- Remove nodes

-- BST operations (assuming Ord):
treeInsert :: Ord a => a -> MyTree a -> MyTree a
treeMember :: Ord a => a -> MyTree a -> Bool
treeFromList :: Ord a => [a] -> MyTree a

3.3 Phase 3: Advanced Types (Week 2)

Task 5: Additional Types

-- Rose trees (arbitrary branching)
data MyRose a = MyRose a (MyList (MyRose a))

roseFold :: (a -> [b] -> b) -> MyRose a -> b
roseMap :: (a -> b) -> MyRose a -> MyRose b
roseFlatten :: MyRose a -> MyList a

-- Non-empty lists
data MyNonEmpty a = MyNonEmpty a (MyList a)

neHead :: MyNonEmpty a -> a  -- Total!
neTail :: MyNonEmpty a -> MyList a
neToList :: MyNonEmpty a -> MyList a
neFromList :: MyList a -> MyMaybe (MyNonEmpty a)

-- Natural numbers
data MyNat = MyZero | MySucc MyNat

natToInt :: MyNat -> Int
intToNat :: Int -> MyMaybe MyNat
natAdd :: MyNat -> MyNat -> MyNat
natMult :: MyNat -> MyNat -> MyNat
natFold :: a -> (a -> a) -> MyNat -> a

Task 6: Type-Safe Builder

Design a state machine using types:

-- Traffic light that can only transition legally
data Red = Red
data Yellow = Yellow
data Green = Green

data Light a where
  RedLight    :: Light Red
  YellowLight :: Light Yellow
  GreenLight  :: Light Green

-- Only valid transitions compile:
redToGreen :: Light Red -> Light Green
greenToYellow :: Light Green -> Light Yellow
yellowToRed :: Light Yellow -> Light Red

-- Invalid transitions won't even compile:
-- greenToRed :: Light Green -> Light Red  -- No implementation possible!

3.4 Deliverables

  1. Library module: MyTypes.hs with all types and functions
  2. Test suite: Property-based tests for all functions
  3. Documentation: Haddock comments explaining each type and function
  4. Examples module: Demonstration of usage patterns

4. Solution Architecture

4.1 Module Organization

src/
  MyTypes/
    Maybe.hs      -- MyMaybe type and functions
    Either.hs     -- MyEither type and functions
    List.hs       -- MyList type and functions
    Tree.hs       -- MyTree type and functions
    Rose.hs       -- MyRose type and functions
    NonEmpty.hs   -- MyNonEmpty type and functions
    Nat.hs        -- MyNat type and functions
  MyTypes.hs      -- Re-exports everything

test/
  MyTypes/
    MaybeSpec.hs
    EitherSpec.hs
    ListSpec.hs
    TreeSpec.hs
    Properties.hs  -- Shared properties

examples/
  Examples.hs

4.2 Implementation Strategy

Implement Fold First

For each recursive type, implement the fold (catamorphism) first. Then implement other functions using the fold where possible:

-- Step 1: Define type
data MyList a = MyNil | MyCons a (MyList a)

-- Step 2: Define fold
myFoldr :: (a -> b -> b) -> b -> MyList a -> b
myFoldr _ z MyNil         = z
myFoldr f z (MyCons x xs) = f x (myFoldr f z xs)

-- Step 3: Derive other functions
myLength = myFoldr (\_ n -> n + 1) 0
mySum = myFoldr (+) 0
myMap f = myFoldr (\x xs -> MyCons (f x) xs) MyNil

Test the Fold Laws

The fold should satisfy the universal property:

-- For any h, f, z, if:
--   h MyNil = z
--   h (MyCons x xs) = f x (h xs)
-- Then:
--   h = myFoldr f z

-- Example: myLength satisfies:
--   myLength MyNil = 0
--   myLength (MyCons x xs) = 1 + myLength xs
-- So:
--   myLength = myFoldr (\_ n -> n + 1) 0

4.3 Key Insights

Folds Eliminate Constructors

Think of a fold as “replacing constructors”:

-- Original list:     MyCons 1 (MyCons 2 (MyCons 3 MyNil))
-- After foldr (+) 0:    (+) 1 (   (+) 2 (   (+) 3    0 ))

The fold replaces:

  • MyCons with f
  • MyNil with z

Folds Build Structure

Conversely, to build a structure, use a fold with constructors:

myReverse :: MyList a -> MyList a
myReverse = myFoldl (flip MyCons) MyNil
-- myFoldl flips the order!

5. Implementation Guide

5.1 Phase 1 Milestones

Milestone 1.1: MyMaybe Complete

  • All functions implemented
  • Tests pass
  • Can use in other functions

Milestone 1.2: MyEither Complete

  • All functions implemented
  • Tests pass
  • Understand Left for errors, Right for success

Milestone 1.3: Conversions

  • myMaybeToEither :: a -> MyMaybe b -> MyEither a b
  • myEitherToMaybe :: MyEither a b -> MyMaybe b

5.2 Phase 2 Milestones

Milestone 2.1: Core List Functions

  • myFoldr works
  • myMap, myFilter, myAppend work
  • Conversion to/from Haskell []

Milestone 2.2: Advanced List Functions

  • myZipWith, myTake, myDrop work
  • myReverse works (use foldl)
  • All functions use folds where possible

Milestone 2.3: Tree Core

  • treeFold works
  • Traversals work
  • treeMap works

Milestone 2.4: BST Operations

  • Insert maintains BST property
  • Lookup is O(log n) for balanced trees

5.3 Hints

Hint 1: foldl vs foldr

-- foldr works from right: f x1 (f x2 (f x3 z))
-- foldl works from left: f (f (f z x1) x2) x3

-- foldr is natural for building new lists
-- foldl is natural for accumulating values
-- foldl' (strict) prevents space leaks

Hint 2: Implementing reverse

-- Wrong (O(n^2)):
myReverse xs = myFoldr (\x acc -> myAppend acc (MyCons x MyNil)) MyNil xs

-- Right (O(n)):
myReverse xs = myFoldl (flip MyCons) MyNil xs

Hint 3: Pattern matching in tree traversals

-- In-order: left, current, right
treeInOrder :: MyTree a -> MyList a
treeInOrder = treeFold MyNil (\x left right -> myAppend left (MyCons x right))

6. Testing Strategy

6.1 Unit Tests

describe "MyMaybe" $ do
  it "myFromMaybe returns default for Nothing" $
    myFromMaybe 0 MyNothing `shouldBe` 0

  it "myFromMaybe returns value for Just" $
    myFromMaybe 0 (MyJust 5) `shouldBe` 5

describe "MyList" $ do
  it "myFoldr sums correctly" $
    myFoldr (+) 0 (MyCons 1 (MyCons 2 (MyCons 3 MyNil))) `shouldBe` 6

  it "myReverse reverses" $
    myReverse (MyCons 1 (MyCons 2 (MyCons 3 MyNil)))
      `shouldBe` MyCons 3 (MyCons 2 (MyCons 1 MyNil))

6.2 Property-Based Tests

-- Functor laws
prop_mapId :: MyList Int -> Bool
prop_mapId xs = myMap id xs == xs

prop_mapCompose :: Fun Int Int -> Fun Int Int -> MyList Int -> Bool
prop_mapCompose (Fn f) (Fn g) xs =
  myMap (f . g) xs == myMap f (myMap g xs)

-- Fold properties
prop_foldrAppend :: MyList Int -> MyList Int -> Bool
prop_foldrAppend xs ys =
  myFoldr MyCons ys xs == myAppend xs ys

prop_lengthAppend :: MyList Int -> MyList Int -> Bool
prop_lengthAppend xs ys =
  myLength (myAppend xs ys) == myLength xs + myLength ys

-- Reverse properties
prop_reverseReverse :: MyList Int -> Bool
prop_reverseReverse xs = myReverse (myReverse xs) == xs

prop_reverseLength :: MyList Int -> Bool
prop_reverseLength xs = myLength (myReverse xs) == myLength xs

6.3 BST Properties

prop_insertMember :: Int -> MyTree Int -> Bool
prop_insertMember x t = treeMember x (treeInsert x t)

prop_inOrderSorted :: [Int] -> Bool
prop_inOrderSorted xs =
  let tree = treeFromList xs
      inOrder = treeInOrder tree
  in isSorted inOrder

isSorted :: Ord a => MyList a -> Bool
isSorted MyNil = True
isSorted (MyCons _ MyNil) = True
isSorted (MyCons x (MyCons y rest)) = x <= y && isSorted (MyCons y rest)

7. Common Pitfalls

7.1 Forgetting Base Cases

Wrong:

myLength (MyCons _ xs) = 1 + myLength xs
-- Missing: myLength MyNil = ???

Right:

myLength MyNil = 0
myLength (MyCons _ xs) = 1 + myLength xs

7.2 Non-Terminating Recursion

Wrong:

myMap f xs = MyCons (f (myHead xs)) (myMap f (myTail xs))
-- Never terminates! No base case.

Right:

myMap f MyNil = MyNil
myMap f (MyCons x xs) = MyCons (f x) (myMap f xs)

7.3 Space Leaks with foldl

Wrong (for large lists):

mySum = myFoldl (+) 0
-- Builds up thunks: (((0 + 1) + 2) + 3)...

Right:

mySum = myFoldl' (+) 0
-- Forces evaluation at each step

7.4 Wrong Fold Direction

Wrong:

myReverse = myFoldr (\x acc -> myAppend acc (MyCons x MyNil)) MyNil
-- O(n^2) because append is O(n)

Right:

myReverse = myFoldl (flip MyCons) MyNil
-- O(n)

7.5 Partial Functions

Dangerous:

myHead :: MyList a -> a
myHead (MyCons x _) = x
-- Runtime error on MyNil!

Safe:

myHead :: MyList a -> MyMaybe a
myHead MyNil = MyNothing
myHead (MyCons x _) = MyJust x

8. Extensions and Challenges

8.1 Basic Extensions

  1. Zipper for Trees: Navigate and modify trees efficiently
  2. Finger Trees: Efficient append and prepend
  3. Difference Lists: O(1) append

8.2 Advanced Extensions

  1. Type-Indexed Data:
    data HList (ts :: [*]) where
      HNil :: HList '[]
      HCons :: t -> HList ts -> HList (t ': ts)
    
  2. Generalized Algebraic Data Types (GADTs):
    data Expr a where
      ILit :: Int -> Expr Int
      BLit :: Bool -> Expr Bool
      Add :: Expr Int -> Expr Int -> Expr Int
      If :: Expr Bool -> Expr a -> Expr a -> Expr a
    
  3. Fix Points and F-Algebras:
    newtype Fix f = Fix { unFix :: f (Fix f) }
    
    cata :: Functor f => (f a -> a) -> Fix f -> a
    cata alg = alg . fmap (cata alg) . unFix
    

8.3 Real-World Applications

  1. JSON Parser: Use ADTs to represent JSON
  2. Expression Evaluator: Build an AST with ADTs
  3. Game State Machine: Encode valid transitions in types

9. Real-World Connections

9.1 ADTs in Other Languages

Language Sum Types Product Types
Rust enum struct, tuple
TypeScript Discriminated unions Objects, tuples
Kotlin sealed class data class
Swift enum with associated values struct
Scala sealed trait + case class case class
OCaml Variant types Records, tuples
F# Discriminated unions Records

9.2 Industry Usage

  • Elm: Uses ADTs for Model-Update-View architecture
  • Rust: Uses Option and Result (equivalent to Maybe/Either)
  • Redux: Actions are ADTs (sum type of possible events)
  • Domain-Driven Design: Aggregate roots often model as ADTs

9.3 Pattern Matching in Production

Many modern languages are adding pattern matching:

  • Python 3.10+: match statement
  • JavaScript: Pattern matching proposal (Stage 1)
  • C# 8+: Extended pattern matching
  • Java 21+: Pattern matching for switch

10. Interview Questions

After completing this project, you should be able to answer:

  1. What makes data types “algebraic”?
    • Types correspond to algebra: sum = addition, product = multiplication
    • Can calculate cardinality using arithmetic
    • Example: Either Bool (Maybe Int) = 2 + (1 + Int )
  2. Explain the difference between sum and product types with examples.
    • Sum: choice between alternatives (Either, Maybe, enums)
    • Product: combination of values (tuples, records, structs)
    • A value of a sum type is one alternative; a value of a product type contains all fields
  3. What is pattern matching and why is it preferable to accessors?
    • Destructures values while doing case analysis
    • Compiler checks exhaustiveness
    • Binds variables in same expression
    • No null pointer exceptions
  4. What is a fold/catamorphism?
    • The fundamental way to consume a recursive data structure
    • Replaces constructors with operations
    • Any structurally recursive function can be expressed as a fold
  5. What does “making illegal states unrepresentable” mean?
    • Design types so invalid combinations cannot be constructed
    • Encode invariants in the type structure
    • Example: NonEmpty list vs. checking for empty at runtime
  6. When would you use newtype vs. data?
    • newtype: single constructor, single field, zero runtime overhead
    • data: multiple constructors or fields, has runtime representation
    • newtype for type safety without performance cost
  7. How do ADTs relate to the visitor pattern in OOP?
    • ADTs + pattern matching is dual to inheritance + virtual methods
    • ADTs: easy to add functions, hard to add cases
    • OOP: easy to add classes, hard to add methods
    • This is the “expression problem”

11. Self-Assessment Checklist

Core Types

  • Can implement Maybe with all standard functions
  • Can implement Either with all standard functions
  • Understand when to use Left vs Right by convention
  • Can convert between Maybe and Either

Recursive Types

  • Can implement a linked list from scratch
  • Can implement a binary tree from scratch
  • Understand base case vs recursive case
  • Can trace through recursive function execution

Folds

  • Can write a fold for any recursive type
  • Can implement map, filter, length using fold
  • Understand the difference between foldl and foldr
  • Know when to use strict foldl’

Pattern Matching

  • Can pattern match on nested structures
  • Understand exhaustiveness checking
  • Can use guards effectively
  • Know the difference between _ and named patterns

Type Design

  • Can design types that prevent invalid states
  • Know when to use newtype
  • Can derive standard type classes
  • Understand type cardinality

Theory

  • Can explain why types are “algebraic”
  • Understand the expression problem
  • Can relate ADTs to other languages’ features
  • Understand catamorphisms conceptually

12. Resources

Primary Texts

  1. “Haskell Programming from First Principles” by Allen & Moronuki
    • Chapter 11: Algebraic datatypes
    • Chapter 12: Signaling adversity (Maybe, Either)
    • Excellent exercises
  2. “Programming in Haskell” by Graham Hutton
    • Chapter 8: Declaring types and classes
    • Chapter 7: Higher-order functions (folds)
    • Concise and mathematical
  3. “Learn You a Haskell for Great Good!” by Miran Lipovaca
    • Chapter 8: Making Our Own Types and Typeclasses
    • Visual and beginner-friendly
    • Free online

Papers

  1. “Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire” by Meijer et al.
    • The classic paper on recursion schemes
    • Introduces catamorphisms, anamorphisms, etc.
  2. “A Tutorial on the Universality and Expressiveness of Fold” by Graham Hutton
    • Why fold is so powerful
    • Very accessible

Online Resources

  1. Typeclassopedia
    • Comprehensive guide to Haskell type classes
    • Shows where ADTs fit in the larger picture
  2. Algebraic Data Types in Haskell
    • School of Haskell tutorial
    • Interactive examples
  3. Parse, don’t validate
    • Famous blog post on type-driven design
    • Making illegal states unrepresentable

Video Lectures

  1. CIS 194 (UPenn)
    • Lecture 3: Algebraic datatypes
    • Free course materials
  2. Haskell MOOC (University of Helsinki)
    • Week 3: Datatypes
    • Exercises included

“The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.” – Edsger W. Dijkstra

After completing this project, you’ll see data differently. Types are not just labels–they are precise descriptions of possibility spaces.