← Back to all projects

LEARN PROPERTY BASED TESTING DEEP DIVE

In 1999, John Hughes and Koen Claessen created QuickCheck for Haskell. It changed how we think about correctness. Instead of writing five tests with specific inputs like `add(1, 2)`, PBT asks you to define a property: `add(a, b) == add(b, a)`. The computer then tries to break that rule 10,000 times with random data.

Learn Property-Based Testing: From Zero to Invariant Master

Goal: Deeply understand Property-Based Testing (PBT) by building the tools that break software. You will move beyond “example-based testing” to understand how to define universal invariants, build sophisticated random data generators, and implement “shrinking” algorithms that find the smallest possible needle in a haystack of failures. By the end, you’ll be able to prove—not just hope—that your code works for all possible inputs.

Why Property-Based Testing Matters

In 1999, John Hughes and Koen Claessen created QuickCheck for Haskell. It changed how we think about correctness. Instead of writing five tests with specific inputs like add(1, 2), PBT asks you to define a property: add(a, b) == add(b, a). The computer then tries to break that rule 10,000 times with random data.

The “Unforeseen” Bug Hunter: PBT finds edge cases you didn’t know existed (empty strings, NaN, giant integers, circular references).
Documentation as Code: Properties are formal specifications of what your code must do, always.
Shrinking: When a test fails with a 5,000-character string, PBT automatically shrinks it down to the exact 1 character that causes the crash.
State Machine Mastery: PBT can simulate years of random user interactions in seconds to find race conditions and logic flaws in complex systems.

Core Concept Analysis

1. Example-Based vs. Property-Based

Example-Based (Traditional):
[Input: 2, 3] -> [Code: Add] -> [Expected: 5]
"I hope these 3 cases cover everything."

Property-Based:
[Generator: Ints] -> [Code: Add] -> [Check: a+b == b+a]
"I'll try 1,000 random pairs to prove commutativity."

2. The Shrinking Process

When a failure is found, the engine tries to “shrink” the input to the simplest form.

Failing Input: "A very long string with a \0 in the middle"
      ↓
Attempt 1: "A very long string" (Passes)
      ↓
Attempt 2: "string with a \0" (Fails)
      ↓
Attempt 3: "\0" (Fails)
      ↓
Result: Minimal reproduction is "\0"

3. The PBT Feedback Loop

┌─────────────────┐      ┌──────────────────┐
│  Data Generator │─────▶│   Test Function  │
└─────────────────┘      └──────────┬───────┘
         ▲                          │
         │ (Success)                │ (Failure!)
         │                          ▼
         │               ┌──────────────────┐
         └───────────────┤ Shrinking Engine │
                         └──────────────────┘

The Core Pillars of PBT

A. Generators (Arbitraries)

Generators are the heart of PBT. They aren’t just random(). They are combinators that know how to create valid structures:

oneOf([Int, String])
listOf(Int, min=5)
recursiveTreeGenerator(depth=3)

B. Invariants (Properties)

These are the “universal truths” of your functions:

Inverse: encrypt(decrypt(x)) == x
Idempotency: sort(sort(x)) == sort(x)
Oracle: myComplexSort(x) == simpleSlowSort(x)
Conservation: sum(accountsBeforeTransfer) == sum(accountsAfterTransfer)

C. Shrinking Strategies

Binary Shrinking: Halve the size of lists or integers.
Structural Shrinking: Remove branches from trees.
Type-Aware Shrinking: Try lower-case letters, then empty strings.

Concept Summary Table

Concept Cluster	What You Need to Internalize
Invariants	Identifying what never changes, regardless of the input.
Generators	Composing simple random types into complex, valid domain objects.
Shrinking	The algorithm that simplifies a failing case into a “minimal reproduction.”
Stateful PBT	Modeling a system as a state machine and testing random sequences of commands.
Oracles	Using a simpler, trusted (but perhaps slower) implementation to verify a complex one.

Deep Dive Reading by Concept

Foundational Theory

Concept	Book & Chapter
Origins of QuickCheck	“Property-Based Testing in PropEr, Erlang, and Elixir” by Fred Hebert — Ch. 1: “What is Property-Based Testing?”
Thinking in Properties	“Haskell Programming from First Principles” by Christopher Allen — Ch. 14: “Testing” (QuickCheck section)

Advanced Techniques

Concept	Book & Chapter
Stateful Testing	“Property-Based Testing in PropEr, Erlang, and Elixir” by Fred Hebert — Ch. 9: “Stateful Properties”
Custom Generators	“Expert F#” by Don Syme — Ch. 18: “Unit Testing” (FsCheck sections)

Essential Reading Order

The Mindset Shift (Week 1):
- Fred Hebert Ch. 1-2 (The transition from examples to properties).
- Haskell Programming Ch. 14 (Understanding Generators).

Project List

Projects are ordered from fundamental understanding to advanced implementations.

Project 1: The “Arbitrary” Factory (Custom Generators)

File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
Main Programming Language: TypeScript
Alternative Programming Languages: Python, Go, Rust
Coolness Level: Level 3: Genuinely Clever
Business Potential: 1. The “Resume Gold”
Difficulty: Level 1: Beginner
Knowledge Area: Randomness / Combinators
Software or Tool: Built from scratch
Main Book: “Property-Based Testing in PropEr, Erlang, and Elixir” by Fred Hebert

What you’ll build: A library of “Generators” that can produce complex, nested random data structure (like JSON objects, binary trees, or valid email addresses) based on simple building blocks.

Why it teaches PBT: You can’t do PBT if you can’t generate data. This project teaches you how to compose simple randomness into domain-specific data, and how to control the “shape” of random data so it doesn’t just produce junk.

Core challenges you’ll face:

Composition: Building a List<Email> generator using Int, String, and Char generators.
Biased Randomness: Ensuring you generate “edge” values (0, empty string, null, MaxInt) more often than average values.
Recursive Data: Preventing a tree generator from recursing infinitely and blowing the stack.

Key Concepts:

Probability Distribution: How to weigh “small” vs “large” values.
Combinators: The flatMap or bind pattern for generators.
Seeded Randomness: Why reproducibility is vital for debugging.

Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic understanding of functions and classes.

Real World Outcome

You will have a library that can generate complex data structures for any function you want to test.

Example Output:

const emailGen = Gen.email();
console.log(emailGen.generate(5)); 

// Output:
// [
//   "a@b.com",
//   "user_123@domain.org",
//   "",               <-- Edge case (empty local part)!
//   "very.long.name.with.dots@sub.domain.co.uk",
//   "!#$%&'*+/=?^_`{|}~@example.com"
// ]

The Core Question You’re Answering

“How do I describe the ‘shape’ of my data so the computer can explore all valid (and invalid) variations of it?”

Before you write any code, sit with this question. Most tests use “nice” data like “John Doe”. Your generator needs to be the “Chaos Monkey” that finds “John \0 Doe” or “J” or a name that is 10MB long.

Concepts You Must Understand First

Stop and research these before coding:

PRNG (Pseudo-Random Number Generators)
- What is a “Seed” and why do we need it for testing?
- Book Reference: “Art of Computer Programming, Volume 2” Ch. 3 - Donald Knuth
Functional Composition
- How can you take two generators (e.g., String and Int) and combine them into one (e.g., User)?
- Book Reference: “Haskell Programming from First Principles” Ch. 14

Questions to Guide Your Design

Before implementing, think through these:

Range Control
- If I ask for an Integer, should it be -1 to 1, or -2^31 to 2^31?
- How do I make sure “0” appears in at least 10% of my tests?
Recursion
- If a User has a list of Friends (who are also Users), how do I stop the generator?

Thinking Exercise

The “Weight” of Randomness

Imagine you are testing a square root function. Most random numbers are positive.

// If you just use Math.random():
// 99.9% of tests will be positive.
// You might miss the failure for negative numbers.

Questions while tracing:

How would you design a generator that returns negative numbers 50% of the time, even if the underlying random source is 0 to 1?
How do you “force” the generator to return exactly 0 or -0?

The Interview Questions They’ll Ask

Prepare to answer these:

“Why shouldn’t we use standard Math.random() directly in property-based testing?”
“How do you ensure a random generator is reproducible across different machines?”
“What is the difference between a ‘Generator’ and a ‘Fuzzer’?”
“How do you handle generating data for mutually recursive types?”
“Explain the ‘bias’ problem in random data generation.”

Hints in Layers

Hint 1: The Interface Start with a simple class Generator<T> that has a generate(seed: number): T method.

Hint 2: Combinators Implement a map function. If you have a Generator<number>, you should be able to map it to a Generator<string> by converting the number to a string.

Hint 3: Choice Create a oneOf function that takes an array of generators and picks one at random to run.

Hint 4: Shrinking Hooks (Advanced) Think ahead: when a value is generated, can you also generate a “shrunk” version of that value?

Books That Will Help

Topic	Book	Chapter
Random Generation	“The Art of Computer Programming, Vol 2”	Ch 3
QuickCheck Design	“Haskell Programming from First Principles”	Ch 14

Implementation Hints

Focus on the flatMap pattern. A generator should be a function that takes some entropy (random bits) and returns a value. By chaining these functions, you can build complex objects. For recursive structures, use a “depth” parameter to decrease the probability of choosing the recursive branch as you go deeper.

Learning Milestones

You can generate random integers and strings.
You can generate a List of objects by composing smaller generators.
You can generate recursive structures like JSON or Trees without infinite loops.

Project 2: The Invariant Runner (The PBT Loop)

File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
Main Programming Language: Python
Alternative Programming Languages: Java, C#, Rust
Coolness Level: Level 3: Genuinely Clever
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 2: Intermediate
Knowledge Area: Test Harnesses / Exception Handling
Software or Tool: Custom Test Runner
Main Book: “Property-Based Testing in PropEr, Erlang, and Elixir” by Fred Hebert

What you’ll build: A test runner that takes a “Property” (a function that returns true/false) and a “Generator”, and runs the property 100+ times, reporting exactly which input caused a failure.

Why it teaches PBT: This is the engine. You’ll learn how to manage the lifecycle of a test, how to catch and log errors, and how to report a failing “counter-example” that a developer can actually use.

Core challenges you’ll face:

Reproducibility: Printing the “Seed” so the user can re-run the exact same failing test.
Early Exit: Stopping the test suite as soon as a failure is found.
Reporting: Formatting the failing input so it’s readable.

Key Concepts:

Universal Quantification: The idea that “For all x, P(x) is true”.
Seed Management: Passing the state of the random number generator.
Counter-examples: The specific input that breaks the rule.

Difficulty: Intermediate Time estimate: 1 week

Real World Outcome

A CLI tool that runs your properties and gives you a clear “REJECTED” message with the failing data.

Example Output:

$ python pbt_runner.py test_list_reversal
Running 100 tests...
[..........F..........]
FAILED!
Property: reverse(reverse(list)) == list
Iteration: 22
Seed: 987234
Failing Input: [1, "a", null, 5]
Error: list index out of range

The Core Question You’re Answering

“How do I automate the search for a ‘counter-example’ to my code’s assumptions?”

Concepts You Must Understand First

Stop and research these before coding:

Assertion vs Return Values
- Should a property return false or throw an AssertionError on failure?
- Book Reference: “Code Complete, 2nd Edition” Ch. 8 - Steve McConnell
Loop Invariants
- What makes a property “universal”?
- Book Reference: “The Science of Programming” - David Gries

Questions to Guide Your Design

Before implementing, think through these:

Concurrency
- If a test hangs (infinite loop), how does the runner handle it?
Statistical Significance
- Is 100 runs enough? How do you let the user configure the run count?

Thinking Exercise

The Flaky Test vs The Bug

You run a test 100 times. It passes. You run it 1,000 times. It fails once.

Questions while tracing:

Is this a “flaky” test?
If you use the same seed, will it fail again?
How does your runner handle “discarded” tests (e.g., when the generator produces data that doesn’t meet a precondition)?

The Interview Questions They’ll Ask

Prepare to answer these:

“What is a counter-example?”
“How do you handle flaky tests in a property-based environment?”
“Explain how you would implement a ‘Precondition’ (check before running the property).”
“Why is the ‘Seed’ the most important piece of information in a PBT failure?”
“How would you integrate this runner into a CI/CD pipeline?”

Hints in Layers

Hint 1: The Loop The core is just a for loop from 1 to 100. In each step, call the generator, then call the property with the result.

Hint 2: Error Handling Wrap the property call in a try/except block. If it fails, capture the exception and the input that caused it.

Hint 3: Seed Passing Don’t use a global random state. Create a Random object with a specific seed and pass it to your generators.

Hint 4: Verbosity Add a “verbose” mode that prints every generated value as it’s being tested.

Books That Will Help

Topic	Book	Chapter
Test Harnesses	“xUnit Test Patterns”	Ch. 11
PBT Loops	“Property-Based Testing in PropEr…”	Ch. 2

Implementation Hints

Ensure your runner takes a seed as an optional argument. If not provided, generate a random one and print it. If provided, use it to initialize your PRNG. This allows users to re-run specific failures. For reporting, use a library like pprint or json.dumps to make the counter-example readable.

Learning Milestones

You can run a property 100 times against a generator.
You can successfully capture and report a failing input.
You can reproduce a failure by providing the same seed to the runner.

Project 3: The Incredible Shrinking Input (Minimal Repros)

File: LEARN_PROPERTY_BASED_TEST_DEEP_DIVE.md
Main Programming Language: Rust
Alternative Programming Languages: C++, Haskell, OCaml
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 1. The “Resume Gold”
Difficulty: Level 3: Advanced
Knowledge Area: Search Algorithms / Tree Traversal
Software or Tool: Shrinking Engine
Main Book: “Property-Based Testing in PropEr, Erlang, and Elixir” by Fred Hebert (Ch. 4)

What you’ll build: A shrinking engine. When a test fails with a complex input (e.g., a 100-node tree), the engine systematically simplifies the input until it finds the smallest possible input that still fails.

Why it teaches PBT: Shrinking is the “secret sauce” of PBT. Without it, you just have random fuzzing. You’ll learn how to navigate the space of “simpler” inputs and how to handle the “local minima” problem in search.

Core challenges you’ll face:

Definition of “Simpler”: Is 0 simpler than 1? Is "a" simpler than "ab"?
Search Strategy: How to efficiently try smaller versions without checking every possible combination.
Integrated Shrinking: Linking the generator to the shrinker so the shrinker knows the type of data it’s handling.

Key Concepts:

Greedy Search: Always moving towards the smaller failing case.
Monotonicity: The assumption that if a small thing fails, a larger thing containing it might also fail.
Tree-based Shrinking: Treating complex structures as trees and pruning them.

Difficulty: Advanced Time estimate: 1-2 weeks

Real World Outcome

A tool that takes a massive, failing crash log and hands you a 1-line reproduction case.

Example Output:

$ ./shrinker --input "{\"user\": {\"id\": 1, \"name\": \"...1000 chars...\", \"email\": \"bad\0email\"}}"
Found failure! Shrinking...
Try size 500... Fails
Try size 100... Fails
Try empty name... Passes
Try 1 char name... Fails
Minimal Failure Found:
{ "name": "A", "email": "\0" }

The Core Question You’re Answering

“How do I find the smallest possible piece of data that breaks my code?”

Concepts You Must Understand First

Stop and research these before coding:

Rose Trees
- How can we represent all possible shrinks of a value as a tree?
- Book Reference: “Haskell Programming from First Principles” Ch. 18
Greedy Search Algorithms
- Why do we pick the first shrink that still fails?
- Book Reference: “Algorithms” 4th Ed. - Robert Sedgewick

Questions to Guide Your Design

Before implementing, think through these:

Termination
- How do you guarantee the shrinker stops? (The “Simpler” definition must be well-founded).
Complexity
- If you have a list of 1000 items, and items 10, 500, and 999 are needed for the crash, how does the shrinker find them?

Thinking Exercise

What is “Simple”?

Rank these inputs from “Most Complex” to “Simplest”:

[5, 4, 3, 2, 1]
[1]
[]
[0, 0, 0]
[-1, 1]

Questions while tracing:

Is a list of three zeros “simpler” than a list with one 1?
Does your answer change if the bug is triggered by “any list longer than 2”?

The Interview Questions They’ll Ask

Prepare to answer these:

“What is the difference between internal and external shrinking?”
“Why is shrinking important for developer productivity?”
“Explain the ‘local minima’ problem in shrinking.”
“How do you shrink a custom data type like a Red-Black Tree?”
“Can a shrinker ever return a value that passes the test?”

Hints in Layers

Hint 1: Shrinking Integers To shrink an integer $N$ towards $0$, try $0$, then $N/2$, then $N-1$.

Hint 2: Shrinking Lists Try removing the first half, then the second half, then individual elements.

Hint 3: The Shrink Loop while true: get list of candidates from current failing input. If any candidate fails, update current input and repeat. If no candidates fail, you are done.

Hint 4: Combinators If you have a shrinker for $A$ and a shrinker for $B$, how do you shrink a Tuple $(A, B)$? You shrink $A$ while keeping $B$ fixed, then shrink $B$ while keeping $A$ fixed.

Books That Will Help

Topic	Book	Chapter
Rose Trees	“Learn You a Haskell for Great Good!”	Ch. 11
Search Strategies	“Algorithm Design” by Kleinberg & Tardos	Ch. 4

Implementation Hints

A common approach is “Type-Directed Shrinking.” Each generator returns not just a value, but a “Rose Tree” of values. The root is the generated value, and the children are all possible one-step shrinks. This makes the shrinking engine generic—it just traverses the tree looking for the first failing node.

Project 4: The Oracle (Verifying Complex Algorithms)

File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
Main Programming Language: Go
Alternative Programming Languages: C++, Rust, Java
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 2: Intermediate
Knowledge Area: Algorithms / Verification
Software or Tool: High-performance sorting or search algorithm
Main Book: “Algorithms, Fourth Edition” by Robert Sedgewick

What you’ll build: A verification suite for a high-performance, complex algorithm (like a custom TimSort or a B-Tree). You will test it against an “Oracle”—a simple, obviously correct (but slow) implementation like a bubble sort or a naive array search.

Why it teaches PBT: Sometimes you don’t know the exact result of a function, but you know it should match a simpler version. This project teaches the “Oracle” pattern, which is the most powerful way to test complex logic.

Core challenges you’ll face:

Consistency: Ensuring both the fast and slow versions are given the exact same input.
Handling Non-Determinism: If the algorithm allows multiple valid outputs (e.g., sorting items with equal keys), how do you verify they are both “correct”?
Performance Trade-offs: How to handle cases where the Oracle is so slow that it limits the number of PBT runs you can do.

Key Concepts:

Differential Testing: Comparing two implementations of the same spec.
Observational Equivalence: When two systems look identical from the outside.

Difficulty: Intermediate Time estimate: 1 week

Real World Outcome

A test suite that proves your “optimised” code hasn’t introduced any subtle bugs that the naive version doesn’t have.

Example Output:

$ go test -v ./optimized_sort
Running 5000 iterations...
Iteration 4532: FAILED
Input: [10, 2, 5, 2, 8]
Optimized Result: [2, 2, 5, 10, 8]  <-- Bug!
Oracle Result:    [2, 2, 5, 8, 10]

The Core Question You’re Answering

“How do I know my ‘fast’ code is correct if the logic is too complex for humans to verify manually?”

Thinking Exercise

The Sorting Oracle

If you are testing a “Stable Sort” algorithm, and your Oracle is a non-stable sort:

Questions while tracing:

Will the Oracle and the Optimized version always return the same array?
If not, how do you define a property that allows for different orders of equal elements but still verifies the sorting?

Hints in Layers

Hint 1: The Wrapper Write a function testAlgorithm(input) that runs both versions and compares the results.

Hint 2: Sorting Objects Instead of sorting just numbers, sort objects like {key: number, original_index: number}. This makes it easier to track stability and correctness.

Hint 3: Pre-conditions Some algorithms only work on specific data (e.g., a search algorithm on a sorted list). Use your generators to create already-sorted data or use a “Filter” to discard invalid inputs.

Project 5: The Virtual User (Stateful System Testing)

File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
Main Programming Language: Elixir
Alternative Programming Languages: Erlang, Scala, Swift
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 4: Expert
Knowledge Area: Distributed Systems / State Machines
Software or Tool: A Key-Value Store or Bank Account System
Main Book: “Property-Based Testing in PropEr, Erlang, and Elixir” by Fred Hebert (Ch. 9)

What you’ll build: A stateful test harness for a system (like a database or a shopping cart). The generator produces a random sequence of commands (e.g., CreateAccount, Deposit, Withdraw, CloseAccount) and verifies that the system’s state remains consistent after every step.

Why it teaches PBT: Real bugs often hide in specific sequences of actions (e.g., “Withdraw after Close”). Stateful PBT teaches you how to model your system as a formal state machine and find “temporal” bugs.

Core challenges you’ll face:

Model vs. Reality: Maintaining a “Model” of what the system should look like and comparing it to the actual running system.
Valid Sequences: Ensuring the generator doesn’t try to Withdraw from an account that hasn’t been Created yet.
Shrinking Sequences: If a 50-step sequence fails, shrinking it to the 2 steps that actually matter.

Key Concepts:

Linearizability: Does the sequence of random commands make sense in a timeline?
Abstract State Machine: A simplified version of your system used for verification.

Difficulty: Expert Time estimate: 2-3 weeks

Real World Outcome

You’ll find “unreachable” bugs like: “If a user puts an item in the cart, clears their cookies, and then applies a discount, the price becomes negative.”

Example Output:

$ mix test.pbt
FAILED: Sequence of 4 commands leads to inconsistent state:
1. CREATE_USER("alice")
2. ADD_TO_CART("alice", "laptop")
3. DELETE_USER("alice")
4. CHECKOUT("alice") <-- SYSTEM CRASHED! (Expected: 404 User Not Found)

The Core Question You’re Answering

“How do I test a system whose behavior depends on its history, not just its current input?”

Concepts You Must Understand First

Stop and research these before coding:

State Machine Modeling
- How can you represent a system as a set of states and transitions?
- Book Reference: “Operating Systems: Three Easy Pieces” Ch. 4 - Remzi Arpaci-Dusseau
Pre-conditions vs Post-conditions
- Why can’t we run every command in every state?
- Book Reference: “Property-Based Testing in PropEr…” Ch. 9

Questions to Guide Your Design

Before implementing, think through these:

Model Accuracy
- If the system is a database, should your model be another database, or just an in-memory Map?
Command Selection
- How do you weigh commands so you don’t just call GET 100 times and never call POST?

Thinking Exercise

The Bank Account State Machine

States: Closed, Active. Commands: Open, Deposit, Withdraw, Close.

Questions while tracing:

Can you Deposit into a Closed account?
What happens to the balance when an account is Closed?
How does your model track these transitions?

The Interview Questions They’ll Ask

Prepare to answer these:

“What is stateful property-based testing?”
“Explain the ‘Model’ in stateful testing.”
“How do you shrink a sequence of commands while maintaining validity?”
“Why is stateful PBT particularly good at finding race conditions?”
“What are the trade-offs between ‘Black-box’ and ‘White-box’ stateful testing?”

Hints in Layers

Hint 1: The Command List Define your commands as objects/records: {name: 'deposit', amount: 50}.

Hint 2: The Model Keep a simple local variable that represents the expected state of the system. Update it after every command you send to the real system.

Hint 3: Preconditions Each command needs a can_run(model) function. The generator uses this to pick only valid next steps.

Hint 4: Parallel Checking To find race conditions, run two sequences in parallel and check if any serial interleaving of those commands explains the final state.

Books That Will Help

Topic	Book	Chapter
Stateful PBT	“Property-Based Testing in PropEr…”	Ch. 9
State Machines	“Structure and Interpretation of Computer Programs”	Ch. 3

Implementation Hints

In Elixir/Erlang, use the proper_statem behavior. It forces you to define four callbacks: initial_state, command, precondition, and postcondition. The engine handles the random generation and shrinking of command sequences for you. For the model, use a simple Map or List.

Learning Milestones

You can generate a random sequence of valid commands.
You can maintain a local model that mirrors the system’s state.
You can find bugs that only occur after a specific sequence of 5+ actions.

Project 6: The API Fuzzer (Contract Testing)

File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
Main Programming Language: Python (using Hypothesis)
Alternative Programming Languages: JavaScript, Go
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 2: Intermediate
Knowledge Area: Web APIs / HTTP
Software or Tool: A REST or GraphQL API
Main Book: “Designing Web APIs” by Brenda Jin

What you’ll build: A tool that reads an OpenAPI (Swagger) spec and generates thousands of valid and “near-valid” HTTP requests to your API, checking for 500 errors, schema violations, or data leaks.

Why it teaches PBT: It connects PBT to the “real world” of web development. You’ll learn how to generate data that fits a strict external schema (JSON Schema) and how to handle the latency/overhead of network testing.

Core challenges you’ll face:

Authentication: How to generate random tokens or maintain a session during 1000s of requests.
Data Pollution: How to clean up the database after your PBT runner creates 50,000 “John Doe” users.
Rate Limiting: Preventing your own test from being blocked by your security layers.

Key Concepts:

Fuzzing vs. PBT: When random junk is better than random “valid” data.
Contract Testing: Ensuring the server obeys its own documentation.

Difficulty: Intermediate Time estimate: 1-2 weeks

Real World Outcome

A list of specific API endpoints that crash when given unusual (but valid) JSON, or that return sensitive fields they shouldn’t.

Example Output:

$ python api_tester.py http://localhost:8080/v1
Testing POST /users...
Iteration 87: 500 INTERNAL SERVER ERROR
Payload: {"name": "\u0000", "age": 2147483648}
Database Error: Integer overflow on 'age' column.

Implementation Hints

Use the Hypothesis library in Python if possible, as it has excellent support for JSON and Schemas. Use a library like Schemathesis as a reference. Focus on the “Round-trip” property: if you POST a resource and then GET it, the values should match exactly.

Project 7: The Complexity Watchdog (Performance PBT)

File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
Main Programming Language: C++
Alternative Programming Languages: Rust, C, Java
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 3: Advanced
Knowledge Area: Performance / Big O
Software or Tool: Custom Profiler / Benchmarking tool
Main Book: “Systems Performance” by Brendan Gregg

What you’ll build: A tool that searches for “Worst Case” inputs. Instead of testing for correctness, the property is: execution_time < expected_limit. The PBT engine will try to find the specific input size or shape that triggers $O(N^2)$ behavior in an $O(N \log N)$ algorithm (like “median-of-three” Quicksort killers).

Why it teaches PBT: It expands your definition of a “Property.” A property doesn’t have to be a boolean; it can be a resource bound (time, memory, stack depth). You’ll learn how to use PBT to find performance regressions and security vulnerabilities like ReDoS (Regular Expression Denial of Service).

Core challenges you’ll face:

Measurement Noise: How to distinguish a slow algorithm from a background process on your OS.
Input Scaling: How to generate “larger” and “more complex” inputs to see where the curve breaks.
Search Guidance: Using execution time to “guide” the generator towards slower paths (approaching “Evolutionary Testing”).

Key Concepts:

Algorithmic Complexity Attacks: Finding inputs that trigger worst-case performance.
Micro-benchmarking Pitfalls: JIT warming, cache effects, and instruction pipelining.

Difficulty: Advanced Time estimate: 2 weeks

Real World Outcome

You’ll discover inputs that cause your “fast” system to hang for seconds, potentially preventing a Denial of Service attack.

Example Output:

$ ./perf_tester --target my_regex_engine
Running search for ReDoS...
Iteration 1500: Slowdown detected!
Input: "aaaaaaaaaaaaaaaaaaaaaaaaaaaa!"
Time: 4.5 seconds (Expected < 0.01s)
Complexity looks like O(2^N).

Concepts You Must Understand First

Stop and research these before coding:

Big O Notation
- What is the difference between average case and worst case?
- Book Reference: “Grokking Algorithms” Ch. 1 - Aditya Bhargava
ReDoS (Regular Expression DoS)
- How can a string cause exponential backtracking in a regex engine?
- Book Reference: “Foundations of Information Security” Ch. 8 - Jason Andress

Questions to Guide Your Design

Before implementing, think through these:

Thresholds
- How do you define “too slow”? Should it be absolute time or a relative ratio?
Search Feedback
- If a test is slow, how can the generator “learn” to produce similar slow inputs?

Thinking Exercise

The Quicksort Killer

Standard Quicksort picks a pivot. If you always pick the smallest or largest element, it becomes $O(N^2)$.

Questions while tracing:

If you have an array of 1000 items, what specific arrangement (sorted, reverse sorted, etc.) triggers the worst case for a “pivot at index 0” strategy?
How would a PBT generator find this arrangement randomly?

The Interview Questions They’ll Ask

Prepare to answer these:

“How can PBT be used for performance testing?”
“What is an Algorithmic Complexity attack?”
“How do you account for CPU frequency scaling and jitter in performance tests?”
“What are metamorphic properties in the context of performance?”
“Can you explain how to shrink a performance counter-example?”

Hints in Layers

Hint 1: The Timer Use high-resolution timers (like std::chrono::high_resolution_clock in C++).

Hint 2: Relative Limits Instead of time < 1s, use (time_for_2N) / (time_for_N) < limit. For $O(N)$, the ratio should be $\approx 2$. For $O(N^2)$, it will be $\approx 4$.

Hint 3: Genetic Algorithms If an input is slow, mutate it slightly rather than starting from scratch. This “hill-climbing” helps find peak slowness.

Hint 4: Profiling Integration Run the slow input through a profiler to see which function is hot.

Books That Will Help

Topic	Book	Chapter
Performance Analysis	“Systems Performance”	Ch. 2
Worst-case Analysis	“Introduction to Algorithms”	Ch. 2

Implementation Hints

In C++, use std::chrono for timing. Run each test multiple times and take the median to reduce noise. For the “Watchdog”, you can set a setitimer or use a separate thread that kills the execution if it exceeds a hard deadline (e.g., 10x the average).

Learning Milestones

You can identify inputs that trigger significantly slower performance.
You can plot a curve of “Input Size vs Time” automatically using PBT data.
You can find a “Quicksort Killer” or “Regex Killer” input.

Project 8: The Crypto-Validator (Inverse Properties)

File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
Main Programming Language: Rust
Alternative Programming Languages: C, Python (using PyCryptodome)
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 4: Expert
Knowledge Area: Cryptography / Security
Software or Tool: Custom implementation of AES, RSA, or a Hash function
Main Book: “Serious Cryptography” by Jean-Philippe Aumasson

What you’ll build: A verification suite for a cryptographic library. You’ll use the “Inverse” property (decrypt(encrypt(x)) == x) and “Algebraic” properties (e.g., in RSA: encrypt(a * b) == (encrypt(a) * encrypt(b)) % n).

Why it teaches PBT: Cryptography is unforgiving. A single bit error breaks everything. This project teaches you how to use mathematical laws as properties and how to handle extremely large, structured random data (keys, nonces, padding).

Core challenges you’ll face:

Giant Integers: Generating and manipulating 2048-bit primes and numbers.
Structured Randomness: A random key is not just any sequence of bytes; it might need to meet specific entropy or parity requirements.
Known-Answer Tests (KAT): Integrating standard test vectors from NIST with your random PBT tests.

Key Concepts:

Round-trip Properties: The most common type of PBT property.
Malleability: Checking if changing a ciphertext in a specific way produces a predictable change in the plaintext.

Difficulty: Expert Time estimate: 2-3 weeks

Concepts You Must Understand First

Stop and research these before coding:

Big Integer Arithmetic
- How are 2048-bit numbers represented and multiplied?
- Book Reference: “The Art of Computer Programming, Vol 2” Ch. 4
Modular Exponentiation
- Why is $(A \cdot B) \pmod N = ((A \pmod N) \cdot (B \pmod N)) \pmod N$ important for RSA?
- Book Reference: “Serious Cryptography” Ch. 10

Questions to Guide Your Design

Before implementing, think through these:

Test Vectors
- How do you know your PBT generator is actually covering the “known weak” keys (like prime factors that are too close)?
Inverse Quality
- If decrypt(encrypt(x)) == x fails, is the bug in encrypt, decrypt, or the BigInt library?

Thinking Exercise

The Bit-Flip Property

In a stream cipher like AES-CTR, if you flip the 3rd bit of the ciphertext, the 3rd bit of the plaintext should flip.

Questions while tracing:

Is this a property or a vulnerability?
How would you write a PBT test to verify this “Malleability” property?
Why would this property fail for an Authenticated Encryption mode like AES-GCM?

The Interview Questions They’ll Ask

Prepare to answer these:

“Why is PBT uniquely suited for cryptographic software?”
“What is a ‘Round-trip’ property in crypto?”
“How would you test for constant-time execution using PBT?”
“What are Algebraic Properties in public-key cryptography?”
“Explain how you would use PBT to find a side-channel leak.”

Hints in Layers

Hint 1: Small Primes First Don’t start with 2048-bit RSA. Test your logic with 8-bit or 16-bit numbers where you can verify the math manually.

Hint 2: Property Stacking Test decrypt(encrypt(x)) == x, but also test encrypt(x) != x (it should never be identity).

Hint 3: Edge Case Generation Force your generators to produce 0, 1, $N-1$, and powers of 2. These are where big-int bugs live.

Hint 4: Test Vectors Load NIST test vectors into your runner and treat them as “fixed examples” that must always pass alongside the random tests.

Books That Will Help

Topic	Book	Chapter
Cryptography	“Serious Cryptography”	Ch. 1-4
Number Theory	“Concrete Mathematics”	Ch. 4

Implementation Hints

Use the num-bigint crate in Rust. When generating random plaintexts, ensure they are smaller than the modulus $N$. For RSA, remember to test both PKCS#1 v1.5 and OAEP padding. The “Commutative” property of RSA is a great PBT test: $E(M1) \cdot E(M2) \pmod N = E(M1 \cdot M2) \pmod N$.

Learning Milestones

You successfully verify the round-trip property for a custom cipher.
You identify a bug in modular arithmetic using a large random input.
You use an algebraic property (like homomorphic multiplication) to verify an implementation.

Project 9: The Race Hunter (Concurrent PBT)

File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
Main Programming Language: Java (using Lin-Check)
Alternative Programming Languages: C++ (with Relacy), Elixir
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 5. The “Industry Disruptor”
Difficulty: Level 5: Master
Knowledge Area: Concurrency / Memory Models
Software or Tool: A Thread-safe Queue or Concurrent HashMap
Main Book: “The Art of Multiprocessor Programming” by Maurice Herlihy

What you’ll build: A tool that executes random operations on a data structure from multiple threads simultaneously. It then uses PBT to find the exact interleaving of thread operations that leads to a race condition or a corrupted state.

Why it teaches PBT: Race conditions are notoriously hard to reproduce. Concurrent PBT (like the “Parallel State Machine” approach) turns non-deterministic bugs into deterministic ones by controlling the scheduler or searching the space of interleavings.

Core challenges you’ll face:

Interleaving Space: There are billions of ways to order instructions from two threads. How do you find the one that fails?
Linearizability Checking: Given a log of parallel events, how do you determine if there’s a “valid” sequential order that explains the result?
Reproducing the Race: Once you find a failure, how do you force the CPU to repeat that exact timing to prove the fix?

Key Concepts:

Linearizability: The gold standard of concurrent correctness.
Thread Interleaving: The specific order in which different threads’ instructions are executed.

Difficulty: Master Time estimate: 1 month+

Concepts You Must Understand First

Stop and research these before coding:

Memory Models
- What is the difference between “Sequential Consistency” and “Weak Consistency”?
- Book Reference: “A Primer on Memory Consistency and Cache Coherence” - Sorin et al.
Linearizability
- What does it mean for a parallel execution to be “equivalent to a serial one”?
- Book Reference: “The Art of Multiprocessor Programming” Ch. 3

Questions to Guide Your Design

Before implementing, think through these:

Observability
- How can you record the history of events without the “Observer Effect” (where recording the bug makes the bug disappear)?
State Space
- If two threads run 5 operations each, how many possible interleavings are there? Is it feasible to test them all?

Thinking Exercise

The Lost Update

Thread A: x = x + 1 Thread B: x = x + 1 Initial: x = 0

Questions while tracing:

List all possible final values of x if there are no locks.
Which of these values are “Linearizable”?
How would your PBT tool detect the non-linearizable case?

The Interview Questions They’ll Ask

Prepare to answer these:

“What is Linearizability?”
“How do you test a concurrent data structure?”
“Explain the ‘Parallel State Machine’ approach to PBT.”
“What is a ‘context switch’ and how does it relate to race conditions?”
“Can PBT prove the absence of race conditions?”

Hints in Layers

Hint 1: The Actor Pattern Think of each thread as an Actor performing random actions.

Hint 2: The Log Every operation should return a Result and a Timestamp (or a global order ID).

Hint 3: The Linearizability Checker Take the set of parallel operations and try to find a single valid serial order (using your Sequential Model from Project 5) that produces the same results. If no such order exists, you found a race.

Hint 4: Forced Yielding Inject Thread.yield() or sleep(0) randomly into your code to increase the chance of context switches at critical moments.

Books That Will Help

Topic	Book	Chapter
Concurrency	“The Art of Multiprocessor Programming”	Ch. 1-3
Java Memory Model	“Java Concurrency in Practice”	Ch. 16

Implementation Hints

In Java, use the Lin-Check library. It uses “Model-based testing.” You provide a thread-safe implementation and a non-thread-safe “Model.” The library runs them in parallel and compares results. For C++, look at CDSChecker. The hardest part is the “Linearizability Checker,” which is essentially a search problem.

Learning Milestones

You can find race conditions in a simple counter.
You successfully detect a “Lost Update” or “ABA Problem” in a concurrent stack.
You can reproduce a race condition by replaying a specific interleaving of operations.

Implementation Hints

Look into the Lin-Check library for Kotlin/Java or P language from Microsoft. The key is to run two threads: one performing a sequence of random operations, and another doing the same. Then, you record all results and check if they could have occurred in any valid serial execution.

Project 10: The Metamorphic Tester (Testing the Untestable)

File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
Main Programming Language: Python
Alternative Programming Languages: R, MATLAB, Julia
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 5. The “Industry Disruptor”
Difficulty: Level 4: Expert
Knowledge Area: Machine Learning / Scientific Computing
Software or Tool: An Image Classifier or a Search Engine
Main Book: “Metamorphic Testing: Theory and Practice” by T.Y. Chen

What you’ll build: A test suite for a system where you don’t know the “correct” answer (e.g., a search engine or an AI model). Instead of checking if search("cats") is correct, you check the relation between inputs: results(search("cats")) should be a subset of results(search("cats" OR "dogs")).

Why it teaches PBT: This is the cutting edge. It teaches you how to test systems where an “Oracle” is impossible. You learn to identify “Metamorphic Relations”—rules about how the output should change when the input is transformed.

Core challenges you’ll face:

Relation Discovery: Finding useful rules (e.g., “If I rotate this image 90 degrees, the classification should not change”).
False Positives: Handling systems that are non-deterministic or “fuzzy” by nature.
Transformations: Implementing input transformations (like image rotation or query expansion) without introducing bugs in the transformations themselves.

Key Concepts:

Metamorphic Relation (MR): A property that relates multiple inputs and outputs.
Symmetry: A common MR where transforming the input doesn’t change the output.

Difficulty: Expert Time estimate: 2-3 weeks

Concepts You Must Understand First

Stop and research these before coding:

Metamorphic Relations (MR)
- What are the common types of relations (Symmetry, Inclusion, Invariance)?
- Book Reference: “Metamorphic Testing: Theory and Practice” Ch. 2
Precision vs Recall
- How do we test a “fuzzy” system like a search engine?
- Book Reference: “Data Science for Business” Ch. 7 - Foster Provost

Questions to Guide Your Design

Before implementing, think through these:

Transformation Integrity
- If you rotate an image to test a classifier, are you sure the rotation didn’t introduce artifacts that confuse the AI?
Statistical Significance
- If a search result changes slightly, is it a bug, or just the ranking algorithm being non-deterministic?

Thinking Exercise

The Google Maps Test

Suppose you are testing a “Shortest Path” algorithm on a map.

Questions while tracing:

If you add a new road to the map, can the shortest path between A and B get longer? (This is a Metamorphic Relation!)
If you remove a road that was not on the shortest path, can the shortest path change?

The Interview Questions They’ll Ask

Prepare to answer these:

“What is Metamorphic Testing?”
“When should you use Metamorphic Testing instead of standard PBT?”
“Give an example of a Metamorphic Relation for a machine learning model.”
“How do you handle non-determinism in metamorphic properties?”
“Explain how metamorphic testing can be used to test compilers.”

Hints in Layers

Hint 1: Pick a Domain Start with something simple like a sum() function. A metamorphic property is sum([a, b]) == sum([b, a]).

Hint 2: Image Transformations For an image classifier, use libraries like OpenCV to flip, rotate, or change the brightness of an image. The classification should ideally stay the same.

Hint 3: Search Engine Relations Test “Subsumption”: Results(A) ⊆ Results(A OR B). If this is false, the boolean logic is broken.

Hint 4: Compilers Test “Equivalent Transformations”: If you rename a local variable in a source file, the compiled binary’s behavior (the output it produces) should be identical.

Books That Will Help

Topic	Book	Chapter
Metamorphic Testing	“Metamorphic Testing: Theory and Practice”	Ch. 1-3
Search Algorithms	“Introduction to Information Retrieval”	Ch. 1

Implementation Hints

In Python, use numpy or opencv for data transformations. Define a list of Transformations and a list of Relations. For each random input, apply a transformation and check if the relation holds between the original output and the transformed output.

Learning Milestones

You can define a metamorphic relation for a search engine.
You successfully identify a “Symmetry” bug in a machine learning model.
You can test a complex system without knowing the expected output for any single test case.

Project Comparison Table

Project	Difficulty	Time	Depth of Understanding	Fun Factor
1. Arbitrary Factory	Level 1	Weekend	Medium	High
2. Invariant Runner	Level 2	1 Week	Medium	Medium
3. Shrinking Engine	Level 3	2 Weeks	High	High
4. Sorting Oracle	Level 2	1 Week	Medium	Low
5. Stateful User	Level 4	3 Weeks	Very High	High
6. API Fuzzer	Level 2	1 Week	Medium	Medium
7. Complexity Watchdog	Level 3	2 Weeks	High	Medium
8. Crypto Validator	Level 4	2 Weeks	High	Low
9. Race Hunter	Level 5	1 Month	Maximum	High
10. Metamorphic Tester	Level 4	3 Weeks	High	High

Recommendation

If you are new to PBT: Start with Project 1 (The Arbitrary Factory). Understanding how data is generated and composed is the prerequisite for everything else. Once you can generate a random User or Tree, you’ve unlocked the mindset.

If you are a Systems Engineer: Jump straight to Project 9 (The Race Hunter). It’s the most challenging, but solving a “one-in-a-million” race condition with a script you wrote is the ultimate engineering dopamine hit.

Final Overall Project: The “Bulletproof” Compiler

What you’ll build: A small compiler or interpreter for a Lisp-like language. To prove its correctness, you will build a PBT suite that uses:

Generators for random valid source code.
Stateful Testing to simulate a debugger stepping through the code.
An Oracle (an existing Lisp interpreter) to verify the results.
Shrinking to turn a giant failing program into a 3-line bug report.

This project combines everything: data generation, oracle verification, state management, and the “magic” of shrinking.

Summary

This learning path covers Property-Based Testing through 10 hands-on projects. Here’s the complete list:

#	Project Name	Main Language	Difficulty	Time Estimate
1	The “Arbitrary” Factory	TypeScript	Level 1	Weekend
2	The Invariant Runner	Python	Level 2	1 Week
3	The Shrinking Engine	Rust	Level 3	2 Weeks
4	The Oracle	Go	Level 2	1 Week
5	The Virtual User	Elixir	Level 4	3 Weeks
6	The API Fuzzer	Python	Level 2	1 Week
7	The Complexity Watchdog	C++	Level 3	2 Weeks
8	The Crypto-Validator	Rust	Level 4	2 Weeks
9	The Race Hunter	Java	Level 5	1 Month+
10	The Metamorphic Tester	Python	Level 4	3 Weeks

Recommended Learning Path

For beginners: Start with projects #1, #2, #6 For intermediate: Jump to projects #3, #4, #7 For advanced: Focus on projects #5, #8, #9, #10

Expected Outcomes

After completing these projects, you will:

Think in terms of universal invariants rather than specific cases.
Be able to build sophisticated data generators for any domain.
Master the shrinking algorithms that make automated testing useful.
Find race conditions and temporal bugs that manual testing misses.
Verify complex algorithms and security protocols with mathematical rigor.

You’ll have built 10 working projects that demonstrate deep understanding of Property-Based Testing from first principles.