LEARN PROPERTY BASED TESTING DEEP DIVE
In 1999, John Hughes and Koen Claessen created QuickCheck for Haskell. It changed how we think about correctness. Instead of writing five tests with specific inputs like `add(1, 2)`, PBT asks you to define a property: `add(a, b) == add(b, a)`. The computer then tries to break that rule 10,000 times with random data.
Learn Property-Based Testing: From Zero to Invariant Master
Goal: Deeply understand Property-Based Testing (PBT) by building the tools that break software. You will move beyond âexample-based testingâ to understand how to define universal invariants, build sophisticated random data generators, and implement âshrinkingâ algorithms that find the smallest possible needle in a haystack of failures. By the end, youâll be able to proveânot just hopeâthat your code works for all possible inputs.
Why Property-Based Testing Matters
In 1999, John Hughes and Koen Claessen created QuickCheck for Haskell. It changed how we think about correctness. Instead of writing five tests with specific inputs like add(1, 2), PBT asks you to define a property: add(a, b) == add(b, a). The computer then tries to break that rule 10,000 times with random data.
- The âUnforeseenâ Bug Hunter: PBT finds edge cases you didnât know existed (empty strings,
NaN, giant integers, circular references). - Documentation as Code: Properties are formal specifications of what your code must do, always.
- Shrinking: When a test fails with a 5,000-character string, PBT automatically shrinks it down to the exact 1 character that causes the crash.
- State Machine Mastery: PBT can simulate years of random user interactions in seconds to find race conditions and logic flaws in complex systems.
Core Concept Analysis
1. Example-Based vs. Property-Based
Example-Based (Traditional):
[Input: 2, 3] -> [Code: Add] -> [Expected: 5]
"I hope these 3 cases cover everything."
Property-Based:
[Generator: Ints] -> [Code: Add] -> [Check: a+b == b+a]
"I'll try 1,000 random pairs to prove commutativity."
2. The Shrinking Process
When a failure is found, the engine tries to âshrinkâ the input to the simplest form.
Failing Input: "A very long string with a \0 in the middle"
â
Attempt 1: "A very long string" (Passes)
â
Attempt 2: "string with a \0" (Fails)
â
Attempt 3: "\0" (Fails)
â
Result: Minimal reproduction is "\0"
3. The PBT Feedback Loop
âââââââââââââââââââ ââââââââââââââââââââ
â Data Generator âââââââśâ Test Function â
âââââââââââââââââââ ââââââââââââŹââââââââ
Ⲡâ
â (Success) â (Failure!)
â âź
â ââââââââââââââââââââ
ââââââââââââââââ⤠Shrinking Engine â
ââââââââââââââââââââ
The Core Pillars of PBT
A. Generators (Arbitraries)
Generators are the heart of PBT. They arenât just random(). They are combinators that know how to create valid structures:
oneOf([Int, String])listOf(Int, min=5)recursiveTreeGenerator(depth=3)
B. Invariants (Properties)
These are the âuniversal truthsâ of your functions:
- Inverse:
encrypt(decrypt(x)) == x - Idempotency:
sort(sort(x)) == sort(x) - Oracle:
myComplexSort(x) == simpleSlowSort(x) - Conservation:
sum(accountsBeforeTransfer) == sum(accountsAfterTransfer)
C. Shrinking Strategies
- Binary Shrinking: Halve the size of lists or integers.
- Structural Shrinking: Remove branches from trees.
- Type-Aware Shrinking: Try lower-case letters, then empty strings.
Concept Summary Table
| Concept Cluster | What You Need to Internalize |
|---|---|
| Invariants | Identifying what never changes, regardless of the input. |
| Generators | Composing simple random types into complex, valid domain objects. |
| Shrinking | The algorithm that simplifies a failing case into a âminimal reproduction.â |
| Stateful PBT | Modeling a system as a state machine and testing random sequences of commands. |
| Oracles | Using a simpler, trusted (but perhaps slower) implementation to verify a complex one. |
Deep Dive Reading by Concept
Foundational Theory
| Concept | Book & Chapter |
|---|---|
| Origins of QuickCheck | âProperty-Based Testing in PropEr, Erlang, and Elixirâ by Fred Hebert â Ch. 1: âWhat is Property-Based Testing?â |
| Thinking in Properties | âHaskell Programming from First Principlesâ by Christopher Allen â Ch. 14: âTestingâ (QuickCheck section) |
Advanced Techniques
| Concept | Book & Chapter |
|---|---|
| Stateful Testing | âProperty-Based Testing in PropEr, Erlang, and Elixirâ by Fred Hebert â Ch. 9: âStateful Propertiesâ |
| Custom Generators | âExpert F#â by Don Syme â Ch. 18: âUnit Testingâ (FsCheck sections) |
Essential Reading Order
- The Mindset Shift (Week 1):
- Fred Hebert Ch. 1-2 (The transition from examples to properties).
- Haskell Programming Ch. 14 (Understanding Generators).
Project List
Projects are ordered from fundamental understanding to advanced implementations.
Project 1: The âArbitraryâ Factory (Custom Generators)
- File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
- Main Programming Language: TypeScript
- Alternative Programming Languages: Python, Go, Rust
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The âResume Goldâ
- Difficulty: Level 1: Beginner
- Knowledge Area: Randomness / Combinators
- Software or Tool: Built from scratch
- Main Book: âProperty-Based Testing in PropEr, Erlang, and Elixirâ by Fred Hebert
What youâll build: A library of âGeneratorsâ that can produce complex, nested random data structure (like JSON objects, binary trees, or valid email addresses) based on simple building blocks.
Why it teaches PBT: You canât do PBT if you canât generate data. This project teaches you how to compose simple randomness into domain-specific data, and how to control the âshapeâ of random data so it doesnât just produce junk.
Core challenges youâll face:
- Composition: Building a
List<Email>generator usingInt,String, andChargenerators. - Biased Randomness: Ensuring you generate âedgeâ values (0, empty string, null, MaxInt) more often than average values.
- Recursive Data: Preventing a tree generator from recursing infinitely and blowing the stack.
Key Concepts:
- Probability Distribution: How to weigh âsmallâ vs âlargeâ values.
- Combinators: The
flatMaporbindpattern for generators. - Seeded Randomness: Why reproducibility is vital for debugging.
Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic understanding of functions and classes.
Real World Outcome
You will have a library that can generate complex data structures for any function you want to test.
Example Output:
const emailGen = Gen.email();
console.log(emailGen.generate(5));
// Output:
// [
// "a@b.com",
// "user_123@domain.org",
// "", <-- Edge case (empty local part)!
// "very.long.name.with.dots@sub.domain.co.uk",
// "!#$%&'*+/=?^_`{|}~@example.com"
// ]
The Core Question Youâre Answering
âHow do I describe the âshapeâ of my data so the computer can explore all valid (and invalid) variations of it?â
Before you write any code, sit with this question. Most tests use âniceâ data like âJohn Doeâ. Your generator needs to be the âChaos Monkeyâ that finds âJohn \0 Doeâ or âJâ or a name that is 10MB long.
Concepts You Must Understand First
Stop and research these before coding:
- PRNG (Pseudo-Random Number Generators)
- What is a âSeedâ and why do we need it for testing?
- Book Reference: âArt of Computer Programming, Volume 2â Ch. 3 - Donald Knuth
- Functional Composition
- How can you take two generators (e.g., String and Int) and combine them into one (e.g., User)?
- Book Reference: âHaskell Programming from First Principlesâ Ch. 14
Questions to Guide Your Design
Before implementing, think through these:
- Range Control
- If I ask for an Integer, should it be -1 to 1, or -2^31 to 2^31?
- How do I make sure â0â appears in at least 10% of my tests?
- Recursion
- If a User has a list of Friends (who are also Users), how do I stop the generator?
Thinking Exercise
The âWeightâ of Randomness
Imagine you are testing a square root function. Most random numbers are positive.
// If you just use Math.random():
// 99.9% of tests will be positive.
// You might miss the failure for negative numbers.
Questions while tracing:
- How would you design a generator that returns negative numbers 50% of the time, even if the underlying random source is 0 to 1?
- How do you âforceâ the generator to return exactly
0or-0?
The Interview Questions Theyâll Ask
Prepare to answer these:
- âWhy shouldnât we use standard
Math.random()directly in property-based testing?â - âHow do you ensure a random generator is reproducible across different machines?â
- âWhat is the difference between a âGeneratorâ and a âFuzzerâ?â
- âHow do you handle generating data for mutually recursive types?â
- âExplain the âbiasâ problem in random data generation.â
Hints in Layers
Hint 1: The Interface
Start with a simple class Generator<T> that has a generate(seed: number): T method.
Hint 2: Combinators
Implement a map function. If you have a Generator<number>, you should be able to map it to a Generator<string> by converting the number to a string.
Hint 3: Choice
Create a oneOf function that takes an array of generators and picks one at random to run.
Hint 4: Shrinking Hooks (Advanced) Think ahead: when a value is generated, can you also generate a âshrunkâ version of that value?
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Random Generation | âThe Art of Computer Programming, Vol 2â | Ch 3 |
| QuickCheck Design | âHaskell Programming from First Principlesâ | Ch 14 |
Implementation Hints
Focus on the flatMap pattern. A generator should be a function that takes some entropy (random bits) and returns a value. By chaining these functions, you can build complex objects. For recursive structures, use a âdepthâ parameter to decrease the probability of choosing the recursive branch as you go deeper.
Learning Milestones
- You can generate random integers and strings.
- You can generate a
Listof objects by composing smaller generators. - You can generate recursive structures like JSON or Trees without infinite loops.
Project 2: The Invariant Runner (The PBT Loop)
- File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: Java, C#, Rust
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The âMicro-SaaS / Pro Toolâ
- Difficulty: Level 2: Intermediate
- Knowledge Area: Test Harnesses / Exception Handling
- Software or Tool: Custom Test Runner
- Main Book: âProperty-Based Testing in PropEr, Erlang, and Elixirâ by Fred Hebert
What youâll build: A test runner that takes a âPropertyâ (a function that returns true/false) and a âGeneratorâ, and runs the property 100+ times, reporting exactly which input caused a failure.
Why it teaches PBT: This is the engine. Youâll learn how to manage the lifecycle of a test, how to catch and log errors, and how to report a failing âcounter-exampleâ that a developer can actually use.
Core challenges youâll face:
- Reproducibility: Printing the âSeedâ so the user can re-run the exact same failing test.
- Early Exit: Stopping the test suite as soon as a failure is found.
- Reporting: Formatting the failing input so itâs readable.
Key Concepts:
- Universal Quantification: The idea that âFor all x, P(x) is trueâ.
- Seed Management: Passing the state of the random number generator.
- Counter-examples: The specific input that breaks the rule.
Difficulty: Intermediate Time estimate: 1 week
Real World Outcome
A CLI tool that runs your properties and gives you a clear âREJECTEDâ message with the failing data.
Example Output:
$ python pbt_runner.py test_list_reversal
Running 100 tests...
[..........F..........]
FAILED!
Property: reverse(reverse(list)) == list
Iteration: 22
Seed: 987234
Failing Input: [1, "a", null, 5]
Error: list index out of range
The Core Question Youâre Answering
âHow do I automate the search for a âcounter-exampleâ to my codeâs assumptions?â
Concepts You Must Understand First
Stop and research these before coding:
- Assertion vs Return Values
- Should a property return
falseor throw anAssertionErroron failure? - Book Reference: âCode Complete, 2nd Editionâ Ch. 8 - Steve McConnell
- Should a property return
- Loop Invariants
- What makes a property âuniversalâ?
- Book Reference: âThe Science of Programmingâ - David Gries
Questions to Guide Your Design
Before implementing, think through these:
- Concurrency
- If a test hangs (infinite loop), how does the runner handle it?
- Statistical Significance
- Is 100 runs enough? How do you let the user configure the run count?
Thinking Exercise
The Flaky Test vs The Bug
You run a test 100 times. It passes. You run it 1,000 times. It fails once.
Questions while tracing:
- Is this a âflakyâ test?
- If you use the same seed, will it fail again?
- How does your runner handle âdiscardedâ tests (e.g., when the generator produces data that doesnât meet a precondition)?
The Interview Questions Theyâll Ask
Prepare to answer these:
- âWhat is a counter-example?â
- âHow do you handle flaky tests in a property-based environment?â
- âExplain how you would implement a âPreconditionâ (check before running the property).â
- âWhy is the âSeedâ the most important piece of information in a PBT failure?â
- âHow would you integrate this runner into a CI/CD pipeline?â
Hints in Layers
Hint 1: The Loop
The core is just a for loop from 1 to 100. In each step, call the generator, then call the property with the result.
Hint 2: Error Handling
Wrap the property call in a try/except block. If it fails, capture the exception and the input that caused it.
Hint 3: Seed Passing
Donât use a global random state. Create a Random object with a specific seed and pass it to your generators.
Hint 4: Verbosity Add a âverboseâ mode that prints every generated value as itâs being tested.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Test Harnesses | âxUnit Test Patternsâ | Ch. 11 |
| PBT Loops | âProperty-Based Testing in PropErâŚâ | Ch. 2 |
Implementation Hints
Ensure your runner takes a seed as an optional argument. If not provided, generate a random one and print it. If provided, use it to initialize your PRNG. This allows users to re-run specific failures. For reporting, use a library like pprint or json.dumps to make the counter-example readable.
Learning Milestones
- You can run a property 100 times against a generator.
- You can successfully capture and report a failing input.
- You can reproduce a failure by providing the same seed to the runner.
Project 3: The Incredible Shrinking Input (Minimal Repros)
- File: LEARN_PROPERTY_BASED_TEST_DEEP_DIVE.md
- Main Programming Language: Rust
- Alternative Programming Languages: C++, Haskell, OCaml
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The âResume Goldâ
- Difficulty: Level 3: Advanced
- Knowledge Area: Search Algorithms / Tree Traversal
- Software or Tool: Shrinking Engine
- Main Book: âProperty-Based Testing in PropEr, Erlang, and Elixirâ by Fred Hebert (Ch. 4)
What youâll build: A shrinking engine. When a test fails with a complex input (e.g., a 100-node tree), the engine systematically simplifies the input until it finds the smallest possible input that still fails.
Why it teaches PBT: Shrinking is the âsecret sauceâ of PBT. Without it, you just have random fuzzing. Youâll learn how to navigate the space of âsimplerâ inputs and how to handle the âlocal minimaâ problem in search.
Core challenges youâll face:
- Definition of âSimplerâ: Is
0simpler than1? Is"a"simpler than"ab"? - Search Strategy: How to efficiently try smaller versions without checking every possible combination.
- Integrated Shrinking: Linking the generator to the shrinker so the shrinker knows the type of data itâs handling.
Key Concepts:
- Greedy Search: Always moving towards the smaller failing case.
- Monotonicity: The assumption that if a small thing fails, a larger thing containing it might also fail.
- Tree-based Shrinking: Treating complex structures as trees and pruning them.
Difficulty: Advanced Time estimate: 1-2 weeks
Real World Outcome
A tool that takes a massive, failing crash log and hands you a 1-line reproduction case.
Example Output:
$ ./shrinker --input "{\"user\": {\"id\": 1, \"name\": \"...1000 chars...\", \"email\": \"bad\0email\"}}"
Found failure! Shrinking...
Try size 500... Fails
Try size 100... Fails
Try empty name... Passes
Try 1 char name... Fails
Minimal Failure Found:
{ "name": "A", "email": "\0" }
The Core Question Youâre Answering
âHow do I find the smallest possible piece of data that breaks my code?â
Concepts You Must Understand First
Stop and research these before coding:
- Rose Trees
- How can we represent all possible shrinks of a value as a tree?
- Book Reference: âHaskell Programming from First Principlesâ Ch. 18
- Greedy Search Algorithms
- Why do we pick the first shrink that still fails?
- Book Reference: âAlgorithmsâ 4th Ed. - Robert Sedgewick
Questions to Guide Your Design
Before implementing, think through these:
- Termination
- How do you guarantee the shrinker stops? (The âSimplerâ definition must be well-founded).
- Complexity
- If you have a list of 1000 items, and items 10, 500, and 999 are needed for the crash, how does the shrinker find them?
Thinking Exercise
What is âSimpleâ?
Rank these inputs from âMost Complexâ to âSimplestâ:
[5, 4, 3, 2, 1][1][][0, 0, 0][-1, 1]
Questions while tracing:
- Is a list of three zeros âsimplerâ than a list with one
1? - Does your answer change if the bug is triggered by âany list longer than 2â?
The Interview Questions Theyâll Ask
Prepare to answer these:
- âWhat is the difference between internal and external shrinking?â
- âWhy is shrinking important for developer productivity?â
- âExplain the âlocal minimaâ problem in shrinking.â
- âHow do you shrink a custom data type like a Red-Black Tree?â
- âCan a shrinker ever return a value that passes the test?â
Hints in Layers
Hint 1: Shrinking Integers To shrink an integer $N$ towards $0$, try $0$, then $N/2$, then $N-1$.
Hint 2: Shrinking Lists Try removing the first half, then the second half, then individual elements.
Hint 3: The Shrink Loop
while true: get list of candidates from current failing input. If any candidate fails, update current input and repeat. If no candidates fail, you are done.
Hint 4: Combinators If you have a shrinker for $A$ and a shrinker for $B$, how do you shrink a Tuple $(A, B)$? You shrink $A$ while keeping $B$ fixed, then shrink $B$ while keeping $A$ fixed.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Rose Trees | âLearn You a Haskell for Great Good!â | Ch. 11 |
| Search Strategies | âAlgorithm Designâ by Kleinberg & Tardos | Ch. 4 |
Implementation Hints
A common approach is âType-Directed Shrinking.â Each generator returns not just a value, but a âRose Treeâ of values. The root is the generated value, and the children are all possible one-step shrinks. This makes the shrinking engine genericâit just traverses the tree looking for the first failing node.
Project 4: The Oracle (Verifying Complex Algorithms)
- File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
- Main Programming Language: Go
- Alternative Programming Languages: C++, Rust, Java
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The âService & Supportâ Model
- Difficulty: Level 2: Intermediate
- Knowledge Area: Algorithms / Verification
- Software or Tool: High-performance sorting or search algorithm
- Main Book: âAlgorithms, Fourth Editionâ by Robert Sedgewick
What youâll build: A verification suite for a high-performance, complex algorithm (like a custom TimSort or a B-Tree). You will test it against an âOracleââa simple, obviously correct (but slow) implementation like a bubble sort or a naive array search.
Why it teaches PBT: Sometimes you donât know the exact result of a function, but you know it should match a simpler version. This project teaches the âOracleâ pattern, which is the most powerful way to test complex logic.
Core challenges youâll face:
- Consistency: Ensuring both the fast and slow versions are given the exact same input.
- Handling Non-Determinism: If the algorithm allows multiple valid outputs (e.g., sorting items with equal keys), how do you verify they are both âcorrectâ?
- Performance Trade-offs: How to handle cases where the Oracle is so slow that it limits the number of PBT runs you can do.
Key Concepts:
- Differential Testing: Comparing two implementations of the same spec.
- Observational Equivalence: When two systems look identical from the outside.
Difficulty: Intermediate Time estimate: 1 week
Real World Outcome
A test suite that proves your âoptimisedâ code hasnât introduced any subtle bugs that the naive version doesnât have.
Example Output:
$ go test -v ./optimized_sort
Running 5000 iterations...
Iteration 4532: FAILED
Input: [10, 2, 5, 2, 8]
Optimized Result: [2, 2, 5, 10, 8] <-- Bug!
Oracle Result: [2, 2, 5, 8, 10]
The Core Question Youâre Answering
âHow do I know my âfastâ code is correct if the logic is too complex for humans to verify manually?â
Thinking Exercise
The Sorting Oracle
If you are testing a âStable Sortâ algorithm, and your Oracle is a non-stable sort:
Questions while tracing:
- Will the Oracle and the Optimized version always return the same array?
- If not, how do you define a property that allows for different orders of equal elements but still verifies the sorting?
Hints in Layers
Hint 1: The Wrapper
Write a function testAlgorithm(input) that runs both versions and compares the results.
Hint 2: Sorting Objects
Instead of sorting just numbers, sort objects like {key: number, original_index: number}. This makes it easier to track stability and correctness.
Hint 3: Pre-conditions Some algorithms only work on specific data (e.g., a search algorithm on a sorted list). Use your generators to create already-sorted data or use a âFilterâ to discard invalid inputs.
Project 5: The Virtual User (Stateful System Testing)
- File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
- Main Programming Language: Elixir
- Alternative Programming Languages: Erlang, Scala, Swift
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 4. The âOpen Coreâ Infrastructure
- Difficulty: Level 4: Expert
- Knowledge Area: Distributed Systems / State Machines
- Software or Tool: A Key-Value Store or Bank Account System
- Main Book: âProperty-Based Testing in PropEr, Erlang, and Elixirâ by Fred Hebert (Ch. 9)
What youâll build: A stateful test harness for a system (like a database or a shopping cart). The generator produces a random sequence of commands (e.g., CreateAccount, Deposit, Withdraw, CloseAccount) and verifies that the systemâs state remains consistent after every step.
Why it teaches PBT: Real bugs often hide in specific sequences of actions (e.g., âWithdraw after Closeâ). Stateful PBT teaches you how to model your system as a formal state machine and find âtemporalâ bugs.
Core challenges youâll face:
- Model vs. Reality: Maintaining a âModelâ of what the system should look like and comparing it to the actual running system.
- Valid Sequences: Ensuring the generator doesnât try to
Withdrawfrom an account that hasnât beenCreatedyet. - Shrinking Sequences: If a 50-step sequence fails, shrinking it to the 2 steps that actually matter.
Key Concepts:
- Linearizability: Does the sequence of random commands make sense in a timeline?
- Abstract State Machine: A simplified version of your system used for verification.
Difficulty: Expert Time estimate: 2-3 weeks
Real World Outcome
Youâll find âunreachableâ bugs like: âIf a user puts an item in the cart, clears their cookies, and then applies a discount, the price becomes negative.â
Example Output:
$ mix test.pbt
FAILED: Sequence of 4 commands leads to inconsistent state:
1. CREATE_USER("alice")
2. ADD_TO_CART("alice", "laptop")
3. DELETE_USER("alice")
4. CHECKOUT("alice") <-- SYSTEM CRASHED! (Expected: 404 User Not Found)
The Core Question Youâre Answering
âHow do I test a system whose behavior depends on its history, not just its current input?â
Concepts You Must Understand First
Stop and research these before coding:
- State Machine Modeling
- How can you represent a system as a set of states and transitions?
- Book Reference: âOperating Systems: Three Easy Piecesâ Ch. 4 - Remzi Arpaci-Dusseau
- Pre-conditions vs Post-conditions
- Why canât we run every command in every state?
- Book Reference: âProperty-Based Testing in PropErâŚâ Ch. 9
Questions to Guide Your Design
Before implementing, think through these:
- Model Accuracy
- If the system is a database, should your model be another database, or just an in-memory Map?
- Command Selection
- How do you weigh commands so you donât just call
GET100 times and never callPOST?
- How do you weigh commands so you donât just call
Thinking Exercise
The Bank Account State Machine
States: Closed, Active.
Commands: Open, Deposit, Withdraw, Close.
Questions while tracing:
- Can you
Depositinto aClosedaccount? - What happens to the balance when an account is
Closed? - How does your model track these transitions?
The Interview Questions Theyâll Ask
Prepare to answer these:
- âWhat is stateful property-based testing?â
- âExplain the âModelâ in stateful testing.â
- âHow do you shrink a sequence of commands while maintaining validity?â
- âWhy is stateful PBT particularly good at finding race conditions?â
- âWhat are the trade-offs between âBlack-boxâ and âWhite-boxâ stateful testing?â
Hints in Layers
Hint 1: The Command List
Define your commands as objects/records: {name: 'deposit', amount: 50}.
Hint 2: The Model Keep a simple local variable that represents the expected state of the system. Update it after every command you send to the real system.
Hint 3: Preconditions
Each command needs a can_run(model) function. The generator uses this to pick only valid next steps.
Hint 4: Parallel Checking To find race conditions, run two sequences in parallel and check if any serial interleaving of those commands explains the final state.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Stateful PBT | âProperty-Based Testing in PropErâŚâ | Ch. 9 |
| State Machines | âStructure and Interpretation of Computer Programsâ | Ch. 3 |
Implementation Hints
In Elixir/Erlang, use the proper_statem behavior. It forces you to define four callbacks: initial_state, command, precondition, and postcondition. The engine handles the random generation and shrinking of command sequences for you. For the model, use a simple Map or List.
Learning Milestones
- You can generate a random sequence of valid commands.
- You can maintain a local model that mirrors the systemâs state.
- You can find bugs that only occur after a specific sequence of 5+ actions.
Project 6: The API Fuzzer (Contract Testing)
- File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
- Main Programming Language: Python (using Hypothesis)
- Alternative Programming Languages: JavaScript, Go
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The âService & Supportâ Model
- Difficulty: Level 2: Intermediate
- Knowledge Area: Web APIs / HTTP
- Software or Tool: A REST or GraphQL API
- Main Book: âDesigning Web APIsâ by Brenda Jin
What youâll build: A tool that reads an OpenAPI (Swagger) spec and generates thousands of valid and ânear-validâ HTTP requests to your API, checking for 500 errors, schema violations, or data leaks.
Why it teaches PBT: It connects PBT to the âreal worldâ of web development. Youâll learn how to generate data that fits a strict external schema (JSON Schema) and how to handle the latency/overhead of network testing.
Core challenges youâll face:
- Authentication: How to generate random tokens or maintain a session during 1000s of requests.
- Data Pollution: How to clean up the database after your PBT runner creates 50,000 âJohn Doeâ users.
- Rate Limiting: Preventing your own test from being blocked by your security layers.
Key Concepts:
- Fuzzing vs. PBT: When random junk is better than random âvalidâ data.
- Contract Testing: Ensuring the server obeys its own documentation.
Difficulty: Intermediate Time estimate: 1-2 weeks
Real World Outcome
A list of specific API endpoints that crash when given unusual (but valid) JSON, or that return sensitive fields they shouldnât.
Example Output:
$ python api_tester.py http://localhost:8080/v1
Testing POST /users...
Iteration 87: 500 INTERNAL SERVER ERROR
Payload: {"name": "\u0000", "age": 2147483648}
Database Error: Integer overflow on 'age' column.
Implementation Hints
Use the Hypothesis library in Python if possible, as it has excellent support for JSON and Schemas. Use a library like Schemathesis as a reference. Focus on the âRound-tripâ property: if you POST a resource and then GET it, the values should match exactly.
Project 7: The Complexity Watchdog (Performance PBT)
- File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
- Main Programming Language: C++
- Alternative Programming Languages: Rust, C, Java
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 3. The âService & Supportâ Model
- Difficulty: Level 3: Advanced
- Knowledge Area: Performance / Big O
- Software or Tool: Custom Profiler / Benchmarking tool
- Main Book: âSystems Performanceâ by Brendan Gregg
What youâll build: A tool that searches for âWorst Caseâ inputs. Instead of testing for correctness, the property is: execution_time < expected_limit. The PBT engine will try to find the specific input size or shape that triggers $O(N^2)$ behavior in an $O(N \log N)$ algorithm (like âmedian-of-threeâ Quicksort killers).
Why it teaches PBT: It expands your definition of a âProperty.â A property doesnât have to be a boolean; it can be a resource bound (time, memory, stack depth). Youâll learn how to use PBT to find performance regressions and security vulnerabilities like ReDoS (Regular Expression Denial of Service).
Core challenges youâll face:
- Measurement Noise: How to distinguish a slow algorithm from a background process on your OS.
- Input Scaling: How to generate âlargerâ and âmore complexâ inputs to see where the curve breaks.
- Search Guidance: Using execution time to âguideâ the generator towards slower paths (approaching âEvolutionary Testingâ).
Key Concepts:
- Algorithmic Complexity Attacks: Finding inputs that trigger worst-case performance.
- Micro-benchmarking Pitfalls: JIT warming, cache effects, and instruction pipelining.
Difficulty: Advanced Time estimate: 2 weeks
Real World Outcome
Youâll discover inputs that cause your âfastâ system to hang for seconds, potentially preventing a Denial of Service attack.
Example Output:
$ ./perf_tester --target my_regex_engine
Running search for ReDoS...
Iteration 1500: Slowdown detected!
Input: "aaaaaaaaaaaaaaaaaaaaaaaaaaaa!"
Time: 4.5 seconds (Expected < 0.01s)
Complexity looks like O(2^N).
Concepts You Must Understand First
Stop and research these before coding:
- Big O Notation
- What is the difference between average case and worst case?
- Book Reference: âGrokking Algorithmsâ Ch. 1 - Aditya Bhargava
- ReDoS (Regular Expression DoS)
- How can a string cause exponential backtracking in a regex engine?
- Book Reference: âFoundations of Information Securityâ Ch. 8 - Jason Andress
Questions to Guide Your Design
Before implementing, think through these:
- Thresholds
- How do you define âtoo slowâ? Should it be absolute time or a relative ratio?
- Search Feedback
- If a test is slow, how can the generator âlearnâ to produce similar slow inputs?
Thinking Exercise
The Quicksort Killer
Standard Quicksort picks a pivot. If you always pick the smallest or largest element, it becomes $O(N^2)$.
Questions while tracing:
- If you have an array of 1000 items, what specific arrangement (sorted, reverse sorted, etc.) triggers the worst case for a âpivot at index 0â strategy?
- How would a PBT generator find this arrangement randomly?
The Interview Questions Theyâll Ask
Prepare to answer these:
- âHow can PBT be used for performance testing?â
- âWhat is an Algorithmic Complexity attack?â
- âHow do you account for CPU frequency scaling and jitter in performance tests?â
- âWhat are metamorphic properties in the context of performance?â
- âCan you explain how to shrink a performance counter-example?â
Hints in Layers
Hint 1: The Timer
Use high-resolution timers (like std::chrono::high_resolution_clock in C++).
Hint 2: Relative Limits
Instead of time < 1s, use (time_for_2N) / (time_for_N) < limit. For $O(N)$, the ratio should be $\approx 2$. For $O(N^2)$, it will be $\approx 4$.
Hint 3: Genetic Algorithms If an input is slow, mutate it slightly rather than starting from scratch. This âhill-climbingâ helps find peak slowness.
Hint 4: Profiling Integration Run the slow input through a profiler to see which function is hot.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Performance Analysis | âSystems Performanceâ | Ch. 2 |
| Worst-case Analysis | âIntroduction to Algorithmsâ | Ch. 2 |
Implementation Hints
In C++, use std::chrono for timing. Run each test multiple times and take the median to reduce noise. For the âWatchdogâ, you can set a setitimer or use a separate thread that kills the execution if it exceeds a hard deadline (e.g., 10x the average).
Learning Milestones
- You can identify inputs that trigger significantly slower performance.
- You can plot a curve of âInput Size vs Timeâ automatically using PBT data.
- You can find a âQuicksort Killerâ or âRegex Killerâ input.
Project 8: The Crypto-Validator (Inverse Properties)
- File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
- Main Programming Language: Rust
- Alternative Programming Languages: C, Python (using PyCryptodome)
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 3. The âService & Supportâ Model
- Difficulty: Level 4: Expert
- Knowledge Area: Cryptography / Security
- Software or Tool: Custom implementation of AES, RSA, or a Hash function
- Main Book: âSerious Cryptographyâ by Jean-Philippe Aumasson
What youâll build: A verification suite for a cryptographic library. Youâll use the âInverseâ property (decrypt(encrypt(x)) == x) and âAlgebraicâ properties (e.g., in RSA: encrypt(a * b) == (encrypt(a) * encrypt(b)) % n).
Why it teaches PBT: Cryptography is unforgiving. A single bit error breaks everything. This project teaches you how to use mathematical laws as properties and how to handle extremely large, structured random data (keys, nonces, padding).
Core challenges youâll face:
- Giant Integers: Generating and manipulating 2048-bit primes and numbers.
- Structured Randomness: A random key is not just any sequence of bytes; it might need to meet specific entropy or parity requirements.
- Known-Answer Tests (KAT): Integrating standard test vectors from NIST with your random PBT tests.
Key Concepts:
- Round-trip Properties: The most common type of PBT property.
- Malleability: Checking if changing a ciphertext in a specific way produces a predictable change in the plaintext.
Difficulty: Expert Time estimate: 2-3 weeks
Concepts You Must Understand First
Stop and research these before coding:
- Big Integer Arithmetic
- How are 2048-bit numbers represented and multiplied?
- Book Reference: âThe Art of Computer Programming, Vol 2â Ch. 4
- Modular Exponentiation
- Why is $(A \cdot B) \pmod N = ((A \pmod N) \cdot (B \pmod N)) \pmod N$ important for RSA?
- Book Reference: âSerious Cryptographyâ Ch. 10
Questions to Guide Your Design
Before implementing, think through these:
- Test Vectors
- How do you know your PBT generator is actually covering the âknown weakâ keys (like prime factors that are too close)?
- Inverse Quality
- If
decrypt(encrypt(x)) == xfails, is the bug inencrypt,decrypt, or theBigIntlibrary?
- If
Thinking Exercise
The Bit-Flip Property
In a stream cipher like AES-CTR, if you flip the 3rd bit of the ciphertext, the 3rd bit of the plaintext should flip.
Questions while tracing:
- Is this a property or a vulnerability?
- How would you write a PBT test to verify this âMalleabilityâ property?
- Why would this property fail for an Authenticated Encryption mode like AES-GCM?
The Interview Questions Theyâll Ask
Prepare to answer these:
- âWhy is PBT uniquely suited for cryptographic software?â
- âWhat is a âRound-tripâ property in crypto?â
- âHow would you test for constant-time execution using PBT?â
- âWhat are Algebraic Properties in public-key cryptography?â
- âExplain how you would use PBT to find a side-channel leak.â
Hints in Layers
Hint 1: Small Primes First Donât start with 2048-bit RSA. Test your logic with 8-bit or 16-bit numbers where you can verify the math manually.
Hint 2: Property Stacking
Test decrypt(encrypt(x)) == x, but also test encrypt(x) != x (it should never be identity).
Hint 3: Edge Case Generation Force your generators to produce 0, 1, $N-1$, and powers of 2. These are where big-int bugs live.
Hint 4: Test Vectors Load NIST test vectors into your runner and treat them as âfixed examplesâ that must always pass alongside the random tests.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Cryptography | âSerious Cryptographyâ | Ch. 1-4 |
| Number Theory | âConcrete Mathematicsâ | Ch. 4 |
Implementation Hints
Use the num-bigint crate in Rust. When generating random plaintexts, ensure they are smaller than the modulus $N$. For RSA, remember to test both PKCS#1 v1.5 and OAEP padding. The âCommutativeâ property of RSA is a great PBT test: $E(M1) \cdot E(M2) \pmod N = E(M1 \cdot M2) \pmod N$.
Learning Milestones
- You successfully verify the round-trip property for a custom cipher.
- You identify a bug in modular arithmetic using a large random input.
- You use an algebraic property (like homomorphic multiplication) to verify an implementation.
Project 9: The Race Hunter (Concurrent PBT)
- File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
- Main Programming Language: Java (using Lin-Check)
- Alternative Programming Languages: C++ (with Relacy), Elixir
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 5. The âIndustry Disruptorâ
- Difficulty: Level 5: Master
- Knowledge Area: Concurrency / Memory Models
- Software or Tool: A Thread-safe Queue or Concurrent HashMap
- Main Book: âThe Art of Multiprocessor Programmingâ by Maurice Herlihy
What youâll build: A tool that executes random operations on a data structure from multiple threads simultaneously. It then uses PBT to find the exact interleaving of thread operations that leads to a race condition or a corrupted state.
Why it teaches PBT: Race conditions are notoriously hard to reproduce. Concurrent PBT (like the âParallel State Machineâ approach) turns non-deterministic bugs into deterministic ones by controlling the scheduler or searching the space of interleavings.
Core challenges youâll face:
- Interleaving Space: There are billions of ways to order instructions from two threads. How do you find the one that fails?
- Linearizability Checking: Given a log of parallel events, how do you determine if thereâs a âvalidâ sequential order that explains the result?
- Reproducing the Race: Once you find a failure, how do you force the CPU to repeat that exact timing to prove the fix?
Key Concepts:
- Linearizability: The gold standard of concurrent correctness.
- Thread Interleaving: The specific order in which different threadsâ instructions are executed.
Difficulty: Master Time estimate: 1 month+
Concepts You Must Understand First
Stop and research these before coding:
- Memory Models
- What is the difference between âSequential Consistencyâ and âWeak Consistencyâ?
- Book Reference: âA Primer on Memory Consistency and Cache Coherenceâ - Sorin et al.
- Linearizability
- What does it mean for a parallel execution to be âequivalent to a serial oneâ?
- Book Reference: âThe Art of Multiprocessor Programmingâ Ch. 3
Questions to Guide Your Design
Before implementing, think through these:
- Observability
- How can you record the history of events without the âObserver Effectâ (where recording the bug makes the bug disappear)?
- State Space
- If two threads run 5 operations each, how many possible interleavings are there? Is it feasible to test them all?
Thinking Exercise
The Lost Update
Thread A: x = x + 1
Thread B: x = x + 1
Initial: x = 0
Questions while tracing:
- List all possible final values of
xif there are no locks. - Which of these values are âLinearizableâ?
- How would your PBT tool detect the non-linearizable case?
The Interview Questions Theyâll Ask
Prepare to answer these:
- âWhat is Linearizability?â
- âHow do you test a concurrent data structure?â
- âExplain the âParallel State Machineâ approach to PBT.â
- âWhat is a âcontext switchâ and how does it relate to race conditions?â
- âCan PBT prove the absence of race conditions?â
Hints in Layers
Hint 1: The Actor Pattern Think of each thread as an Actor performing random actions.
Hint 2: The Log
Every operation should return a Result and a Timestamp (or a global order ID).
Hint 3: The Linearizability Checker Take the set of parallel operations and try to find a single valid serial order (using your Sequential Model from Project 5) that produces the same results. If no such order exists, you found a race.
Hint 4: Forced Yielding
Inject Thread.yield() or sleep(0) randomly into your code to increase the chance of context switches at critical moments.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Concurrency | âThe Art of Multiprocessor Programmingâ | Ch. 1-3 |
| Java Memory Model | âJava Concurrency in Practiceâ | Ch. 16 |
Implementation Hints
In Java, use the Lin-Check library. It uses âModel-based testing.â You provide a thread-safe implementation and a non-thread-safe âModel.â The library runs them in parallel and compares results. For C++, look at CDSChecker. The hardest part is the âLinearizability Checker,â which is essentially a search problem.
Learning Milestones
- You can find race conditions in a simple counter.
- You successfully detect a âLost Updateâ or âABA Problemâ in a concurrent stack.
- You can reproduce a race condition by replaying a specific interleaving of operations.
Implementation Hints
Look into the Lin-Check library for Kotlin/Java or P language from Microsoft. The key is to run two threads: one performing a sequence of random operations, and another doing the same. Then, you record all results and check if they could have occurred in any valid serial execution.
Project 10: The Metamorphic Tester (Testing the Untestable)
- File: LEARN_PROPERTY_BASED_TESTING_DEEP_DIVE.md
- Main Programming Language: Python
- Alternative Programming Languages: R, MATLAB, Julia
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 5. The âIndustry Disruptorâ
- Difficulty: Level 4: Expert
- Knowledge Area: Machine Learning / Scientific Computing
- Software or Tool: An Image Classifier or a Search Engine
- Main Book: âMetamorphic Testing: Theory and Practiceâ by T.Y. Chen
What youâll build: A test suite for a system where you donât know the âcorrectâ answer (e.g., a search engine or an AI model). Instead of checking if search("cats") is correct, you check the relation between inputs: results(search("cats")) should be a subset of results(search("cats" OR "dogs")).
Why it teaches PBT: This is the cutting edge. It teaches you how to test systems where an âOracleâ is impossible. You learn to identify âMetamorphic Relationsâârules about how the output should change when the input is transformed.
Core challenges youâll face:
- Relation Discovery: Finding useful rules (e.g., âIf I rotate this image 90 degrees, the classification should not changeâ).
- False Positives: Handling systems that are non-deterministic or âfuzzyâ by nature.
- Transformations: Implementing input transformations (like image rotation or query expansion) without introducing bugs in the transformations themselves.
Key Concepts:
- Metamorphic Relation (MR): A property that relates multiple inputs and outputs.
- Symmetry: A common MR where transforming the input doesnât change the output.
Difficulty: Expert Time estimate: 2-3 weeks
Concepts You Must Understand First
Stop and research these before coding:
- Metamorphic Relations (MR)
- What are the common types of relations (Symmetry, Inclusion, Invariance)?
- Book Reference: âMetamorphic Testing: Theory and Practiceâ Ch. 2
- Precision vs Recall
- How do we test a âfuzzyâ system like a search engine?
- Book Reference: âData Science for Businessâ Ch. 7 - Foster Provost
Questions to Guide Your Design
Before implementing, think through these:
- Transformation Integrity
- If you rotate an image to test a classifier, are you sure the rotation didnât introduce artifacts that confuse the AI?
- Statistical Significance
- If a search result changes slightly, is it a bug, or just the ranking algorithm being non-deterministic?
Thinking Exercise
The Google Maps Test
Suppose you are testing a âShortest Pathâ algorithm on a map.
Questions while tracing:
- If you add a new road to the map, can the shortest path between A and B get longer? (This is a Metamorphic Relation!)
- If you remove a road that was not on the shortest path, can the shortest path change?
The Interview Questions Theyâll Ask
Prepare to answer these:
- âWhat is Metamorphic Testing?â
- âWhen should you use Metamorphic Testing instead of standard PBT?â
- âGive an example of a Metamorphic Relation for a machine learning model.â
- âHow do you handle non-determinism in metamorphic properties?â
- âExplain how metamorphic testing can be used to test compilers.â
Hints in Layers
Hint 1: Pick a Domain
Start with something simple like a sum() function. A metamorphic property is sum([a, b]) == sum([b, a]).
Hint 2: Image Transformations
For an image classifier, use libraries like OpenCV to flip, rotate, or change the brightness of an image. The classification should ideally stay the same.
Hint 3: Search Engine Relations
Test âSubsumptionâ: Results(A) â Results(A OR B). If this is false, the boolean logic is broken.
Hint 4: Compilers Test âEquivalent Transformationsâ: If you rename a local variable in a source file, the compiled binaryâs behavior (the output it produces) should be identical.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Metamorphic Testing | âMetamorphic Testing: Theory and Practiceâ | Ch. 1-3 |
| Search Algorithms | âIntroduction to Information Retrievalâ | Ch. 1 |
Implementation Hints
In Python, use numpy or opencv for data transformations. Define a list of Transformations and a list of Relations. For each random input, apply a transformation and check if the relation holds between the original output and the transformed output.
Learning Milestones
- You can define a metamorphic relation for a search engine.
- You successfully identify a âSymmetryâ bug in a machine learning model.
- You can test a complex system without knowing the expected output for any single test case.
Project Comparison Table
| Project | Difficulty | Time | Depth of Understanding | Fun Factor |
|---|---|---|---|---|
| 1. Arbitrary Factory | Level 1 | Weekend | Medium | High |
| 2. Invariant Runner | Level 2 | 1 Week | Medium | Medium |
| 3. Shrinking Engine | Level 3 | 2 Weeks | High | High |
| 4. Sorting Oracle | Level 2 | 1 Week | Medium | Low |
| 5. Stateful User | Level 4 | 3 Weeks | Very High | High |
| 6. API Fuzzer | Level 2 | 1 Week | Medium | Medium |
| 7. Complexity Watchdog | Level 3 | 2 Weeks | High | Medium |
| 8. Crypto Validator | Level 4 | 2 Weeks | High | Low |
| 9. Race Hunter | Level 5 | 1 Month | Maximum | High |
| 10. Metamorphic Tester | Level 4 | 3 Weeks | High | High |
Recommendation
If you are new to PBT: Start with Project 1 (The Arbitrary Factory). Understanding how data is generated and composed is the prerequisite for everything else. Once you can generate a random User or Tree, youâve unlocked the mindset.
If you are a Systems Engineer: Jump straight to Project 9 (The Race Hunter). Itâs the most challenging, but solving a âone-in-a-millionâ race condition with a script you wrote is the ultimate engineering dopamine hit.
Final Overall Project: The âBulletproofâ Compiler
What youâll build: A small compiler or interpreter for a Lisp-like language. To prove its correctness, you will build a PBT suite that uses:
- Generators for random valid source code.
- Stateful Testing to simulate a debugger stepping through the code.
- An Oracle (an existing Lisp interpreter) to verify the results.
- Shrinking to turn a giant failing program into a 3-line bug report.
This project combines everything: data generation, oracle verification, state management, and the âmagicâ of shrinking.
Summary
This learning path covers Property-Based Testing through 10 hands-on projects. Hereâs the complete list:
| # | Project Name | Main Language | Difficulty | Time Estimate |
|---|---|---|---|---|
| 1 | The âArbitraryâ Factory | TypeScript | Level 1 | Weekend |
| 2 | The Invariant Runner | Python | Level 2 | 1 Week |
| 3 | The Shrinking Engine | Rust | Level 3 | 2 Weeks |
| 4 | The Oracle | Go | Level 2 | 1 Week |
| 5 | The Virtual User | Elixir | Level 4 | 3 Weeks |
| 6 | The API Fuzzer | Python | Level 2 | 1 Week |
| 7 | The Complexity Watchdog | C++ | Level 3 | 2 Weeks |
| 8 | The Crypto-Validator | Rust | Level 4 | 2 Weeks |
| 9 | The Race Hunter | Java | Level 5 | 1 Month+ |
| 10 | The Metamorphic Tester | Python | Level 4 | 3 Weeks |
Recommended Learning Path
For beginners: Start with projects #1, #2, #6 For intermediate: Jump to projects #3, #4, #7 For advanced: Focus on projects #5, #8, #9, #10
Expected Outcomes
After completing these projects, you will:
- Think in terms of universal invariants rather than specific cases.
- Be able to build sophisticated data generators for any domain.
- Master the shrinking algorithms that make automated testing useful.
- Find race conditions and temporal bugs that manual testing misses.
- Verify complex algorithms and security protocols with mathematical rigor.
Youâll have built 10 working projects that demonstrate deep understanding of Property-Based Testing from first principles.