Project 6: Statistical Arbitrage “Pairs Trading” Backtester
Build a backtester for a mean-reversion pairs strategy.
Project Overview
| Attribute | Value |
|---|---|
| Difficulty | Level 2: Intermediate |
| Time Estimate | 1-2 weeks |
| Main Language | Python |
| Alternative Languages | R, JavaScript |
| Knowledge Area | Mean reversion |
| Tools | Time series data |
| Main Book | “Algorithmic Trading” by Ernest Chan |
What you’ll build: A system that identifies a pair, computes the spread, and trades when it diverges.
Why it teaches quant: It forces you to model relationships, not just single-asset trends.
Core challenges you’ll face:
- Normalizing prices into a spread
- Detecting mean reversion signals
- Managing entry/exit thresholds
Real World Outcome
You will run a backtest and see trade logs, equity curves, and spread charts.
Example Output:
$ python pairs.py --a KO --b PEP
Trades: 18
Total return: 9.3%
Saved spread chart to charts/KO_PEP_spread.png
Verification steps:
- Check spread stationarity visually
- Confirm trades trigger on threshold crossings
The Core Question You’re Answering
“How do you trade relative value instead of absolute direction?”
Pairs trading is about statistical relationships.
Concepts You Must Understand First
Stop and research these before coding:
- Spread construction
- How do you build a spread using hedge ratios?
- Book Reference: “Algorithmic Trading” by Ernest Chan, Ch. 9
- Mean reversion
- What does stationarity mean in practice?
- Book Reference: “Quantitative Trading” by Ernest Chan, Ch. 6
- Z-scores
- Why use standardized deviations as signals?
- Book Reference: “Algorithmic Trading” by Ernest Chan, Ch. 9
Questions to Guide Your Design
- Pair selection
- Will you pick pairs manually or based on correlation?
- How will you test stability over time?
- Risk control
- How will you size each leg?
- What stop-loss rules will you use?
Thinking Exercise
Spread Signal
Given a spread series with mean 0 and std 1, what does a z-score of +2 imply? When should you enter?
Questions while working:
- Why is mean reversion assumed?
- What happens if the spread regime shifts?
The Interview Questions They’ll Ask
Prepare to answer these:
- “What is a pairs trading strategy?”
- “Why do you use z-scores?”
- “How do you choose the hedge ratio?”
- “What can break mean reversion?”
- “How do you control risk in a spread trade?”
Hints in Layers
Hint 1: Starting Point Start with a fixed hedge ratio and simple z-score rule.
Hint 2: Next Level Compute hedge ratio via regression.
Hint 3: Technical Details Use rolling windows to update mean and std.
Hint 4: Tools/Debugging Plot the spread and highlight entry/exit points.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Pairs trading | “Algorithmic Trading” by Ernest Chan | Ch. 9 |
| Mean reversion | “Quantitative Trading” by Ernest Chan | Ch. 6 |
| Z-scores | “Algorithmic Trading” by Ernest Chan | Ch. 9 |
Implementation Hints
- Normalize data and align dates carefully.
- Use rolling statistics to avoid lookahead bias.
- Keep trade logs detailed for analysis.
Learning Milestones
- First milestone: You can compute and plot spreads.
- Second milestone: You can trigger trades from z-scores.
- Final milestone: You can explain when pairs trading fails.