Project 4: Trading Strategy Backtester

Build a backtesting engine that replays historical market data, simulates order execution with realistic fills and slippage, and calculates performance metrics - the essential tool every quantitative trader uses.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 2-3 weeks
Language Rust or C++ (Python for comparison)
Prerequisites Basic statistics, order book understanding
Key Topics Event-driven architecture, market microstructure, quant metrics
Main Book “Designing Data-Intensive Applications” by Martin Kleppmann

1. Learning Objectives

After completing this project, you will be able to:

  1. Design event-driven systems - Build an architecture where market events drive strategy execution in proper chronological order
  2. Understand market microstructure - Model how orders actually get filled in real markets, including slippage, spread, and market impact
  3. Implement realistic fill simulation - Create execution models that don’t lie about strategy performance
  4. Calculate quantitative metrics - Compute Sharpe ratio, Sortino ratio, maximum drawdown, and other risk-adjusted returns
  5. Process large datasets efficiently - Use memory-mapped files and streaming to handle gigabytes of tick data
  6. Identify backtesting biases - Recognize and avoid lookahead bias, survivorship bias, and overfitting
  7. Build modular strategy interfaces - Design clean abstractions that separate strategy logic from execution infrastructure
  8. Profile and optimize data processing - Make your backtester fast enough to test thousands of parameter combinations
  9. Generate actionable reports - Produce trade logs, P&L curves, and performance summaries
  10. Bridge theory and practice - Connect academic finance concepts to working code

2. Theoretical Foundation

2.1 Core Concepts

Event-Driven Backtesting

The fundamental architecture decision in backtesting is event-driven vs. vectorized processing:

VECTORIZED BACKTESTING (NumPy/Pandas Style)
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│  prices = load_all_prices()                    # Load entire    │
│  signals = strategy(prices)                     # history       │
│  returns = prices.pct_change() * signals.shift(1)  # Vector ops │
│  sharpe = returns.mean() / returns.std()                        │
│                                                                 │
│  Problem: This is CHEATING!                                     │
│  - signals() can see future prices                              │
│  - No fill simulation (assumes instant execution)               │
│  - Cannot model order book dynamics                             │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

EVENT-DRIVEN BACKTESTING (What You'll Build)
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│  Event Queue (time-ordered)                                     │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │ 09:30:00.123 QUOTE AAPL bid=150.25 ask=150.26            │  │
│  │ 09:30:00.124 TRADE AAPL price=150.26 qty=100             │  │
│  │ 09:30:00.125 QUOTE AAPL bid=150.26 ask=150.27            │  │
│  │ 09:30:00.126 ORDER_FILL (your order) price=150.27        │  │
│  │ ...                                                       │  │
│  └───────────────────────────────────────────────────────────┘  │
│                          │                                       │
│                          ▼                                       │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │
│  │    Strategy     │──│   Execution     │──│   Portfolio     │  │
│  │    Engine       │  │   Simulator     │  │    Tracker      │  │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘  │
│                                                                 │
│  Key: Strategy ONLY sees events up to "now" in the simulation   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Why Event-Driven Matters:

  • Strategy receives information in the same order as live trading
  • Impossible to accidentally use future information
  • Fill simulation happens at the correct timestamp
  • Supports complex order types and partial fills

Fill Simulation

When your strategy decides to buy, what price do you actually get? This is the hardest problem in backtesting:

THE FILL PROBLEM
════════════════════════════════════════════════════════════════════

Your strategy says: "BUY 1000 shares of AAPL at market"

Current order book:
┌───────────────────────────────────────────────────────────────┐
│         BIDS              │           ASKS                    │
├───────────────────────────┼───────────────────────────────────┤
│  150.25 x 500             │  150.26 x 200  ◀── Best ask       │
│  150.24 x 300             │  150.27 x 400                     │
│  150.23 x 1000            │  150.28 x 600                     │
│  150.22 x 800             │  150.29 x 500                     │
└───────────────────────────┴───────────────────────────────────┘

Naive fill: "You got filled at 150.26" (best ask)
            Cost = 1000 x 150.26 = $150,260

Reality:    You WALK UP the book!
            - 200 shares @ 150.26 = $30,052
            - 400 shares @ 150.27 = $60,108
            - 400 shares @ 150.28 = $60,112
            ────────────────────────────────
            Total = $150,272
            Average = $150.272/share (0.008% worse!)

SLIPPAGE = Reality - Naive = $12 per 1000 shares

At 10,000 trades/year: $120,000 difference!

Market Impact Model:

Your orders affect the market. Large orders move prices:

MARKET IMPACT VISUALIZATION
═══════════════════════════════════════════════════════════════

Price before your buy order:
    |
    |     ●────────────────
    |                       ← 150.26
    └────────────────────────────►  time

Price after your buy order:
    |                  ●────
    |     ●───────────/      ← 150.28 (market moved!)
    |                       ← 150.26
    └────────────────────────────►  time
      ↑               ↑
  Order placed    Order filled

Your impact = 150.28 - 150.26 = $0.02
Future orders start at higher price

Common Impact Models:

  1. Linear impact: Impact = k * sqrt(volume / ADV)
    • ADV = Average Daily Volume
    • k = market-specific constant (typically 0.1-0.5)
  2. Square root law: Impact proportional to sqrt(order_size)
    • Based on Kyle’s Lambda from market microstructure theory
    • Empirically validated across many markets
  3. Almgren-Chriss model: Optimal execution trajectories
    • Considers temporary vs permanent impact
    • Used by institutional traders for large orders

Performance Metrics

The goal of backtesting is not just “did we make money?” but “did we make risk-adjusted returns?”

KEY PERFORMANCE METRICS
═══════════════════════════════════════════════════════════════

SHARPE RATIO - Risk-adjusted excess return
┌────────────────────────────────────────────────────────────┐
│                                                            │
│              E[R - Rf]         Mean excess return          │
│  Sharpe = ────────────── = ──────────────────────────      │
│            σ(R - Rf)       Std dev of excess return        │
│                                                            │
│  Where: R = strategy returns                               │
│         Rf = risk-free rate (usually ~4% annual)           │
│                                                            │
│  Interpretation:                                           │
│    < 0.5  : Poor                                           │
│    0.5-1  : Good for most strategies                       │
│    1-2    : Very good                                      │
│    > 2    : Exceptional (or likely overfitted!)            │
│    > 3    : Almost certainly a bug                         │
│                                                            │
└────────────────────────────────────────────────────────────┘

SORTINO RATIO - Only penalizes downside volatility
┌────────────────────────────────────────────────────────────┐
│                                                            │
│              E[R - Rf]                                     │
│  Sortino = ───────────────                                 │
│             σ(downside)                                    │
│                                                            │
│  Where: downside = returns below threshold (usually 0)     │
│                                                            │
│  Why it matters:                                           │
│  - Sharpe penalizes UPSIDE volatility (good variance!)     │
│  - Sortino only counts BAD variance                        │
│  - Sortino > Sharpe indicates positively skewed returns    │
│                                                            │
└────────────────────────────────────────────────────────────┘

MAXIMUM DRAWDOWN - Largest peak-to-trough decline
┌────────────────────────────────────────────────────────────┐
│                                                            │
│  Cumulative P&L                                            │
│        │                                                   │
│   100k │        ●─────────●  ← Peak = $150k                │
│        │       /           \                               │
│   150k │      /             \                              │
│        │     /               \   ● ← Trough = $90k         │
│    90k │    /                 \ /                          │
│        │   ●                   ●                           │
│    50k │  /                                                │
│        └──────────────────────────────► time               │
│                                                            │
│  Max Drawdown = (150k - 90k) / 150k = 40%                  │
│                                                            │
│  Recovery: How long to reach $150k again?                  │
│                                                            │
│  Interpretation:                                           │
│    < 10%  : Conservative (bond-like)                       │
│    10-20% : Typical for diversified equity strategies      │
│    20-50% : Aggressive, HFT can tolerate due to speed      │
│    > 50%  : Extreme (crypto, leverage, concentrated)       │
│                                                            │
└────────────────────────────────────────────────────────────┘

WIN RATE AND PROFIT FACTOR
┌────────────────────────────────────────────────────────────┐
│                                                            │
│  Win Rate = Winning Trades / Total Trades                  │
│                                                            │
│  Profit Factor = Gross Profits / Gross Losses              │
│                                                            │
│  Example:                                                  │
│    100 trades: 55 winners (+$50 avg), 45 losers (-$40 avg) │
│    Win Rate = 55%                                          │
│    Profit Factor = (55 × 50) / (45 × 40) = 1.53            │
│                                                            │
│  A strategy with 40% win rate can still be profitable!     │
│    If avg_win = $3 and avg_loss = $1:                      │
│    Expected value = 0.4 × 3 + 0.6 × (-1) = +$0.60/trade    │
│                                                            │
└────────────────────────────────────────────────────────────┘

2.2 Why This Matters

The Backtest-to-Live Gap

Every professional trader knows this truth:

"Backtests are not predictions. They are sanity checks."
                                    - Every quant, eventually

Your backtester must be realistic enough that:

  1. Profitable backtests have a reasonable chance of being profitable live
  2. Unprofitable backtests definitively kill bad ideas
  3. You understand WHY a strategy works (or doesn’t)

What Goes Wrong Without Realistic Backtesting:

THE BACKTEST LIE PROGRESSION
═══════════════════════════════════════════════════════════════

Month 1 (Backtest Development):
  "My strategy returns 500% per year with 0.8 Sharpe!"
  └── Reality: You're overfitting to historical patterns

Month 2 (Paper Trading):
  "Hmm, returns are only 200%... must be bad luck"
  └── Reality: Your fill assumptions were too optimistic

Month 3 (Live Trading with Small Size):
  "50% returns, still profitable! Scaling up..."
  └── Reality: Your orders are small enough to hide impact

Month 4 (Live Trading with Real Size):
  "Wait, I'm LOSING money? But the backtest..."
  └── Reality: Market impact + slippage + latency = death

The fix: Build a backtester that breaks your illusions BEFORE
you lose real money.

2.3 Historical Context

The Evolution of Backtesting

1980s: Spreadsheet Era
├── Manual calculations in Lotus 1-2-3
├── Daily data only (end-of-day closes)
├── No fill simulation
└── Survivorship bias everywhere

1990s: TradeStation/MetaStock Era
├── First automated backtesting platforms
├── Minute bars become available
├── Still no realistic execution modeling
└── Retail traders start overfitting

2000s: Quantitative Revolution
├── Institutional-grade platforms (MATLAB, R)
├── Tick data becomes accessible
├── Market microstructure research explodes
├── Almgren-Chriss, Kyle models published
└── HFT firms build custom backtesters

2010s: Open Source Era
├── Zipline, Backtrader, QuantConnect
├── Cloud computing enables large-scale testing
├── Machine learning applied to alpha research
├── Execution quality becomes differentiator
└── Slippage modeling goes mainstream

2020s: Realistic Simulation Era
├── Full order book replay
├── Latency simulation at microsecond level
├── Agent-based market simulators
├── Your backtester: somewhere in this progression
└── Goal: Be honest about what you DON'T know

2.4 Common Misconceptions

Misconception 1: “My Sharpe is 3.0, I found alpha!”

SHARPE RATIO SKEPTICISM GUIDE
═══════════════════════════════════════════════════════════════

Reported Sharpe    Probability of Being Real (roughly)
──────────────────────────────────────────────────────────
0.5                High - this is a typical good strategy
1.0                Good - still believable
1.5                Skeptical - need strong out-of-sample proof
2.0                Very skeptical - check for bugs/biases
3.0                Almost certainly wrong
5.0+               Definitely a bug (or your strategy prints money)

Why high Sharpes are suspicious:
1. Overfitting to in-sample data
2. Survivorship bias in your universe
3. Fill assumption errors (assuming perfect fills)
4. Lookahead bias (using future data)
5. Transaction cost underestimation
6. Selection bias (you tested 1000 strategies, this is the winner)

Misconception 2: “The backtest was profitable, so it will work live”

BACKTEST vs LIVE DISCREPANCIES
═══════════════════════════════════════════════════════════════

Factor              Backtest          Reality
───────────────────────────────────────────────────────────────
Order fill price    Best bid/ask      Walk up/down the book
Fill probability    100%              Maybe 60-80% on limits
Execution latency   0 nanoseconds     10 microseconds - 10ms
Market impact       Usually ignored   0.1-1% for large orders
Data quality        Clean, adjusted   Errors, splits, gaps
Bid-ask spread      Often ignored     Real cost on every trade
Slippage            Usually underest  2-10x what you modeled
Market regimes      All look alike    Some are untraatable

Misconception 3: “More data = better backtest”

THE CURSE OF LONG BACKTESTS
═══════════════════════════════════════════════════════════════

Too short (< 1 year):
├── Not enough data for statistical significance
├── Might just be a bull/bear market effect
└── Seasonal patterns not captured

Too long (> 10 years):
├── Market structure has changed dramatically
├── Your edge may have been arbitraged away
├── Different regulatory environment
├── Different technology (2010 HFT != 2024 HFT)
└── Risk of overfitting to ancient patterns

Sweet spot (3-5 years):
├── Multiple market regimes included
├── Enough trades for significance
├── Market structure still relevant
└── Out-of-sample period available

Misconception 4: “Lookahead bias is obvious, I’d never do that”

SUBTLE LOOKAHEAD BIAS EXAMPLES
═══════════════════════════════════════════════════════════════

Obvious (you'd catch this):
  if tomorrow_price > today_price:
      buy()

Subtle (often missed):
  1. Using adjusted close prices
     - Stock splits are applied retroactively
     - Using "close" vs "adj_close" at wrong time

  2. Index membership
     - Backtesting S&P 500 stocks... but using TODAY's list
     - In 2010, you wouldn't know which stocks survive to 2024

  3. Universe selection
     - "I only trade liquid stocks" (>1M shares/day)
     - But you're using liquidity calculated over FUTURE data

  4. Parameter selection
     - "My 20-day moving average works great!"
     - Did you test 5, 10, 15, 20, 25, 30 day MAs first?
     - If so, you've looked ahead to find optimal params

  5. Corporate actions
     - Earnings dates, dividend dates, M&A announcements
     - When exactly was this information PUBLIC?

3. Project Specification

3.1 What You Will Build

A high-performance backtesting engine that:

  1. Replays historical market data in chronological order
  2. Simulates order execution with configurable slippage models
  3. Tracks positions and P&L in real-time during simulation
  4. Calculates risk-adjusted performance metrics
  5. Generates detailed trade logs for analysis
  6. Processes gigabytes of data efficiently using memory-mapped files
BACKTESTER ARCHITECTURE OVERVIEW
═══════════════════════════════════════════════════════════════

                    ┌─────────────────────────────┐
                    │     Historical Data         │
                    │  (CSV, Binary, or Memory)   │
                    └─────────────┬───────────────┘
                                  │
                                  ▼
                    ┌─────────────────────────────┐
                    │      Data Loader            │
                    │  • Memory-mapped files      │
                    │  • Streaming iterator       │
                    │  • Normalization            │
                    └─────────────┬───────────────┘
                                  │
                                  ▼
    ┌─────────────────────────────────────────────────────────┐
    │                     EVENT ENGINE                         │
    │  ┌───────────────────────────────────────────────────┐  │
    │  │              Priority Queue (by timestamp)        │  │
    │  │  • Market data events                             │  │
    │  │  • Order events (new, cancel, fill)               │  │
    │  │  • Timer events (for strategy scheduling)         │  │
    │  └────────────────────────┬──────────────────────────┘  │
    └───────────────────────────┬─────────────────────────────┘
                                │
              ┌─────────────────┼─────────────────┐
              ▼                 ▼                 ▼
    ┌─────────────────┐  ┌─────────────┐  ┌─────────────────┐
    │   STRATEGY      │  │  EXECUTION  │  │   PORTFOLIO     │
    │                 │  │  SIMULATOR  │  │                 │
    │ • on_quote()    │  │             │  │ • Positions     │
    │ • on_trade()    │  │ • Slippage  │  │ • Cash          │
    │ • on_fill()     │  │ • Latency   │  │ • P&L           │
    │                 │  │ • Impact    │  │ • Risk limits   │
    └────────┬────────┘  └──────┬──────┘  └────────┬────────┘
             │                  │                   │
             │   Submit Order   │                   │
             └─────────────────►│                   │
                                │ Fill occurred     │
                                └──────────────────►│
                                                    │
    ┌───────────────────────────────────────────────┼─────────┐
    │                   METRICS ENGINE              ▼         │
    │  • Trade log      • Sharpe/Sortino   • P&L curve       │
    │  • Win rate       • Max drawdown     • Daily returns   │
    └─────────────────────────────────────────────────────────┘
                                │
                                ▼
                    ┌─────────────────────────────┐
                    │        OUTPUT               │
                    │  • Summary statistics       │
                    │  • Trade log CSV            │
                    │  • P&L curve CSV            │
                    │  • Performance report       │
                    └─────────────────────────────┘

3.2 Functional Requirements

F1: Data Loading

  • Load historical bar data (OHLCV) from CSV files
  • Load historical tick data (quote + trade) from CSV or binary
  • Support memory-mapped file access for large datasets (>1GB)
  • Validate data integrity (timestamps ordered, no missing bars)
  • Handle different timezones and market hours

F2: Event Engine

  • Process events in strict chronological order
  • Support multiple event types: Quote, Trade, Bar, OrderFill, Timer
  • Allow strategy to subscribe to specific event types
  • Handle events with identical timestamps (order by type priority)

F3: Strategy Interface

  • Provide callbacks: on_bar(), on_quote(), on_trade(), on_fill()
  • Allow strategy to access current market state (last price, bid/ask)
  • Provide historical data access (last N bars) without lookahead
  • Support multiple simultaneous strategies

F4: Execution Simulation

  • Support market orders with slippage modeling
  • Support limit orders with partial fills
  • Model configurable latency (order-to-market delay)
  • Implement market impact for large orders
  • Track order state machine: New -> Acknowledged -> Partial -> Filled/Cancelled

F5: Portfolio Tracking

  • Track positions per symbol (quantity, average cost)
  • Calculate mark-to-market P&L continuously
  • Enforce position limits and risk checks
  • Support multiple currency handling

F6: Metrics Calculation

  • Calculate Sharpe and Sortino ratios
  • Track maximum drawdown and drawdown duration
  • Compute win rate, profit factor, average win/loss
  • Generate daily return series for analysis

F7: Output Generation

  • Print summary statistics to stdout
  • Export trade log to CSV (entry time, exit time, P&L, etc.)
  • Export P&L curve to CSV for plotting
  • Support JSON output for integration with other tools

3.3 Non-Functional Requirements

N1: Performance

  • Process at least 1 million bars per second on a single core
  • Process at least 100,000 ticks per second with full simulation
  • Memory usage < 2x the size of input data

N2: Accuracy

  • Floating-point P&L calculations accurate to 4 decimal places
  • Timestamps preserved at microsecond precision
  • No lookahead bias by construction

N3: Modularity

  • Strategy swappable without changing core engine
  • Execution model swappable (simple, realistic, custom)
  • Data source swappable (CSV, binary, API)

N4: Testability

  • Each component unit-testable in isolation
  • Deterministic: same inputs always produce same outputs
  • Comparison tests against known results

3.4 Example Usage / Output

$ ./backtester --strategy mean_reversion \
               --data aapl_2023.csv \
               --capital 100000 \
               --slippage 0.01% \
               --latency 100us \
               --output report.json

Loading data... 2,450,000 bars loaded (1.2GB)
Running simulation... [████████████████████] 100%
Time elapsed: 3.2 seconds (765,625 bars/sec)

═══════════════════════════════════════════════════════════════════
                      BACKTEST RESULTS
═══════════════════════════════════════════════════════════════════

Strategy:       Mean Reversion (lookback=20, z_threshold=2.0)
Symbol:         AAPL
Period:         2023-01-03 to 2023-12-29 (252 trading days)
Initial Capital: $100,000.00

───────────────────────────────────────────────────────────────────
                      PERFORMANCE SUMMARY
───────────────────────────────────────────────────────────────────
Total Trades:          1,247
Winning Trades:        676 (54.2%)
Losing Trades:         571 (45.8%)

Gross Profit:          $112,450.30
Gross Loss:            -$67,219.80
Net P&L:               $45,230.50 (+45.2%)

Average Win:           $166.35
Average Loss:          -$117.72
Profit Factor:         1.67
Expectancy:            $36.27 per trade

───────────────────────────────────────────────────────────────────
                      RISK METRICS
───────────────────────────────────────────────────────────────────
Sharpe Ratio:          1.82 (annualized, rf=5%)
Sortino Ratio:         2.15
Calmar Ratio:          3.68

Max Drawdown:          -$12,340.50 (-12.3%)
Max Drawdown Duration: 18 days
Recovery Time:         23 days

Daily VaR (95%):       -$1,234.50
Daily VaR (99%):       -$2,156.70

───────────────────────────────────────────────────────────────────
                      EXECUTION ANALYSIS
───────────────────────────────────────────────────────────────────
Total Slippage Paid:   $3,456.78
Avg Slippage/Trade:    $2.77 (0.018%)
Fill Rate:             98.3% (21 orders unfilled)

Trades by Hour (EST):
  09:30-10:30:  312 trades (25.0%)
  10:30-12:00:  287 trades (23.0%)
  12:00-14:00:  198 trades (15.9%)
  14:00-15:30:  256 trades (20.5%)
  15:30-16:00:  194 trades (15.6%)

───────────────────────────────────────────────────────────────────
                      OUTPUT FILES
───────────────────────────────────────────────────────────────────
Trade Log:      output/trades_2023_mean_reversion.csv
P&L Curve:      output/pnl_2023_mean_reversion.csv
Daily Returns:  output/returns_2023_mean_reversion.csv
Full Report:    output/report_2023_mean_reversion.json

═══════════════════════════════════════════════════════════════════

Trade Log Output (CSV):

trade_id,symbol,side,quantity,entry_time,entry_price,exit_time,exit_price,pnl,pnl_pct,duration_sec,slippage
1,AAPL,LONG,100,2023-01-03T09:45:23.456789,128.50,2023-01-03T10:12:45.123456,129.25,74.50,0.58%,1642,0.04
2,AAPL,SHORT,150,2023-01-03T10:35:12.789012,129.80,2023-01-03T11:02:33.456789,129.45,52.05,0.27%,1641,0.06
3,AAPL,LONG,100,2023-01-03T11:15:45.123456,129.10,2023-01-03T11:45:12.789012,128.75,-35.50,-0.27%,1768,0.03
...

3.5 Real World Outcome

After completing this project, you will have:

  1. A working backtester that you can use to test trading ideas before risking real money

  2. Quantitative intuition about what makes strategies profitable or unprofitable

  3. Implementation skills in event-driven systems that translate to other domains (game engines, simulators, reactive systems)

  4. Portfolio evidence of systems programming and quantitative finance knowledge

  5. Foundation for more advanced projects (live trading, multi-asset, optimization)


4. Solution Architecture

4.1 High-Level Design

DETAILED SYSTEM ARCHITECTURE
═══════════════════════════════════════════════════════════════════

┌─────────────────────────────────────────────────────────────────┐
│                        DATA LAYER                                │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────────┐  ┌──────────────────┐  ┌────────────────┐ │
│  │   CSV Parser     │  │  Binary Reader   │  │  Memory Map    │ │
│  │                  │  │                  │  │                │ │
│  │ • Line iterator  │  │ • Struct layout  │  │ • mmap()       │ │
│  │ • Field parsing  │  │ • Zero-copy      │  │ • Page faults  │ │
│  │ • Type convert   │  │ • Endianness     │  │ • Prefetching  │ │
│  └────────┬─────────┘  └────────┬─────────┘  └───────┬────────┘ │
│           │                     │                     │          │
│           └──────────────┬──────┴─────────────────────┘          │
│                          │                                       │
│                          ▼                                       │
│           ┌──────────────────────────────┐                       │
│           │        Event Stream          │                       │
│           │   Iterator<Item = Event>     │                       │
│           └──────────────┬───────────────┘                       │
│                          │                                       │
└──────────────────────────┼──────────────────────────────────────┘
                           │
┌──────────────────────────┼──────────────────────────────────────┐
│                          ▼          EVENT ENGINE                 │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                   Event Priority Queue                    │   │
│  │                                                           │   │
│  │   Time: 09:30:00.100    ┌─────────────────────────────┐  │   │
│  │                         │ Quote { bid, ask, time }     │  │   │
│  │                         └─────────────────────────────┘  │   │
│  │   Time: 09:30:00.100    ┌─────────────────────────────┐  │   │
│  │                         │ Trade { price, qty, time }   │  │   │
│  │                         └─────────────────────────────┘  │   │
│  │   Time: 09:30:00.105    ┌─────────────────────────────┐  │   │
│  │   (internal)            │ OrderFill { id, price, qty } │  │   │
│  │                         └─────────────────────────────┘  │   │
│  │                                                           │   │
│  └──────────────────────────────────────────────────────────┘   │
│                           │                                      │
│                           ▼                                      │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                    Event Dispatcher                       │   │
│  │                                                           │   │
│  │   match event {                                           │   │
│  │       Quote(q) => {                                       │   │
│  │           market_state.update_quote(q);                   │   │
│  │           strategy.on_quote(q, &market_state);            │   │
│  │       }                                                   │   │
│  │       Trade(t) => strategy.on_trade(t, &market_state);    │   │
│  │       Fill(f) => {                                        │   │
│  │           portfolio.apply_fill(f);                        │   │
│  │           strategy.on_fill(f);                            │   │
│  │       }                                                   │   │
│  │   }                                                       │   │
│  │                                                           │   │
│  └──────────────────────────────────────────────────────────┘   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
                           │
         ┌─────────────────┼─────────────────┐
         ▼                 ▼                 ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│    STRATEGY     │ │   EXECUTION     │ │   PORTFOLIO     │
│                 │ │   SIMULATOR     │ │                 │
├─────────────────┤ ├─────────────────┤ ├─────────────────┤
│                 │ │                 │ │                 │
│ Callbacks:      │ │ Models:         │ │ State:          │
│ • on_bar()      │ │ • NaiveExec     │ │ • positions     │
│ • on_quote()    │ │ • SlippageExec  │ │ • cash          │
│ • on_trade()    │ │ • ImpactExec    │ │ • equity        │
│ • on_fill()     │ │ • OrderBookExec │ │ • pnl_history   │
│                 │ │                 │ │                 │
│ Actions:        │ │ Order Queue:    │ │ Operations:     │
│ • submit_order()│ │ • pending[]     │ │ • update_pos()  │
│ • cancel_order()│ │ • fill_logic()  │ │ • mark_to_mkt() │
│                 │ │ • latency_model │ │ • calc_margin() │
│                 │ │                 │ │                 │
│ State:          │ │ Metrics:        │ │ Risk:           │
│ • custom_data   │ │ • slippage_paid │ │ • max_position  │
│ • signals       │ │ • fill_rate     │ │ • max_drawdown  │
│ • indicators    │ │ • impact_cost   │ │ • stop_loss     │
│                 │ │                 │ │                 │
└─────────────────┘ └─────────────────┘ └─────────────────┘

4.2 Key Components

Component 1: Event Types

EVENT TYPE HIERARCHY
═══════════════════════════════════════════════════════════════

Event
├── MarketDataEvent
│   ├── QuoteEvent
│   │   └── { symbol, timestamp, bid, bid_size, ask, ask_size }
│   ├── TradeEvent
│   │   └── { symbol, timestamp, price, quantity, aggressor_side }
│   └── BarEvent
│       └── { symbol, timestamp, open, high, low, close, volume }
│
├── OrderEvent
│   ├── NewOrderEvent
│   │   └── { order_id, symbol, side, quantity, order_type, limit_price }
│   ├── CancelOrderEvent
│   │   └── { order_id }
│   └── FillEvent
│       └── { order_id, symbol, side, quantity, price, fee, timestamp }
│
└── SystemEvent
    ├── TimerEvent
    │   └── { callback_id, scheduled_time }
    └── EndOfDataEvent

Component 2: Strategy Interface

STRATEGY TRAIT / INTERFACE
═══════════════════════════════════════════════════════════════

trait Strategy {
    // Lifecycle
    fn initialize(&mut self, context: &Context);
    fn teardown(&mut self, context: &Context);

    // Market data callbacks
    fn on_bar(&mut self, bar: &Bar, context: &mut Context);
    fn on_quote(&mut self, quote: &Quote, context: &mut Context);
    fn on_trade(&mut self, trade: &Trade, context: &mut Context);

    // Order callbacks
    fn on_fill(&mut self, fill: &Fill, context: &mut Context);
    fn on_cancel(&mut self, order_id: OrderId, context: &mut Context);
    fn on_reject(&mut self, order_id: OrderId, reason: &str, context: &mut Context);
}

Context provides:
├── submit_order(symbol, side, qty, order_type) -> OrderId
├── cancel_order(order_id)
├── get_position(symbol) -> Position
├── get_cash() -> Decimal
├── get_equity() -> Decimal
├── get_last_price(symbol) -> Option<Decimal>
├── get_last_quote(symbol) -> Option<Quote>
├── current_time() -> Timestamp
└── log(level, message)

Component 3: Execution Models

EXECUTION MODEL HIERARCHY
═══════════════════════════════════════════════════════════════

trait ExecutionModel {
    fn process_order(
        &mut self,
        order: &Order,
        market_state: &MarketState,
        current_time: Timestamp
    ) -> Vec<ExecutionEvent>;
}

Implementation 1: NaiveExecution
├── Market orders fill at last price
├── Limit orders fill if price crosses
├── No slippage, no latency
└── Use for: Quick sanity checks only

Implementation 2: SlippageExecution
├── Market orders: last_price * (1 + slippage_pct * side_sign)
├── Limit orders: Partial fills based on volume
├── Configurable slippage percentage
└── Use for: Moderate realism

Implementation 3: MarketImpactExecution
├── Linear impact: impact = k * sqrt(qty / ADV)
├── Temporary vs permanent impact modeling
├── Order size affects subsequent prices
└── Use for: Large order simulation

Implementation 4: OrderBookExecution
├── Maintain simulated order book
├── Walk the book for market orders
├── Queue position for limit orders
└── Use for: Maximum realism (also most complex)

4.3 Data Structures

Core Data Types:

DATA TYPE DEFINITIONS
═══════════════════════════════════════════════════════════════

// Use fixed-point for money (avoid floating-point errors)
type Price = Decimal;      // 128-bit fixed-point
type Quantity = i64;       // Signed for short positions
type Money = Decimal;
type Timestamp = i64;      // Microseconds since epoch

struct Symbol {
    ticker: String,        // "AAPL", "BTC-USD"
    exchange: String,      // "NYSE", "COINBASE"
}

struct Bar {
    symbol: Symbol,
    timestamp: Timestamp,
    open: Price,
    high: Price,
    low: Price,
    close: Price,
    volume: Quantity,
}

struct Quote {
    symbol: Symbol,
    timestamp: Timestamp,
    bid: Price,
    bid_size: Quantity,
    ask: Price,
    ask_size: Quantity,
}

struct Trade {
    symbol: Symbol,
    timestamp: Timestamp,
    price: Price,
    quantity: Quantity,
    side: Side,            // BUY or SELL (aggressor)
}

enum Side { Buy, Sell }

enum OrderType {
    Market,
    Limit(Price),
    StopMarket(Price),
    StopLimit { stop: Price, limit: Price },
}

struct Order {
    id: OrderId,
    symbol: Symbol,
    side: Side,
    quantity: Quantity,
    order_type: OrderType,
    status: OrderStatus,
    submitted_at: Timestamp,
}

enum OrderStatus {
    Pending,
    Acknowledged,
    PartialFill { filled: Quantity },
    Filled,
    Cancelled,
    Rejected { reason: String },
}

struct Position {
    symbol: Symbol,
    quantity: Quantity,        // Positive = long, negative = short
    average_cost: Price,
    realized_pnl: Money,
    unrealized_pnl: Money,
}

struct Fill {
    order_id: OrderId,
    symbol: Symbol,
    side: Side,
    quantity: Quantity,
    price: Price,
    fee: Money,
    timestamp: Timestamp,
}

Performance-Critical Structures:

CACHE-FRIENDLY LAYOUTS
═══════════════════════════════════════════════════════════════

// Hot path: Bar data for strategy calculations
// Pack frequently accessed fields together
struct BarCompact {
    close: f64,           // Most accessed
    volume: u64,
    open: f64,
    high: f64,
    low: f64,
    timestamp: i64,
}  // 48 bytes, fits in cache line

// Historical bar buffer (ring buffer for lookback)
struct BarBuffer {
    data: Vec<BarCompact>,   // Pre-allocated
    head: usize,             // Next write position
    count: usize,            // Current fill level
    capacity: usize,         // Maximum lookback
}

impl BarBuffer {
    fn push(&mut self, bar: BarCompact) {
        self.data[self.head] = bar;
        self.head = (self.head + 1) % self.capacity;
        self.count = std::cmp::min(self.count + 1, self.capacity);
    }

    fn get(&self, bars_ago: usize) -> Option<&BarCompact> {
        if bars_ago >= self.count { return None; }
        let idx = (self.head + self.capacity - 1 - bars_ago) % self.capacity;
        Some(&self.data[idx])
    }
}

4.4 Algorithm Overview

Event Processing Loop:

MAIN EVENT LOOP PSEUDOCODE
═══════════════════════════════════════════════════════════════

function run_backtest(data_source, strategy, config):
    // Initialize components
    event_queue = PriorityQueue(by_timestamp)
    portfolio = Portfolio(initial_capital)
    execution = ExecutionSimulator(config.slippage_model)
    market_state = MarketState()
    metrics = MetricsCollector()

    // Load initial events from data
    for event in data_source.events():
        event_queue.push(event)

    // Main loop
    while not event_queue.is_empty():
        event = event_queue.pop()
        current_time = event.timestamp

        // Update simulation clock
        market_state.advance_time(current_time)

        // Dispatch based on event type
        match event:
            Quote(q):
                market_state.update_quote(q)
                strategy.on_quote(q, context)

            Trade(t):
                market_state.update_last_trade(t)
                strategy.on_trade(t, context)
                // Check if any pending orders should fill
                new_fills = execution.check_fills(pending_orders, market_state)
                for fill in new_fills:
                    event_queue.push(FillEvent(fill, current_time + latency))

            Bar(b):
                market_state.update_bar(b)
                strategy.on_bar(b, context)

            Fill(f):
                portfolio.apply_fill(f)
                strategy.on_fill(f, context)
                metrics.record_trade(f, portfolio)

            NewOrder(o):
                // Add latency before order reaches "exchange"
                pending_orders.add(o)
                event_queue.push(OrderAck(o, current_time + latency))

        // Mark-to-market at each event
        portfolio.mark_to_market(market_state.current_prices())
        metrics.record_equity(current_time, portfolio.equity())

    // Finalize
    strategy.teardown(context)
    return metrics.generate_report()

Sharpe Ratio Calculation:

SHARPE RATIO ALGORITHM
═══════════════════════════════════════════════════════════════

function calculate_sharpe(daily_returns, risk_free_rate, annualization=252):
    // Convert daily returns to excess returns
    daily_rf = risk_free_rate / annualization
    excess_returns = [r - daily_rf for r in daily_returns]

    // Calculate mean and std dev
    mean_excess = sum(excess_returns) / len(excess_returns)
    variance = sum((r - mean_excess)^2 for r in excess_returns) / (len - 1)
    std_excess = sqrt(variance)

    // Annualize
    annual_mean = mean_excess * annualization
    annual_std = std_excess * sqrt(annualization)

    // Handle zero volatility (flat returns)
    if annual_std == 0:
        return 0 if annual_mean == 0 else infinity

    return annual_mean / annual_std

Maximum Drawdown Algorithm:

MAXIMUM DRAWDOWN ALGORITHM
═══════════════════════════════════════════════════════════════

function calculate_max_drawdown(equity_curve):
    // equity_curve is array of (timestamp, equity_value)

    max_equity = -infinity
    max_drawdown = 0
    max_drawdown_start = None
    max_drawdown_end = None
    current_peak_time = None

    for (time, equity) in equity_curve:
        if equity > max_equity:
            max_equity = equity
            current_peak_time = time

        drawdown = (max_equity - equity) / max_equity

        if drawdown > max_drawdown:
            max_drawdown = drawdown
            max_drawdown_start = current_peak_time
            max_drawdown_end = time

    return {
        max_drawdown: max_drawdown,          // As percentage
        peak_time: max_drawdown_start,
        trough_time: max_drawdown_end,
        duration_days: (trough_time - peak_time).days
    }

5. Implementation Guide

5.1 Development Environment Setup

For Rust:

# Create project
cargo new backtester --bin
cd backtester

# Add dependencies to Cargo.toml
# [dependencies]
# rust_decimal = "1.32"      # Fixed-point arithmetic
# chrono = "0.4"             # Datetime handling
# csv = "1.3"                # CSV parsing
# memmap2 = "0.9"            # Memory-mapped files
# serde = { version = "1.0", features = ["derive"] }
# clap = { version = "4.4", features = ["derive"] }  # CLI args
# rand = "0.8"               # For testing

cargo build

For C++:

# Create project structure
mkdir backtester && cd backtester
mkdir src include tests data

# Create CMakeLists.txt
cat > CMakeLists.txt << 'EOF'
cmake_minimum_required(VERSION 3.16)
project(backtester CXX)

set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

# Dependencies (install via vcpkg or system packages)
find_package(Boost REQUIRED)
find_package(fmt REQUIRED)

add_executable(backtester
    src/main.cpp
    src/engine.cpp
    src/data_loader.cpp
    src/execution.cpp
    src/portfolio.cpp
    src/metrics.cpp
)

target_include_directories(backtester PRIVATE include)
target_link_libraries(backtester PRIVATE Boost::boost fmt::fmt)
EOF

mkdir build && cd build
cmake ..
make

Sample Data for Testing:

# Download sample data (or create synthetic)
# Many free sources: Yahoo Finance, Alpha Vantage, Databento

# Create synthetic test data
cat > data/test_bars.csv << 'EOF'
timestamp,symbol,open,high,low,close,volume
2023-01-03T09:30:00,AAPL,128.50,128.75,128.45,128.60,10000
2023-01-03T09:31:00,AAPL,128.60,128.90,128.55,128.85,15000
2023-01-03T09:32:00,AAPL,128.85,129.00,128.70,128.75,12000
EOF

5.2 Project Structure

RECOMMENDED DIRECTORY LAYOUT
═══════════════════════════════════════════════════════════════

backtester/
├── src/
│   ├── main.rs (or main.cpp)       # CLI entry point
│   ├── lib.rs                      # Library root
│   ├── engine/
│   │   ├── mod.rs                  # Event engine
│   │   ├── event.rs                # Event types
│   │   ├── clock.rs                # Simulation clock
│   │   └── dispatcher.rs           # Event routing
│   ├── data/
│   │   ├── mod.rs                  # Data loading
│   │   ├── csv_loader.rs           # CSV parsing
│   │   ├── bar.rs                  # Bar data types
│   │   └── quote.rs                # Quote/trade types
│   ├── execution/
│   │   ├── mod.rs                  # Execution simulation
│   │   ├── order.rs                # Order types
│   │   ├── models/
│   │   │   ├── naive.rs            # Simple fill model
│   │   │   ├── slippage.rs         # Slippage model
│   │   │   └── impact.rs           # Market impact model
│   │   └── fill.rs                 # Fill events
│   ├── portfolio/
│   │   ├── mod.rs                  # Portfolio tracker
│   │   ├── position.rs             # Position management
│   │   └── risk.rs                 # Risk checks
│   ├── strategy/
│   │   ├── mod.rs                  # Strategy trait
│   │   ├── context.rs              # Strategy context
│   │   └── examples/
│   │       ├── buy_and_hold.rs     # Simplest strategy
│   │       ├── mean_reversion.rs   # Example strategy
│   │       └── momentum.rs         # Another example
│   ├── metrics/
│   │   ├── mod.rs                  # Metrics collection
│   │   ├── returns.rs              # Return calculations
│   │   ├── drawdown.rs             # Drawdown tracking
│   │   └── report.rs               # Report generation
│   └── types/
│       ├── mod.rs                  # Shared types
│       ├── price.rs                # Price type (Decimal)
│       ├── timestamp.rs            # Timestamp handling
│       └── symbol.rs               # Symbol type
├── tests/
│   ├── engine_tests.rs
│   ├── execution_tests.rs
│   ├── portfolio_tests.rs
│   └── metrics_tests.rs
├── benches/
│   └── throughput.rs               # Performance benchmarks
├── data/
│   └── sample/
│       └── aapl_2023.csv           # Sample data
├── Cargo.toml
└── README.md

5.3 The Core Question You’re Answering

“How do you simulate trading realistically enough that backtest results predict live performance?”

This question forces you to confront the fundamental tension in backtesting:

Speed vs. Realism:

  • Vectorized calculations are fast but unrealistic
  • Full order book simulation is realistic but slow
  • Your job: find the right tradeoff for the use case

Historical Data vs. Live Markets:

  • Historical data is clean, ordered, complete
  • Live markets have gaps, errors, latency
  • Your job: model the messiness

Strategy Development vs. Validation:

  • Developing a strategy requires many iterations
  • Validating requires strict out-of-sample testing
  • Your job: make both possible

5.4 Concepts You Must Understand First

Before writing code, ensure you can answer these questions:

  1. Event-Driven Architecture
    • What is an event loop? How does it differ from procedural code?
    • Why is time ordering critical in financial simulation?
    • Book Reference: “Designing Data-Intensive Applications” Ch. 11
  2. Market Microstructure
    • What is the bid-ask spread and why does it exist?
    • How does a limit order book work?
    • What is the difference between a market order and a limit order?
    • Book Reference: “Trading and Exchanges” by Larry Harris, Ch. 1-5
  3. Statistical Foundations
    • What is standard deviation and why does it measure risk?
    • What is the difference between arithmetic and geometric returns?
    • Why do we annualize the Sharpe ratio?
    • Book Reference: “Statistics and Data Analysis for Financial Engineering” by Ruppert
  4. Fixed-Point Arithmetic
    • Why is floating-point dangerous for money calculations?
    • What is the difference between 0.1 + 0.2 in floats vs. decimals?
    • Book Reference: “Computer Systems: A Programmer’s Perspective” Ch. 2
  5. Memory-Mapped Files
    • What is mmap() and why is it faster than read()?
    • What are the tradeoffs of memory-mapped I/O?
    • Book Reference: “The Linux Programming Interface” Ch. 49

5.5 Questions to Guide Your Design

Architecture Questions:

  1. Should events be pulled (iterator) or pushed (callbacks)? What are the tradeoffs?
  2. How do you handle multiple symbols with interleaved events?
  3. What happens if two events have the exact same timestamp?
  4. How do you model the delay between order submission and execution?

Execution Questions:

  1. If your order is larger than the best bid/ask size, how do you fill it?
  2. Should limit orders that cross the spread immediately fill at the limit or the market?
  3. How do you model the probability that a limit order gets filled?
  4. What happens to your pending orders when the market gaps overnight?

Portfolio Questions:

  1. How do you handle partial fills? Is your position the sum of all fills?
  2. What is the average cost of a position that has been added to multiple times?
  3. How do you calculate unrealized P&L with a position that has multiple entry prices?
  4. When do you realize gains/losses - on each exit trade or when position is flat?

Metrics Questions:

  1. Should Sharpe ratio use daily, weekly, or monthly returns?
  2. How do you handle days with zero trades (no returns)?
  3. Is maximum drawdown based on intraday equity or end-of-day?
  4. How do you annualize returns when the test period is not exactly one year?

5.6 Thinking Exercise

The Time-Travel Detector

Before implementing your backtester, try this exercise to internalize lookahead bias:

Scenario: You have bar data with timestamp, open, high, low, close, volume.

Question: Which of these strategy rules contains lookahead bias?

Rule 1: "Buy when close > SMA(close, 20)"
Rule 2: "Buy when close > open (bullish bar)"
Rule 3: "Buy when today's high > yesterday's high"
Rule 4: "Place buy order at open + 0.01 when yesterday closed above SMA"
Rule 5: "Buy when RSI(close, 14) < 30"

Now for each rule, determine:
- At what TIME can you evaluate this rule?
- At what TIME can you place the order?
- At what PRICE will you get filled?

Work through this on paper before reading the answers.

Analysis:

LOOKAHEAD BIAS ANALYSIS
═══════════════════════════════════════════════════════════════

Rule 1: "Buy when close > SMA(close, 20)"

Timeline of a bar:
  09:30 ──── open ────────────────────────────── 09:31
         │             (bar forms)              │
         │                                      close
         │                                      │
         ├──────────────────────────────────────┤

When can you evaluate "close > SMA(close, 20)"?
  -> Only AFTER the bar closes (09:31)

When can you place the order?
  -> Earliest is 09:31 (next bar opens at 09:31)

At what price will you fill?
  -> The NEXT bar's open, NOT the current close!

Common bug: Filling at the close of the signal bar.
This assumes you can time-travel!

Rule 2: "Buy when close > open"

Same problem! You don't know close > open until 09:31.
If you buy at the close, you're using future info.

Rule 3: "Buy when today's high > yesterday's high"

When do you know today's high?
  -> Only at end of day!

You can't use "today's high" during the day because
you don't know if current high is THE high.

Rule 4: "Place buy order at open + 0.01 when yesterday closed above SMA"

This is VALID. Here's why:
  - Yesterday's close is known at end of yesterday
  - You can calculate SMA using yesterday's data
  - You can place order before today's open
  - Order triggers at open + 0.01 (a known price)

Rule 5: "Buy when RSI(close, 14) < 30"

Same as Rule 1. You need 14 closes to calculate RSI.
If the 14th close triggers the signal, you can only
execute on bar 15's open.

5.7 Hints in Layers

Hint 1: Start with the Simplest Possible Version

Your first backtester should be embarrassingly simple:

  • Single symbol
  • Bar data only (no quotes/trades)
  • Market orders only (fill at close)
  • No slippage, no latency
  • Buy-and-hold strategy

If you can’t get this working, you won’t get the complex version working.

Hint 2: The Event Loop is Simpler Than You Think

loop {
    event = next_event_from_data();
    if event is None { break; }

    match event {
        Bar(b) => strategy.on_bar(b),
        _ => {}
    }
}

That’s it. Everything else is elaboration.

Hint 3: Separate State from Logic

Your strategy should not own its market data. It receives it. Your strategy should not execute orders directly. It requests them.

// Good
fn on_bar(&mut self, bar: &Bar, ctx: &mut Context) {
    if self.should_buy(bar) {
        ctx.submit_order(Order::market_buy(bar.symbol, 100));
    }
}

// Bad
fn on_bar(&mut self, bar: &Bar) {
    if self.data[self.data.len()-1].close > self.data[self.data.len()-21..].avg() {
        self.portfolio.buy(100, bar.close);  // Direct mutation!
    }
}

Hint 4: Use a Priority Queue for Multi-Source Events

When you have quotes, trades, and fills all interleaved:

event_queue: BinaryHeap<(Timestamp, EventType, Event)>

// EventType for tie-breaking (same timestamp)
enum EventType {
    MarketData = 0,   // Process first
    OrderSubmit = 1,  // Then orders
    Fill = 2,         // Then fills
}

Hint 5: Start Metrics Collection from Day 1

Every time you touch equity, record it:

fn apply_fill(&mut self, fill: &Fill) {
    // Update position...
    self.metrics.record_event(
        fill.timestamp,
        MetricEvent::Trade {
            symbol: fill.symbol.clone(),
            pnl: realized_pnl,
        }
    );
    self.metrics.record_event(
        fill.timestamp,
        MetricEvent::Equity(self.total_equity()),
    );
}

5.8 The Interview Questions They’ll Ask

  1. “Walk me through how your backtester handles a limit order that should partially fill.”
    • Expected: Describe the order book model, queue position, fill probability, and how remaining quantity is tracked.
  2. “How do you prevent lookahead bias in your system?”
    • Expected: Explain event ordering, strategy only receiving past/current data, fill prices based on post-signal data.
  3. “What’s the difference between Sharpe and Sortino, and when would you prefer each?”
    • Expected: Sharpe penalizes all volatility; Sortino only downside. Prefer Sortino for strategies with positive skew.
  4. “How would you model market impact for a strategy trading 1% of daily volume?”
    • Expected: Discuss linear/square-root impact, temporary vs. permanent impact, Almgren-Chriss optimal execution.
  5. “Your backtest shows a 3.0 Sharpe ratio. What’s wrong?”
    • Expected: Discuss overfitting, survivorship bias, lookahead bias, transaction cost underestimation, or simply too good to be true.
  6. “How do you validate that your backtester itself is correct?”
    • Expected: Unit tests for each component, comparison against known results, sanity checks (buy-and-hold should match benchmark).
  7. “What’s the tradeoff between event-driven and vectorized backtesting?”
    • Expected: Speed vs. realism. Vectorized is faster but prone to lookahead; event-driven is realistic but slower.
  8. “How would you backtest a strategy that reacts to news or sentiment?”
    • Expected: Discuss timestamp accuracy of news data, point-in-time databases, and the challenge of replay fidelity.

5.9 Books That Will Help

Topic Book Chapters
Event-Driven Systems “Designing Data-Intensive Applications” by Kleppmann Ch. 11: Stream Processing
Market Microstructure “Trading and Exchanges” by Larry Harris Ch. 1-10: Market Structure
Quantitative Finance “Advances in Financial ML” by de Prado Ch. 2: Financial Data Structures
Backtesting Methodology “Advances in Financial ML” by de Prado Ch. 7-9: Backtesting, Cross-Validation
Systems Performance “Computer Systems: A Programmer’s Perspective” Ch. 10: System-Level I/O
Memory-Mapped I/O “The Linux Programming Interface” by Kerrisk Ch. 49: Memory Mappings
Efficient I/O “Building Low Latency Applications with C++” Ch. 4: Data Handling
Risk Management “Risk Management and Financial Institutions” by Hull Ch. 1-5: Risk Measures

5.10 Implementation Phases

Phase 1: Basic Event Replay (Days 1-5)

Goals:

  • Load bar data from CSV
  • Create event stream abstraction
  • Implement simple event loop
  • Create Strategy trait with on_bar() callback
  • Implement buy-and-hold strategy for testing

Deliverable:

$ ./backtester --strategy buy_hold --data aapl.csv
Loaded 252 bars
Final equity: $110,000 (10% return)

Success Criteria:

  • Buy-and-hold returns match manual calculation
  • Events processed in chronological order
  • Strategy receives bars one at a time

Phase 2: Portfolio and Execution (Days 6-10)

Goals:

  • Implement Portfolio struct with position tracking
  • Create Order and Fill types
  • Implement simple execution (fill at next bar open)
  • Add market and limit order support
  • Track realized and unrealized P&L

Deliverable:

$ ./backtester --strategy sma_crossover --data aapl.csv
Trades: 45
Net P&L: $12,340
Win rate: 52%

Success Criteria:

  • Position quantity and average cost are correct
  • Realized P&L calculated on each close
  • Orders submitted on bar N fill on bar N+1

Phase 3: Realistic Fill Simulation (Days 11-14)

Goals:

  • Add slippage model (percentage or fixed)
  • Implement latency simulation
  • Add partial fill support for limit orders
  • Create market impact model for large orders
  • Track slippage costs separately

Deliverable:

$ ./backtester --strategy sma_crossover --data aapl.csv --slippage 0.1%
Net P&L: $10,890 (slippage cost: $1,450)

Success Criteria:

  • Same strategy, higher slippage = lower returns
  • Large orders have more impact than small orders
  • Fill prices are realistic (worse than ideal)

Phase 4: Metrics and Reporting (Days 15-18)

Goals:

  • Calculate Sharpe, Sortino, Calmar ratios
  • Track maximum drawdown with timestamps
  • Generate trade log CSV
  • Generate P&L curve CSV
  • Create summary report

Deliverable:

$ ./backtester --strategy mean_reversion --data aapl.csv --output report.json
[Full report as shown in Section 3.4]

Success Criteria:

  • Sharpe ratio matches manual calculation
  • Drawdown periods identified correctly
  • Output files are valid and complete

Phase 5: Performance Optimization (Days 19-21)

Goals:

  • Profile current performance
  • Implement memory-mapped file loading
  • Optimize hot paths (event dispatch, P&L calculation)
  • Add multi-symbol support
  • Benchmark against targets

Deliverable:

$ ./backtester --benchmark
Loading: 10M bars in 0.8s (12.5M bars/sec)
Processing: 10M events in 9.2s (1.1M events/sec)

Success Criteria:

  • 1M+ bars/second throughput
  • Memory usage < 2x data size
  • Multi-symbol backtest works correctly

5.11 Key Implementation Decisions

Decision 1: Floating-Point vs Fixed-Point for Prices

ANALYSIS: PRICE REPRESENTATION
═══════════════════════════════════════════════════════════════

Option A: f64 (double precision float)
  Pros:
    - Native hardware support (fast)
    - Standard math operations work
    - Sufficient precision for most cases
  Cons:
    - 0.1 + 0.2 != 0.3 (representation error)
    - Accumulating errors over millions of trades
    - Equality comparison is dangerous

Option B: Fixed-point Decimal (e.g., rust_decimal, decimal.js)
  Pros:
    - Exact representation of decimal values
    - No accumulating errors
    - Safe equality comparison
  Cons:
    - Slower (software implementation)
    - Different API than standard math
    - Need library dependency

Recommendation: Use Decimal for money, f64 for intermediate calculations
  - All P&L, position values, fees: Decimal
  - Technical indicators (SMA, RSI): f64
  - Convert at boundaries

Decision 2: Event Queue vs. Iterator

ANALYSIS: EVENT FLOW ARCHITECTURE
═══════════════════════════════════════════════════════════════

Option A: Pull-based Iterator
  for event in data_source.events() {
      process(event);
  }

  Pros:
    - Simple, familiar pattern
    - Natural backpressure
    - Easy to reason about
  Cons:
    - Harder to inject internal events (fills, timers)
    - Less flexible for multi-source

Option B: Priority Queue
  while let Some(event) = queue.pop() {
      let new_events = process(event);
      queue.extend(new_events);
  }

  Pros:
    - Natural multi-source merging
    - Internal events (fills) easy to add
    - Time ordering guaranteed
  Cons:
    - More memory for queue
    - Need to seed queue initially

Recommendation: Start with Iterator (simpler), add Queue when needed

Decision 3: Callback vs. Return Events

ANALYSIS: STRATEGY INTERFACE STYLE
═══════════════════════════════════════════════════════════════

Option A: Callback with mutable context
  fn on_bar(&mut self, bar: &Bar, ctx: &mut Context) {
      ctx.submit_order(...);
  }

  Pros:
    - Can submit multiple orders
    - Access to full context
    - Familiar OOP pattern
  Cons:
    - Mutable reference complexity
    - Harder to test in isolation
    - State mutation less explicit

Option B: Return orders from handler
  fn on_bar(&mut self, bar: &Bar) -> Vec<Order> {
      vec![Order::market_buy(...)]
  }

  Pros:
    - Pure function, easy to test
    - Explicit outputs
    - No hidden state mutation
  Cons:
    - Strategy can't query context
    - Need to pass more info in
    - Less flexible

Recommendation: Callback style for flexibility, but keep Context interface minimal

6. Testing Strategy

6.1 Unit Tests

Data Loading Tests:

#[test]
fn test_csv_bar_parsing() {
    let csv = "timestamp,symbol,open,high,low,close,volume\n\
               2023-01-03T09:30:00,AAPL,128.50,128.75,128.45,128.60,10000";

    let bars: Vec<Bar> = parse_csv_bars(csv).collect();

    assert_eq!(bars.len(), 1);
    assert_eq!(bars[0].symbol.ticker, "AAPL");
    assert_eq!(bars[0].open, dec!(128.50));
    assert_eq!(bars[0].high, dec!(128.75));
    assert_eq!(bars[0].volume, 10000);
}

#[test]
fn test_timestamp_ordering() {
    let bars = vec![
        Bar { timestamp: 1000, .. },
        Bar { timestamp: 900, .. },  // Out of order!
        Bar { timestamp: 1100, .. },
    ];

    assert!(validate_bar_ordering(&bars).is_err());
}

Execution Model Tests:

#[test]
fn test_market_order_fills_at_next_bar() {
    let bars = vec![
        Bar { close: dec!(100.00), .. },
        Bar { open: dec!(100.50), close: dec!(101.00), .. },
    ];

    let order = Order::market_buy("AAPL", 100);
    let mut exec = NaiveExecution::new();

    // Order submitted after bar 0
    exec.submit_order(order, bars[0].timestamp + 1);

    // Process bar 1 - should fill at open
    let fills = exec.process_bar(&bars[1]);

    assert_eq!(fills.len(), 1);
    assert_eq!(fills[0].price, dec!(100.50));  // Bar 1's open
}

#[test]
fn test_slippage_applied_correctly() {
    let mut exec = SlippageExecution::new(Decimal::new(10, 4)); // 0.1%

    let fill = exec.fill_market_order(
        Side::Buy,
        100,
        dec!(100.00),  // Reference price
    );

    // Buy order should slip UP
    assert_eq!(fill.price, dec!(100.10));  // 100 * 1.001
}

#[test]
fn test_limit_order_no_fill_when_price_not_crossed() {
    let order = Order::limit_buy("AAPL", 100, dec!(99.00));
    let bar = Bar { low: dec!(99.50), .. };  // Never touches 99.00

    let fills = exec.process_bar(&bar);

    assert!(fills.is_empty());
}

Portfolio Tests:

#[test]
fn test_position_average_cost() {
    let mut portfolio = Portfolio::new(dec!(100000));

    portfolio.apply_fill(Fill { side: Side::Buy, quantity: 100, price: dec!(50.00), .. });
    portfolio.apply_fill(Fill { side: Side::Buy, quantity: 100, price: dec!(52.00), .. });

    let pos = portfolio.get_position("AAPL").unwrap();
    assert_eq!(pos.quantity, 200);
    assert_eq!(pos.average_cost, dec!(51.00));  // (100*50 + 100*52) / 200
}

#[test]
fn test_realized_pnl_on_close() {
    let mut portfolio = Portfolio::new(dec!(100000));

    portfolio.apply_fill(Fill { side: Side::Buy, quantity: 100, price: dec!(50.00), .. });
    portfolio.apply_fill(Fill { side: Side::Sell, quantity: 100, price: dec!(55.00), .. });

    let pos = portfolio.get_position("AAPL").unwrap();
    assert_eq!(pos.quantity, 0);
    assert_eq!(pos.realized_pnl, dec!(500.00));  // (55 - 50) * 100
}

Metrics Tests:

#[test]
fn test_sharpe_ratio_calculation() {
    // Daily returns: 1%, -0.5%, 2%, -1%, 1.5%
    let returns = vec![0.01, -0.005, 0.02, -0.01, 0.015];
    let rf_annual = 0.05;  // 5% annual

    let sharpe = calculate_sharpe(&returns, rf_annual, 252);

    // Manual calculation:
    // Mean daily excess return = (0.01 - 0.005 + 0.02 - 0.01 + 0.015) / 5 - 0.05/252
    //                         = 0.006 - 0.000198 = 0.005802
    // Std dev of returns = 0.01095 (calculated)
    // Annualized: mean * 252 / (std * sqrt(252)) = ...

    assert!((sharpe - 1.335).abs() < 0.01);  // Approximate expected value
}

#[test]
fn test_max_drawdown() {
    let equity = vec![100.0, 110.0, 105.0, 95.0, 100.0, 115.0];

    let dd = calculate_max_drawdown(&equity);

    // Peak at 110, trough at 95: (110 - 95) / 110 = 13.6%
    assert!((dd.max_drawdown - 0.136).abs() < 0.01);
    assert_eq!(dd.peak_index, 1);
    assert_eq!(dd.trough_index, 3);
}

6.2 Integration Tests

#[test]
fn test_buy_and_hold_matches_manual() {
    let bars = load_test_bars("tests/data/aapl_100_bars.csv");
    let initial_capital = dec!(100000);

    let result = run_backtest(
        BuyAndHoldStrategy::new(),
        &bars,
        BacktestConfig {
            initial_capital,
            slippage: Decimal::ZERO,
            latency_us: 0,
        },
    );

    // Manual calculation: buy 100 shares at first bar open, sell at last bar close
    let expected_return = (bars.last().unwrap().close - bars[0].open)
        / bars[0].open;

    assert!((result.total_return - expected_return).abs() < dec!(0.0001));
}

#[test]
fn test_no_lookahead_bias() {
    // Strategy that should fail if there's lookahead bias
    struct CheatStrategy { bars_seen: Vec<Bar> }

    impl Strategy for CheatStrategy {
        fn on_bar(&mut self, bar: &Bar, ctx: &mut Context) {
            // Try to access future data (should fail)
            let future_bar = ctx.get_bar(1);  // 1 bar into future
            assert!(future_bar.is_none(), "Lookahead bias detected!");

            self.bars_seen.push(bar.clone());
        }
    }

    run_backtest(CheatStrategy { bars_seen: vec![] }, &bars, config);
}

6.3 Property-Based Tests

use proptest::prelude::*;

proptest! {
    #[test]
    fn test_pnl_equals_equity_change(
        initial in 10000.0..1000000.0f64,
        returns in proptest::collection::vec(-0.1..0.1f64, 10..100)
    ) {
        let mut equity = initial;
        let mut total_pnl = 0.0;

        for r in returns {
            let pnl = equity * r;
            equity += pnl;
            total_pnl += pnl;
        }

        let equity_change = equity - initial;

        // P&L should equal equity change
        assert!((total_pnl - equity_change).abs() < 0.01);
    }

    #[test]
    fn test_position_quantity_consistency(
        fills in proptest::collection::vec(
            (any::<bool>(), 1..1000i64),  // (is_buy, quantity)
            1..50
        )
    ) {
        let mut position = 0i64;

        for (is_buy, qty) in fills {
            if is_buy {
                position += qty;
            } else {
                position -= qty;
            }
        }

        // Sum of signed fills should equal final position
        let computed: i64 = fills.iter()
            .map(|(is_buy, qty)| if *is_buy { *qty } else { -*qty })
            .sum();

        assert_eq!(position, computed);
    }
}

7. Common Pitfalls and Debugging

7.1 Pitfall: Using Close Price for Same-Bar Execution

Symptom: Strategy returns are impossibly good.

Problem:

// WRONG
fn on_bar(&mut self, bar: &Bar, ctx: &mut Context) {
    if bar.close > self.sma {
        ctx.submit_order(Order::market_buy("AAPL", 100));
    }
}
// ... later in execution ...
fn fill_order(&mut self, order: &Order, bar: &Bar) -> Fill {
    Fill { price: bar.close, .. }  // Filling at the same bar!
}

Why It’s Wrong:

  • Signal generated using bar’s close price
  • Order filled at the same close price
  • In reality, you don’t know close until bar ends, then next trade is at next bar open

Fix:

fn fill_order(&mut self, order: &Order, next_bar: &Bar) -> Fill {
    Fill { price: next_bar.open, .. }  // Fill at NEXT bar's open
}

7.2 Pitfall: Survivorship Bias in Stock Universe

Symptom: All your stocks went up. Sharpe is high. Feels wrong.

Problem:

// Loading "current" S&P 500 constituents
let symbols = load_sp500_symbols();  // As of today

for symbol in symbols {
    let bars = load_historical_data(symbol, "2010-01-01", "2020-12-31");
    backtest(strategy, bars);
}
// Problem: Companies that went bankrupt aren't in today's S&P 500!

Why It’s Wrong:

  • In 2010, different companies were in the index
  • Companies that failed, got acquired, or were delisted are missing
  • You’re only testing on “survivors”

Fix:

// Use point-in-time index constituents
let symbols_by_date = load_historical_constituents("SP500");

for date in trading_days("2010-01-01", "2020-12-31") {
    let symbols = symbols_by_date.get(&date);
    // Only trade symbols that were actually in the index on this date
}

7.3 Pitfall: Ignoring Transaction Costs

Symptom: Live trading P&L is much worse than backtest.

Problem:

// Costs not modeled:
// - Bid-ask spread (you buy at ask, sell at bid)
// - Exchange fees ($0.003/share typical)
// - SEC fees (0.00051% of notional)
// - Borrowing costs for shorts
// - Slippage on larger orders

Reality Check:

1000 shares AAPL at $150:
  Bid-ask spread (0.01%):    $15.00
  Exchange fees:              $3.00
  Slippage (0.02%):          $30.00
  ─────────────────────────────────
  Total round-trip:          $48.00

At 10 trades/day, 252 days:
  Annual transaction costs: $121,000

Your strategy needs to make $121k just to break even!

Fix:

struct TransactionCosts {
    spread_pct: Decimal,
    fee_per_share: Decimal,
    slippage_pct: Decimal,
}

fn calculate_fill_price(&self, side: Side, reference: Decimal, qty: i64) -> Decimal {
    let spread_impact = reference * self.spread_pct / 2;
    let slippage = reference * self.slippage_pct * (qty as f64 / ADV).sqrt();

    match side {
        Side::Buy => reference + spread_impact + slippage,
        Side::Sell => reference - spread_impact - slippage,
    }
}

7.4 Pitfall: Floating-Point Accumulation Errors

Symptom: P&L doesn’t match sum of trade P&Ls. Small discrepancies grow over time.

Problem:

// Each trade adds tiny error
double total_pnl = 0.0;
for (auto& trade : trades) {
    total_pnl += trade.pnl;  // 0.1 + 0.2 != 0.3 in floats!
}

// After 1M trades, error can be $100s

Fix:

use rust_decimal::Decimal;

let mut total_pnl = Decimal::ZERO;
for trade in &trades {
    total_pnl += trade.pnl;  // Exact decimal arithmetic
}

7.5 Pitfall: Not Handling Stock Splits and Dividends

Symptom: Your strategy suddenly has 2x the shares, or P&L is wildly off on certain dates.

Problem:

Historical data for AAPL:
  2020-08-31: Close = $499.23 (pre-split)
  2020-09-01: Close = $129.04 (post 4:1 split)

If you're not using adjusted prices:
  - Your "yesterday's close" is $499.23
  - Today's open of $129.04 looks like -74% crash
  - Strategy goes crazy

Fix Options:

  1. Use adjusted prices (common but has issues):
    • All historical prices adjusted to current split/dividend state
    • Problem: Prices change retroactively as new splits occur
  2. Handle splits explicitly (better for realism):
    • Store split events as data
    • Adjust position quantity on split dates
    • Use raw prices for fill simulation
fn handle_corporate_action(&mut self, action: &CorporateAction) {
    match action {
        CorporateAction::Split { ratio, .. } => {
            self.quantity = (self.quantity as f64 * ratio) as i64;
            self.average_cost /= Decimal::from_f64(ratio).unwrap();
        }
        CorporateAction::Dividend { amount, .. } => {
            self.cash += amount * Decimal::from(self.quantity);
        }
    }
}

7.6 Debugging Techniques

Technique 1: Trade Log Analysis

# Export every trade with full context
./backtester --strategy my_strat --data aapl.csv --verbose > trades.log

# Look for suspicious patterns
grep "FILL" trades.log | awk '{print $5}' | sort -n | uniq -c
# ^ Shows fill price distribution - any outliers?

# Check for same-bar execution
awk '/SIGNAL/ {signal_time=$2} /FILL/ {if($2==signal_time) print "SAME BAR!"}' trades.log

Technique 2: Sanity Check Strategies

// Buy-and-hold should match benchmark
let bh_return = (final_price - initial_price) / initial_price;
assert!((strategy_return - bh_return).abs() < 0.01);

// 100% cash strategy should have 0% return
// (tests that you're not accidentally trading)

// Reverse strategy should have opposite sign
let reverse_return = -strategy_return;  // Approximately

Technique 3: Visualize the Equity Curve

import pandas as pd
import matplotlib.pyplot as plt

pnl = pd.read_csv('pnl.csv')
pnl['timestamp'] = pd.to_datetime(pnl['timestamp'])
pnl.set_index('timestamp', inplace=True)

plt.figure(figsize=(12, 6))
plt.plot(pnl['equity'])
plt.title('Equity Curve')
plt.xlabel('Date')
plt.ylabel('Equity ($)')
plt.savefig('equity_curve.png')

# Look for:
# - Sudden jumps (data error or lookahead)
# - Too-smooth curve (not enough trades)
# - Consistent upward slope (too good, suspicious)

8. Extensions and Challenges

8.1 Extension: Multi-Asset Portfolio

Extend your backtester to handle multiple symbols simultaneously:

Challenges:
- Correlation between assets
- Portfolio-level position limits
- Rebalancing logic
- Cross-asset signals (pairs trading)

8.2 Extension: Order Book Simulation

Implement a full order book simulator:

Features:
- Limit order queue position
- Partial fills over time
- Market orders walk the book
- Your orders affect the book

8.3 Extension: Walk-Forward Optimization

Implement rolling window parameter optimization:

Process:
1. Train on window 1 (years 1-2)
2. Test on window 2 (year 3)
3. Train on window 2 (years 2-3)
4. Test on window 3 (year 4)
...

Outputs:
- Out-of-sample Sharpe for each window
- Parameter stability over time
- Degradation analysis

8.4 Extension: Monte Carlo Simulation

Add randomization to stress test strategies:

Variations to simulate:
- Random slippage (normal distribution around mean)
- Random fill delays
- Randomly drop some fills (partial fills)
- Bootstrap resample returns
- Path dependency analysis

8.5 Challenge: Match Production Backtest

If you have access to a production backtester (QuantConnect, Backtrader, etc.):

Challenge:
1. Implement the same strategy in both systems
2. Use the same data
3. Compare fill-by-fill
4. Explain any discrepancies

9. Real-World Connections

9.1 How Professional Firms Backtest

Hedge Fund Approach:

Infrastructure:
- Dedicated backtesting cluster (100+ cores)
- Point-in-time databases (no lookahead possible by design)
- Full tick data with order book (terabytes)
- Version-controlled strategy code
- Automated regression testing

Process:
1. Idea generation (fundamental or quantitative)
2. Quick vectorized backtest (sanity check)
3. Full event-driven backtest (realistic)
4. Paper trading (live data, simulated execution)
5. Small live allocation ($1M)
6. Scale if profitable (months of evidence)

Key Differences from Your Backtester:

Aspect Your Backtester Production System
Data CSV files Tick database (100TB+)
Execution Simple models Full order book replay
Scale 1 symbol 5000+ symbols
Speed 1M bars/sec 100M+ events/sec
Validation Manual review Automated pipelines
Cost model Percentage slippage Market impact + fees + borrow

9.2 Career Applications

Quant Developer:

  • Build and maintain backtesting infrastructure
  • Optimize execution simulation
  • Scale to multi-asset, multi-strategy

Quant Researcher:

  • Use backtester to develop strategies
  • Analyze results for alpha decay
  • Iterate on parameter selection

Risk Manager:

  • Stress test strategies with extreme scenarios
  • Validate drawdown limits
  • Monitor strategy degradation

Trading Systems Engineer:

  • Connect backtester to live execution
  • Ensure consistency between simulation and production
  • Performance optimization

10. Resources

10.1 Documentation and References

10.2 Academic Papers

  • Almgren, R., & Chriss, N. (2000). “Optimal Execution of Portfolio Transactions” - Market impact model
  • Kyle, A. S. (1985). “Continuous Auctions and Insider Trading” - Market microstructure theory
  • de Prado, M. L. (2018). “Advances in Financial Machine Learning” - Modern backtesting methodology

10.3 Books

Book Focus Level
“Trading and Exchanges” by Harris Market microstructure Foundational
“Advances in Financial ML” by de Prado Backtesting methodology Advanced
“Quantitative Trading” by Chan Practical implementation Intermediate
“Inside the Black Box” by Narang Strategy development Intermediate
“Algorithmic Trading” by Chan Technical implementation Intermediate

10.4 Open Source Projects


11. Self-Assessment Checklist

Before considering this project complete, verify you can answer “yes” to all:

Architecture:

  • I can explain why event-driven backtesting is more realistic than vectorized
  • My backtester processes events in strict chronological order
  • Strategy code cannot access future data by design
  • I can add new event types without modifying the core engine

Execution:

  • Market orders fill at the next bar’s open (not the signal bar’s close)
  • I have implemented at least one slippage model
  • I understand why my fill prices differ from ideal prices
  • I can explain how market impact affects large orders

Portfolio:

  • Position quantities and average costs are always correct
  • Realized P&L is calculated correctly on position close
  • I handle partial fills without losing track of position state
  • Cash balance is updated correctly for all transactions

Metrics:

  • I can calculate Sharpe ratio from first principles
  • I track maximum drawdown with timestamps
  • My win rate and profit factor match manual calculation
  • I generate a complete trade log for audit

Performance:

  • My backtester processes 1M+ bars per second
  • I use memory-mapped files for large datasets
  • I have benchmarked the hot paths

Validation:

  • Buy-and-hold strategy returns match benchmark
  • Adding slippage reduces (never increases) returns
  • I have tested with known-result datasets
  • I have found and fixed at least one bug using trade log analysis

12. Submission / Completion Criteria

Your backtester is complete when you can demonstrate:

1. Correct Basic Operation

$ ./backtester --strategy buy_hold --data sample.csv
# Returns match manual calculation

2. Realistic Execution

$ ./backtester --strategy mean_reversion --data sample.csv --slippage 0.1%
# Returns lower than zero-slippage run

3. Complete Metrics

$ ./backtester --strategy mean_reversion --data sample.csv --output report.json
# Report contains: Sharpe, Sortino, max drawdown, trade count, win rate

4. Performance Targets

$ ./backtester --benchmark
# 1M+ bars/second throughput

5. Trade Log Audit

$ head -20 output/trades.csv
# Every fill has: timestamp, symbol, side, quantity, price, P&L

6. Documentation

  • README explaining how to run
  • At least one example strategy
  • Test suite with passing tests

Summary

This project teaches you to build a realistic backtesting engine - the essential tool for any quantitative trader. You’ve learned:

  1. Event-driven architecture prevents lookahead bias by design
  2. Fill simulation must model slippage, spread, and market impact
  3. Performance metrics like Sharpe ratio measure risk-adjusted returns
  4. Common biases (survivorship, lookahead, overfitting) destroy backtest validity
  5. Efficient data handling enables testing thousands of parameter combinations

The backtester you’ve built is a foundation for more advanced projects: optimization, live trading integration, and multi-asset portfolio management.

Remember: A backtest is not a prediction. It’s a tool to discard bad ideas quickly and build intuition about what might work. The real test is always live trading - but a good backtester saves you from losing money on obviously bad ideas.


Next Step: Use your backtester to test real strategies, or move on to P05: Custom Memory Allocator to eliminate malloc from your hot paths.